Mac vs Linux Text issues (performance / leak)

I’m using this textbook code on my Mac without any problems. My input text is bigger though: like 30KB

[quote]Dim t As Text = “Hello, World!”
Dim reverse As Text

For Each c As Text In t.Characters
reverse = c + reverse
Next[/quote]

On Linux, with the same input, it takes forever to finish.

Any thoughts or alternatives are welcome.

I made a short test with a 39kb file
2:45 minutes in Linux

But what is the purpose of displaying a large text backwards?

It’s mseconds on a Mac. :slight_smile:

I’m not displaying it, I want to search in it.

Just to clarify: I’d prefer a compatible short-term solution (reversed text), not a rewrite of my search logic.

And should I feedback this as a bug?

try this… I was surprised how fast it was

		Dim t As Text = "Hello, World!"
		Dim reverse As Text
		dim start as  double
		while t.Length<30000
				t=t+t
		wend // creates a string 53,248 characters long
//
//  original method
//
		start=Microseconds
		For Each c As Text In t.Characters
				reverse = c + reverse
		Next
// OSX = 0.111 seconds
		msgbox str((Microseconds-start)/1000000)+":"+str(t.Length)+":"+str(reverse.Length)
//
// New Method
//
		dim v() As Text
		dim z() As Text
		dim i as integer
		start=Microseconds
		v=t.split
		for i=v.Ubound DownTo 0
				z.append(v(i))
		next
		reverse=text.join(z,"")
// OSX= 0.0225 seconds (5x faster)
		msgbox str((Microseconds-start)/1000000)+":"+str(t.Length)+":"+str(reverse.Length)
//

Can’t test on Linux, but if OSX got “faster”, then I’d think Linux would too

I suspect the issue is that the first method has to reallocate the string everytime a character is added, where the 2nd method never does

my Linux Result:

1.MsgBox
299.121:53248:53248
2.MsgBox
1.558646:53248:53248

[quote=314844:@Axel Schneider]my Linux Result:

1.MsgBox
299.121:53248:53248
2.MsgBox
1.558646:53248:53248[/quote]

So it would seem the array method is WAY WAY faster on Linux, and “way” faster on OSX :slight_smile:

You know that you can search normal text by reversing only the search string, right ?

299 vs 1.55 Is a huge difference

Cheers guys!

And @Michel Bujardet I have to search the other way around. So I’m doing both. :slight_smile:

Oh, should I file a feedback on this? Might have other consequences on Linux too right?

Hope you guys are still around. Found the next one. Again mseconds on Mac, minutes on Linux.

The code is simple enough. Suggestions?

Dim quickRows() As Text = dictVal.Split("""fields"":{") // Split the text rows in an array

Mac:

15:30:46 AirTableScript.findJoinManual: 2 15:30:46 AirTableScript.findJoinManual: 2a -rows:1405

Linux:

14:04:05 AirTableScript.findJoinManual: 2 14:07:43 AirTableScript.findJoinManual: 2a -rows:1405

I’ll try a string type too…

I had a feeling that the type could make a difference, and it does. The exact same code with String instead of Text is fast on Linux too…

Did a quick hack:

Dim quickRowsTemp() As String = str(dictVal).Split("""fields"":{") // Split the text rows in an array

Linux with both versions. String type is instantaneously, Text type takes almost 4 minutes.

15:02:20 AirTableScript.findJoinManual: 2 15:02:20 AirTableScript.findJoinManual: 2a -stringrows:1405 15:06:06 AirTableScript.findJoinManual: 2a -rows:1405 15:06:06 AirTableScript.findJoinManual: 2b

I’ll report it.

Case #46943

Don’t use Text split. It is bugged. And as you see, much slower.

Use string and it should fly.

Yep. This is an issue in most high level languages where strings are immutable. It’s why there’s a StringBuilder class in .NET, though you can do the same thing with a string array and join in Xojo.

In my testing on a Mac this is about twice as fast as the split technique:

		dim mb as new Xojo.Core.MutableMemoryBlock(Xojo.Core.TextEncoding.ASCII.ConvertTextToData(t))
		dim x As Integer = 0
		dim y As Integer = mb.Size-1
		dim ch As UInt8
		while x < y
				ch = mb.UInt8Value(x)
				mb.UInt8Value(x) = mb.UInt8Value(y)
				mb.UInt8Value(y) = ch
				x = x + 1
				y = y -1
		wend
                t = Xojo.Core.TextEncoding.ASCII.ConvertDataToText(mb)

However, using an old style MemoryBlock is 7-8x faster:

		dim mb As MemoryBlock = t
		dim x As Integer = 0
		dim y As Integer = mb.Size-1
		dim ch As UInt8
		while x < y
				ch = mb.UInt8Value(x)
				mb.UInt8Value(x) = mb.UInt8Value(y)
				mb.UInt8Value(y) = ch
				x = x + 1
				y = y -1
		wend
		dim s as string = mb.StringValue(0, mb.Size)
		s = s.DefineEncoding(Encodings.ASCII)
		t = s.ToText

Naturally if you have non-ASCII encoding then you can’t do it this way, though I’m guessing if you convert to/from UTF32 you can get away with 4-byte swaps.

[quote=314864:@Arthur v. d. B.]Cheers guys!

And @Michel Bujardet I have to search the other way around. So I’m doing both. :)[/quote]

This would affect your core logic, but…the absolute fastest and most memory efficient way to do what you need should be to walk the Text as an old style MemoryBlock in both directions. Aside from the initial conversion from Text to MemoryBlock, you’re not wasting any time allocating or swapping memory.

Of course it may not matter if you’re dealing with small blocks of Text.

Thanks again people!

And @Daniel Taylor: for now it’s easily fast enough, as the bottleneck is API communication which is factors slower.

I’ll consider it for a rewrite though, as the amount of data might grow in the future too.

[quote=315132:@Michel Bujardet]Don’t use Text split. It is bugged. And as you see, much slower.

Use string and it should fly.[/quote]

Excluding the performance issue, what did you find for a bug?

Hi guys! The feedback is still open, but any chance this has been fixed in the meantime (I’m still on 2017 R2.1) due to a different compiler or something else?