Fast String Concatenation with a defined encoding

I think you are totally misunderstanding what global effect is. Though Kems example above describes it a bit more.

I do not know the kind of strings you are working on, but Join is the fastest way. Even with that example concatanting numbers.
You just add strings to an array, so there is not actual copy, you just add references, then the Join creates the final String. So there is only 1 copy at all, and it is already UTF8 by default.
In your technique, you are taking the string copying it to a stream and then copying it again into a string at the end. It doubles the copies.
And in both escenarios you will have the 2 copies at some point in RAM.

Sorry for my English if it sounds weird or rude, it is not the intention, but it is not my native language.

@Mar_Tin As @JensK said earlier he was using Join with an array and got a 25% speed improvement by switching to the memory block method. I can’t say I’ve tried the Join method myself.

@Kem_Tekinay & @Björn_Eiríksson I can see the problem now. However, in the example here we are creating a brand new string anyway as it is being taken from a memory block, there is no possibility of pointers from elsewhere.

In my opinion Xojo really should solve that by adding something like myBlock.ToString(Encodings.UTF8) to the MemoryBlock class. Which would do same as the assignment operator except additionally give you chance to pass in the encoding.

@Björn_Eiríksson It is already there:

mBlock.StringValue(0, mBlock.Size, Encodings.UTF8)
1 Like

All of the existing mechanisms produce a copy of the original string. Even if you assign the result to the same variable (s = s.DefineEncoding), there will be a moment in time, however brief, when there are two copies of the data in memory. @Mar_Tin, this is true for Join as well. The only way around making a copy of the string would be if Xojo created a method that did not return a value (make a copy) and worked directly on the string itself. Something like

dim s as string = "abcd"
DefineEncoding(s, Encodings.ISOLatin1)

That would not create a second copy of the string.

s = DefineEncoding(s, Encodings.ISOLatin1)

necessitates making a copy, because the result could be assigned to anything.