Windows arm64: split with endofline

Daniel_Fritzsche · December 12, 2022, 10:09pm

Hi,

I am using windows 11 on arm and experience problems with the split method.

lStringArray1 = lString1.Split(EndOfLine)
lStringArray2 = lString2.Split(EndOfLine.Windows)
lStringArray3 = lString3.Split(EndOfLine.Unix)
lStringArray4 = lString4.Split(chr(10))
lStringArray5 = lString5.Split(chr(13)+chr(10))

All these calls do not split a given text which contains windows line endings.

I know that it makes no sense to split a string containing Windows line breaks with Unix line breaks. Please spare me with comments about this, it’s just a test series.

lStringArray6 = lString6.Splitt(" ")

Splitting with a space works fine…

Unfortunately I cannot test on native windows x86-64 in order to check if it is a common windows issue or only an windows arm64 issue. I have also tried to build the x86-64 version and run it on windows arm with no success. I tried building the windows x86-64 and windows arm64 of the test-project on macOS and run it on windows arm with no success.

Can anyone confirm this, comment the issue and upvote the issue?

https://tracker.xojo.com/xojoinc/xojo/-/issues/71214

It is a real show stopper for us, as we cannot test our application on windows arm with this bug. The issue contains a test-project to show the problem.

Any ideas to workaround this or to modify the code are welcome…

Thanks
Daniel

Tim_Parnell · December 12, 2022, 10:19pm

Does ReplaceLineEndings do anything to your string? For testing, try replacing line endings with something visible.

Daniel_Fritzsche · December 12, 2022, 10:22pm

Great idea… will try that…

AlbertoD · December 12, 2022, 10:32pm

For your sample, try EndOfLine.CR

Daniel_Fritzsche · December 12, 2022, 10:44pm

Now it gets weird… when

using endofline.cr the split works
using endofline.crlf the split fails.

But that is strange… on windows there should be crlf and notepad++ shows the cr lf …

@Tim_Parnell : using ReplaceLineEndings also works as workaround.

Douglas_Handy · December 12, 2022, 10:49pm

That is not a workaround. It is by far the best way to get a consistent result regardless of the source of the data, or the platform it runs on. You may THINK it has a given sequence since you are on Windows or whatever, but my standard practice has always been to use ReplaceLineEndings() to get to a known sequence, and then Split() on that sequence. (And regardless of the platform you are on, if the source originated on another machine – like an attachment or from a website or whatever – you have no suitable expectation of what is in the source.

Daniel_Fritzsche · December 12, 2022, 10:57pm

After thinking some time about your approach, I will take that into account and build a splitEOL extend method to use it always…
Thanks Douglas

kevin_g · December 13, 2022, 8:27am

You’ve possibly found a bug in Xojo. Could you create a ticket in the feedback system so that it gets investigated.

Greg_O · December 13, 2022, 11:11am

It also could be that the EndOfLine character used by Windows ARM is not CR+LF.

Christian_Schmitz · December 13, 2022, 11:20am

we always normalise line endings here with ReplaceLineEndings before doing such a split().
You can’t predict what type of line endings a text has, when you can’t control the source.

e.g. user may edit a record on MacOS and put in CR or LF for line endings on saving. Windows user reads it and can’t split on CRLF.

Rick_Araujo · December 13, 2022, 11:51am

I sadly learned that when storing content in a Xojo constant using copy/paste Xojo destroys the pasted content changing it. One known interference is replacing whatever line endings you pasted to just Chr(13). CRLF becomes CR, LF becomes CR. Looks much like to what you are seeing.