Splitting lines windows/mac

dim lines() as string
dim msg as string = “…”

lines = split(msg, chr(&h0a))
vs
lines = split(msg, chr(&h0a) + chr(&h0d))

How should you split lines on unix vs dos?

use ENDOFLINE it is OS agnostic

lines = split(ReplaceLineEndings(msg, EndOfLine), EndOfLine)

I thought that split wouldn’t know what to do if you asked it to do this:

lines = split(“ABC11DEF11GHI”, “11”)

[quote=428547:@Brian O’Brien]I thought that split wouldn’t know what to do if you asked it to do this:

lines = split(“ABC11DEF11GHI”, “11”)[/quote]
what does that have to do with ENDOFLINE??? [confused]

Beware of multiplatform text files.

If you move a file from Windows to mac OS or Linux, the used EndOfLine character(s) may be different and your code may work… or not.

Which is why Tim’s suggestion handles that too.

lines = split(ReplaceLineEndings(msg, EndOfLine), EndOfLine)

It first replaces all variations of line endings with the current OS standard ending, then splits on that ending style. Problem solved.

Thanks Douglas, I forgot that one !

Carriage Return and Line Feed.

A Carriage Return brought the typewriters spool begging of the ‘carriage’.
A Line Feed rotated the carriage and caused the paper to feed up.

Teletype machines later became possible and they emulated the typewriter.
The alphabet was encoded into something called ASCII.
In the ASCII set a Carriage Return was encoded as a 7bit value. &h0D (13)
The line feed was encoded as &h0A (10)
Data was sent by RS232 using 300 baud modems programmed to send No stop bits 8/7 data bits and one stop bit…

So to send the message “Hello World” one would say:

print("Hello World" + chr(&h0D) + char(&h0A))

When operating systems entered the system they also adopted ASCII.
However unix systems decided that only one character was necessary and dropped the carriage return.
So on a unix operating system you would do this:

print("Hello" + char(&h0a))

But on MSDOS systems

print("Hello" + chr(&h0d)+chr(&h0a))

So back to ENDOFLINE and the split function.
I was under the impression that the split function would take only 1 single character as its delimiter and not two or more characters.
So my example is

dim msg as string = "Hello" + chr(&0d) + chr(&h0a) + "World" . // To emulate reading a dos txt file on a mac. dim lines() as string #If TargetWindows Then lines=split(msg, chr(&h0d)+chr(&h0a)) #else lines=split(msg, chr(&h0a)) #endif

So i tried this on my Mac:

dim lines() as string dim msg as string = "Hello" + chr(&h0d) + chr(&h0a) + "World" // What happens with a DOS file when read on MAC? lines = split(msg, chr(&h0d)+chr(&h0a)) lines = split(msg, chr(&h0a)) // Produces two string but with a trailing &h0D

Dave said this could be simplified to ENDOFLINE no matter what the OS.

I wanted to keep it simple…

dim msg = "ABC12DEF" dim lines() as string lines = split(msg, "12")

Which produces two string! :slight_smile:

The delimiter can be any number of characters.

The LR does not imply that at all even when it sayw the Delimiter is a String.

Also, the two LR descriptions of what happens when you do not pass a delimiter are… dubious.

In these cases, the best way to know is to check by yurself, but for the delimiter string, how can you think to test if it is possible to use a more than one character Delimiter ?

This issue is… when I open a file and want to break it into lines, that file may have been created on a PC and I am running on a Mac.
So If I use EndOfLine… it’s end of line for that system not the end of line for the file.

So if I strip the whole file of &h0d then spilt the lines using &h0a.
Then this should work no matter what platform i’m running on or what system created the file.

[quote=428645:@Brian O’Brien]So if I strip the whole file of &h0d then spilt the lines using &h0a.
Then this should work no matter what platform i’m running on or what system created the file.[/quote]

Which is similar to Tim’s suggested one liner. Check out what ReplaceLineEndings() does:

lines = split(ReplaceLineEndings(msg, EndOfLine), EndOfLine)

No need to roll your own solution. Xojo already has a method to do it for you.

That’s what ReplaceLineEndings is for.