Splitting lines windows/mac

  1. 6 weeks ago

    Brian O

    Mar 14 Pre-Release Testers, Xojo Pro Calgary, AB
    Edited 6 weeks ago

    dim lines() as string
    dim msg as string = "........"

    lines = split(msg, chr(&h0a))
    vs
    lines = split(msg, chr(&h0a) + chr(&h0d))

    How should you split lines on unix vs dos?

    use ENDOFLINE it is OS agnostic

  2. Dave S

    Mar 14 Answer San Diego, California USA

    use ENDOFLINE it is OS agnostic

  3. Tim H

    Mar 14 Pre-Release Testers Portland, OR USA

    lines = split(ReplaceLineEndings(msg, EndOfLine), EndOfLine)

  4. Brian O

    Mar 14 Pre-Release Testers, Xojo Pro Calgary, AB

    @Dave S use ENDOFLINE it is OS agnostic

    I thought that split wouldn't know what to do if you asked it to do this:

    lines = split("ABC11DEF11GHI", "11")

  5. Dave S

    Mar 14 San Diego, California USA

    @Brian OBrien I thought that split wouldn't know what to do if you asked it to do this:

    lines = split("ABC11DEF11GHI", "11")

    what does that have to do with ENDOFLINE??? [confused]

  6. Emile S

    Mar 14 Europe (France, Strasbourg)

    Beware of multiplatform text files.

    If you move a file from Windows to mac OS or Linux, the used EndOfLine character(s) may be different and your code may work… or not.

  7. Douglas H

    Mar 14 Pre-Release Testers, Xojo Pro

    Which is why Tim's suggestion handles that too.

    lines = split(ReplaceLineEndings(msg, EndOfLine), EndOfLine)

    It first replaces all variations of line endings with the current OS standard ending, then splits on that ending style. Problem solved.

  8. Emile S

    Mar 14 Europe (France, Strasbourg)

    Thanks Douglas, I forgot that one !

  9. Brian O

    Mar 14 Pre-Release Testers, Xojo Pro Calgary, AB
    Edited 6 weeks ago

    Carriage Return and Line Feed.
    -image-
    A Carriage Return brought the typewriters spool begging of the 'carriage'.
    A Line Feed rotated the carriage and caused the paper to feed up.

    Teletype machines later became possible and they emulated the typewriter.
    The alphabet was encoded into something called ASCII.
    In the ASCII set a Carriage Return was encoded as a 7bit value. &h0D (13)
    The line feed was encoded as &h0A (10)
    Data was sent by RS232 using 300 baud modems programmed to send No stop bits 8/7 data bits and one stop bit....

    So to send the message "Hello World" one would say:

    print("Hello World" + chr(&h0D) + char(&h0A))

    When operating systems entered the system they also adopted ASCII.
    However unix systems decided that only one character was necessary and dropped the carriage return.
    So on a unix operating system you would do this:

    print("Hello" + char(&h0a))

    But on MSDOS systems

    print("Hello" + chr(&h0d)+chr(&h0a))

    So back to ENDOFLINE and the split function.
    I was under the impression that the split function would take only 1 single character as its delimiter and not two or more characters.
    So my example is

    dim msg as string = "Hello" + chr(&0d)  + chr(&h0a) + "World" . // To emulate reading a dos txt file on a mac.
    dim lines() as string
    #If TargetWindows Then
       lines=split(msg, chr(&h0d)+chr(&h0a))
    #else
       lines=split(msg, chr(&h0a))
    #endif

    So i tried this on my Mac:

    dim lines() as string
    dim msg as string = "Hello" + chr(&h0d) + chr(&h0a) + "World"  // What happens with a DOS file when read on MAC?
    lines = split(msg, chr(&h0d)+chr(&h0a)) 
    lines = split(msg, chr(&h0a))  // Produces two string but with a trailing &h0D

    Dave said this could be simplified to ENDOFLINE no matter what the OS.

    I wanted to keep it simple...

    dim msg = "ABC12DEF"
    dim lines() as string
    lines = split(msg, "12")

    Which produces two string! :)

  10. Tim H

    Mar 14 Pre-Release Testers Portland, OR USA

    The delimiter can be any number of characters.

  11. Emile S

    Mar 15 Europe (France, Strasbourg)

    The LR does not imply that at all even when it sayw the Delimiter is a String.

    Also, the two LR descriptions of what happens when you do not pass a delimiter are… dubious.

    In these cases, the best way to know is to check by yurself, but for the delimiter string, how can you think to test if it is possible to use a more than one character Delimiter ?

  12. Brian O

    Mar 15 Pre-Release Testers, Xojo Pro Calgary, AB

    This issue is.. when I open a file and want to break it into lines, that file may have been created on a PC and I am running on a Mac.
    So If I use EndOfLine.. it's end of line for that system not the end of line for the file.

    So if I strip the whole file of &h0d then spilt the lines using &h0a.
    Then this should work no matter what platform i'm running on or what system created the file.

  13. Douglas H

    Mar 15 Pre-Release Testers, Xojo Pro

    @Brian OBrien So if I strip the whole file of &h0d then spilt the lines using &h0a.
    Then this should work no matter what platform i'm running on or what system created the file.

    Which is similar to Tim's suggested one liner. Check out what ReplaceLineEndings() does:

    lines = split(ReplaceLineEndings(msg, EndOfLine), EndOfLine)

    No need to roll your own solution. Xojo already has a method to do it for you.

  14. Tim H

    Mar 15 Pre-Release Testers Portland, OR USA

    @Brian OBrien that file may have been created on a PC and I am running on a Mac

    That's what ReplaceLineEndings is for.

or Sign Up to reply!