Replace EndOfLines: Strings end in EndOfLines?!?

I’ve been using Replaces for ages (who hasn’t) and i’m totally baffled by this odd behaviour:

I am trying to clean a string which has loads of spaces in front of each line of text. So i replace double spaces with single spaces (in a loop until all have been replaced), which works fine until i’m left with the last space before each line.

So i try this:
strText=ReplaceAll(strText, EndOfLine + " ", EndOfLine)

and absolutely nothing happens.

So i put my search and replace strings into variables and now i can see that apparently, when a string begins with an EndOfLine, it also ends with an EndOfLine:

strFind=EndOfLine + " " ’ returns EndOfLine + " " + EndOfLine ?!?

Is this a Windows oddity? Or am i imagining things?

The EndOfLine on Windows is two characters, carriage return (13) + linefeed (10), and that might be what you’re seeing.

But this is a great use of a regular expression.

var rx as new RegEx
rx.SearchPattern = "^[\x20\t]+"
rx.ReplacementPattern = ""
rx.Options.ReplaceAllMatches = true

strText = rx.Replace( strText )

(Untested, forgive typos.)

If you first replace all line endings, you make sure the EndOfLine is the expected ending since
many plafroms can have different line endings.

myStr = ReplaceLineEndings(EndOfLine.Unix) // Makes single char LN
mystr = myStr.ReplaceAll(EndOfLine.Unix + " ", EndOfLine)
// or maybe you want to trim the whitespaces?
// mystr = mystr.trimRight  ?
2 Likes

I’m reading the text from a file where the EndOfLine is a single chr(13), no chr(10) added.

And yes, of course i can use regular expressions in this case (and your code does work, thanks!), however the question is whether this is some fluke or if there is some system behind this i need to be aware of for other scenarios also?

Try
strFind=EndOfLine + " "
and you will see that an EndOfLine is added to the end of the string. Without any EndOfLines anywhere in the string, this doesn’t happen … At least this is what i see on Windows.

I’m not seeing this behavior on Windows nor Mac. Which Xojo version are you using?

@DerkJ gave you the answer as to why. Either use Kem’s regex solution or Derk’s ReplaceLineEndings solution.

You are imagining things. The line:
strFind=EndOfLine + " "
Returns: chr(13) + chr(10) + chr(20)

Which is correct, as Kem already mentioned EndOfLine on Windows is two characters chr(13) + chr(10) and " " (space) is just one chr(20)

So, why are you using EndOfLine if it will allways return chr(13) + chr(10) on windows? Why dont you simply use the chr(13)?

NO, as described first.

I have had this happen to me also. Just recently too. Hopefully it gets solved in the next version.

Resolve what? I think it is working as designed. The OP already stated his text has only a x13 character and NOT the x13 followed by x10 pair that is (correctly) represented by a unqualified EndOfLine when run under Windows. The “problem” here was assuming EndOfLine would match against a single carriage return character.

Maybe I misunderstood. I thought replacelineEndings wasn’t working. I didn’t take proper time

2021, 2.1

THANK YOU, at least one other person can see what i am seeing!!!

So you also feel this only began recently?

I guess i need to rephrase the question a little to hopefully get an answer …

Alright, let me rephrase since the whole replace and EndOfLine thing seems to detract from the issue. That is how and where the problem became apparent, but it actually happens right here:

strText=“One line of normal text”

One line of normal text

strText=“One line of normal text” + chr(13) + “Another line”

One line of normal text_
Another line_

where the _ shows the chr(13)s.

I found out that no matter where or how many EndOfLines or chr(13)s i use in a string, xojo tags another one to the end of the string. This happens as soon as i save this in a variable, so replacing end of lines doesn’t help. The extra one is already there!

Can you upload a really simple binary project showing the problem?

Of course, nothing of the sort happens, either in Win7 or Win10. I put:

Var  strText as String

strText = "one line of normal text" + chr(13) + "Another line"

break

in the Open event of a new project, then examined the hex string at the break. The last character in the hex was from the ‘e’ of the second word ‘line’.

So if you think there’s a problem, please demonstrate that with a small example we can all run.

Ok, it seems to be working now. I can split and replace even including carriage returns again. Yay.

In my current project, i am loading xml files, cleaning some entries and then using them elsewhere. I never had any problem replacing EndOfLines before. When this didn’t work any more the other day, i looked at the xml and switched to chr(13) since those appeared in the xml code. Still no luck. Since i do my replacing in my own procedure, i looked at the variables and found this anomaly which can be demonstrated here:

Dim strText As String

strText=“One line of normal text”
strText=“One line of normal text” + chr(13) + “Another line”

After the first line, looking at the variable’s text in the debug window, the text reads ‘One line of normal text’ and if you press ‘Ctrl+a’ to select all, there are no surprises. If you copy the text to a string analyzer, no surprises here either.

Then step through to the second line and you will see something appear after the last character of the word ‘line’ after pressing ‘Ctrl+a’. If you copy the text to a string analyzer, you will see that the additional character is a trailing chr(13). The trailing carriage return does not show up in binary view and also is not counted.

So naturally i assumed that the additional chr(13) was messing up my replaces (and as i noticed later, also my splits).

Then i tried the ReplaceLineEndings function but this also didn’t help. Until today. Out of lack of options, i tried it not as the function, but used the other notation ‘strXML.ReplaceLineEndings …’ and now i can replace and split again. Why it didn’t work before – i guess i must have made a mistake – but why i have to do this at all if chr(13) obviously already is the line end in my xml files, i don’t understand.

And it’s still a mystery to me why there are carriage returns popping up in the debugger …

This line, in the debugger, does NOT show any chr(13) at the end.

Ah finally we get to the cause by documenting your process, the copy and paste as you mention above causes you to see an EOF when pasting it into an external hex editor, if you click the binary tab in the IDE you can see the raw data which will not have the final EOF.

Not seeing this on 2021 R2.1 Windows, 32 or 64 bit.

Patrick, are you using a clipboard utility or other add-on that might be adding this extra end of line?

Because, as others have pointed out, Windows uses a different sequence for line endings - a two character sequence of CR and LF. So when you use EndOfLine in a Windows application, you will get this two character string, which will not match a lone CR line ending. ReplaceLineEndings “fixes” this for you, if you use it correctly.

BUT… I am suspicious about your XML file. A lone CR line ending is not used by any modern OSes today (that I know of). It was used on Macs/Apples pre-OSX, and I think Commodore, but those are ancient. Are you sure your file only uses a single CR for line breaks? If so, where do these XML files come from?