RegEx Problems

I’m having some strange behavior working with Regular Expressions.

I have the following code:

[code]Public Function ToJSONEscapedText(extends x as XmlDocument) as Text
Dim contents As String = x.ToString

//Carriage Return Is replaced With \r
Dim re As New RegEx
re.SearchPattern = “\r”
re.ReplacementPattern = “\\r”
re.Options.ReplaceAllMatches = True

Dim m As RegExMatch = re.Search( contents )
Dim output As String = re.Replace
contents = output

//Double quote Is replaced With "
re = New RegEx
re.SearchPattern = “”""
re.ReplacementPattern = “”""
re.Options.ReplaceAllMatches = True

output = re.Replace( contents )
contents = output

//Backslash Is replaced With \\
re = New RegEx
re.SearchPattern = “\”
re.ReplacementPattern = “\\\”
re.Options.ReplaceAllMatches = True

output = re.Replace( contents )
contents = output

Return contents.DefineEncoding( Encodings.UTF8 ).ToText
End Function
[/code]

When using the above regular expressions in RegEx Tester they work no problem, but in Xojo nothing is matched or replaced when using the below XML (pulled from the XMLDocument page in the docs):

<?xml version="1.0" encoding="UTF-8"?> <League> <Team name="Seagulls"> <Player name="Bob" position="1B" /> <Player name="Tom" position="2B" /> </Team> <Team name="Pigeons"> <Player name="Bill" position="1B" /> <Player name="Tim" position="2B" /> </Team> <Team name="Crows"> <Player name="Ben" position="1B" /> <Player name="Ty" position="2B" /> </Team> </League>

To make things even more strange, I was initially testing this out in a Web App but couldn’t even instantiate the XML. I kept getting a Parser Error 2: Syntax Error. I did double check that this XML is valid using an online validator which didn’t report any problems.

Does there appear to be anything wrong with my code or with this XML before I file bug reports?

First, you don’t have to match before doing a replace, unless you have some other use for the match.

Most likely the text you’re matching against has linefeeds, not returns. \\r is specific to a return (ASC 13), and will ignore linefeeds (ASC 10), and linefeeds are more common these days.

If you want to replace either, you can use \\R in your pattern, and that will match return, linefeed, or return + linefeed (the Windows standard, for some reason). If you want to be exact, you’ll have to use two patterns, one to replace \\r with \\\\r and another to replace \ with \\\ .

Thanks Kem! Using the search method was actually an artifact of my debugging. I made the change to do a search with \R, but it’s still failing to find anything for any of the searches.

I have no explanation for that. It works fine here in RegExRX and in Xojo with the code you pasted above with just the “\R” change.

BTW, you don’t need to create new RegEx objects for each replacement, you can reuse the same one. That’s just an FYI.

Submitted a bug report: <https://xojo.com/issue/54965>

It appears this was working, but some compounding factors lead to my not seeing it.

XML.ToString appears to remove EndOfLine and Tab characters already which is why \R didn’t seem to work (nor did \t when I tried that too).

The XML also does not have any backslashes, although when I tried to add one in various places, I got a parsing error pulling it into XML…strange nothing I can find indicates backslash being illegal or reserved.

Finally, my coding error is where I set the search and replace string for double quotes.

Rather than:

re.SearchPattern = "\""" re.ReplacementPattern = "\"""
It should look like this:

re.SearchPattern = """" re.ReplacementPattern = "\\\"""