RegEx Problems

  1. 5 weeks ago

    Jonathan E

    Feb 19 Pre-Release Testers, Xojo Pro Las Vegas, NV

    I'm having some strange behavior working with Regular Expressions.

    I have the following code:

    Public Function ToJSONEscapedText(extends x as XmlDocument) as Text
      Dim contents As String = x.ToString
      
      //Carriage Return Is replaced With \r
      Dim re As New RegEx
      re.SearchPattern = "\r"
      re.ReplacementPattern = "\\r"
      re.Options.ReplaceAllMatches = True
      
      Dim m As RegExMatch = re.Search( contents )
      Dim output As String = re.Replace
      contents = output
      
      //Double quote Is replaced With \"
      re = New RegEx
      re.SearchPattern = "\"""
      re.ReplacementPattern = "\"""
      re.Options.ReplaceAllMatches = True
      
      output = re.Replace( contents )
      contents = output
      
      //Backslash Is replaced With \\
      re = New RegEx
      re.SearchPattern = "\\"
      re.ReplacementPattern = "\\\\"
      re.Options.ReplaceAllMatches = True
      
      output = re.Replace( contents )
      contents = output
      
      Return contents.DefineEncoding( Encodings.UTF8 ).ToText
    End Function

    When using the above regular expressions in RegEx Tester they work no problem, but in Xojo nothing is matched or replaced when using the below XML (pulled from the XMLDocument page in the docs):

    <?xml version="1.0" encoding="UTF-8"?>
     <League>
     	<Team name="Seagulls">
     		<Player name="Bob" position="1B" />
     		<Player name="Tom" position="2B" />
     	</Team>
     	<Team name="Pigeons">
     		<Player name="Bill" position="1B" />
     		<Player name="Tim" position="2B" />
     	</Team>
     	<Team name="Crows">
     		<Player name="Ben" position="1B" />
     		<Player name="Ty" position="2B" />
     	</Team>
     </League>

    To make things even more strange, I was initially testing this out in a Web App but couldn't even instantiate the XML. I kept getting a Parser Error 2: Syntax Error. I did double check that this XML is valid using an online validator which didn't report any problems.

    Does there appear to be anything wrong with my code or with this XML before I file bug reports?

  2. Kem T

    Feb 19 Pre-Release Testers, Xojo Pro, XDC Speakers New York

    First, you don't have to match before doing a replace, unless you have some other use for the match.

    Most likely the text you're matching against has linefeeds, not returns. \r is specific to a return (ASC 13), and will ignore linefeeds (ASC 10), and linefeeds are more common these days.

    If you want to replace either, you can use \R in your pattern, and that will match return, linefeed, or return + linefeed (the Windows standard, for some reason). If you want to be exact, you'll have to use two patterns, one to replace \r with \\r and another to replace \n with \\n.

  3. Jonathan E

    Feb 19 Pre-Release Testers, Xojo Pro Las Vegas, NV

    Thanks Kem! Using the search method was actually an artifact of my debugging. I made the change to do a search with \R, but it's still failing to find anything for any of the searches.

  4. Kem T

    Feb 19 Pre-Release Testers, Xojo Pro, XDC Speakers New York

    I have no explanation for that. It works fine here in RegExRX and in Xojo with the code you pasted above with just the "\R" change.

    BTW, you don't need to create new RegEx objects for each replacement, you can reuse the same one. That's just an FYI.

  5. Jonathan E

    Feb 19 Pre-Release Testers, Xojo Pro Las Vegas, NV

    Submitted a bug report: Feedback Case #54965

  6. Jonathan E

    Feb 19 Pre-Release Testers, Xojo Pro Las Vegas, NV

    It appears this was working, but some compounding factors lead to my not seeing it.

    XML.ToString appears to remove EndOfLine and Tab characters already which is why \R didn't seem to work (nor did \t when I tried that too).

    The XML also does not have any backslashes, although when I tried to add one in various places, I got a parsing error pulling it into XML...strange nothing I can find indicates backslash being illegal or reserved.

    Finally, my coding error is where I set the search and replace string for double quotes.

    Rather than:

    re.SearchPattern = "\"""
    re.ReplacementPattern = "\"""

    It should look like this:

    re.SearchPattern = """"
    re.ReplacementPattern = "\\"""

or Sign Up to reply!