Example of Using RegEx with SubExpressions (Capture Groups)

I am very new to Xojo, but I am a long-time user an fan of RegEx. I have to say that I found the Xojo flavor of RegEx somewhat difficult to understand and use.
Most of this comes from the terms that Xojo has chosen to use.
If you have used other RegEx engines, then you know that the proper term for “SubExpressions” is Capture Groups.
Furthermore, to put the full match results in the SubExpressitonsString is very confusing, since clearly it is NOT a subexpression.
One final note, Xojo Regex by default uses Case INSENSITIVE. This is highly unusual. In fact, I don’t remember see that on any other RegEx engine I have used.

Be that as it may, I finally figured out the basics of Xojo RegEx, and so I’d like to share a basic example that I hope will help others trying to learn Xojo RegEx.

This code reads all of the data from controls on the main window, and outputs the result to a control.
So, obviously, you will need to change the controls I have used to match your names.
This code is in the Action() method of a PushButton on the window.

[h]Method Code to Process RegEx with SubExpressions (Capture Groups)[/h]

//~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
// Action() Method for runTestPB Control
// Ver: 2.0   2020-06-11    Author: JMichaelTX
//
//  REQUIRES These Controls on the Window:
//    • sourceTA
//    • regexPatternTF
//    • regexResultsTA
//
//  REQUIRES These App Properties
//    • LF As String = Encodings.UTF8.Chr(10)
//
//  RegEx Notes:
//    • Xojo RegEx is Case INSENSITIVE by default (unlike all other RegEx engines)
//       • You can make Case SENSITIVE by prefixing the Regex Pattern with:
//          (?-i)
//         OR by setting the RegEx.Options object.
//~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Var LF As String = Encodings.UTF8.Chr(10)  // in place of App.LF
Var TAB As String = Encodings.UTF8.Chr(9)

Var re As New RegEx
Var oMatch As RegExMatch
Var fullMatchStr As String
Var numSubExp As Integer
Var numCaptureGroups As Integer
Var cgStr As String  // Capture Group (aka SubExpression) results


//--- Get the Source String and RegEx Pattern from Window Controls ---
re.SearchPattern = regexPatternTF.Value
oMatch = re.Search(sourceTA.value)  // returns Nil if NO MATCH

//--- LOOP THRU ALL (if any) MATCHES ---

while oMatch isA RegExMatch   // Starts only if there is at least one match
  
  numSubExp = oMatch.SubExpressionCount   // = fullMatch + Number of Capture Groups
  numCaptureGroups = numSubExp - 1
  
  fullMatchStr = oMatch.SubExpressionString(0)
  regexResultsTA.AddText(fullMatchStr + LF)    // ADD to Control
  
  //--- Get and Output Any/All Capture Groups ---
  For iCG As Integer = 1 to numCaptureGroups
    cgStr = oMatch.SubExpressionString(iCG)
    regexResultsTA.AddText(TAB + cgStr + LF)
  next
  
  oMatch = re.Search  // Get the NEXT Match (if any)
  
wend

It seems to work well on my Mac, but there could very well be bugs I have over looked.

If anyone has any issues and/or suggestions, I would welcome them.

Hope this helps you guys.

1 Like

One of my frustrations with Xojo RegEx is finding the right documentation page to use.
So, I put together the OneTab list to help me find things:
https://www.one-tab.com/page/-HIV2dzhQiWSyg6kTmvTmA

I also found it better to so a Google site search than to search the Xojo Document wiki.
For example:
https://www.google.com/search?q=site%3Adocumentation.xojo.com%20RegEx

I have a Keyboard Maestro Macro that automates this if anyone is interested.