For our new PCRE2CodeMBS class, we add a new Matches method. It returns an iterator, so you can use it with a for-each loop in Xojo:
Sub test()
Dim rx As New PCRE2CompilerMBS
rx.CaseLess = True
rx.DotAll = False
rx.Ungreedy = False
rx.NewLine = rx.kNewLineAnyCRLF
rx.Multiline = True
rx.Pattern = kMarkerPattern
Dim code As PCRE2CodeMBS = rx.Compile
If UseJIT Then
code.JITCompile(code.kJITComplete)
End If
for each MatchData as PCRE2MatchDataMBS in code.Matches(TestString, 0)
foundCount = foundCount + 1
next
End Sub
One of the key elements ot reach best performance is to reuse the PCRE2CodeMBS objects and avoid compilation as much as you can. Second, you avoid creating PCRE2MatchDataMBS objects and reuse them. The iterator does that by default for you.
Let us show you the numbers from our benchmark:
RegEx Xojo | RegEx MBS | PCRE2 | PCRE2+JIT | PCRE2 Iterator | PCRE2 Iterator+JIT | |
---|---|---|---|---|---|---|
Match 500 in 1 MB | 171317 | 3478 | 3521 | 266 | 3231 | 277 |
Match 500 in 10 MB | 1711241 | 30627 | 30742 | 928 | 30542 | 1063 |
Match 5000 in 1 MB | 1642035 | 3234 | 3776 | 434 | 3660 | 478 |
Match 5000 in 10 MB | 16834198 | 30906 | 31200 | 1363 | 31157 | 1456 |
Tests made with MBS Xojo Plugins 22.2 and Xojo 2022r2.1 on a current MacBook Pro. The Xojo value is a 10 times average while the MBS calls are averaged over running 100 times. But at the quick turn around times of the compiled code, it gets hard to measure!
If you haven’t tried yet, please give the PCRE2 classes with just-in-time compiler a try. They are often 10 times faster compared to our already quick RegExMBS class.
Please try the new classes if you have a chance!