Iterate with PCRE2

For our new PCRE2CodeMBS class, we add a new Matches method. It returns an iterator, so you can use it with a for-each loop in Xojo:

Sub test()
	Dim rx As New PCRE2CompilerMBS
	rx.CaseLess = True
	rx.DotAll = False
	rx.Ungreedy = False
	rx.NewLine = rx.kNewLineAnyCRLF
	rx.Multiline = True
	rx.Pattern = kMarkerPattern
	Dim code As PCRE2CodeMBS = rx.Compile
	If UseJIT Then
	End If
	for each MatchData as PCRE2MatchDataMBS in code.Matches(TestString, 0)
		foundCount = foundCount + 1
End Sub

One of the key elements ot reach best performance is to reuse the PCRE2CodeMBS objects and avoid compilation as much as you can. Second, you avoid creating PCRE2MatchDataMBS objects and reuse them. The iterator does that by default for you.

Let us show you the numbers from our benchmark:

RegEx Xojo RegEx MBS PCRE2 PCRE2+JIT PCRE2 Iterator PCRE2 Iterator+JIT
Match 500 in 1 MB 171317 3478 3521 266 3231 277
Match 500 in 10 MB 1711241 30627 30742 928 30542 1063
Match 5000 in 1 MB 1642035 3234 3776 434 3660 478
Match 5000 in 10 MB 16834198 30906 31200 1363 31157 1456

Tests made with MBS Xojo Plugins 22.2 and Xojo 2022r2.1 on a current MacBook Pro. The Xojo value is a 10 times average while the MBS calls are averaged over running 100 times. But at the quick turn around times of the compiled code, it gets hard to measure!

If you haven’t tried yet, please give the PCRE2 classes with just-in-time compiler a try. They are often 10 times faster compared to our already quick RegExMBS class.

Please try the new classes if you have a chance!

1 Like