RegexMBS MatchLimitRecursion question

One user of my app complained that my app was slow. He sent me a process sample and I saw that the following code was causing the slowness:

dim theRegex as new RegEx theRegex.options.ReplaceAllMatches = true theRegex.Options.Greedy = True theRegex.SearchPattern = "(\\r|\ )+(\\s)+" theRegex.ReplacementPattern = CarbonModule.EndOfLineMacintosh theText = theRegex.Replace(theText)

Clever Xojo user that I am I simply replaced the code with MBS

[code]dim Pattern as String = “(\r|
)+(\s)+”

dim theRegex as new RegExMBS
theRegex.CompileOptionDotAll = True
theRegex.CompileOptionUngreedy = False
theRegex.CompileOptionNewLineAnyCRLF = True

if theRegex.Compile(Pattern) and theRegex.Study then
theText = theRegex.ReplaceAll(theText, CarbonModule.EndOfLineMacintosh)
end if[/code]

Which made a nice hard crash:

Thread 7 Crashed:
0 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010fae3077 match + 23
1 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faeea1e match + 47550
2 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faf10e4 match + 57476
3 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faeea1e match + 47550
4 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faf10e4 match + 57476
5 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faeea1e match + 47550
6 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faf10e4 match + 57476
7 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faeea1e match + 47550
8 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faf10e4 match + 57476
9 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faeea1e match + 47550
10 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faf10e4 match + 57476
11 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faeea1e match + 47550
12 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faf10e4 match + 57476
13 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faeea1e match + 47550
14 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faf10e4 match + 57476
15 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faeea1e match + 47550
16 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faf10e4 match + 57476
17 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faeea1e match + 47550
18 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faf10e4 match + 57476
19 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faeea1e match + 47550
20 MBS_Tools_RegEx_Plugin_20105.dylib 0x000000010faf10e4 match + 57476

However, this looks like a recursion problem so I had a look at the MBS docs. That led me to something called matchlimitrecursion. The clearest explanation I found was this here: https://serverfault.com/questions/408265/what-are-pcre-limits .

But I’m still not sure what value gives me a fast regex, a good result and no crash. The string that caused the crash has a lot of whitespace.

Well, you could increase the stack size for the thread.

see
http://documentation.xojo.com/api/language/thread.html#thread-stacksize

That has been in place for a couple of years now:

me.StackSize = 1280000

This seems to work now:

dim Pattern as String = "(\\r|\ )+\\s+" dim theRegex as new RegExMBS theRegex.CompileOptionDotAll = True theRegex.CompileOptionUngreedy = False theRegex.CompileOptionNewLineAnyCRLF = True theRegex.MatchLimitRecursion = 1500 if theRegex.Compile(Pattern) and theRegex.Study then theText = theRegex.ReplaceAll(theText, CarbonModule.EndOfLineMacintosh) end if

I removed the brackets around the \s . Without the brackets the recursion seems less deep.

1.2 MB for a stack is not much if you do a lot of recursions.

Well, the documentation says:

The nasty thing about the regex is that the recursion is kind-of invisible. What value would be good for testing? 5 or 10 MB?

I made the stacksize 10 times larger and not everything works fine. Thanks!

not or now?

Thick fingers: now is correct.