Careful with that RegEx!

An interesting detail: Stackexchange was down yesterday because someone posted an empty string that sent their RegEx trim into and endless loop:
http://stackstatus.net/post/147710624694/outage-postmortem-july-20-2016

I tried to access it several times yesterday.

Proof that the biggests are not immune to bugs…

Interesting, and not wrong. According to RegExRX, a string with 20,000 spaces as the end takes 0.04 ms to match with \\s+$, but put another character at the end of those spaces and it takes 2.6 seconds to fail. I can see how that would severely hamper a high-traffic site, even without their other issues.

By the way, in PCRE and some other regex flavors, the problem could have been solved by making the match atomic or, in this case, possessive. Either of these reduce the fail time to about 325 ms.

(?>\\s+)$
\\s++$

“Atomic” tells the engine that a subsequent failure will not allow the engine backtrack into what was matched within the atomic group. That doesn’t seem to be an option in JavaScript however.

For more information, see

http://www.regular-expressions.info/atomic.html

I dimly remember making my app crash with a regex where I had added brackets - these here () - for clarity. I got an explanation why this made a difference which I didn’t understand.

What he said!