I have a really weird hang doing a regex. The regex itself is relatively simple:
\b(https?://|www\.)([^<>\s]+)
It is supposed to make links out of urls. I have a really old email (ironically from @Kem_Tekinay ) where the start of the second match goes before the search start position of the first match. The result is an infinite loop because matching never finishes.
The code isn’t really new. Does anyone see what I’m doing wrong?
'make urls to links
Dim theRegex As New RegExMBS
theRegex.CompileOptionCaseLess = True
theRegex.CompileOptionDotAll = True
theRegex.CompileOptionUngreedy = False
theRegex.CompileOptionNewLineAny = True
Dim searchString As String = "\b(https?://|www\.)([^<>\s]+)"
If theRegex.Compile(searchString) Then
Dim searchStart As Integer
Dim punctuation As String
Dim protocol As String
Dim url As String
Dim replacement As String
While theRegex.Execute(theText, searchStart) > 0
' Get match offsets
Dim matchStart As Integer = theRegex.OffsetCharacters(0)
Dim matchEnd As Integer = theRegex.OffsetCharacters(1)
' Extract submatches using offsets
Dim protocolStart As Integer = theRegex.OffsetCharacters(2)
Dim protocolEnd As Integer = theRegex.OffsetCharacters(3)
protocol = theText.Middle(protocolStart, protocolEnd - protocolStart)
Dim urlStart As Integer = theRegex.OffsetCharacters(4)
Dim urlEnd As Integer = theRegex.OffsetCharacters(5)
url = theText.Middle(urlStart, urlEnd - urlStart)
' Remove punctuation at the end of the URL
Select Case url.Right(1)
Case ")"
url = url.TrimRight(")")
punctuation = ")"
Case "."
url = url.TrimRight(".")
punctuation = "."
Case ","
url = url.TrimRight(",")
punctuation = ","
Case "?"
url = url.TrimRight("?")
punctuation = "?"
Else
punctuation = ""
End Select
' Add https and change http to https
If protocol = "www." Then
protocol = "https://www."
ElseIf protocol = "http://" Then
protocol = "https://"
End If
' Create the replacement string
replacement = "<a href=""" + protocol + url + """>" + protocol + url + "</a>" + punctuation
' Replace the match in the text
theText = theText.Left(matchStart) + replacement + theText.Middle(matchEnd)
' Adjust search start position for the next match
searchStart = matchStart + replacement.Length
Wend
End If
beep
Example:
regex hang.xojo_binary_project.zip (7.0 KB)