Let’s say you have a RegEx pattern like “*** My stars ***” but you don’t want the *'s to be wildcards, you want them to be literal. How do you escape them? Better yet, how do you escape ALL characters?
Put a slash in front of it, like \* and then it’s escaped.
As Marc says, you escape them with a backslash. You don’t escape ALL characters, you escape each character.
That becomes a problem when there’s a RegEx special modifier. For example, what if you wanted to escape “b”? There’s already a \b that performs something special.
Would this be a good solution? Or, is there a better way to do this?
Dim xRegEx As New RegEx
Dim xMatch As RegExMatch
Dim sPatternBits() As String = Split(this_bit,"")
Dim m, mEnd As Integer
mEnd = uBound(sPatternBits)
For m = 0 to mEnd
If sPatternBits(m) = "*" Then
sPatternBits(m) = ".*"
Else
sPatternBits(m) = "\" + Oct(Asc(sPatternBits(m)))
End If
Next
xRegEx.Options.CaseSensitive = False
xRegEx.Options.TreatTargetAsOneLine = False
xRegEx.Options.StringBeginIsLineBegin = True
xRegEx.Options.StringEndIsLineEnd = True
xRegEx.Options.MatchEmpty = False
xRegEx.SearchPattern = ReplaceAll(this_bit,"*",".*")
xMatch = xRegEx.Search(FThisItem.Name)
If xMatch = Nil or xMatch.SubExpressionCount = 0 Then Exit For i//Haven't handled partial wildcard matches yet
Why would you want to escape a b? A “b” is already a b – it doesn’t need to be escaped. You only need to escape RegEx specific letters, such as .
and *
and \\
(which you can do with a \\
).
Unless I’m missing something?
I’m matching file names on the user’s computer, which could be anything.
Wouldn’t you just use a .+
then?
Not really, because the escape character is not supposed to be used to escape alphas (it does, when the alpha in question is not part of an escape sequence).
My understanding is that Xojo uses PCRE. You don’t escape low-ascii characters in regex. Alternatively, I believe you can do \Q…\E where everything between \Q and \E will be taken literally.
I’m not matching everything. I’m matching the potential for characters.
For example, I want to have a query like:
Cachesdb*
But, with a bit more flexibility. I want to check the user’s home path so like:
/Users/This is my super star (*) home folder//Library/Caches/.db
Now you see those ***'s shouldn’t be matched dynamically but the other ones should be.
I think I’ve figured it out… does anyone notice any issues?
Function MatchesWildcardKSW(Extends search_this As String, the_pattern As String) As Boolean
Dim sPatternBits() As String = Split(the_pattern,"")
Dim xRegEx As New RegEx
Dim xMatch As RegExMatch
Dim m, mEnd As Integer
mEnd = uBound(sPatternBits)
For m = 0 to mEnd
If sPatternBits(m) = “" Then
sPatternBits(m) = ".”
Else
sPatternBits(m) = “” + Oct(Asc(sPatternBits(m)))
End If
Next
xRegEx.Options.CaseSensitive = False
xRegEx.Options.TreatTargetAsOneLine = False
xRegEx.Options.StringBeginIsLineBegin = True
xRegEx.Options.StringEndIsLineEnd = True
xRegEx.Options.MatchEmpty = False
xRegEx.SearchPattern = Join(sPatternBits,"")
xMatch = xRegEx.Search(search_this)
If xMatch = Nil or xMatch.SubExpressionCount = 0 Then Return False
Return True
End Function
It probably won’t be an issue, but if you get into Unicode characters with values > &o777, this code won’t work. As an alternative, you can use the hex code with the token \\x{XXX}
.
Also as an alternative:
the_pattern = the_pattern.ReplaceAllB( "\\E", "\\\\EE\\Q" )
the_pattern = the_pattern.ReplaceAllB( "*", "\\E.*\\Q" )
the_pattern = "\\Q" + the_pattern + "\\E"
So a pattern like “/this/path//to/nthing” would become “\Q/this/path/\E.\Q/to/n\E.\Qthing\E”, and that will work.