again …
I need to be able to find a sequence of digits from 1 to N in length that are bounded on either side by ANYTHING except A-Z,a-z
and may or may not begin or end a line
again …
I need to be able to find a sequence of digits from 1 to N in length that are bounded on either side by ANYTHING except A-Z,a-z
and may or may not begin or end a line
some real text example ?
Off the top of my head…
[^A-Za-z][0-9]+[^A-Za-z]
Dave you want something like?:
code // match 109
a109b // no match[/code]
How about this:
a109%
#109b
Greg has the right idea but that will match the character before and after the digits too, and won’t match beginning or end of document, so I’d use lookarounds.
(?i)(?<![a-z\\d])\\d+(?![a-z])
The lookbehind makes sure the previous character is not a letter or a digit, and the lookahead makes sure it’s not a letter.
Why a digit in the lookbehind? To make sure it doesn’t start matching in the middle of a stream of digits. In the string “a1234”, “1” will not match because it is preceded by “a”, but, without the \d in the lookbehind, “234” would match because it is preceded by “1”.
Kem, I used a regex checker online and get:
109a // match 10
I don’t know if Dave will have something like that and if he expects to get no-match, 109 or 10 as a match.
So I guess, depending what Dave wants, maybe add \d to the negative lookahead.
[quote=413876:@Alberto De Poo]Dave you want something like?:
code // match 109 -
a109b // no match[/code]
How about this:
a109%
#109b
[/quote]
first example… yes but excluding the () other examples no, because a LETTER is a prefix or postfix
Then:
(?i)(?<![a-z\\d])\\d+(?![a-z\\d])
I guess
almost works … yeah I guess I should have said “BOTH sides” not “EITHER”
123 a456 789a
the 123 are highlighted (which is right)
the 456 are not (which is right)
but 78 are highlighted, and 9 is not (I wanted none to be highlighted)
on a side but related note.
I am modifing some code that Jim McKay wrote a few years back…
and all the existing highlighting RegEx uses SUBEXPRESSION(1)
Try
If group.Words.Ubound=-1 Then Continue
r= New RegEx
r.Options.TreatTargetAsOneLine=True
r.Options.CaseSensitive=False
r.Options.MatchEmpty=True
r.SearchPattern="(?<!\\B)("+Join(group.Words,"|")+")\\b"
m=r.Search(theText,startChar)
While m<>Nil
Dim characterPosition As Integer = st.Text.LeftB(m.SubExpressionStartB(1)).Len
st.TextColor(characterPosition,m.SubExpressionString(1).Len)=group.HighlightColor
m=r.Search
Wend
Catch
System.DebugLog("exception occurred. match string:"+r.SearchPattern)
End Try
but for this I needed to use “0”?
r= New RegEx
r.Options.TreatTargetAsOneLine=True
r.Options.CaseSensitive=False
r.Options.MatchEmpty=True
r.SearchPattern="(?i)(?<![a-z\\d])\\d+(?![a-z])"
m=r.Search(theText,startChar)
While m<>Nil
Dim characterPosition As Integer = st.Text.LeftB(m.SubExpressionStartB(0)).Len
st.TextColor(characterPosition,m.SubExpressionString(0).Len)=color.orange
m=r.Search
Wend
Also how “expensive” is “r=New RegEx” and all the property assignments?
Would it be faster to create an array of “RegEx” object ahead of time and just use them.
Its a finite amount (a dozen or less)… but it seems that creating them literally hundreds or thousands of times…
Use Albert’s modified pattern above.
works much better
what about my other question?