RegEx help

again …

I need to be able to find a sequence of digits from 1 to N in length that are bounded on either side by ANYTHING except A-Z,a-z
and may or may not begin or end a line

some real text example ?

Off the top of my head…

[^A-Za-z][0-9]+[^A-Za-z]

Dave you want something like?:

code // match 109
a109b // no match[/code]

How about this:

a109% #109b

Greg has the right idea but that will match the character before and after the digits too, and won’t match beginning or end of document, so I’d use lookarounds.

(?i)(?<![a-z\\d])\\d+(?![a-z])

The lookbehind makes sure the previous character is not a letter or a digit, and the lookahead makes sure it’s not a letter.

Why a digit in the lookbehind? To make sure it doesn’t start matching in the middle of a stream of digits. In the string “a1234”, “1” will not match because it is preceded by “a”, but, without the \d in the lookbehind, “234” would match because it is preceded by “1”.

Kem, I used a regex checker online and get:

109a // match 10

I don’t know if Dave will have something like that and if he expects to get no-match, 109 or 10 as a match.

So I guess, depending what Dave wants, maybe add \d to the negative lookahead.

[quote=413876:@Alberto De Poo]Dave you want something like?:

code // match 109 -
a109b // no match[/code]

How about this:

a109% #109b[/quote]
first example… yes but excluding the () other examples no, because a LETTER is a prefix or postfix

Then:

(?i)(?<![a-z\\d])\\d+(?![a-z\\d])

I guess

almost works :slight_smile: … yeah I guess I should have said “BOTH sides” not “EITHER”

123 a456 789a

the 123 are highlighted (which is right)
the 456 are not (which is right)
but 78 are highlighted, and 9 is not (I wanted none to be highlighted)

on a side but related note.
I am modifing some code that Jim McKay wrote a few years back…
and all the existing highlighting RegEx uses SUBEXPRESSION(1)

Try
   If group.Words.Ubound=-1 Then Continue
   r= New RegEx
   r.Options.TreatTargetAsOneLine=True
   r.Options.CaseSensitive=False
   r.Options.MatchEmpty=True
   r.SearchPattern="(?<!\\B)("+Join(group.Words,"|")+")\\b"
						
   m=r.Search(theText,startChar)
   While m<>Nil
	Dim characterPosition As Integer = st.Text.LeftB(m.SubExpressionStartB(1)).Len
	st.TextColor(characterPosition,m.SubExpressionString(1).Len)=group.HighlightColor
	m=r.Search
   Wend
Catch
   System.DebugLog("exception occurred. match string:"+r.SearchPattern)
End Try

but for this I needed to use “0”?

r= New RegEx
r.Options.TreatTargetAsOneLine=True
r.Options.CaseSensitive=False
r.Options.MatchEmpty=True
r.SearchPattern="(?i)(?<![a-z\\d])\\d+(?![a-z])"
m=r.Search(theText,startChar)
		
While m<>Nil
	Dim characterPosition As Integer = st.Text.LeftB(m.SubExpressionStartB(0)).Len
	st.TextColor(characterPosition,m.SubExpressionString(0).Len)=color.orange
	m=r.Search
Wend

Also how “expensive” is “r=New RegEx” and all the property assignments?
Would it be faster to create an array of “RegEx” object ahead of time and just use them.
Its a finite amount (a dozen or less)… but it seems that creating them literally hundreds or thousands of times…

Use Albert’s modified pattern above.

works much better

what about my other question? :slight_smile: