Dim re As New RegEx
Dim rm As RegExMatch
re.SearchPattern="^[a-zA-Z][a-zA-Z_0-9]\\w*$"
rm=re.search(keyword)
If rm<>Nil Then Return tokenSYMBOL
Return tokenUNKNOWN
I need to validate that the variable “KEYWORD” contains a valid symbol name
(ie. Starts with A-Z (case no matter), and is followed optionally by A-Z,0-9 or _)
what I am missing is the “optionally” part… this fails if keyword is only a single character
You don’t need the second character class since \w will match all of those characters anyway, and having it there makes a second character required. Try this:
^[a-z]\\w*$
(Unless you change the options, the patterns are case-insensitive.)
Ok… forgetting about the case sensitive part… doesn’t that say the string must ONLY be the characters a-z?
I need the FIRST character to be A-Z
and the 2nd to N characters to be A-Z, 0-9 or _ , but ONLY if the string > 1 character long
if it is only 1 character long then it must be A-Z
basically a string that matches the defintion for a variable name for the most part
x is valid
x3 is valid
x_3_4z is valid
3x is not valid
I see nothing in your pattern that checks for digits ONLY after the 1st character… now I’m not saying you are wrong… I’m trying to see what I am missing
As for the second and third bullets, are we talking about a single character, or multiple? And the character following the period, are we talking any non-space character?
this RegEx needs to be “generic” I have two projects to use it in … one in Xojo, one not, so I’d rather not use something like RegExRX
Starts with a Letter (case is not important)
followed by 0 to n characters that must be [A-Z] (any case) , or [0-9], or “_”, or “.”
and in the case of the “.”, only ONE is allowed, and must be followed by at least one other character
$abc.3 is not valid (no $ allowed)
9.abc is not valid (must start with letter)
A. is not valid (period must have at least one [A-Z,0-9] following it
A..B is not valid (contains multiple ".")
as to “any non-space” character, no, not “any”… the entire string must be A-Z a-z 0-9 . or _ characters only… nothing else
no spaces, or non alphanumeric except “.” or “_”
And, I was mistaken in saying that yours and Kem’s were essentially the same, the ‘1 or more instances of 0 or more words’ should evaluate differenty
(\.\w+)? = an optional group consisting of a period followed by 1 or more words.
([\.][\w*]+)? = an optional group consisting of a period followed by one or more instances of 0 or more words.
I’m not sure why the second works; I would assume that this would evaluate true if the period is not followed by a character. I expected this regex to result true for string=r. (no character following the period) but it didn’t. Hopefully Kem can explain this one, as it’s beyond my understanding.