RegEx Question : OH MIGHTY KEM!!!!

  Dim re As New RegEx
  Dim rm As RegExMatch
  If rm<>Nil Then Return tokenSYMBOL
  Return tokenUNKNOWN

I need to validate that the variable “KEYWORD” contains a valid symbol name
(ie. Starts with A-Z (case no matter), and is followed optionally by A-Z,0-9 or _)
what I am missing is the “optionally” part… this fails if keyword is only a single character

You don’t need the second character class since \w will match all of those characters anyway, and having it there makes a second character required. Try this:


(Unless you change the options, the patterns are case-insensitive.)

Ok… forgetting about the case sensitive part… doesn’t that say the string must ONLY be the characters a-z?

I need the FIRST character to be A-Z
and the 2nd to N characters to be A-Z, 0-9 or _ , but ONLY if the string > 1 character long
if it is only 1 character long then it must be A-Z
basically a string that matches the defintion for a variable name for the most part

x is valid
x3 is valid
x_3_4z is valid
3x is not valid

seems this works


You just wrote the longer version of the same pattern. :slight_smile:

I see nothing in your pattern that checks for digits ONLY after the 1st character… now I’m not saying you are wrong… I’m trying to see what I am missing

the magic word here is the “\w”

start with A-Z and optionally followed by 0-n “word characters” where are “A-Z”, “0-9” and “_”

NOW IT MAKES SENSE :slight_smile:

Of all the elements of computer programming, RegEx is the one that I have never ever been able to make sense out of…

I THOUGHT I could take what I learned above and make a simple addtion, but its not happening… :frowning:

I need a pattern that will match a string, that must meet this criteria

  • Starts with a Letter (case is not important)
  • followed by [A-Z] (any case) , or [0-9], or “_”, or “.”
  • and in the case of the “.”, only ONE is allowed, and must be followed by at least one other character

basically a strict filename pattern

Do you have Kem’s RegExRX? Absolutely essential.

Does this do what you want?


or, if you are not setting the case insensitive flag


As for the second and third bullets, are we talking about a single character, or multiple? And the character following the period, are we talking any non-space character?

this RegEx needs to be “generic” I have two projects to use it in … one in Xojo, one not, so I’d rather not use something like RegExRX

  • Starts with a Letter (case is not important)
  • followed by 0 to n characters that must be [A-Z] (any case) , or [0-9], or “_”, or “.”
  • and in the case of the “.”, only ONE is allowed, and must be followed by at least one other character
$abc.3  is not valid  (no $ allowed) is not valid (must start with letter)
A. is not valid (period must have at least one [A-Z,0-9] following it
A..B is not valid (contains multiple ".")

as to “any non-space” character, no, not “any”… the entire string must be A-Z a-z 0-9 . or _ characters only… nothing else
no spaces, or non alphanumeric except “.” or “_”




if you are not setting the case insensitive flag

this is close , but a dot and following are required


I would have thought this would make the dot etc optional, but it doesn’t work


Try this:


I will, but this is what I came up with via trial and error


Kem… your suggestion gave this

RegEx=^[A-Z]\\w*(\\.\\w+)?$  String=filename.ext result=false
RegEx=^[A-Z]\\w*(\\.\\w+)?$  String=filename result=false
RegEx=^[A-Z]\\w*(\\.\\w+)?$  String=filename..ext result=false
RegEx=^[A-Z]\\w*(\\.\\w+)?$  String=.ext result=false

mine gave this

RegEx=^[A-Za-z]\\w*([\\.][\\w*]+)?$  String=filename.ext result=true
RegEx=^[A-Za-z]\\w*([\\.][\\w*]+)?$  String=filename result=true
RegEx=^[A-Za-z]\\w*([\\.][\\w*]+)?$  String=filename..ext result=false
RegEx=^[A-Za-z]\\w*([\\.][\\w*]+)?$  String=.ext result=false

Your regex must have the case sensitive flag set, as yours and Kem’s are essentially the same (aside from yours has [A-Za-z] while Kem;s has [A-Z])

You don’t need the brackets around the \. and the \w ( your code [\w*]+ is essentially saying that there are 1 or more instances of 0 or more words)

well I don’t know… but mine works and Kems (sorry) does not even when I change to [A-Za-z]

“r.r” is valid, but Kem regex says it is not

RegEx=^[A-Za-z]\\w*([\\.][\\w*]+)?$  String=r.r result=true
RegEx=^[A-Za-z]\\w(\\.\\w+)?$  String=r.r result=false

You’re missing the asterisk after the first \w in the second line.

And, I was mistaken in saying that yours and Kem’s were essentially the same, the ‘1 or more instances of 0 or more words’ should evaluate differenty

(\.\w+)? = an optional group consisting of a period followed by 1 or more words.
([\.][\w*]+)? = an optional group consisting of a period followed by one or more instances of 0 or more words.

I’m not sure why the second works; I would assume that this would evaluate true if the period is not followed by a character. I expected this regex to result true for string=r. (no character following the period) but it didn’t. Hopefully Kem can explain this one, as it’s beyond my understanding.