Xojo RegEx Not Correct

I am using RegEx to find various lines in a long text string that start with certain words. One example would be the following line:

Filename: CCC - Attorneys Only.docx

In RegExRX, the RegEx:

^(?:Filename\\:\\s*)(.*)[\\r\ ]

Returns “CCC - Attorneys Only.docx”.

However, this code in Xojo:

rg.SearchPattern = "^(?:Filename\\:\\s)(.*)[\\r\ ]" rgMatch = rg.search(App.dtFileConverter.OutputString) if rgMatch <> Nil then strRegExOut = rgMatch.SubExpressionString(0) else strRegExOut = "Text not found!" End if

Returns the full line:

Filename: CCC - Attorneys Only.docx

The Xojo RegEx documentation says it follows PERL RegEx. The PERL RegEx says it works the same as RegExRX. Xojo’s documentation is ambiguous about the action of “(?:…)”, but it does not function as PERL does.

Anyone know of a way to make Xojo RegEx do this properly? I know MBS RegEx will probably do this correctly, but I would only like to go there as a last resort. Thanks.

Kurt

It’s your use of SubExpressionString
0 is the entire match string, 1 is the first real subexpression.

That’s what this view means by the way:

$1 is your first subexpression which returns exactly what you’re looking for.

Also, I think you should be terminating with $ instead of [\\r\ ] because it will not match if no new line comes after.

If you are working a lot with RegEx then I recommend RegExRX.

Kurt, your pattern will match the entire line, even in RegExRX as shown in your screenshot. As Tim pointed out, you want SubExpressionString( 1 ) to match just the filename.

You can rewrite the patten this way:

^Filename:[ \\t]*(.*)

Unless you change options, the dot won’t match a newline anyway so there is no reason to match it at the end. You can add “$” at the end instead as Tim suggested, but that too is unnecessary. As a rule of thumb, don’t match more than you need.

he already uses it : see 3rd line of original post …

Thanks for everyone’s help. I have this working as expected now and I learned a bit more about RegEx along the way.