RegEx not working

[\+|\-]\d*\.?\d*\W

This RegEx pattern should match + or - followed by an integer or decimal, and not match + or - followed by a word, right?

(3,12)+4 // should match +4
(3.12)+4.0 // should match +4.0
(3,12)-15 // should match -15
(3.12)-15.22 // should match -15.22
(3,12)+pow(2,3) // should not match anything
(3,12)-pow(2,3) // should not match anything

It works in online Regex testers and in Kem’s RegExRX. It doesn’t work in Xojo2018r2. No matches are returned. If I remove the \W then matches are returned, but they are wrong. Integers are matched, but any digits following the decimal point aren’t matched.

What am I doing wrong?

In RegExRX, have you tried right clicking the pattern and telling it to copy the Xojo/Realbasic search pattern? That will include any flags you had set.

Thanks, I just tried that, and although I had not selected any flags, the string copied was

(?mi-Us)[\+|\-]\d*\.?\d*\W

But there is no change; it still doesn’t work.

I’m building 64-bit, thought maybe that was causing the problem, switched it to 32-bit, no change. It doesn’t work, period.

How about * at the end?
[\+|\-]\d*\.?\d*\W*

Note: I don’t have RegEx experience, just did a couple of tests. Still don’t know how to deal with (3,12)+pow(2,3). See post below

Edit: I guess that having the \W at the end, the Search needs a non-word character at the end. Maybe other languages use \W and \W* the same or \W? (0 or 1 time)

How about:

[\\+\\-0-9.]+

This worked for all your samples:
[\+|\-]\d+\.?\d*\W*

Edit: from what I can tell non of your examples use the \W, do you have other examples that do?
Edit 2: maybe instead of \W* use \W? or not at all. It all depends if you need to match something like:
“(3,12)+4 // should match” <-- will match "+4 // " with \W* and "+4 " (yes a space at the end) with \W?
Edit 3: using [\+|\-] will make something like “(3,12)|4” match with “|4”

[quote=408837:@Alberto De Poo]This worked for all your samples:
[\+|\-]\d+\.?\d*\W*[/quote]

Sorry, but that matches a + or - anywhere regardless of what follows it, so it doesn’t work.

[quote=408836:@Greg O’Lone]How about:

[\+\-0-9.]+[/quote]

Sorry no, that matches numbers in other places.

[\+|\-]\d*\.?\d*\W*
matches + or - in your last 2 samples
[\+|\-]\d+\.?\d*\W*
didn’t match those

Sorry it didn’t work for you, it did work on my tests with your samples. Do you have a sample where it fails that I can test?

I have this code without a match:

[code]Dim re As New RegEx
Dim match As RegExMatch

re.SearchPattern = “[\+|\-]\d+\.?\d*\W*”
match = re.Search("(3,12)+m4")

Dim result As String
Do
If match <> Nil Then
result = match.SubExpressionString(0)
MsgBox(result)
End If

match = re.Search
Loop Until match Is Nil[/code]

[quote=408845:@Alberto De Poo][\+|\-]\d*\.?\d*\W*
matches + or - in your last 2 samples
[\+|\-]\d+\.?\d*\W*
didn’t match those

Sorry it didn’t work for you, it did work on my tests with your samples. Do you have a sample where it fails that I can test?[/quote]

Oops, my mistake! I had copied your corrected search pattern incorrectly. This seems to work:

[\+|\-]\d+\.?\d*\W*

Thank you very much for the help!

I recommend you read my notes above, about | and \W*.

To illustrate:
[\+|\-]\d+\.?\d*

[\+|\-]\d+\.?\d*\W*

Maybe [\+\-]\d+\.?\d* will be enough?

Unless you want to match | too, you should remove that from the character class. For readability, you can also do this:

[-+]

[quote=408850:@Kem Tekinay]Unless you want to match | too, you should remove that from the character class. For readability, you can also do this:

[-+][/quote]

Thanks, I was mistakenly trying to use | as an “or” operator. I thought - and + have special meanings in RegEx and must be preceded by a slash. Is that not true for character sets?

[quote=408849:@Alberto De Poo]I recommend you read my notes above, about | and \W*.
… Maybe [\+\-]\d+\.?\d* will be enough?
[/quote]

Oh, I hadn’t seen your notes (edits) at all before (since you added them later). I put “\W” there in the search in order to avoid matching “+pow”. It shouldn’t match a + or - if it’s not followed by a number, so I can’t see how [\+\-]\d+\.?\d* would be enough, but obviously I don’t fully understand RegEx.

So the corrected search should be, I think:

[\+\-]\d+\.?\d*\W*

I’m new to RegEx so I hope this information is correct:

[-+]\\d+\\.?\\d*\\W*

will search for a pattern that has

[+-] // + or - \\d? // 1 or more digits \\.? // 0 or 1 decimal point \\d* // 0 or more digits \\W* // 0 or more non-word character

If your source is:

(3,12)-15.22 // ; : & * this is a test

you will get:

-15.22 // ; : & * 

If you remove \W* then you will get:

-15.22

My guess is that:

[-+]\\d+\\.?\\d*

is better for you.

Within a character class, the only characters that have special meaning are ^, -, \, and ]. Here are the rules:

  • \ is the escape character.
  • If - or ] is the first character, it need not be escaped.
  • If - is the last character before the closing bracket, it need not be escaped.
  • If ^ is first character, it means “not”, so escape it if you mean, literally, the caret. Otherwise, it need not be escaped.
  • A literal backslash must always be escaped.
  • No other character need be escaped even if it has special meaning outside the character class, like $ or +.

To be on the safe side, just escape any of those four character when you mean for them to be taken literally.

Within RegExRX, the tokens within a character class will be represented as black when taken literally and in color if not. In other words, RegExRX knows and follows these rules.

Always admiring Master Kem…
Kem, where did you learn Klingon … sorry Regex ?
any (very) (good) book to recommend ?

When writing RegExRX I scoured the man pages and other PCRE documentation. Before writing it, I knew just the basics. By the time I was on the third or fourth revision, I had a much deeper understanding. Even then, a lot of it was trial and error.

But “master”? I appreciate the compliment, but I don’t think the person who is the master of regular expressions exists. :slight_smile: