RegEx: How do I get rid of surrounding () and [] ?

RegEx question.

I found some ways to get things in a string with the RegEx example.

Prelude:
When I want to add a new feature in a running project, I create a brand new project and build a prototype. In this case, I only add a ListBox in that project and teh code is in Window1.Open. I keep in mind the environment where the code will be pasted later. That is why I set the search string: in the main project (see the screen shot), the search string is a file name read from a folder. *

The string to search in is:
“Star Trek - The Next Generation 02 - [huckyc] [2012-01-15] (Miss page 15).7z”

How to get the date (SQLDate format):
Search Pattern: “\d\d\d\d-\d\d-\d\d”
Found string: 2012-01-15

How to get a word with 10 characters:
Search Pattern: “\w\w\w\w\w\w\w\w\w\w”
Found string: Generation

How to get text between two html tags ?
Just read the doc and you will be far more than happy.

Unfortunately, I do not found a way to get text between brackets (huckyc is just an example, I have many other available data string, not always the same length.

Also, I was unable to find the information between parens ‘(’ and ‘)’.

Any help is welcome.

Of course, I can search specifically using InStr or NthField for these, but when will I learn to use RegEx.

Worst, I started with RegEx searching the SQLDate from my search string and it tools me only minutes to do so. Now, after 4 hours (3:16AM to 7:45AM) I do not have advanced by a single bit !
(excepted for the “search a word who is i characters long !)

I FOUND SOMETHING:
Search Pattern: “\x28.\x29" or "\(.\)”
Found string: (Miss page 15)

But how do I exclude the open and close parens ?

8:00 AM discovery:
Search Pattern: “\[.*\]”
Found string: [huckyc] [2012-01-15]

I remember I read something to avoid to get from first [ to the last ] in the docs.

I got it ! I got it ! I just have to be a bit greedy !

Search Pattern:	"\\[.*?\\]"
Found string:	[huckyc]

BTW: I still get the useless [] and ().

The question now is: How can I remove them (beside using a Mid() function) ?

Here is a screen shot of what I am actually doing:

Yes, we can have more than two background colors on a ListBox. As you may understand, it is the missing of an icon on Column 0 that displays a magenta background.
Also, Lines that have a SQLDateTime does not have a date in their file names.

At last, you can see the enclosed strings in columns “Scanner” and “Warning”.

  • I just realize that in fact, this is the second spin off for the main project: how to use RegEx to set data into a project who displays how to report data from a folder full of files who is in fact a spin of of a larger folder !

Just a remark : in your title you probably wanted to write ‘How to get rid of’. In English, a ride is something else :wink:

yes Michel. I tend to add (these last weeks) an ending e or a (but no o, i nor u) for some words. I even think that sometimes my finger(s) write the text alone.

Easy Rider…

Google is awesome…

http://stackoverflow.com/questions/2403122/regular-expression-to-extract-text-between-square-brackets
http://stackoverflow.com/questions/3812055/how-to-remove-square-brackets-in-string-using-regex

Those are for square brackets, but easily modifiable for normal ones.

So, use the .replace function of a regex object.

You need to use subexpressions. See the entry on subexpressions at http://documentation.xojo.com/index.php/RegEx

Eric, Markus: thank you.

Thanks.

Yes, awesome, when Internet is on ;-:).

[quote=130631:@Emile Schwarz]Thanks.

Yes, awesome, when Internet is on ;-:).[/quote]

I remember when I had to go to MacDonald’s…

dim rx as new RegEx

// Between [ and ]
rx.SearchPattern = "\\[([^\\]]*)"

dim textBetweenBrackets as string

dim match as RegExMatch = rx.Search( searchText )
if match <> nil then
  textBetweenBrackets = match.SubExpressionString( 1 )
end if

// Between ( and )
rx.SearchPattern = "\\(([^)]*)"

dim textBetweenParens as string

match  = rx.Search( searchText )
if match <> nil then
  textBetweenParens = match.SubExpressionString( 1 )
end if

Kem, surely you can combine this into one search …? :wink:

Certainly.

Thank you Ken, Markus.

I do not need to get them into one search since this comes for two different fields.

Unfortunately, the success of the searches depends on the source validity. When you get data in a different format, the search cannot be successfull.

In a list of files (files I get data from a folder), I discovered fields enclosed with parenthesis where I expect brakets…, fields not enclosed with parenthesis, dates in the calendar format, not SQLDate, and so on… For this set of files, I was able to modify manually the orignal file names because there was less than 10 misformed file names (on more than 200 files). This cannot be done manually.

There is nothing we can do in these cases, Kem ?

I forgot to mention that to remove the first and last characters of a string Mid() is easy to use and fast.

That is what I added yesterday. Now, I will go back to a full regex code.

Thank you pals.

Unfortunately, regular expressions depend on the existence of a pattern within the source text. If no such pattern exists, they are of no use.

And here it is.

((\\()|(\\[))((?(2)[^)]|[^\\]])*)

:slight_smile:

[quote=130797:@Kem Tekinay]And here it is.

((\\()|(\\[))((?(2)[^)]|[^\\]])*) [/quote]
I only see brackets ]-]

Counted them three times and got a different result each time :slight_smile:

I should point out that the text will be in SubexpressionString( 4 ) with this pattern.

I was thinking that but had a little hope to be wrong…

BTW: the first thing I found while trying to use RegEx is the SQLDate search: I wrote it just like if I knew what I was doing ! And it worked ! ("\d\d\d\d-\d\d-\d\d"). I was watching carefully the scarce doc when I do that.

Markus: do you use this kind of search ?

RegEx searches? Sure. And I use Kem’s excellent RegExRX app to help me get it right.

RegExRX is a really great app for trying to get RexEx right. Quick and easy :smiley: