Regex Extract Between HTML Tags

Having read a number of forum posts, and online examples I thought the below would give me the ‘table’ within a HTML page returned from a socket to data (the text is in data fine). Instead it gives me no match;

Dim rg As New RegEx
    Dim myMatch As RegExMatch
    rg.SearchPattern="<table>(.*?)<\/table>"

    myMatch = rg.Search(data)

    If myMatch <> Nil Then
      txtdata.Text = myMatch.SubExpressionString(0)
      'html.LoadPage(mymatch)
    Else
      txtdata.Text = "Text not found!"
    End If

    Exception err As RegExException
      MsgBox(err.Message)

Any suggestions?

.* will stop at the end of the line unless you change options/switches, so my guess is you’re trying to match against multiple lines. Try [\s\S]* instead. (That’s one of a few solutions.)

1 Like

Also, although it doesn’t hurt, / does not need to be escaped. PCRE doesn’t care, but some engines would give you an error for that.

1 Like

Thanks as always Kem, that works perfectly, although sadly I am no nearer understanding the vagaries of regex, perhaps I need to find a book on the subject.

1 Like

Or get a tool for creating them… lemme see if I can remember the name…

RegExRx?

I use it two or three times a week and find that I’m learning new things about regex every time I use it :stuck_out_tongue:

3 Likes

Since you mention it…

2 Likes

Happy RegExRX user here, even though I still pester Kem from time-to-time when I can’t wrap my head around something.

1 Like