HTML Parser

Have anyone of you developed an HTML Parser?
The thing that I need to do is to read an HTML page and extract some div with a fixed ID or Class.
Anyone of you can tell me if there is some already made thing?
Thanks to all

You could do this with a regular expression, something like:

<div .*id=\"put your id in here\".*>(.*)<\\/div>

the first match would be the contents of the div.

you can read the source text of the HTML Page


me.ExecuteJavaScript "window.status = document.getElementsByTagName('html')[0].innerHTML;"

Status Changed

dim s as String = newStatus
s = NthField(s, "<div id="""xxxxx", 2)
s = NthField(s, "</div>", 1)
MSGBox s

Regex is the way to go, I scrape content from HTML that way all the time.

But remember to not overdo it


I’ve developed HTML parser library using Xojo, similar to XMLDocument and has ability to extract contents of some tags using xpath. But, if you just analyze a html with same structure only, it’s more easy to use regex function.