Have anyone of you developed an HTML Parser?
The thing that I need to do is to read an HTML page and extract some div with a fixed ID or Class.
Anyone of you can tell me if there is some already made thing?
Thanks to all
You could do this with a regular expression, something like:
<div .*id=\"put your id in here\".*>(.*)<\\/div>
the first match would be the contents of the div.
you can read the source text of the HTML Page
Document.Complete
me.ExecuteJavaScript "window.status = document.getElementsByTagName('html')[0].innerHTML;"
Status Changed
dim s as String = newStatus
s = NthField(s, "<div id="""xxxxx", 2)
s = NthField(s, "</div>", 1)
MSGBox s
Regex is the way to go, I scrape content from HTML that way all the time.
But remember to not overdo it http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags/1732454#1732454
I’ve developed HTML parser library using Xojo, similar to XMLDocument and has ability to extract contents of some tags using xpath. But, if you just analyze a html with same structure only, it’s more easy to use regex function.