Read HTMLviewer contents

I want to be able to grab a web page, copy it’s contents to clipboard, and search for one item on that page (it’s from a music site that lists a song’s key). I can get the page up easy enough on an HTMLviewer but don’t know how to grab the text within.

Any help would be appreciated as it would allow me to get the key signatures of over 1000 songs.

Add this method to a module:

Public Function HTML(extends h as DesktopHTMLViewer) As String
  Return h.ExecuteJavaScriptSync( "document.documentElement.outerHTML;" )
End Function

Call as:

var result as String = HTMLViewer1.HTML
2 Likes

I’ll try it!

If you want the (more or less) plain-text content:

Public Function PlainText(extends h as DesktopHTMLViewer) As String
  Return h.ExecuteJavaScriptSync( "document.documentElement.textContent;" )
End Function
1 Like

I do feel obligated to say that, instead of scraping from a web site which may be a violation of that site’s licensing or otherwise questionable, you should see if there is an API available that you can use to get the information you need. Maybe something like this?

Worked! But I first had to update the htmlviewer to Desktop version.

Thanks. As for the comment below that scraping a site may be a violation of licensing, etc., I hadn’t thought of that. I doubt highly it’s a problem since it’s a website of hymns I’m scouring but I’ll look into that.

Again, thanks. This is the type of thing an amateur programmer like me would never have figured out on my own.

Happy to help!

2 Likes