HTMLviewer Text

I am sorry about this but I have read some of the post relating to HTMLviewer and I still cant work out how to grab the text from a HTMLviewer…

I tried to follow some of the replies to question about getting stuff from a HTMLviewer but no complete code is shown and some of the links are broken.

I cannot use plugins cannot afford them.
So any answer must be free to use but this is not to be used in a commercial sense.

Any help appreciated.

Ian.
(I have the original html text if that helps.)

When you say “text” do you mean the actual HTML, or just the text as seen on the screen when the HTMLviewer renders the text within it?

If the former then you could do:

myhtmlviewer.ExecuteJavaScript ("document.title = document.documentElement.innerHTML;")
This will cause a TitleChanged event for the HTMLviewer where you can pick up the result. I just tested this, it seems to work.

If you want the latter, you may have to just use the method here and parse the output to get what you want.

I have the original HTML that is use in the HTMLviewer but I need to get the text so I can process it.

Thanks for your Reply.

Ian

better not assign to title as that may truncate it.

Better run JavaScript and return it.
e.g. via HTMLViewer.HTMLTextMBS function if you use MBS Xojo Plugins.

I just did another test. Instead of .innerHTML, as above, try the exact same but with .innerText instead.

Christian is right in that a problem with the existing events is that what you get back may be incomplete (limited no. of chars). But maybe Xojo will change that, who knows. It’s been a long-standing problem and is what caused me to circumvent the issue with a WebSocket Server in my app, but that’s a bit heavy-handed.

Hi,

Thanks for everything you have done but I have a stupid question (as usual)

Where do i put the code… A button or something else.

Ian

Have put this in the action of a button…and nothing happens

htmlviewer1.ExecuteJavaScript (“document.title = document.documentElement.innerTEXT;”)

Ian

  1. innerText not innerTEXT

  2. You have to implement the TitleChanged event on the HTMLViewer. That is where you handle the incoming event. It has a parameter (newTitle or something) which contains the output.

Thank you very much Tim,

I placed the lines

htmlviewer.ExecuteJavaScript (“document.title = document.documentElement.innerText;”)
textarea1.value = newtitle

and low and behold the textarea1 filled with the text from the HTMLviewer.

Ian

Actually that was a bit of a surprise to me that this worked, I’d not come across documentElement before. And even more so that this was a simple way to get the HTML and Text back from an HTMLViewer.

Now, you may wish to check that this works on macOS, Windows, and Linux (anyway, whichever ones you care about :slight_smile: ). Also, I’d be inclined to check for length limits. Put a big chunk of raw text (several kbytes) in your HTML, and see if you can get it all back. When testing this a year or so ago I was sure that with TitleChanged, I was limited to 4kbytes on Window, no limits on macOS. Also, the other similar event for Window.StatusChanged, didn’t work at all on Windows.

Have just done a quick count and one email was more than 256 words…

Thank again.

Ian

Ian,

more details there:

https://www.w3schools.com/Jsref/prop_node_innertext.asp

Thanks for the info, Emile.
Ian

Happy to help.

Note all the other technologies there @ www.w3schools.com.