trying to extract HTML code from HTMLviewer

i am have a simular objective as this
https://forum.xojo.com/38031-extract-text-from-an-htmlviewer/0#p310106

extracting HTML code from an HTMLViewer with no luck.
I have tried the MBS method using MBS 8.3 and i get type errors with “WebViewMBS”
this method requires adding 3 addtional plugins to xojo/plugins which i would like to avoid to keep lower over head.

i tried the java-script method and sub class method but also have not been able to get it work.
i am not completely understanding the concept of javascript in the subclass and how it will refence my HTMLviewer … ?

please advise.

The method in that thread no longer works out of the box. window.status is not a solution anymore, and you have to chunk data out through the title. Chances are you don’t need a HTMLViewer, so I recommend using a HTTPSocket to download text from the internet. http://documentation.xojo.com/index.php/HTTPSocket

Tim

It is unfortunate that the window.status no longer works.
i will try Socket method and get back either way

thanks for pointing me in the right direction.

hi, using mbs plugin can you try

  dim  webSat , cadSat As String
  dim posi, posf as Integer
  
  #If TargetWin32
    webSat=htmpage.IEHTMLTextMBS
  #ElseIf TargetMacOS
    webSat=htmpage.mainFrameMBS.dataSource.data
  #endif

The plugins have dependencies.

Whether you need them or not, the compiler needs them to build.
They probably don’t all end up in the app.

But you need MacControls, MacCocoa, MacBase and Main at least, I think.

Dependencies are listed here:
https://www.monkeybreadsoftware.net/plugindeps.shtml

Christian yes i know MacControls, MacCocoa, MacBase, and Main are needed, i received an email from you on this.

Tim,

I have tried the Socket method and put the code into a Shared Method.
Actually it does retrieve a page. the bad news it it retrieves a page with a code 300 ( this page has moved )
when using HTMLViewer the exact page is loaded.
The code i am using is not able to download the exact page i wanted. because it has moved.
the next question is how can i emulate a browser using Sockets. Perhaps a Form, maybe certain http headers ?

Here is my current Socket Code that gets a page (300) code:

Dim http As New HTTPSocket Dim data As String = http.Get(URL, 30) http.Yield = True return data

A HTTP status code 30x (page has moved) is not something emulating a browser will fix. The classic framework HTTPSocket leaves more things to the Xojo developer, and will not automatically follow redirects. You can find the URL you’re being directed to in the headers, and follow that.

If you wish to use a namespaced framework Xojo.Net.HTTPSocket it will follow redirects (and store cookies, which can be handy for emulating a browser). It is, unfortunately, a little more complicated to use with the classic framework. http://developer.xojo.com/xojo-net-httpsocket

1 Like

[quote=403386:@David Cullins]I have tried the Socket method and put the code into a Shared Method.
Actually it does retrieve a page. the bad news it it retrieves a page with a code 300 ( this page has moved )[/quote]

When you get status 30x the server will also include a “response header” named “Location” which will have the new location for the page. This is what a browser (or HTMLViewer) sees and just automatically sends another request to get that page. You just need to do the same thing yourself.

For example, see this thread.

Or as Tim suggested, use the new framework Xojo.Net.HTTPSocket which uses HTTP 1.1 and can do it for you. But it isn’t hard to do by yourself either with a classic HTTPSocket.

Thx Douglas,

I actually did download the 302 Moved Page, Parsed the new URL and followed the URL using the Classic HTTPSocket and i got a blank page. I further tried to manually paste this new URL in a browser and also got a blank page.

it looks like i’ll be learning about Xojo.Net.HttpSocket but it is much more complex as Tim Noted.

For future Xojo Updates it would be handy if a new property could be added to HTMLviewer such as HTMLviewer.SourceCode
it would make like a lot easier. After all , all browers have a sourcecode option.

Thanks to everyone for help

I will be asking a lot of questions about Net.HttpSocket unless i get lucky and figure it out …

Hi @David Cullins

In order to solve the 3xx problem, try to use HTTPSecureSocket instead of the HTTPSocket.

Javier

I asked that very same question some weeks ago (err 300 / Relocated URL) and get a good answer (change the transfer protocol).
That was with the Classic framework.

Search for this Conversation; you are fortunate, I found that conversation here:
https://forum.xojo.com/48656-301-error-using-a-documentation-example

Andrew Lambert advice was: Dim socket1 As New HTTPSecureSocket socket1.ConnectionType = SSLSocket.TLSv12 Dim data As String = socket1.Get("http://www.xojo.com/", 30)

And that works.

Unfortunately, this never goes to the documentation. :frowning: