htmlViewer.documentComplete - well not quite

James_Pitchford · March 15, 2016, 5:51pm

I’m using htmlViewer.documentComplete event to trigger a capture of the resultant web page utilising RenderWebsiteImageMBS.

I couple this with htmlViewer.cancelLoad to prevent loading of any URL’s not originally requested, so that I only get one documentComplete event.

However, in many cases, I find that the document is not, infact, complete at that time and the renderWebsiteImageMBS can end up filling a blank NSImage.

I have tried to fix this using a DelayMBS x, and am already up to 0.2seconds delay before the pictures seem to be reliably rendered. Not sure if this is a function of machine speed or internet speed.

Does anyone have a more reliable way of determining that the loaded document is really complete?

Michel_Bujardet · March 15, 2016, 6:41pm

Document complete may fire several times on certain sites. The way I worked around it is by using a single timer and push it at every occurrence in DocumentComplete, then at the last one the timer fires and I use its action event instead.

James_Pitchford · March 15, 2016, 7:05pm

Thanks Michel - I’ll give that a try though it does look like the length of the timer would need to be dependent on internet speed - which is a variable…

Michel_Bujardet · March 15, 2016, 7:11pm

If you make it something like 100-200 you should be fine. It is much less critical than a simple delay. You can even go to 500 without it feeling strange to the user.

Tim_Parnell · March 15, 2016, 7:13pm

What kind of results do you get if you track http://documentation.xojo.com/index.php/HTMLViewer.DocumentProgressChanged instead?

James_Pitchford · March 15, 2016, 8:02pm

Much the same Tim. The progressChanged =100% event always seems to fire before the documentComplete event.

James_Pitchford · March 15, 2016, 10:30pm

I am using a separate window to build the htmlViewer in and then to capture the rendered picture. It seems the delay needed to ensure the viewer is complete is different according to whether the viewer window is visible or not.

For a visible window, a delay of less than 0.1seconds seems to reliably catch a completed rendering, but if the window is set to visible = false, then the delay needs to go out to beyond 0.5 seconds to reliably get a complete rendering.

Strange huh?

The challenge of course is that I have only tested a small number of websites. Other sites may take longer. It would be good if this stuff was a little more deterministic.

Norman_P · March 15, 2016, 10:49pm

page received ? page rendered as you assume it does

the page html etc is downloaded
at that point the page received event is raised
then its handed to the rendering engine which does its thing with it
there is no “page fully rendered” event which is what you’re using page received as - but thats not correct

not sure we have a way to know when it is rendered

James_Pitchford · March 15, 2016, 10:54pm

Ah, that explains it Norman. Thanks.

Then the delay required is a function of the processing speed, rather than internet speed, and presumably the size/complexity of the web page being downloaded. Hmm…

Will_Shank · March 15, 2016, 11:08pm

Workaround idea: after you get the initial image scan a diagonal of the pixels. If they are all white then run another 0.1 second timer to repeat until there’s non-white pixels.

Norman_P · March 15, 2016, 11:10pm

[quote=253543:@James Pitchford]Ah, that explains it Norman. Thanks.

Then the delay required is a function of the processing speed, rather than internet speed, and presumably the size/complexity of the web page being downloaded. Hmm…[/quote]
Very likely

And there is no “web page rendered” event even from the underlying objects

So knowing when the page has fully rendered is tricky as heck for sure

Michel_Bujardet · March 15, 2016, 11:29pm

Here is DocumentComplete on the site http://cnn.com :

12:27:37 AM Window1.HTMLViewer1.DocumentComplete document complete http://widgets.outbrain.com/nanoWidget/3rd/comScore/comScore.htm#pid=235 12:27:38 AM Window1.HTMLViewer1.DocumentComplete document complete http://ads.pubmatic.com/AdServer/js/showad.js#PIX&ptask=DSP&kdntuid=1&SPug=true&p=35675&predirect=http%3A%2F%2Fsync.rhythmxchange.com%2Fusersync%2Fpubmatic%2F&it=0&np=0 Window1.HTMLViewer1.DocumentComplete document complete http://ads.pubmatic.com/AdServer/js/showad.js#PIX&ptask=DSP&kdntuid=1&SPug=true&p=35675&predirect=http%3A%2F%2Fsync.rhythmxchange.com%2Fusersync%2Fpubmatic%2F&it=0&np=0 12:27:39 AM Window1.HTMLViewer1.DocumentComplete document complete about:srcdoc Window1.HTMLViewer1.DocumentComplete document complete about:blank Window1.HTMLViewer1.DocumentComplete document complete about:blank Window1.HTMLViewer1.DocumentComplete document complete http://aspen.turner.com/static/xdm_iframe.html?xdm_e=http%3A%2F%2Fedition.cnn.com&xdm_c=default4455&xdm_p=1 12:28:48 AM Window1.HTMLViewer1.DocumentComplete document complete about:blank Window1.HTMLViewer1.DocumentComplete document complete about:blank 12:39:07 AM My Application Ended

That is just an example of where the timer workaround I use works reliably. A simple delay will never do.

James_Pitchford · March 16, 2016, 12:01am

I’ll give Will’s and Michel’s a go and let you know.

Peter_Job · March 16, 2016, 3:30am

[quote=253547:@Norman Palardy]
And there is no “web page rendered” event even from the underlying objects

So knowing when the page has fully rendered is tricky as heck for sure[/quote]

Not that this is any help, and it relates to Windows, but using a MS Webbrowser control, you get a “navigationcomplete2” event when all is done. Is something similar not available in the “underlying objects”?

Norman_P · March 16, 2016, 3:45am

Navigation Complete isn’t quite the same as “and its now showing” which is what James needs

James_Pitchford · March 16, 2016, 11:16pm

Well, Michel’s method certainly works - and is simple - but in my app it can cause long delays whilst the app is rendering a bunch of web pages.

Next to try Will’s method - which looks like it should cut the time down a bit.

Michel_Bujardet · March 16, 2016, 11:51pm

From what I can gather, it may just capture the page as it is not fully displayed, but as soon as it displays something else than white. Which means “not quite”.

If you want the page really complete, it’s ready when it’s ready. Not an instant before…

James_Pitchford · March 17, 2016, 2:01am

It looks like you are right Michel. Whilst Will’s method is fast at getting fairly static websites - often around 100msecs to complete rendering, more complicated websites - e.g. with embedded wordpress etc, don’t work. Since part of the is website loaded first and detected as a non-black pixel and therefore rendered partially complete.