As part of my (seemingly never ending) quest to create a cross-platform Markdown parser for Xojo (iOS module done - GitHub) I need to access the contents of a HTMLViewer in desktop projects. Is this possible? Seems like it should be but I can’t see a property or method in the docs to let me do this. I don’t want to use a plugin. I don’t mind declares but it must work on macOS, Windows and Linux.
Any help is appreciated.
do you want to use MBS Plugin for desktop.
First you will notice that you have to have different codes for
- Windows with IE
- Windows with Chrome
- Linux with WebKit
- Mac with WebKit
The current method of getting data out of a HTML Viewer is to push the data through window.status so you can capture it in the StatusChanged event. If MBS is an option I would really recommend it, I went through a lot of trouble to make sure HTML Edit can get the data out on both Mac and Windows reliably (more an issue on Windows, really.)
According to StackOverflow you should be able to get the source out quite easily. The only thing you won’t get is the doctype apparently.
window.status = document.documentElement.innerHTML; and then capture the result in StatusChanged.
You may want to try the headless browser approach as well, you can get results back from it through shell without hacking HTML viewer.
I’m still really surprised that Markdown is hard to parse, it seems so simple
(I don’t disbelieve you, I’m just legitimately surprised.)
@Christian Schmitz As I said in the question, I don’t want to use plugins. I’m trying to do this for two reasons:
1). It should be doable in Xojo
2). I want to open source the module and I can’t do that idealistically if the code requires a proprietary license.
I have no issues with your plugins Christian - mostly they are excellent. In fact, I own a full license to them already. I’m also aware that you have a Markdown plugin but that has a couple of issues:
1). I don’t know what renderer you’ve ported but it doesn’t correctly parse John Gruber’s seminal syntax text (link) and it also doesn’t support various newer elements of Markdown such as code fences, tables or checklists
2). See point (2) above.
@Tim Parnell The hack to get the source code via the StatusChanged event seems too convoluted. Not sure the approach would work as the value I’m interested in (the source code) would be available asynchronously in the StatusChanged() event making it difficult for me to utilise. Essentially I want to be able to grab the contents synchronously like:
theContents = myHTMLViewer.contents
Is this doable with declares? It seems bananas that it’s not possible.
Regarding Markdown parsing - I’m really starting to get fed up of it! Here’s a brief summary of what I’ve been doing and why it’s hard:
1). I need a truly cross-platform (desktop and iOS) way to parse Markdown. This rules out plugins because they don’t work on iOS
2). Even if iOS wasn’t an issue (technically it’s not anymore because I’ve published an open source working fast Markdown parser for it that’s essentially a wrapper for remarkable.js) I can’t use the MBS plugins because of the reasons cited above as well as the fact that they don’t offer feature parity with the iOS implementation.
3). I tried (and have had moderate success with) writing my own parser in native Xojo. It made heavy use of RegEx. Trouble is it’s slow, incomplete and I’ve reached an impasse.
You mentioned the HTMLEdit control. How does that get the contents of an HTML viewer?
I’ve read this topic a few times, and am confused as to the intent.
If you wish to get the “HTML” behind a webpage to perform some processing on… why not bypass the HTMLVIEWER all together, and just use a socket to download the page into a string variable directly?
Or am I missing something else here?
I also got one markdown engine for a client running in HTMLViewer already.
[quote=307203:@Garry Pettet]@@Tim Parnell The hack to get the source code via the StatusChanged event seems too convoluted. Not sure the approach would work as the value I’m interested in (the source code) would be available asynchronously in the StatusChanged() event making it difficult for me to utilise. Essentially I want to be able to grab the contents synchronously like:
theContents = myHTMLViewer.contents
Is this doable with declares? It seems bananas that it’s not possible.[/quote]
Hours of fiddling around with StatusChanged and dealing with the shortcomings intricacies of Windows.
As for RegEx to parse, I don’t understand why it would be slow. I don’t use Markdown regularly, so I frequently check this reference which makes it seem like there isn’t really too much to it. This is where my confusion comes from. I haven’t researched it or anything, just my quick glance feels like it shouldn’t be hard.
It would seem that the most efficient way for a cross platform solution is to do just the same as what you do in iOS : use HTMLViewer.
On macOS, it is possible through declares to do exactly the same as the iOS declare.
See Shao Sean contribution here https://forum.xojo.com/14481-should-changing-an-htmlviewer-s-window-status-work-in-windows/0
OK. I think I’ve got my head around how to use JS to push the contents of a div into window.status and then retrieve it. Trouble is, I can’t get it to work.
Here’s a link to a completely stripped down example project - link.
The project has two modules: (1) Shakespeare (the module for using JS to parse Markdown) and (2) MarkdownKit (my Xojo native attempt at a Markdown parser). Running the app will let you try to parse with either Shakespeare or MarkDownKit.
Basically, Shakespeare creates an instance of my HTMLViewer subclass (BetterHTMLViewer) and loads a string constant containing a bare bones HTML document, the embedded remarkable.js script and a custom JS function to call remarkable.js. The constant is stored in Shakespeare.MARKDOWN_CONVERTER_HTML.
You’ll notice that no answer is given when using Shakespeare. In fact, the StatusChanged event never seems to fire.
What am I doing wrong here?
Tested adding actual HTML elements to the Shakespeare constant.
It’s not loading up, so there’s a bug in your code somewhere.
Still looking, just thought you might like a status update.
@Tim Parnell Thanks a lot for taking a look. Much appreciated. Dumping the contents of the constant into a .html file and visiting it in a browser seems to work so I can’t see the error. Then again, I’ve been staring at this project too long!
Oh. I see.
You can’t use HTML Viewer to render anything without actually displaying it.
I had done testing with that when people wanted to print HTML Edit contents without MBS, we ended up gathering the contents, opening a new HTML Viewer and printing it.
That’s different than iOS it would seem. The same trick works on iOS by only instantiating iOSHTMLViewer in code. I hadn’t realised that the desktop HTMLViewer must be instantiated in the IDE.
Hmmm. I feel like I’m SOL with this approach.
@Christian Schmitz Fancy wrapping Pandoc in a plugin? The Markdown engine is way better than the one you currently have (no offence) plus as a bonus, it offers a bunch of other conversion formats…
I just looked at the iOS project.
If you use the macOS Method I linked to above, you can do exactly the same.
window.status = xxx;
I could even be conditional based on platform.
Pandoc is GPL, so I can’t use it in a plugin.
(and you not in a commercial app!)