MARKDOWN Class for Xojo™ - now available

I have recently completed an custom class for Xojo that translates Markdown code to HTML. This class is written 100% in Xojo and uses no declares, or custom helper utilities. It supports the basic CommonMark specifications as best I can tell, and has included a number of built in extensions allowing it to support Markdown syntax beyond what I have seen in other available classes.

A demo can be downloaded for Free,

Note : the Linux version has NOT been tested, as I have no machine to do so with
The class SHOULD work with Console and Web apps as well, but again I have no platform :frowning:

The test Markdown code used in the demo can be viewed here
the HTML results can be viewed here

The unencrypted source code will be available shortly for $69.00US

And I am soliciting feedback on more features that might be added.

Currently here is what is included

| Markdown Feature         |  
+--------------------------+ 
| H1...H6 headers          |  
| Horizontal Rule          |  
| Full Character Escapes   |  
| Typographic Replacements |  
| Bold Text                |  
| Italic Text              |  
| Strikethru               |  
| Insert/Underline         |  
| SuperScript              |  
| Subscript                |  
| [[Keyboard]] Highlight   |  
| Fractions                |  
| Block-quotes             |  
| Unordered Lists          |  
| Ordered Lists            |  
| Task Lists               |  
| Inline Code              |  
| Fenced Code              |  
| Tables                   |  
| Markdown Links           |  
| Auto-convert Links       |  
| Images                   |  
| Limited Emoji Support    |  
| Footnotes                |  
| Table of Contents        |  
| Printable Page Breaks    |  

also if for some reason you would like a 32bit version of the demo, let me know

Now supports custom containers…

The test Markdown code used in the demo can be viewed here
the HTML results can be viewed here

Impressive.

The Markdown syntax I am used to allows me to include html tags. Are you aware of that?

See here: https://spec.commonmark.org/0.28/#raw-html

Yours converts any < into < - that makes it impossive to add <div …> (to add color to text, for instance) or comments with .

…does it works with Mojave ?

[quote=401370:@Thomas Tempelmann]Impressive.

The Markdown syntax I am used to allows me to include html tags. Are you aware of that?

See here: CommonMark Spec

Yours converts any < into < - that makes it impossive to add <div …> (to add color to text, for instance) or comments with .[/quote]
I am aware of that… and I have read various articles that do and do not allow rawHTML…
This is something I need to look into… the issue is the ability to determine what is syntax correct, and to not be mis-interpeted, since the < and > signs are used within Markdown as well.

I see no reason why not… since the output is pure HTML, and I doubt Mojave is changing how that works.

In regards to Raw HTML… as I mentioned various other Markdown parsers handle it in different ways, from not allowing it at all , to allowing only a “whitelist” of elements, to those that have a full blown HTML lexer involved. Meaning there is not real “standard”

So my Markdown Parser takes this approach, which I think allows the most flexibilty,

Right now markdownDS supports “code fences”, to this I added a “language” option. If you specifiy HTML as the language, then anything between the code fences is taken AS-IS, where otherwise it is rendered “safe” and wrapped in a line number presentation box.

```` HTML
<div><span>This code is rendered to markdown with NO changes</span></div>
````

where this would be rendered as discrete text

```` 
<div><span>This code is rendered to markdown with NO changes</span></div>
````

The demo will be updated in the next hour or two,

NOTE : it now also supports HTML style comments… ANYWHERE in the Markdown (does not have to be in an HTML fence)

I am also going to experiment with allowing

to “infer” an HTML fence as well

Note : This DID require a macOS declare to turn off SMARTDASHES etc.

I disagree: CommonMark clearly specifies this feature, as does the original Markdown spec.

The fact that others do not support this is clearly against both those specs. So, while some may omit this feature, the standard is quite clear on wanting to support this.

So, the least you should do is state that your implementation does not support this part of the standard instead of saying it’s not a clear part of the standard.

Also, seeing you intend to look into support this now for

, please note that the specs say it should work with almost anything that looks like an html tag. I, for instance, also use and

for this purpose, and it works with the MBS Marddown class that I currently use. And, of course, "

needs to be seen as a html tag as well.

I agree that this can be quite difficult to parse correctly, though.

I think 100% of the cases where this isn’t implemented in the interpreter boil down to this, really.

Supporting HTML means either taking it all in and not caring about the output (which I think could be a disclaimer but wouldn’t help much less knowledgeable people) or implementing various degrees of parsing a second language (the 1st being markdown) that can be orders of magnitude more complex.

I’d choose a middle ground: Make sure that all opened tags are closed (in proper nesting order) before the block is finished and ignore what the tags may be or do. It’s still a lot of work but it provides the functionality while putting the burden of HTML validation on the user.

OK… I think I have it!! :slight_smile:
Its a mixture of what I had said earlier today (HTML code fence) and what Thoma and Eduardo said.

The markdown preprocessor looks for a qualified set of HTML elements. If it has not previously found a qualifiyng one, it starts a fenced area and proceeds until it finds a closing one at the same level (this allows for nested DIV for example)

Once it has inserted the “fences” it then processes it as I described above. So far it seems to work just fine.

here is a list of the elements if looks for

  • ADDRESS
  • ARTICLE
  • ASIDE
  • BLOCKQUOTE
  • CANVAS
  • DIV
  • DL
  • FIGURE
  • FOOTER
  • FORM
  • MAIN
  • NAV
  • NOSCRIPT
  • SCRIPT
  • OL
  • P
  • PRE
  • SECTION
  • TABLE

The opening tag MUST start the markdown line.
the closing tag “can” be anywhere, but if it is NOT the end of the line, then any HTML following will not be dealt with properly

<div>test</div> // this works
<div><div>test</div><div> // this works
<div>test</div><span>xxx</span> // the span will not be processed 
<div>test</div>
<span>xxx</span> // but here it will be

All the files for markdownDS have been updated on my server with the latest changes
this includes the custom containers and support for Raw Html

Great work.

BTW, if you want to be correctly compliant with CommonMark (I don’t care for it, though), this line is NOT a comment you should leave as that:

That’s because of the “–” inside. I don’t see a good reason for it (other than that they might thing it would make parsing easier), but they made that intentional.

Another thing: In your sample markdown text, shouldn’t each paragraph (separated by two newlines) be separately enclosed in

tags? Because they aren’t. I wonder if you broke that with your latest changes.

[quote=401545:@Thomas Tempelmann]BTW, if you want to be correctly compliant with CommonMark (I don’t care for it, though), this line is NOT a comment you should leave as that:

[/quote]

Not sure I understand what you are getting at… Every online markdown editor that I tried either rendered that as a comment, or was an editor that didn’t understand raw html.

I will look into that. I think now it is just rendering it as “text”… but you are correct it should be wrapped in “p” tags.

I’m again referring to the CommonMark spec.

See CommonMark Spec vs. example 597

and it seems to boil down to “just because”… yet that same editor breaks (my opinon) if there are blank lines in a mult-line comment…

Yep, it makes not much sense. Just pointing it out because you mentioned CommonMark conformity. I don’t care how you solve it either way :wink:

They don’t even conform to themselves… :slight_smile: and Dingus is the only editor I found that does that… and even then not consistently

and for this I’m going with HTML method… ends… regardless of whats between

Commonmark does this because it’s invalid HTML, for stupid reasons unrelated to them.

This is one of the quirks of the history of HTML. In standards mode HTML5 tries to play nice with XML, and thus to be a valid subset of XML it forbids double-hyphens in comments to make it compatible with XML, which in turn forbids double-hyphens in comments to be compatible with SGML.

XML:
https://www.w3.org/TR/REC-xml/#sec-comments

SGML:
http://www.howtocreate.co.uk/SGMLComments.html#doubledash

HTML4:
https://www.w3.org/TR/html4/intro/sgmltut.html#h-3.2.4

HTML5:
https://www.w3.org/TR/html5/syntax.html#comments

Commonmark, thus, tries to play nice with HTML5 by being compliant with an exception to comments existing in XML which in turn exists to not upset SGML parsers (so, essentially, this is a “feature” to be compatible with the least used language of the family).

The suggestion of HTML5, which at least gives an option, is to add a space whenever two consecutive dashes are found within a comment but don’t terminate the tag. This, I think, is a good option to adopt for a parser: Convert – to either “- -” or to “—” (an em-dash).

It has to be tackled because Firefox, for example, chokes on double-dashes in comments in standards mode.

but again not consistent… at least my interpetation is

foo <!-- 
foo
 --

boo
-->

foo <!-- foo --boo -->

resolves (per the spec , assuming the Dingus editor used by the above examples is “correct”

<h2>foo <!--
foo</h2>
<p>boo
--></p>
<p>foo <!-- foo --boo --></p>

yes… it changed it to “safe” html instead of comments… but uh… where did the

come in?

Nice!

I look forward to when you release it :smiley:

you can download the demo at the links posted above