MARKDOWN Class for Xojo™ - now available

  1. 7 months ago

    Dave S

    16 Aug 2018 San Diego, California USA
    Edited 7 months ago

    I have recently completed an custom class for Xojo that translates Markdown code to HTML. This class is written 100% in Xojo and uses no declares, or custom helper utilities. It supports the basic CommonMark specifications as best I can tell, and has included a number of built in extensions allowing it to support Markdown syntax beyond what I have seen in other available classes.

    A demo can be downloaded for Free,

    Note : the Linux version has NOT been tested, as I have no machine to do so with
    The class SHOULD work with Console and Web apps as well, but again I have no platform :(

    The test Markdown code used in the demo can be viewed here
    the HTML results can be viewed here

    The unencrypted source code will be available shortly for $69.00US

    And I am soliciting feedback on more features that might be added.

    Currently here is what is included

    | Markdown Feature         |  
    +--------------------------+ 
    | H1...H6 headers          |  
    | Horizontal Rule          |  
    | Full Character Escapes   |  
    | Typographic Replacements |  
    | Bold Text                |  
    | Italic Text              |  
    | Strikethru               |  
    | Insert/Underline         |  
    | SuperScript              |  
    | Subscript                |  
    | [[Keyboard]] Highlight   |  
    | Fractions                |  
    | Block-quotes             |  
    | Unordered Lists          |  
    | Ordered Lists            |  
    | Task Lists               |  
    | Inline Code              |  
    | Fenced Code              |  
    | Tables                   |  
    | Markdown Links           |  
    | Auto-convert Links       |  
    | Images                   |  
    | Limited Emoji Support    |  
    | Footnotes                |  
    | Table of Contents        |  
    | Printable Page Breaks    |  

    also if for some reason you would like a 32bit version of the demo, let me know

  2. Dave S

    17 Aug 2018 San Diego, California USA
    Edited 7 months ago

    Now supports custom containers...

    The test Markdown code used in the demo can be viewed here
    the HTML results can be viewed here

  3. Thomas T

    18 Aug 2018 Pre-Release Testers, Xojo Pro Europe (Germany, Munich)
    Edited 7 months ago

    Impressive.

    The Markdown syntax I am used to allows me to include html tags. Are you aware of that?

    See here: https://spec.commonmark.org/0.28/#raw-html

    Yours converts any < into &lt; - that makes it impossive to add <div ...> (to add color to text, for instance) or comments with <!-- ... -->.

  4. Emile S

    18 Aug 2018 Europe (France, Strasbourg)

    …does it works with Mojave ?

  5. Dave S

    18 Aug 2018 San Diego, California USA

    @ThomasTempelmann Impressive.

    The Markdown syntax I am used to allows me to include html tags. Are you aware of that?

    See here: https://spec.commonmark.org/0.28/#raw-html

    Yours converts any < into &lt; - that makes it impossive to add <div ...> (to add color to text, for instance) or comments with <!-- ... -->.

    I am aware of that.... and I have read various articles that do and do not allow rawHTML..
    This is something I need to look into.... the issue is the ability to determine what is syntax correct, and to not be mis-interpeted, since the < and > signs are used within Markdown as well.

    @Emile S …does it works with Mojave ?

    I see no reason why not...... since the output is pure HTML, and I doubt Mojave is changing how that works.

  6. Dave S

    18 Aug 2018 San Diego, California USA
    Edited 7 months ago

    In regards to Raw HTML.... as I mentioned various other Markdown parsers handle it in different ways, from not allowing it at all , to allowing only a "whitelist" of elements, to those that have a full blown HTML lexer involved. Meaning there is not real "standard"

    So my Markdown Parser takes this approach, which I think allows the most flexibilty,

    Right now markdownDS supports "code fences", to this I added a "language" option. If you specifiy HTML as the language, then anything between the code fences is taken AS-IS, where otherwise it is rendered "safe" and wrapped in a line number presentation box.

    ```` HTML
    <div><span>This code is rendered to markdown with NO changes</span></div>
    ````

    where this would be rendered as discrete text

    ```` 
    <div><span>This code is rendered to markdown with NO changes</span></div>
    ````

    The demo will be updated in the next hour or two,

    NOTE : it now also supports HTML style comments.... ANYWHERE in the Markdown (does not have to be in an HTML fence)

    I am also going to experiment with allowing <DIV> </DIV> to "infer" an HTML fence as well

    Note : This DID require a macOS declare to turn off SMARTDASHES etc.

  7. Thomas T

    18 Aug 2018 Pre-Release Testers, Xojo Pro Europe (Germany, Munich)
    Edited 7 months ago

    @Dave S In regards to Raw HTML.... as I mentioned various other Markdown parsers handle it in different ways, from not allowing it at all , to allowing only a "whitelist" of elements, to those that have a full blown HTML lexer involved. Meaning there is not real "standard"

    I disagree: CommonMark clearly specifies this feature, as does the original Markdown spec.

    The fact that others do not support this is clearly against both those specs. So, while some may omit this feature, the standard is quite clear on wanting to support this.

    So, the least you should do is state that your implementation does not support this part of the standard instead of saying it's not a clear part of the standard.

    Also, seeing you intend to look into support this now for <div>, please note that the specs say it should work with almost anything that looks like an html tag. I, for instance, also use <span> and <p> for this purpose, and it works with the MBS Marddown class that I currently use. And, of course, "<div style="color:red"> needs to be seen as a html tag as well.

    I agree that this can be quite difficult to parse correctly, though.

  8. Eduardo G

    18 Aug 2018 Pre-Release Testers Europe (Madrid, Spain)

    @ThomasTempelmann I agree that this can be quite difficult to parse correctly, though.

    I think 100% of the cases where this isn't implemented in the interpreter boil down to this, really.

    Supporting HTML means either taking it all in and not caring about the output (which I think could be a disclaimer but wouldn't help much less knowledgeable people) or implementing various degrees of parsing a second language (the 1st being markdown) that can be orders of magnitude more complex.

    I'd choose a middle ground: Make sure that all opened tags are closed (in proper nesting order) before the block is finished and ignore what the tags may be or do. It's still a lot of work but it provides the functionality while putting the burden of HTML validation on the user.

  9. Dave S

    18 Aug 2018 San Diego, California USA
    Edited 7 months ago

    OK.... I think I have it!! :)
    Its a mixture of what I had said earlier today (HTML code fence) and what Thoma and Eduardo said.

    The markdown preprocessor looks for a qualified set of HTML elements. If it has not previously found a qualifiyng one, it starts a fenced area and proceeds until it finds a closing one at the same level (this allows for nested DIV for example)

    Once it has inserted the "fences" it then processes it as I described above. So far it seems to work just fine.

    here is a list of the elements if looks for

    • ADDRESS
    • ARTICLE
    • ASIDE
    • BLOCKQUOTE
    • CANVAS
    • DIV
    • DL
    • FIGURE
    • FOOTER
    • FORM
    • MAIN
    • NAV
    • NOSCRIPT
    • SCRIPT
    • OL
    • P
    • PRE
    • SECTION
    • TABLE

    The opening tag MUST start the markdown line.
    the closing tag "can" be anywhere, but if it is NOT the end of the line, then any HTML following will not be dealt with properly

    <div>test</div> // this works
    <div><div>test</div><div> // this works
    <div>test</div><span>xxx</span> // the span will not be processed 
    <div>test</div>
    <span>xxx</span> // but here it will be
  10. Dave S

    18 Aug 2018 San Diego, California USA

    All the files for markdownDS have been updated on my server with the latest changes
    this includes the custom containers and support for Raw Html

  11. Thomas T

    19 Aug 2018 Pre-Release Testers, Xojo Pro Europe (Germany, Munich)

    Great work.

    BTW, if you want to be correctly compliant with CommonMark (I don't care for it, though), this line is NOT a comment you should leave as that:

    <!-- a -- b -->

    That's because of the "--" inside. I don't see a good reason for it (other than that they might thing it would make parsing easier), but they made that intentional.

    Another thing: In your sample markdown text, shouldn't each paragraph (separated by two newlines) be separately enclosed in <p> tags? Because they aren't. I wonder if you broke that with your latest changes.

  12. Dave S

    19 Aug 2018 San Diego, California USA

    @ThomasTempelmann BTW, if you want to be correctly compliant with CommonMark (I don't care for it, though), this line is NOT a comment you should leave as that:

    <!-- a -- b -->

    Not sure I understand what you are getting at.... Every online markdown editor that I tried either rendered that as a comment, or was an editor that didn't understand raw html.

    @ThomasTempelmann Another thing: In your sample markdown text, shouldn't each paragraph (separated by two newlines) be separately enclosed in <p> tags? Because they aren't. I wonder if you broke that with your latest changes

    I will look into that. I think now it is just rendering it as "text"... but you are correct it should be wrapped in "p" tags.

  13. Thomas T

    19 Aug 2018 Pre-Release Testers, Xojo Pro Europe (Germany, Munich)

    @Dave S Not sure I understand what you are getting at.... Every online markdown editor that I tried either rendered that as a comment, or was an editor that didn't understand raw html.

    I'm again referring to the CommonMark spec.

    See https://spec.commonmark.org/0.28/#example-596 vs. example 597

  14. Dave S

    19 Aug 2018 San Diego, California USA

    and it seems to boil down to "just because".... yet that same editor breaks (my opinon) if there are blank lines in a mult-line comment...

  15. Thomas T

    19 Aug 2018 Pre-Release Testers, Xojo Pro Europe (Germany, Munich)

    Yep, it makes not much sense. Just pointing it out because you mentioned CommonMark conformity. I don't care how you solve it either way ;)

  16. Dave S

    19 Aug 2018 San Diego, California USA

    They don't even conform to themselves... :) and Dingus is the only editor I found that does that... and even then not consistently

    and for this I'm going with HTML method.... <!-- starts and --> ends... regardless of whats between

  17. Eduardo G

    19 Aug 2018 Pre-Release Testers Europe (Madrid, Spain)
    Edited 7 months ago

    @ThomasTempelmann Yep, it makes not much sense. Just pointing it out because you mentioned CommonMark conformity. I don't care how you solve it either way ;)

    Commonmark does this because it's invalid HTML, for stupid reasons unrelated to them.

    This is one of the quirks of the history of HTML. In standards mode HTML5 tries to play nice with XML, and thus to be a valid subset of XML it forbids double-hyphens in comments to make it compatible with XML, which in turn forbids double-hyphens in comments to be compatible with SGML.

    XML:
    https://www.w3.org/TR/REC-xml/#sec-comments

    For compatibility, the string " -- " (double-hyphen) must not occur within comments.] Parameter entity references must not be recognized within comments.

    SGML:
    http://www.howtocreate.co.uk/SGMLComments.html#doubledash

    Note, XML (and therefore also XHTML when served using an XML based content-type) took the sensible step of making it not valid to have -- inside a comment

    HTML4:
    https://www.w3.org/TR/html4/intro/sgmltut.html#h-3.2.4

    Authors should avoid putting two or more adjacent hyphens inside comments

    HTML5:
    https://www.w3.org/TR/html5/syntax.html#comments

    If the XML API restricts comments from having two consecutive U+002D HYPHEN-MINUS characters (--), the tool may insert a single U+0020 SPACE character between any such offending characters.

    Commonmark, thus, tries to play nice with HTML5 by being compliant with an exception to comments existing in XML which in turn exists to not upset SGML parsers (so, essentially, this is a "feature" to be compatible with the least used language of the family).

    The suggestion of HTML5, which at least gives an option, is to add a space whenever two consecutive dashes are found within a comment but don't terminate the tag. This, I think, is a good option to adopt for a parser: Convert -- to either "- -" or to "—" (an em-dash).

    It has to be tackled because Firefox, for example, chokes on double-dashes in comments in standards mode.

  18. Dave S

    19 Aug 2018 San Diego, California USA
    Edited 7 months ago

    but again not consistent... at least my interpetation is

    foo <!-- 
    foo
     --
    
    boo
    -->
    
    foo <!-- foo --boo -->

    resolves (per the spec , assuming the Dingus editor used by the above examples is "correct"

    <h2>foo &lt;!--
    foo</h2>
    <p>boo
    --&gt;</p>
    <p>foo &lt;!-- foo --boo --&gt;</p>

    yes... it changed it to "safe" html instead of comments.... but uh... where did the <h2> come in?

  19. brian f

    20 Aug 2018 Pre-Release Testers, Xojo Pro Chilly California

    Nice!

    I look forward to when you release it :D

  20. Dave S

    20 Aug 2018 San Diego, California USA

    you can download the demo at the links posted above

  21. Newer ›

or Sign Up to reply!