HTMLViewer to PDF (without using print mechanism)

In my desktop app I have an HTMLViewer that gets loaded up with html my app generates on the fly. Mostly it is text, but there are some images my app creates and saves to disk so that the HTMLViewer can load them and display them in the HTML document as well.

I need to find a way to convert the displayed page in the HTMLViewer into a PDF file so that I can push that file on to a webservice, send it via email, store it on their filesystem, or do whatever else my business requirements dictate.

I want the PDF creation to be completely automated in code, i.e., any solution that involves printing the HTMLViewer and having the user select a PDF printer or similar will be rejected - I need to handle the PDF creation completely in code with no user interaction at all.

What are my options?

Thanks!

– Kimball

May depend on what OS you’re talking about

Need solutions for both Win 7 and newer, and OS X 10.7 and newer.

I’m happy to employ different solutions for each OS - as long as the output PDFs are basically the same.

well on OS X there’s a handy little cmd line helper app
http://gavinballard.com/automatically-converting-html-to-pdf-on-mac/

cross platform?
maybe http://wkhtmltopdf.org

Or PhantomJS. Even with zero Javascript experience I was able to understand the code. See http://phantomjs.org .

Of course, MBS has functions, too. But you need to load the html into a htmlviewer and then print to pdf. It works, but you need to be careful with the timing and it’s rather slow (on the Mac side).

I had an idea. It might not be what you want. You could base64 the images and inline them in the HTML. No tools required, no user interaction, and would presumably work on all systems.

Thanks @Tim Parnell , but I need to produce a real PDF file.

@Norman Palardy - the command line tool you pointed me to is VERY interesting. It’s basically a very small objective-c wrapper that instantiates a browser in memory, loads up whatever page you point at it, then prints it to PDF using the built-in PDF creation libraries in OS X. All done in memory, and all automated. This is a perfect solution for OS X (the only thing better would be if I bothered to take the time to make the same library calls as this little tool and did it all inside my Xojo app… but I’m lazy. :wink: )

@Christian Schmitz - this tool looks interesting, but comes with an awful lot of overhead (i.e., on Windows, it bundles the VC++ portable library, etc.) I was sort of under the impression that your DynaPDF plugin could do this - but a cursory look at your docs did not yield anything obvious. Can DynaPDF do what I need?

@Beatrix Willius - I’ll definitely look into Phantom JS. When you say MBS has functions too - which ones? Do you mean in the DynaPDF library? Or are there functions in the other Xojo plugins?

Thanks everyone!

DynaPDF can’t convert HTML.

And all the webkit functions to make PDF are Mac only.

Could you create a PDF file instead of the HTML in the first place (using DynaPDF or similar) and then display the PDF in your HTMLViewer instead of showing the HTML? Then you’d be sure no matter what, people would always see the same thing and your PDF problem would be solved.

Of course, you’d have to rewrite your whole output routine…heh.

@Bill Gookin while waving my hand slowly across your field of view, channeling Luke Skywalker: That is NOT the solution I’m looking for…

Could you just “hide” the print in the background (e.g. https://forum.xojo.com/19961-shell-command-with-pdfcreator)?

@LangueR - essentially, yes, that is what I’d like to do, though I’m not sure that pdfcreator is the best tool for the job. PhantomJS looks like the frontrunner so far, though it has the disadvantage that is just renders the contents of the browser as a giant image, and then wraps that in a PDF. This makes for very large PDF files, and the text is not searchable.

On the other hand, it is rendered pixel-perfectly, so I don’t have to worry about fonts / layout issues at all.

Would the PrintToPDFFileMBS work (http://forums.realsoftware.com/viewtopic.php?f=1&t=47825&hilit=PDF)?

PrintToPDFFileMBS is Mac only. You can try it if you like.

I did try PrintToPDFFileMBS (as well as the renderToPDF flavor) and was only ever able to produce a blank page. I tried lots of different ways, but was never successful.

It looks like the winner for me so far is going to be PhantomJS. It’s fast, does a great job rendering the content, produces acceptably small pdf files, and is cross platform. Nearly ideal.

Thanks everyone!

Could you go at it from the other direction? Generate the PDF via code and display it in the app? That way there’s no HTML code to mess with.

You can use DyanPDF for viewing, but it that’s too pricey we’ve managed to use Monkeybread regular mac plugins on the Mac side and QuickPDF on the Windows side. Kind of fugly but it worked.

There are lots of places where the app is already generating the HTML for in-app display, and I’d REALLY rather not have to re-write all of that to produce PDF instead. I think PhantomJS is going to work well for my use case.

Thanks!

I have an app too that needs this functionality. Either the app already geneates the pages as HTML or the input source is HTML. I need Windows & Mac too, would love Linux but that is just a pipedream.

sb