Ceci est une ancienne révision du document !
A few months ago, I wrote an article on using LaTeX to easily manage and track a CV. I ended up using LaTeX instead of my first choice (Markdown + HTML Stylesheets), because I had a great deal of trouble getting the page sizing working properly. Since then, I’ve learned a fair bit more, and want to share my knowledge with you!
The Path Shortly after writing the article mentioned above, I heard about Adam Wathan typesetting the book “Refactoring UI” in Markdown and generating PDFs from those files. He informed me on Twitter that he was using Prince XML to compile the PDF files. Looking into it, I decided it was way too expensive for the occasional (commercial) use that I was planning. It did, however, indicate to me that this was possible. If you’re looking for a free tool for personal use, Prince does allow it, and only adds a small logo on the first page. Instead, I then headed to alternativeto.net and had a look at the alternatives to Prince XML. There were 3 options listed - wkhtmltopdf, PDFReactor, and WeasyPrint. PDFReactor also has a licensing cost associated with it, so I instead focused on the other two.
WeasyPrint My first look was WeasyPrint, as it looked the most similar to Prince XML. It takes a website, and turns it into a wonderful PDF. If you need to make brochures, or documents with images/diagrams/icons, this would probably be my recommendation. It’s not too complicated to set up and use, but it does require you to create the HTML file somehow, including all the assets and styles. Combining this with Tailwind CSS would probably be the fastest way to create a nice looking PDF. However, I wanted something similar - just a set of Markdown files that could be turned into basic text-only PDFs. wkhtmltopdf This engine can be used with Pandoc (which I have previously used to turn .docx files into Markdown), and can take Markdown files directly, and, with one command, generate the HTML and then the PDF. You can include CSS files and many other options. Admittedly, I haven’t found too many easy-to-follow guides, and I find their documentation to be confusing when you have little experience with pandoc. As a side note, pandoc also supports weasyprint.
Reddit to the Rescue While I had done a few tests, I didn’t have the time to invest in creating decent styles for either tool, especially since that was where I ran into the most issues originally. Instead, I put it on the back burner and continued working on my various other projects. That is, until the first week of February, where a user had posted on Reddit’s /r/unixporn subreddit. His setup included a very nice PDF generated from a very normal-looking Markdown file. Hunting through the comments, I found someone who had already asked the question of “how?”. Well…it turns out to have been pandoc + wkhtmltopdf. Following the dialog (and the recommendation of tufte-css), I have successfully compiled a few easy Markdown files into usable, readable PDFs.
Why? I’ve heard this question a lot when it comes to things I spend my time investigating. The answer for this one is also pretty standard - efficiency. As a developer, I often have to write documentation or make notes about some process or another. When I expect to have upwards of 5 pages of documentation (especially with images, an index, etc), I stick with Sphinx. This is extra useful, as I can output to LaTeX, PDF, ePub, or HTML (among other things). Depending on the needs of my client, I can then compile the same files into any combination of formats they might need. However, if I’m looking at maybe a single page of documentation, setting up Sphinx is a massive overkill for this kind of situation. Especially if it’s not for a client and I just want to keep track of some process I used. I tend to write this stuff into Markdown (even before I could compile it into PDF easily) because I sometimes want to collect various items together, or add it to my internal documentation (which is HTML created from Markdown). Therefore, writing the short notes into Markdown was always my first approach. Now I can compile the Markdown into HTML (as normal), but also into PDFs for longer-term storage or sharing.
I also find Markdown much faster to type and format than anything like Google Drive, Microsoft Word, or Pages documents, since formatting is taken care of with just a few characters, instead of memorizing ever-changing (between the various applications) shortcuts, or having to use the mouse to select individual styles and settings. Best of all, Markdown is repeatable. I can write a dozen documents, and format them all the same way at the same time with one CSS file.
How? This is surprisingly simple. Using the links in the Further Reading section, make sure you’ve downloaded the Tufte CSS file and fonts (or any CSS file you’d like to use), and save them somewhere. After that, make yourself a Markdown file you’d like to convert. Then use the command below: pandoc -f markdown -t html5 ./fcm-notes/pandoc.md –pdf-engine wkhtmltopdf –css tufte.css -o “pandoc.pdf” The options are pretty self-explanatory - f for the input format (“from”), -t for the target format (“to”), –pdf-engine for the engine to use, –css for the target CSS file, -o for the output file name. You can also get fancier by creating a script to watch a specific file, or a bash alias to speed up the process of compiling a file. Either way, this should get you started!
Future Now that I have the Markdown → PDF workflow working, I will see about using pandoc to convert Markdown into Doc formats. This way, I can start writing Markdown files for articles, and host them internally as a website for easy searching, instead of hunting through a folder of Word and google documents.
Conclusion While Pandoc can do an almost overwhelming amount of things, starting with a few simple (but powerful) options seems best. From there, you can move on to creating reveal.js slideshows, or any number of other formats. Have you ever used this? Or are you inspired to do so now? Feel free to share any awesome use-cases with me via email. Or reach out to me at lswest34+fcm@gmail.com with any recommendations, questions, or article requests.
Further Reading https://www.reddit.com/r/unixporn/comments/al1uge/i3wm_my_comfy_notetaking_setup/ - Reddit thread https://pandoc.org/MANUAL.html - Pandoc manual https://weasyprint.org/ - WeasyPrint https://github.com/edwardtufte/tufte-css - Tufte CSS repo