This time, I am going to work with a real document, not with any of the thousands of packages available on ctan.org. I am going to convert an existing text file of a book into a Latex – PDF file. I was inspired to do this after I watched a professional bookbinder on Youtube print and bind a book starting from a PDF. The starting file I will use is the text file for a children’s book called Whitefoot the Wood Mouse by Thornton W. Burgess. Burgess was a favourite author of mine when I was in elementary school. He wrote more than 170 books and over 15,000 stories. Some of the books are available at gutenberg.org. First I need to have a text file for the selected book. It is available in three different epub formats on the Gutenberg site as well as two Kindle formats, and a html and a plain text file which is the one I want. The text file needs to be edited somewhat. The Table of Contents needs to be removed because Latex will build a new one. I removed the Gutenberg Licence since I plan to print and bind one copy as a gift. I saved the txt file as an odt file using LibreOffice. Then I used the Writer2Latex filter extension to convert the file to the tex format. Once the file has been converted, I opened the tex file in my preferred editor, TexStudio, to do the cleanup. Of course I could do some of the cleanup in the odt file before converting it to a tex file. I could also edit the tex file LibreOffice generated in a text editor but I like the advantage of the color coding in TexStudio. I find it easier and faster to work with the tex file instead of the odt file. The first thing I did was generate the pdf file using TexStudio before I started editing. I wanted to be sure it was error free before I started deleting unnecessary code. Good thing I compiled first. LibreOffice inserted two Latex commands to make the Table of Contents clickable but there was an error somewhere in the hypersetup command. Since I intend to print the generated pdf file on paper, any hypertext links are irrelevant. So I commented out both the hypersetup and the usepackage{hyperref} commands from the generated file. Since that cleared the error I then deleted those two commands. Remember each time you make a change to the Latex commands, regenerate the PDF. This will check for errors in the code and it will also save the current code. Doing this step after every change is a little time consuming. However, I learned long ago when making changes to code (or just about anything) change one thing then make sure it works as you want, then change the next one thing and check again. There were a number of “length” commands which were generated by LibreOffice in order to be sure the geometry of the Latex-generated PDF document would match the page style used in LibreOffice. I will set page size, printed area size, and other page geometry options later in the Latex process, so I removed those length commands. There was a short section of footnote rules and another section of page styles. Both are not needed so I removed them. Then I generated the PDF again to check for any errors. This time there were no errors. Next I made necessary corrections to the title and author in the preamble. That made the title and author in the body redundant so they were removed. I noticed every paragraph of the text started with {\ttfamily{ and ended with a closing curly brace. While those defined the paragraph, they were unnecessary, because every paragraph was separated from the next one with at least two CR/LF (two presses of the <enter> key). Two CR/LF codes is the necessary indicator in Latex to separate paragraphs. This ttfamily command also sets the font for all the text to “tt” – a monospaced font family which is inappropriate unless you desire text that looks as if it were made with a typewriter. I eliminated \ttfamily{ …} for the first two paragraphs to make sure this was safe. No problems I could see. So I used Search and Replace in TexStudio and deleted them all. As I scrolled through the tex file, I saw the \bigskip command many times. This is used by Latex to make large vertical areas of white space. I assume the Writer2Latex extension converted multiple CR/LF codes to bigskip. Since I will use other Latex instructions to control white space, I removed the bigskip commands as well. Again I generated the PDF and scrolled through it to make sure there was no inappropriate white space and no long paragraphs. (This is a children’s book, so paragraphs should be short, two or three sentences. If you are working with a different file then adjust your editing accordingly.) This finished the majority of the cleanup. Then it was time to make this text file look more like a book. At this point in the editing process, the PDF was 33 full letter-size pages. The document class was article, and the default type size was 10pt. This is a book for children. It was originally printed and bound in a size suitable to be held by small hands. Like most books, each chapter should start on a separate page. I changed document class to book and changed the default font to Noto Serif. In order to take advantage of the built-in heading formatting, I styled each chapter heading as a chapter. Search and Replace was my friend to do this. The original text has the word CHAPTER at the beginning of each one. It is easy to change the word CHAPTER to the command \chapter for each one. Be aware of three things. Chapters are only allowed in the document class book so change that first. The text for each chapter needs to be inside curly braces so each chapter heading needs personal attention. Latex automatically numbers chapters. If the text you work with has chapter numbers, as Whitefoot the Wood Mouse did, you should probably delete them when you edit the chapter command. Note: If you do not want the automatic chapter numbering then change the \chapter command to \chapter*, i.e. add an asterisk after the word “chapter”. This also removes the chapter from the Table of Contents. Many of the chapters in Whitefoot the Wood Mouse have two-line poems at the beginning. The text from these poems started with either a curly brace or a backslash, both will cause problems when the file is compiled. As I edited the chapter commands, I also removed the troublemakers. Those two-line poems reminded me of a package I reviewed in FCM #212 which formats small bits of text at the beginning of chapters. The package is called epigraph and is very simple to use. However there were a lot of epigraphs in this little book. Adding the code so each one was coded as an epigraph took time. These changes had to be done individually. Using Search and Replace could have eliminated all the backslashes and curly braces in the file, not just the ones for the epigraphs. That would have caused more work than I wanted to think about doing. I adjusted the leading (spacing between lines), and added a paragraph indent. After I use the geometry package (next issue), I will look at a printed page to find out if I think my work is usable by primary school students. I am also considering using drop caps, large initial capital letters at the beginning of the first paragraph of each chapter. Perhaps I can find some illustrations to add to the text. These are tasks for the time. Here is the code (with a few comments) for the preamble so far followed by the code to make an epigraph. \documentclass[letterpaper]{book}
%book document class means pages will be printed on both sides %each chapter will start on a right hand page
\usepackage[]{epigraph} %See code below and accompanying image \usepackage[regular]{noto-serif} %An initial font change \usepackage{setspace} %Sets space environment between lines \usepackage[indent=3em,parfill=1em,skip=\baselineskip]{parskip} \title{Whitefoot the Wood Mouse} \author{Thornton W. Burgess} \date{\today} \begin{document}
\maketitle \tableofcontents
\begin{spacing}{1.3} %Gives entire document extra space between lines \chapter{Whitefoot Spends A Happy Winter} To make an epigraph I used the following: \epigraph{You never can tell! You never can tell! Things going wrong will often end well.}{Whitefoot} There are images of an epigraph and of the beginning of a chapter with this column. For the second part of this project, I am going to add a drop cap in the first paragraph of each chapter to add a little visual interest. I will alter the page geometry to get the file ready for imposition. Imposition is a process used to put the text on the correct printed pages in order to bind them into a book. If you do not need or want to make your file into a printed book, then imposition is not necessary. However you might want to change the page geometry to make the PDF look more like a printed book. If I can find some suitable illustrations I may include them. I hope you will join me next time.