Before We Begin
There is a lot of information in this section, so before you start reading it I want you to think about what kind of e-book you're making and why you're making it. The answers to these two questions will determine what material you need to understand and what you can safely skip. Some answers I can think of are:
- You have some handouts that you have created in MS Office and rather than printing them off and getting them photocopied you'd like to make PDF's out of them and distribute those to your students.
- You have some students with reading problems. You'd like to make Plain Text files from your handouts so these students can use Read Etexts, which supports Text To Speech with word highlighting.
- You have textbooks that you'd like to convert to e-books so your students don't have to lug them around. You don't care if the e-book is laid out exactly the same way on the screen as it is on the page, as long as the words and pictures are all there.
- You want to make Plain Text versions of your textbooks for your students with reading problems.
- You own some lavishly illustrated children's books and you'd like to make e-books out of them for your students. It is important that the e-book pages look exactly like the book pages.
- You own some lavishly illustrated children's books and you would like to make e-books from them to donate to the Internet Archive.
- You own some lavishly illustrated children's books older than 1923 and would like to donate the books themselves to the Internet Archive so the experts there could make e-books out of them.
- You own some books copyrighted before 1923 and you'd like to make Plain Text e-books to donate to Project Gutenberg.
- You own a copy of White Shadows In The South Seas, published 1919, which you willing to scan and OCR for Project Gutenberg if only you could get some help with all the proofreading that would require.
- You've written a textbook yourself and you'd like to make an EPUB out of it.
- You want to collaborate with other teachers to create a textbook, and hope to get it translated into several languages.
- You teach a class where the students all have XO laptops and nothing else and you'd like to have the students make some simple e-books using just those computers.
From a technical standpoint, converting a document you created yourself into an e-book is trivial. It is no more difficult than saving a document made in one word processor into the format used by a different brand of word processor.
The website Booki provides a way to create e-books in collaboration with other authors and get those books translated into multiple languages. This very book you are reading was created using Booki.
Making an e-book out of a printed book is more difficult and more work than converting your own work into an e-book. You need to turn printed pages into images, turn images of text into text, proofread everything and correct several kinds of errors that will inevitably come up. Making an e-book to donate to Project Gutenberg or the Internet Archive is more work than making one for your own use. However, the results can be well worth the effort.
Every kind of e-book can be made with free software that is easy to use. In the chapters that follow I begin with the easiest possibilities (creating an e-book from a document you made) and finish with the more difficult ones. If you aren't planning to create an e-book from a printed book the first chapters may be the only ones you need to read.
I will explain how to do every task using Windows and Linux. Much of the software I'll talk about is available for the Macintosh as well, so if you have one you should be able to figure out how to do things there too. Most of the software we will use was originally written for Linux and later adapted to the other platforms. It is no more difficult to use than other Windows software. Sometimes I will explain tricks that only work in Linux, but I will always provide an alternate method for Windows. Linux is an operating system for those who like to open the hood and tinker. If you are a teacher some of your more difficult students may one day fall into this category. These tricks are for them, and may safely be ignored by others.
If you have a Macintosh and want to install and run software described here you may need to use Mac Ports, which you can learn more about here:
I'm not a Mac user so I won't be able to give detailed advice on installing these programs on a Macintosh.
Don't be intimidated by the amount of information in the chapter on scanning books. In the end all you're doing is taking pictures of the book pages with a digital camera, then rotating, cropping, and cleaning up those pictures. The detailed information in this chapter will make that process as painless as possible.
Some of the chapters have very short Python programs in them. Don't be put off by these. Like all other computer programs they are meant to save you work, and they will if you give them a chance.
Python programs can be run on Windows, the Macintosh, or Linux. Linux is the simplest, because Python is used so much on that platform. A typical Linux install will have Python installed by default. For Windows and the Mac you can download Python here:
The version you want will be Python 2.7.1. Python versions starting with 3 probably will not work. Don't be concerned that you aren't using the latest version of Python. At this time Python 3 is not widely used. When it is more mature I'll rewrite these programs to use it.
The proofer.py utility requires PyGTK. While there is a PyGTK download for Windows, you'll need to use Mac Ports to get it on the Macintosh. PyGTK is included with every Linux distribution.
To download and install PyGTK for Windows you'll need to follow the instructions here:
On Windows a version of GTK+ is included with The GIMP install, but is not adequate for running PyGTK. You'll need to uninstall it, install the new GTK+ bundle, and replace the PATH entry for GTK to point to the new one. If that sounds like a lot more work than you normally go through to install a Windows program, it is. You may find running proofer.py on Windows more trouble than its worth. The other Python programs should still be useful on Windows.
The Python programs themselves can be downloaded here:
One trick for downloading them is to click on the program name on this page, which will give you a formatted listing of the code. When you get that look to the upper right of that listing for a link named Raw blob data. Click on that to download the program.
To download all of the programs look for a link named Download master as tar.gz. That will give you an archive file that you can open with 7-Zip.
A simple way to run these programs is to put Python in your system path (see http://www.computerhope.com/issues/ch000549.htm for instructions for Windows), put the program in the directory where the files you'll be working on live, make that your current directory, and run a command line like this:
python programname.py arguments