Open-source .epub vs. Adobe .pdf

Over the past few days I’ve been reading up on e-book file formats. I have a collection of short stories I want to publish, and I have a working understanding of the technology that readers will use to embrace that content, but until recently I haven’t worried too much about delivering content to that technology. (The main reason for my delay is simply the pace of change. Time spent trying to understand or master e-content technology six months ago would have put me at buggy-whip risk.)

As luck would have it, Mark Coker just released data about the file formats most in use on Smashwords, his e-publishing site. At the same time, Joel Friedlander pointed me to a useful video tutorial about formatting content using Adobe’s InDesign software, which seems to be the tool of choice for many people. From these two sources of information I was able to understand and easily navigate the first fork in the road on my own publishing journey.  

Like any author, what I want is for my text — the words I’ve written using whatever tools I’ve chosen to use — to be available to as many people as possible. That’s going to be my main, unchanging goal, no matter what else happens in the future. Because of the technological time we live in, reaching that goal means providing my text in various file formats so it can be accessed by the end user. Ideally there would be only one file format for publishing text, and it would be open source — meaning no one would own or control that particular format, and anyone could use it without having to pay a per-use fee or buy a proprietary application. (For obvious reasons, this is not the preferred course for companies looking to profit from the dissemination of text.)

According to Mark Coker’s file-format data covering the past year on Smashwords, Adobe’s proprietary .pdf file format was the most-used format at 35%, followed by the open-source .epub format at 22%. Mark also noted that this was a change from the previous year, when .epub beat .pdf handily.

Why would a proprietary format beat out an open-source format? In this case the answer has as much to do with the demands of the content being published as it does with functionality of the file formats being used. As I recently learned, the .epub format’s strength is that it creates reflowable text — meaning text that adjusts itself depending on the size of the display, the font being used (if the user is able to change fonts), the size of the text, and various other variables.

From the point of view of many authors, however, this is also .epub’s weakness. If what you are publishing is simply a long string of text — as most fiction tends to be — then .epub works fine. If your content includes tables, images, sidebars and other layout-specific elements, then .epub quickly becomes a nightmare because you cannot control when and how these elements will display across all of the various e-readers and viewing applications.

The .pdf file format solves these layout-specific problems because it creates a static image — a picture — of each page. From the author’s point of view this is a godsend, because content will always display the same way for every user. For users, however, there is a downside. Precisely because .pdf text is not reflowable, it will not resize to fit each device or user setting. This means some users on some devices will need to zoom in and out to clearly see things like captions, table data, or sidebar text that may be in a smaller font. All of the information will be present as the author intended, but if the original page was 9 inches high by 6 inches wide, and the end user is looking at that same content on a Kindle or iPhone, there’s probably going to be some zooming involved — provided the device supports that functionality.

Because the stories I want to publish are straight text, the reflowable .epub format not only meets my needs as an author, but it provides the most transparent reading experience for end users. That’s a win-win for me because I don’t have to make any trade-offs between my own authorial needs and the end-user reading experience. Having said that, the appeal of the .pdf format is clear because it preserves all the work an author puts into page layout and structure. If I had content that was dependent on images, data or layout, I’d have to decide whether to use .pdf, or how to translate all of those assets into .epub-friendly equivalents. Ugh.

As a follow-up, I encourage you to watch the InDesign tutorial I mentioned above. I learned a lot in the few short minutes it took to watch, and I think it will give you valuable insight into these issues. It will also introduce you to the learning curve you’ll be facing if you decide you want to do some of the more complex stuff yourself.

As for me, I won’t be buying InDesign any time soon, for three reasons. First, at $699 it’s pricey. Second, the life of a successful writer will be defined as much by keeping costs under control as by anything else, so there aren’t going to be a lot of dollars going out until there are a lot of dollars coming in. Third, if I ever decide the software is worth having, I’ll still want to compare the total cost (in time and money) of buying it, learning how to use it and paying for future upgrades with the total cost of having someone else provide that service. If I can get the end result cheaper I’ll go with the service: if not, I’ll buy the software.

It should be noted that it is not necessary to buy InDesign, or any Adobe application, in order to create a .pdf file from most commonly created source text. Adobe’s proprietary tool for creating .pdf files is called Acrobat, and it currently sells for $299 in the U.S. The application that most computer users use to read .pdf files is called the Acrobat Reader, and it is distributed free — meaning anyone can freely read content that someone else has created as a .pdf. What is less commonly known is that the OpenOffice suite of applications includes a .pdf writer in its Write application, allowing documents created in a wide variety of file formats (including MS Word .doc) to be exported as .pdf files. More info here (scroll down).


This is a
cross-posting from Mark Barrett‘s Ditchwalk.