HTML > PDF

I stumbled into the world of PDF critics this week.

Some criticism from as far back as 2001 (emphasis in original):

PDF was designed to specify printable pages. PDF content is thus optimized for letter-sized sheets of paper, not for display in a browser window. I often see users getting lost in PDF because the print-oriented viewer gives them only a small peephole on a big, complicated layout and they can’t scroll it in the simple, linear manner they are accustomed to on the Web. Instead, PDF files often use elaborate graphic layouts and split the content into separate units for each sheet of print. Although this is highly appropriate for printed documents, it causes severe usability problems online.

PDF pages lack navigation bars and other apparatus that might help users move within the information space and relate to the rest of the site. Because PDF documents can be very big, the inability to easily navigate them takes a toll on users. PDF documents also typically lack hypertext, again because they are designed with print in mind.

In a recent study of how journalists use the Web, we found that PDF files sometimes crashed the user’s computer. This happened most often to journalists working from home on low-end computers (especially old Macs). The more fancy the company’s press kit, the less likely it would get quoted.

Because PDF is not the standard Web page format, it dumps users into a non-standard user interface. Deviating from the norm hurts usability because, for example, scrolling works differently, as do certain commands, such as the one to make text larger (or smaller). Also, after finishing with a PDF document, users sometimes close the window instead of clicking the Back button, thus losing their navigation history. Although this behavior is not common, it is symptomatic of the problems caused when you present users with a non-standard Web page that both looks different and follows different rules.

Not all of this holds up almost two decades later. PDF’s are quicker to load now than they once were, for example, just because computers and the Internet are faster now. Memory constraints are not nearly as strict either, so it’s rare that a PDF will crash a computer or even a program.

That doesn’t mean every criticism is now obsolete. From gov.uk this year (emphasis added):

GOV.UK exists to make government services and information as easy as possible to find and use.

For that reason, we’re not huge fans of PDFs on GOV.UK.

Compared with HTML content, information published in a PDF is harder to find, use and maintain. More importantly, unless created with sufficient care PDFs can often be bad for accessibility and rarely comply with open standards.

We’ll continue to improve GOV.UK content formats so it’s easy to create great-looking, usable and accessible HTML documents.

We also intend to build functionality for users to automatically generate accessible PDFs from HTML documents. This would mean that publishers will only need to create and maintain one document, but users will still be able to download a PDF if they need to. (This work is downstream of some higher priorities, but is on the long-term roadmap).

I encourage you to read the gov.uk piece, as it covers persisting problems with this format. I’m not always a partisan in these debates, but in this case I’m persuaded.

I do think that for exchanging documents in emails, PDF is still a good option. Anti-PDF posts often grant that PDF can be better than HTML for printing as well: it provides a clear print preview and is supported consistently across flavors of printer hardware and software. But for online reading, HTML wins.

Leave a Comment