Careers   |   Store   |   Bestsellers
ECPA Wire: Industry Issues

Breaking Down the EPUB 3 Spec by Eric Freese, Aptara

Wednesday, October 5, 2011   (2 Comments)
Posted by: Sheri Toomb
Share |

Breaking It Down: The EPUB 3 Spec

By Eric Freese, Solutions Architect, Aptara and member of the EPUB 3 Working Group

EPUB is widely accepted as the de facto digital format standard for eBooks, with its signature reflowable text that can be read on the greatest variety of reading systems (including the iPad, nook/nookColor, Kobo, and Sony readers to name a few).

The much anticipated, upcoming revised edition of the IDPF standard, EPUB 3, will include new features that promise to greatly enhance the reader experience, such as embedded audio, video, and interactivity. Meanwhile, publishers hold out hope that the new and improved EPUB standard will rectify the frustration with EPUB 2.0.1 files behaving differently on different reading systems (which have led me and others, to stress the importance of testing your files on every intended device.)

With speculation abounding since the introduction of the spec in the spring of this year, publishers have been waiting to see what’s really possible with EPUB 3—and what reading systems will support it. To help manage expectations and alleviate confusion, I’ve provided a brief snapshot of some of the spec‘s new features, as well as what publishers can do to start preparing for them, and notes of caution as to what may, or may not, be available in EPUB 3 reading systems.


There has been some confusion as to whether HTML5 and EPUB 3 will work together. To set the record straight, HTML5 is the base language of EPUB 3 (with some minor adjustments to allow for pagination and other reading behaviors). Since EPUB 3 con- tent is written in HTML5, the two will interact hand-in-hand.

EPUB 3 reading systems must be able to process XHTML files written in HTML5. This doesn’t mean that web browsers will be able to display EPUB files, unless they are able to process the additional navigation information contained within the EPUB file. That being said there are some reading systems that are imple- mented within a browser environment.


The new baseline for style sheets is CSS2.1 with some CSS3 features added. This will provide much richer layout including multi-column layout, better font support, and directional printing, to name a few. Reading systems are NOT required to support CSS, but almost all of them do. One of the leading causes of frustration is the difference in CSS support between reading systems. In the current EPUB environment, many reading systems do not allow style sheets within an EPUB file to override the system’s default settings. EPUB 3 does not do anything to alleviate this situation and, in fact, might exacerbate it somewhat due to the additional capabilities that are possible. Reading systems also have the ability to implement their own proprietary CSS extensions, which would then be ignored by other reading systems.


Audio can be inserted into eBook files using the HTML5 <audio> tag. This is what Apple, Barnes & Noble, and Amazon have been using all along to embed audio in enhanced eBooks. Now, it’s simply part of the EPUB 3 spec. Reading systems are NOT required to support audio, although many do. If a reading system supports audio, it must support MP3. In addition, support of MP4 AAC and media overlays (explained later) is optional.


Video can be inserted into eBook files using the HTML5 <video> tag. Again, this is what has already been occurring. And again, reading systems are NOT required to support video. In fact, most of the e-Ink devices are not able to show video in a satisfactory manner.

One of my main bones of contention with the EPUB 3 spec is that there is no specified format that must be supported. If a reading system supports video, the spec recommends support of at least one of either H.264 (also known as MPEG-4 AVC) or VP8 video compression formats, but neither is required. Unfortunately, the spec also does not say that some other format is not allowed. Essentially, there is nothing to stop a reading system developer from implementing some other video format (Flash?). Whether that happens remains to be seen, but there is an opening available.

In the meantime, publishers are going to need to prepare videos in both formats to support the widest range of reading systems. As has been discussed in the past, this could lead to very large EPUB 3 files, or different versions that target specific reading systems.

Media Overlays

Media overlay functionality was added to the spec to enable text and media to be presented in a combined manner. For example, highlighting text as it is spoken by the computer or as part of a sound track. In order to employ these overlays, special Synchro- nized Multimedia Integration Language (SMIL) files will have to be created.

Reading systems are not required to provide this functionality, but if they do, they should allow readers to skip or escape out of overlays. Overlays can also be used to provide text-to-speech functionality. The spec mentions the Pronunciation Lexicon Specification (PLS) and the Speech Synthesis Markup Language (SSML) as the means for providing assistance in generating synthetic speech, but does not require reading systems to use that information.


Scalable Vector Graphic (SVG) files have been allowed within EPUB files for some time. However, their use was limited, due largely to a lack of reading system support. EPUB 3 now mandates that reading systems be able to process SVG within the eBook, including allowing users to select text and search within the content of the SVG files. The only portion of SVG that is not allowed is the animation capability.


MathML is part of HTML5, and therefore, it is part of EPUB 3. Reading systems must be able to process the presentation form of MathML, but may also support the content form of MathML. I won’t go into a lot of detail here. But publishers that deal with mathematical and scientific content may be interested, as it will allow formulas to be included as part of the XHTML markup— rather than as images. This means the content will be scalable, among other things. It is still recommended that images of the formulas be included as fallbacks.

Foreign Resources

Foreign resources are pieces of content that are not a core media type. For example, PDFs might be considered foreign resources. I have seen cases where PDF files are incorporated into EPUB files. When this is done, at least one fallback (perhaps a plain text equivalent) should be included to allow reading systems that don’t support the resource to operate.


Scripting and interactivity is another of the most hyped new features of EPUB 3. Once again EPUB 3 gets this functionality through HTML5. Usually this means JavaScript but this is not the only option. While scripting could blur the lines between eBooks and apps, it should be noted that reading system support for scripting is NOT required. Furthermore, reading systems have the ability to place additional limitations on the capabilities provided to scripts for a variety of reasons, including security and process- ing capabilities. That being said, publishers should be thinking about possible ways that content can be made more interactive and beginning to plan for creating those enhancements. How- ever, they should also make sure that the reading experience is not adversely affected if a reader decides to turn scripting off, or if a reading system does not provide it.


EPUB 3 created a Canonical Fragment Identifier (EPUBCFI) specification for creating and accessing various locations within the content. This allows very fine grained access to the content, even at the word or phrase level. The use of this spec could al- low indexes to link to the exact word within the content. It is also the basis of a future inter-document linking spec due out in the near future.

Publishers should consider how best to create additional target IDs within their content to speed the linking process. The good news is that reading systems are required to be able to process EPUBCFI addresses, making them more interoperable.


EPUB 2.0.1 actually consists of 2 schemas—EPUB and DTBook. DTBook was intended to provide content to assistive systems for visually-impaired readers through Braille readers and other technologies. Because of the accessibility features within HTML5, it was decided that DTBook could be deprecated and the functionality rolled into EPUB. So technically, EPUB 3 files are accessible ‘by design.’

Publishers should do everything within reason to ensure that all items within their content are accessible. This includes descrip- tions of all images and alternative text for MathML and scripts.


Hopefully this quick dive into the spec provides enough context for you to know what to expect as we move into the new eBook formatting realm of EPUB 3. There will undoubtedly be lots of new capabilities as best practices get solidified and reading systems become even more advanced. So stay tuned. The final membership vote is expected in late August or early September. I’ll be reporting back with updates.

Top 10 Tips for Capitalizing on eBooks & New Media

1 Know Your Readers

2 Know Your Content

3 Know Your Source

4 Know Your Content's Technical Parameters

5 Think Digital, Think Multi-Channel

6 Understand Your Conversion Options

7 Define Quality Upfront

8 Get Some Buzz

9 Re-Imagine Your Book

10 Build-In Interactivity

Reprinted with permission from Aptara, Inc., 2011.


Michael Covington says...
Posted Friday, October 7, 2011
As a follow-up are there any technology partners today who are proficient in HTML5/EPUB3 and what percent of e-readers will support this new standard?
Michael Covington says...
Posted Friday, October 7, 2011
Great article!

Association Management Software Powered by YourMembership  ::  Legal