Breaking Down the EPUB 3 Spec by Eric Freese, Aptara
Wednesday, October 5, 2011
Posted by: Sheri Toomb
Breaking It Down: The EPUB 3 Spec
By Eric Freese, Solutions
Architect, Aptara and member of the EPUB 3 Working Group
EPUB is widely accepted as the de facto digital format standard
for eBooks, with its signature reflowable text that can be read on the greatest
variety of reading systems (including the iPad, nook/nookColor, Kobo, and Sony
readers to name a few).
The much anticipated, upcoming revised edition of the IDPF
standard, EPUB 3, will include new features that promise to greatly enhance the
reader experience, such as embedded audio, video, and interactivity. Meanwhile,
publishers hold out hope that the new and improved EPUB standard will rectify
the frustration with EPUB 2.0.1 files behaving differently on different reading
systems (which have led me and others, to stress the importance of testing your
files on every intended device.)
With speculation abounding since the introduction of the spec in
the spring of this year, publishers have been waiting to see what’s really
possible with EPUB 3—and what reading systems will support it. To help manage
expectations and alleviate confusion, I’ve provided a brief snapshot of some
of the spec‘s new features, as well as what publishers can do to start
preparing for them, and notes of caution as to what may, or may not, be available in EPUB 3 reading systems.
There has been some confusion as to whether HTML5 and EPUB 3 will
work together. To set the record straight, HTML5 is the base language of EPUB 3
(with some minor adjustments to allow for pagination and other reading
behaviors). Since EPUB 3 con- tent is written in HTML5, the two will interact
EPUB 3 reading systems must be able to process XHTML files written
in HTML5. This doesn’t mean that web browsers will be able to display EPUB
files, unless they are able to process the additional navigation information
contained within the EPUB file. That being said there are some reading systems
that are imple- mented within a browser environment.
The new baseline for style sheets is CSS2.1 with some CSS3 features added. This will provide much richer layout including multi-column
layout, better font support, and directional printing, to name a few. Reading
systems are NOT required to support CSS, but almost all of them do. One of the
leading causes of frustration is the difference in CSS support between
reading systems. In the current EPUB environment, many reading systems do not
allow style sheets within an EPUB file to override the system’s default
settings. EPUB 3 does not do anything to alleviate this situation and, in fact,
might exacerbate it somewhat due to the additional capabilities that are
possible. Reading systems also have the ability to implement their own
proprietary CSS extensions, which would then be ignored by other reading
Audio can be inserted into eBook files using the HTML5 <audio> tag. This is what Apple, Barnes & Noble, and Amazon have been using
all along to embed audio in enhanced eBooks. Now, it’s simply part of the EPUB
3 spec. Reading systems are NOT required to support audio, although many do. If
a reading system supports audio, it must support MP3. In addition, support of
MP4 AAC and media overlays (explained later) is optional.
Video can be inserted into eBook files using the HTML5
<video> tag. Again, this is what has already been occurring. And again,
reading systems are NOT required to support video. In fact, most of the e-Ink
devices are not able to show video in a satisfactory manner.
One of my main bones of contention with the EPUB 3 spec is that
there is no specified format that must be supported. If a reading system
supports video, the spec recommends support of at least one of either H.264
(also known as MPEG-4 AVC) or VP8 video compression formats, but neither is
required. Unfortunately, the spec also does not say that some other format is
not allowed. Essentially, there is nothing to stop a reading system developer
from implementing some other video format (Flash?). Whether that happens
remains to be seen, but there is an opening available.
In the meantime, publishers are going to need to prepare videos in
both formats to support the widest range of reading systems. As has been
discussed in the past, this could lead to very large EPUB 3 files, or different
versions that target specific reading systems.
Media overlay functionality was added to the spec to enable text
and media to be presented in a combined manner. For example, highlighting text
as it is spoken by the computer or as part of a sound track. In order to employ
these overlays, special Synchro- nized Multimedia Integration Language (SMIL)
files will have to be created.
Reading systems are not required to provide this functionality,
but if they do, they should allow readers to skip or escape out of overlays.
Overlays can also be used to provide text-to-speech functionality. The spec
mentions the Pronunciation Lexicon Specification (PLS) and the Speech Synthesis
Markup Language (SSML) as the means for providing assistance in generating
synthetic speech, but does not require reading systems to use that information.
Scalable Vector Graphic (SVG) files have been allowed within EPUB
files for some time. However, their use was limited, due largely to a lack of
reading system support. EPUB 3 now mandates that reading systems be able to
process SVG within the eBook, including allowing users to select text and search
within the content of the SVG files. The only portion of SVG that is not
allowed is the animation capability.
MathML is part of HTML5, and therefore, it is part of EPUB 3.
Reading systems must be able to process the presentation form of MathML, but
may also support the content form of MathML. I won’t go into a lot of detail
here. But publishers that deal with mathematical and scientific content may be
interested, as it will allow formulas to be included as part of the XHTML
markup— rather than as images. This means the content will be scalable, among
other things. It is still recommended that images of the formulas be included
Foreign resources are pieces of content that are not a core media
type. For example, PDFs might be considered foreign resources. I have seen
cases where PDF files are incorporated into EPUB files. When this is done, at
least one fallback (perhaps a plain text equivalent) should be included to
allow reading systems that don’t support the resource to operate.
Scripting and interactivity is another of the most hyped new
features of EPUB 3. Once again EPUB 3 gets this functionality through HTML5.
could blur the lines between eBooks and apps, it should be noted that reading
system support for scripting is NOT required. Furthermore, reading systems have
the ability to place additional limitations on the capabilities provided to
scripts for a variety of reasons, including security and process- ing
capabilities. That being said, publishers should be thinking about possible
ways that content can be made more interactive and beginning to plan for
creating those enhancements. How- ever, they should also make sure that the
reading experience is not adversely affected if a reader decides to turn
scripting off, or if a reading system does not provide it.
EPUB 3 created a Canonical Fragment Identifier (EPUBCFI)
specification for creating and accessing various locations within the content.
This allows very fine grained access to the content, even at the word or phrase
level. The use of this spec could al- low indexes to link to the exact word
within the content. It is also the basis of a future inter-document linking
spec due out in the near future.
Publishers should consider how best to create additional target IDs
within their content to speed the linking process. The good news is that
reading systems are required to be able to process EPUBCFI addresses, making
them more interoperable.
EPUB 2.0.1 actually consists of 2 schemas—EPUB and DTBook. DTBook
was intended to provide content to assistive systems for visually-impaired
readers through Braille readers and other technologies. Because of the
accessibility features within HTML5, it was decided that DTBook could be
deprecated and the functionality rolled into EPUB. So technically, EPUB 3
files are accessible ‘by design.’
Publishers should do everything within reason to ensure that all
items within their content are accessible. This includes descrip- tions of all
images and alternative text for MathML and scripts.
Hopefully this quick dive into the spec provides enough context
for you to know what to expect as we move into the new eBook formatting realm
of EPUB 3. There will undoubtedly be lots of new capabilities as best practices
get solidified and reading systems become even more advanced. So stay tuned.
The final membership vote is expected in late August or early September. I’ll
be reporting back with updates.
Top 10 Tips for Capitalizing on eBooks & New Media
1 Know Your
2 Know Your
3 Know Your
4 Know Your
Content's Technical Parameters
Digital, Think Multi-Channel
Your Conversion Options
8 Get Some Buzz
10 Build-In Interactivity
Reprinted with permission from Aptara, Inc.,