EPUB version 2.0 became an official standard of the IDPF in September 2007, superseding the older Open eBook standard.
In August 2009, the IDPF announced that they would begin work on maintenance tasks of the EPUB standard. Two broad objectives were defined by this working group: "One set of activities governs maintenance of the current EPUB Standards (i.e. OCF, OPF, and OPS), while another set of activities addresses the need to keep the Standards current and up-to-date." The working group expected to be active through 2010, publishing updated standards throughout its lifetime. On April 6, 2010, it was announced that this working group would complete their update in April 2010. The result was to be a minor revision to EPUB 2.0.1 that "...corrects errors and inconsistencies and does not change functionality." On July 2, 2010, drafts of the version 2.0.1 standards appeared on the IDPF website.
On April 6, 2010, it was announced that a working group would form to revise the EPUB specification. In the working group's charter draft, 14 main problems with EPUB are identified, which the group will address. The group was chartered through May 2011, and was scheduled to submit a final draft on May 15, 2011. An initial Editors Draft for EPUB3 was published on November 12, 2010, and the first public draft was published on February 15, 2011. On May 23, 2011, the IDPF released its proposed specification for final review. On October 10, 2011, the IDPF announced that its membership had approved EPUB 3 as a final Recommended Specification.[
In September 2012, ISO/IEC JTC1/SC34 re-established Ad Hoc Group 4 on EPUB of IDPF to prepare the creation of a Joint Working Group (JWG) for EPUB.
EPUB 3 will be submitted as a Draft Technical Specification by the Korean National Body via the JTC 1 fast-track procedure and it will be assigned to the SC 34/JWG when approved. It is currently under a standardization process under the formal name ISO/IEC DTS 30135-1 - Information technology - Digital publishing - EPUB3 -- Part 1: EPUB3 Overview.[13]
Version 2.0 is the most popular version of EPUB. EPUB 2.0 was approved in October 2007, with a maintenance update (2.0.1) intended to clarify and correct errata in the specifications being approved in September 2010.[19] EPUB version 2.0.1 consists of three specifications:
.epub
file in XML.[21]EPUB internally uses XHTML to represent the text and structure of the content document, and a subset of CSS to provide layout and formatting. XML is used to create the document manifest, table of contents, and EPUB metadata. Finally, the files are bundled in a zip file as a packaging format.
An EPUB file uses XHTML 1.1 to construct the content of a book as of version 2.0.1.
Styling and layout are performed using a subset of CSS 2.0, referred to as OPS Style Sheets.
This specialized syntax requires that reading systems support for only a portion of CSS properties and
adds a few custom properties. Custom properties include oeb-page-head, oeb-page-foot,
and
oeb-column-number
. Font-embedding can be accomplished using the @font-face
property, as well as including the font file in the OPF's manifest (see below). The
mimetype for CSS documents in EPUB is
text/css
.[20]
For a table of supported properties and detailed information, see
Section 3.0
of the specification.
EPUB also requires that PNG,
JPEG, GIF,
and SVG images be supported using the
mimetypes image/png, image/jpeg,
image/gif, image/svg+xml
. Other media types are allowed, but creators must include alternative
renditions using supported types.[20]
Unicode is required, and content producers must use either UTF-8 or UTF-16 encoding. This is to support international and multilingual books. However, reading systems are not required to provide the fonts necessary to display every unicode character, though they are required to display at least a placeholder for characters that cannot be displayed fully.
An example skeleton of an XHTML file for EPUB looks like this:
<?xml version="1.0" encoding="UTF-8" ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en"> <head> <meta http-equiv="Content-Type" content="application/xhtml+xml; charset=utf-8" /> <title>Pride and Prejudice</title> <link rel="stylesheet" href="css/main.css" type="text/css" /> </head> <body> ... </body> </html>
The OPF specification's purpose is to "...[define] the mechanism by which the various components
of an OPS publication are tied together and provides additional structure and semantics to the electronic
publication."[21]
This is accomplished by two XML files with the extensions .opf
and .ncx
.
.opf file
The OPF file, traditionally named content.opf
, houses the EPUB book's
metadata, file manifest, and linear reading order.
This file has a root element package
and four child elements: metadata, manifest,
spine,
and guide
. All of these except guide
are required. Furthermore,
the package
node must have the unique-identifier
attribute. The .opf file's
mimetype is application/oebps-package+xml
.
The metadata
element contains all the metadata information for a particular EPUB file.
Three metadata tags are required (though many more are available): title, language,
and
identifier
. title
contains the title of the book, language
contains
the language of the book's contents in RFC
3066 format or its successors, such as the newer
RFC 4646 and identifier
contains
a unique identifier for the book, such as its ISBN
or a URL. The identifier
's id
attribute should equal the unique-identifier
attribute from the package
element.[21]
For a full listing of EPUB metadata, see
Section 2.2
of the specification.
The manifest
element lists all the files contained in the package. Each file is represented
by an item
element, and has the attributes id, href, media-type
. All XHTML
(content documents), stylesheets, images or other media, embedded fonts, and the NCX file should be
listed here. Only the .opf
file itself, the container.xml
, and the mimetype
files should not be included.[21]
Note that in the example below, an arbitrary media-type
is given to the included font file,
even though no mimetype exists for fonts.
The spine
element lists all the XHTML content documents in their linear reading order.
Also, any content document that can be reached through linking or the table of contents must be listed
as well. The toc
attribute of spine
must contain the id
of the
NCX file listed in the manifest. Each itemref
element's idref
is set to the
id
of its respective content document.
The guide
element is an optional element for the purpose of identifying fundamental
structural components of the book. Each reference
element has the attributes type,
title, href
. Files referenced in href
must be listed in the manifest, and are allowed
to have an element identifier (e.g. #figures in the example).
A list of possible values for type
can be found in
Section 2.6
of the specification.
An example OPF file:
<?xml version="1.0"?> <package version="2.0" xmlns="http://www.idpf.org/2007/opf" unique-identifier="BookId"> <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf"> <dc:title>Pride and Prejudice</dc:title> <dc:language>en</dc:language> <dc:identifier id="BookId" opf:scheme="ISBN">123456789X</dc:identifier> <dc:creator opf:file-as="Austen, Jane" opf:role="aut">Jane Austen</dc:creator> </metadata> <manifest> <item id="chapter1" href="chapter1.xhtml" media-type="application/xhtml+xml"/> <item id="stylesheet" href="style.css" media-type="text/css"/> <item id="ch1-pic" href="ch1-pic.png" media-type="image/png"/> <item id="myfont" href="css/myfont.otf" media-type="application/x-font-opentype"/> <item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml"/> </manifest> <spine toc="ncx"> <itemref idref="chapter1" /> </spine> <guide> <reference type="loi" title="List Of Illustrations" href="appendix.html#figures" /> </guide> </package>
.ncx file
The NCX file (Navigation Control file for XML), traditionally named toc.ncx
,
contains the hierarchical table of contents
for the EPUB file. The specification for NCX was developed for
Digital Talking Book (DTB), is
maintained by the
DAISY Consortium,
and is not a part of the EPUB specification. The NCX file has a mimetype of application/x-dtbncx+xml
.
Of note here is that the values for the docTitle, docAuthor,
and meta name="dtb:uid"
elements should match their analogs in the OPF file. Also, the meta name="dtb:depth"
element
is set equal to the depth of the navMap
element. navPoint
elements can be
nested to create a hierarchical table of contents. navLabel
's content is the text that
appears in the table of contents generated by reading systems that use the .ncx. navPoint
's
content
element points to a content document listed in the manifest and can also include
an element identifier (e.g. #section1).
A description of certain exceptions to the NCX specification as used in EPUB can be found in Section 2.4.1 of the specification. The complete specification for NCX can be found in Section 8 of the Specifications for the Digital Talking Book.
An example .ncx file:
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE ncx PUBLIC "-//NISO//DTD ncx 2005-1//EN" "http://www.daisy.org/z3986/2005/ncx-2005-1.dtd"> <ncx version="2005-1" xml:lang="en" xmlns="http://www.daisy.org/z3986/2005/ncx/"> <head> <!-- The following four metadata items are required for all NCX documents, including those that conform to the relaxed constraints of OPS 2.0 --> <meta name="dtb:uid" content="123456789X"/> <!-- same as in .opf --> <meta name="dtb:depth" content="1"/> <!-- 1 or higher --> <meta name="dtb:totalPageCount" content="0"/> <!-- must be 0 --> <meta name="dtb:maxPageNumber" content="0"/> <!-- must be 0 --> </head> <docTitle> <text>Pride and Prejudice</text> </docTitle> <docAuthor> <text>Austen, Jane</text> </docAuthor> <navMap> <navPoint class="chapter" id="chapter1" playOrder="1"> <navLabel><text>Chapter 1</text></navLabel> <content src="chapter1.xhtml"/> </navPoint> </navMap> </ncx>