An EPUB primer
What is an e-book?
They come in a number of formats: PDF, EPUB, MOBI, AZW
- The common version is the EPUB (The final format for most e-book readers)
- Amazon’s MOBI and AZW formats can be converted from an EPUB
- I upload EPUBs at the KDP site and retrieve MOBIs to proof
- PDF (Portable Document Format) is a standard format developed by Adobe. It’s been around since 1993.
- EPUB, MOBI and AZW are based on HTML (XHTML), the same language used for Web sites. The chapters of the book are bundled with a number of other files that provide the parameters—such as font type and size, table of contents, paragraph styles.
The EPUB
EPUB is short for “electronic publication.” It’s a standard file format used in the production of e-books. Unlike a fixed digital format like a PDF, an EPUB allows content reflow based on screen size or font size.
An EPUB is a ZIP file that bundles content. Instead of the .zip extension, it uses the file extension .epub. It utilizes markup in HTML (XHTML), XML, and CSS, and may contain images, audio, and video files.
An EPUB’s format is formally known as the Open eBook Publication Structure (OEBPS). Most EPUBs are 2.0.1 or 3.0. The latter version provides abilities for more precise layouts, and specialized formatting. Keep in mind, however, that older devices may not be able to support the advanced abilities of EPUB 3.0.
The information in this document will focus on EPUB 2.0.1.
There are three specifications for EPUBs:
- The Open Container Format (OCF) specifies the order of the files in an EPUB
- The Open Packaging Format (OPF) defines the contents of the file and its metadata
- The Open Publication Structure (OPS) specifies the physical contents of the e-book
Tools of the trade
Calibre
- Free
- E-book reader—provides “look and feel”
- E-book converter
- E-book editor
- I use Calibre frequently to convert from one e-book type to another, and I like to use the e-book reader when I’m checking my formatting
Jutoh
- E-book converter
- E-book creator
- You can format your book without worrying about the XHTML, XML, etc. Extremely user-friendly.
Sigil
- Free
- E-book editor
- This is my “go-to” tool for XHTML editing.
Adobe Digital Editions
- Free
- E-book reader—provides “look and feel”
Kindlegen
- Free
- A program you can download to convert your EPUB into a MOBI
- I use this tool to build MOBIs that I then load onto my Kindle and iPad to check formatting
Kindle Previewer
- Free
- E-book reader—simulates “look and feel” for a variety of devices
- Handles EPUBs, MOBIs, HTML
- I also use this tool to evaluate my format over a variety of e-book readers
Editors
In theory, you can use Notepad to edit all of the components in your EPUB, but it’s a primitive approach. There are other free/inexpensive options:
Tag basics
To understand what’s going on in your EPUB, you need to understand tags
- Tags are surrounded by <>
- There’s usually a start tag <> and an end tag</>
- Overlapping tags must be nested
Correct: <i>I like to live <b>boldly.</b> How about you?</i>
Hot mess: <i><h1>Your</i> piece of pie? I don’t think so!</h1>
Sigil’s attempt to fix:
But, the first set of italic tags aren’t necessary.
<h1><i>Your</i> piece of pie? I don’t think so!</h1>
- Most tags shouldn’t exceed a single paragraph.
- Resources
- w3schools.com: HTML tutorial
- Amazon’s cheat sheets
Getting to the heart of the matter
Always make a back-up copy before opening your EPUB!
- Change the extension from .epub to .zip and—voila!—you can open it!
- You can use one of the editors listed above (or Notepad) to edit the various files.
- To revert back to an EPUB:
- Highlight all of the components except the mimetype file and zip it
- Move the mimetype file into it and then change the extension back to .epub.
Dissecting an EPUB
Mimetype (mandatory)
Plain text, at the highest level, identifies file as an EPUB. Cannot be compressed.
META-INF (mandatory)
- Folder, at the highest level
- The file container.xml is required
- Other files are optional:
- xml
- xml (manifest for container contents)
- xml (for container-level metadata)
- xml (reserved for digital rights management (DRM) information)
- xml (holds digital signatures of the container and its contents)
container.xml (mandatory)
- Located in META-INF folder, directs e-book readers to location of content.opf file.
encyrption.xml (optional)
Located in META-INF folder, it works with some embedded fonts.
- May need to select different fonts if you remove it
- Nook doesn’t allow it
- I had troubles trying to use it in my Smashwords EPUB, so I eliminated it
com.apple.ibooks.display-options.xml (optional)
- Located in META-INF folder, works with embedded fonts for an iBook EPUB
- Consists of a set of display options that tell iOS devices how to present the content
- This may be a deprecated file (included for older iBooks, but not used by newer ones)
OEBPS (mandatory)
- Folder, at the highest level
- Named after the overall set of specifications: Open eBook Publication Structure (OEBPS). It contains all of the e-book’s content (text, images, etc.).
Inside OEBPS:
You can have folders such as:
- CSS or Styles (contains Cascading Stylesheets)
- Fonts (location for embedded fonts)
- Images (such as author picture)
- Acceptable image formats: JPEG, PNG, GIF, SVG
- Text (XHTML/HTML files)
- Audio
- Video
- opf (mandatory): file that describes all of the contents in the EPUB
- ncx (mandatory for EPUB 2.0.1): navigation control file (special TOC in e-book reader)
Fonts (optional)
Folder that contains the fonts you want to embed into your EPUB. This is required for fonts not readily available on e-book readers. Keep in mind that not all e-book readers will utilize embedded fonts.
Styles (optional)
- Folder for Cascading Stylesheets
- This folder is optional; you can insert the stylesheets directly into the OEBPS folder.
- Cascading Stylesheets have a file extension of .css
- Stylesheets are optional, can have various names, and you may choose to include more than one.
Here’s an example with multiple stylesheets that reside in OEBPS. The developer decided to separate the overall page style from the paragraph styles. I’ve found numerous examples of this in books converted with Calibre.
Text (optional)
- Folder that holds all of the XHTML/HTML files
- I recommend that your break your manuscript into separate files for each chapter. This can save a lot of headaches.
- I separate each front matter and back matter component into a separate file.
- Avoid spaces in your filenames
Again, I like to use descriptive file names:
Sample of converted XHTML/HTML with generic filenames:
content.opf (mandatory)
Metadata
Contains the information about your book, such as:
- Unique ID
- Title
- Author (Creator)
- Description
- Publisher
- ISBN
- Language
Manifest
Lists every component in the OEBPS folder except the .opf file.
- Navigation control file (toc.ncx)
- Stylesheets
- Images
- XHTML/HTML files
- Font files
Spine
Lists the order of the XHTML/HTML files
Guide
Lists special files used by e-book readers
- Cover (if you bundle it in your EPUB)
- Table of contents
- Landing spot (when the book is initially opened in the e-book reader, it’s to this spot)
toc.ncx
The navigation control file creates the special table of contents found on e-book readers.
Each navPoint indicates a landing spot.
The playOrder indicates the sequence. You must keep your entries in order, and without gaps (right: 1,2,3,4,5 wrong: 1,2,4,5,7)
Cascading Stylesheets
The best way to understand this concept is to correlate a CSS paragraph style to a Word style.
Word CSS
Normal body
Header1 h1
Conversion tools usually convert into generic labels. I prefer to use descriptive ones.
Sample of styles after a conversion:
If you have embedded fonts, you define them at the top of your stylesheet, or in the page stylesheet (if you choose to have one).
You can also add styles for spans of text—for example if you want to change the font for a sentence in the middle of a paragraph.
Resource: w3schools.com CSS tutorial
My e-book is broken!
Always make a copy before proceeding!
Run it through an EPUB validator
- A number of tools have an EPUB check embedded in them
- The International Digital Publishing Forum provides an online EPUB Validator at: idpf.org
- This may provide some clues, as it will give you errors and associated line numbers
Find the bad spot and compare it to code that’s working
- Find the point where it isn’t working and open it up in an editor.
- In a tool like Sigil, you can go to the “funky spot” and switch to the XHTML and it’ll take you to that point in the code.
- Compare the malfunctioning code to similar spot that’s working. Change the code to align and then test.
The Internet is your friend!
- Google it!
- There are communities on vendor sites that can offer insights
- Vendor formatting guidelines
- Nook Press formatting guides
- Amazon’s Basic HTML Formatting Guidelines
- Amazon’s Simplified Formatting Guide
- Amazon’s Kindle Publishing Guidelines (pdf)
- Amazon’s file formatting tips
- Apple’s iBooks Asset Guide 5.2 (contains a great primer on EPUBs)
- Apple’s iBooks Store Formatting Guidelines
- Smashwords Style Guide (free e-book at Smashwords and on Amazon)
- Smashwords instructions for uploading an EPUB instead of a Word document
- Facebook groups can provide assistance as well
- Indie Author Writing Group
- Alliance of Independent Authors (You need to be a member)
- Other useful sites
- Guido Henkel’s Web site
- ePUBSecrets.com
- idpf.org (location of the specifications)
- wiki.mobileread.com
- Smashwords FAQ
- Fantasy Castle Books (great tutorial on fixed-layout e-books)
- Pigsgourdsandwikis.com (it hasn’t been updated a a while, but it has some great advice on quirky aspects of EPUBs)
- kdp.amazon.com/help (you can find all of Amazon’s Kindle publishing resources here)
- Amazon supported formats
- ibm.com/developerworks/xml/tutorials/x-epubtut (basic tutorial on how to build an EPUB)
- EPUB tutorial by Jedisaber.com
- The Yellow Buick Review (this is no longer being updated, but it has a great series of blogs on how to build an EPUB)
- James Calbraith has a two-part tutorial
- KJKlemme.com (I’m adding a writer helps section)
- Other resources
- Zen of eBook Formatting by Guido Henkel
- E-Book Formatting for Novelists: A step-by-step guide for the independent novelist or small press by K.C. May (free at Smashwords and Barnes & Noble)
- APE How to Publish a Book by Guy Kawasaki
- eBook Formatting and Publishing Guide for Epub & Kindle Mobi Books using Sigil ebook editor by Suzanne Fyhrie Parrott
- Ebook Formatting: KF8, Mobi & EPUB by Matt Harrison (gives a variety of coding examples—rather techie)
- EPUB From the Ground Up: A Hands-On Guide to EPUB 2 and EPUB 3 by Jarret Buse (another rather techie resource)
- EPUB Straight to the Point: Creating ebooks for the Apple iPad and other ereaders (One-Off) by Elizabeth Castro (pretty in-depth resource)
- Aaron Shepard’s Kindle series