EPUB for archival preservation

18 June 2012

Over the last few years, the EPUB format has gained widespread popularity in the consumer market. The KB has been approached by a number of publishers that wish to use EPUB for delivering some of their electronic publications. Surprisingly little information is available on the format’s suitability for archival preservation, apart from Library of Congress’ Sustainability of Digital Formats web pages, which contain entries on EPUB 2 and EPUB 3.

So, the KB’s Departments of Collection and Collection Care requested a more detailed investigation of EPUB’s preservation credentials. More specifically, answers were needed to the following questions:

  • What are the main characteristics of EPUB?

  • What functionality does EPUB provide, and is this sufficient for representing e.g. content with sophisticated layout and typography requirements?

  • How well is the EPUB supported by software tools that are used in (pre-)ingest workflows?

  • How suitable is EPUB for archival preservation? What are the main risks?

EPUB for archival preservation

The report EPUB for archival preservation is a first attempt at answering these questions as well as possible. It starts out with a simple example that illustrates the general structure of an EPUB file, followed by a more in-depth discussion on of specific aspects of the format. It then covers functionality-related aspects such as layout, appearance and multimedia support, and the main differences between EPUB 2 and EPUB 3.

Support by characterisation tools is important for processing EPUB files in an operational workflow, so a brief review (and some preliminary tests) of relevant identification, validation and feature extraction tools is included as well.

To assess the overall suitability of EPUB for preservation, the format was evaluated against a set of widely used criteria (mainly from The National Archives and Library of Congress). The final chapter wraps up the main conclusions, and suggests a number of recommendations.

Community input

Since it appears that not much has been published on EPUB within an archival preservation context so far, we would really appreciate to hear your thoughts on the report. Is anything important missing? Did I overlook any relevant tools? Is there anything in particular that you strongly disagree with? Please use the comment fields below to let us know!

In addition, the final chapter contains two subsections with Community Recommendations and Tool Recommendations. These are all things we can do as a community to simplify the use of EPUB in archival settings. Please consider getting involved if you feel you could make a contribution.

EPUB for archival preservation, KB/ National Library of the Netherlands


Originally published at the Open Preservation Foundation blog



Comments

Post a comment by replying to this post using your ActivityPub (e.g. Mastodon) account.

    Search

    Tags

    Archive

    2025

    April

    2024

    December

    November

    October

    March

    2023

    June

    May

    March

    February

    January

    2022

    November

    June

    April

    March

    2021

    September

    February

    2020

    September

    June

    April

    March

    February

    2019

    September

    April

    March

    January

    2018

    July

    April

    2017

    July

    June

    April

    January

    2016

    December

    April

    March

    2015

    December

    November

    October

    July

    April

    March

    January

    2014

    December

    November

    October

    September

    August

    January

    2013

    October

    September

    August

    July

    May

    April

    January

    2012

    December

    September

    August

    July

    June

    April

    January

    2011

    December

    September

    July

    June

    2010

    December

    Feeds

    RSS

    ATOM