PDF – Inventory of long-term preservation risks

26 July 2012

In this blog post I’ll be dusting off some old stuff for a change. The occasion for this is the following question, posted by Paul Wheatley on the Libraries and Information Science Stack Exchange website a few days ago:

What preservation risks are associated with the PDF file format?

Report

This reminded me of a report I wrote on this very subject back in 2009. (Incidentally this was my very first foray into the wacky world of digital preservation, but that’s another story.) Originally this document was intended for internal use at the KB, but looking at it again, I think it may be of interest to a wider audience. It also aligns quite nicely with the upcoming work on a knowledge base of file-format related risks that will be done as part of the SCAPE project. The main idea here is to take a file format, identify its main (preservation-related) risks, and describe how “risky” features can be detected by existing (characterisation) tools. In fact I was envisaging something along these lines when I wrote PDF report in 2009, but other things got in the way, and I never got round to the final step. The SCAPE work should finally make this happen.

Although the work on the knowledge base is still in its early stages, some very first results can be found here. The initial focus will be on JPEG 2000 (JP2/JPX) and PDF.

As for the report, I should add that some of it is a little rough around the edges, and you may note some gaps and not-quite-finished bits. This is also why we never released this first time around. Also, one aspect that is not well covered is PDF’s potential for transmitting viruses and other malware. Nevertheless, as a general introduction to the format and an overview of its main risks I think it’s not too shabby, but I’ll let you be the judge of that! As always, feel free to use the comment fields for you feedback and suggestions.

Adobe Portable Document Format - Inventory of long-term preservation risks, KB/ National Library of the Netherlands


Originally published at the Open Preservation Foundation blog




Search

Tags

Archive

2024

December

November

October

March

2023

June

May

March

February

January

2022

November

June

April

March

2021

September

February

2020

September

June

April

March

February

2019

September

April

March

January

2018

July

April

2017

July

June

April

January

2016

December

April

March

2015

December

November

October

July

April

March

January

2014

December

November

October

September

August

January

2013

October

September

August

July

May

April

January

2012

December

September

August

July

June

April

January

2011

December

September

July

June

2010

December

Feeds

RSS

ATOM