Update on jpylyzer

23 April 2012

Introduction

In this blog post I will give a brief update of the latest jpylyzer developments. Jpylyzer is a validation and feature extraction tool for the JP2 (JPEG 2000 Part 1) still image format.

History of jpylyzer

Around mid-summer 2011, the KB started initial preparations for migrating 146 TB of TIFF images from the Dutch Metamorfoze program to JP2. We realised that the possibility of hardware failure (e.g. short network interruptions) during the migration process would imply a major risk for the creation of malformed and damaged files. Around the same time, we received some rather worrying reports from the British Library, who were confronted with JP2 images that contained damage that couldn’t be detected with existing tools such as JHOVE.

This prompted me to have a go at writing a rudimentary software tool that was able to detect some simple forms of file corruption in JP2. A blog post I wrote on this resulted in quite a bit of feedback, and several people asked about the possibility to extend the tool’s functionality to a full-fledged JP2 validator and feature extractor. Since this fitted in nicely with some SCAPE work that was envisaged on quality assurance in imaging workflows, I started work on a first prototype of jpylyzer, which saw the light of day in December.

In the remainder of this blog post I will outline the main developments that have happened since then.

Refactoring of existing code

Shortly after the release of the first prototype, my KB colleague René van der Ark spontaneously offered to do a refactoring job on my original code, which was clumsy and unnecessarily lengthy in places. This has resulted in a code that is more modular, and which adheres more closely to established programming practices. As a result, the refactored code is significantly more maintainable than the original one, which makes it easier for other programmers to contribute to jpylyzer. This should also contribute to the long-term sustainability of the software.

New features

Since the first prototype the following functionality was added to jpylyzer:

  • New validator functions were added for XML boxes, UUID boxes, UUID Info boxes, Palette boxes and Component Mapping boxes.
  • A check was added that verifies whether the number of tiles in an image matches the image- and tile size information in the codestream header.
  • Another check was added that verifies whether all tile-parts within each tile exist.
  • The ICC profile feature extraction function has been given an overhaul, and it now extracts all ICC header items. In addition, the output is now reported in a more user-friendly format.
  • For codestream comments, all characters from the Latin character set are now supported.
  • The reporting of the validation results has been made more concise. By default, jpylyzer now only reports the results of validation tests that failed (previously all test results were reported). This behaviour can be overruled with a new –verbose switch.

In addition to the above, various bugs and minor issues have been addressed as well.

Debian packages

During the SCAPE Braga meeting in February, work started on the creation of Debian packages for jpylyzer. The availability of Debian packages greatly simplifies jpylyzer’s installation on Linux-based systems. This work was done by Dave Tarrant (University of Southampton), Miguel Ferreira, Rui Castro, Hélder Silva (KEEP Solutions) and Rainer Schmidt (AIT).

Jpylyzer now hosted by OPF

In order to make a tool sustainable, it is important that its maintenance and development are not solely dependent on one single institution or person. Because of this, jpylyzer is now hosted by the Open Planets Foundation, which ensures the involvement of a wider community. Jpylyzer also has its own home page on the OPF site. It contains links to the source code, Windows executables, Debian packages and the User Manual.

Jpylyzer home page

http://jpylyzer.openpreservation.org/


Originally published at the Open Preservation Foundation blog




Search

Tags

Archive

2024

November

October

March

2023

June

May

March

February

January

2022

November

June

April

March

2021

September

February

2020

September

June

April

March

February

2019

September

April

March

January

2018

July

April

2017

July

June

April

January

2016

December

April

March

2015

December

November

October

July

April

March

January

2014

December

November

October

September

August

January

2013

October

September

August

July

May

April

January

2012

December

September

August

July

June

April

January

2011

December

September

July

June

2010

December

Feeds

RSS

ATOM