JPEG quality estimation: experiments with a modified ImageMagick heuristic

23 October 2024
Photograph of golden retriever dog Bailey sitting at a desk in front of a laptop, bashing her paws away at the laptop's keyboard while wearing a necktie.
Bailey AKA the "I have no idea what I'm doing" dog. License unknown.

In this post I explore some of the challenges I ran into while trying to estimate the quality level of JPEG images. By quality level I mean the percentage (1-100) that expresses the lossiness that was applied by the encoder at the last “save” operation. Here, a value of 1 results in very aggressive compression with a lot of information loss (and thus a very low quality), whereas at 100 almost no information loss occurs at all1.

More specifically, I focus on problems with ImageMagick’s JPEG quality heuristic, which become particularly apparent when applied to low quality images. I also propose a simple tentative solution, that applies some small changes to ImageMagick’s heuristic.

Context of this work

I’m currently working on an automated workflow for quality-checking scanned books and periodicals in PDF format. These PDFs are created by external suppliers for The Digital Library for Dutch Literature (dbnl), which has been managed by the KB since 2015.

Each book or periodical volume is scanned to PDF. For each publication, 2 versions are made:

  1. A “production master” PDF with scans that are encoded at 85% JPEG quality.
  2. A (relatively) small access PDF with scans encoded at 50% JPEG quality.

It’s important to have some way of verifying the approximate quality of both versions. The production master serves as input for derived EPUB, XML and unformatted text versions. If the scans are compressed too heavily, this adversely affects these derived products. On the other hand, too little compression on the access PDFs will result in files that are impractically large for access.

Estimating JPEG quality

Probably the best explainer on JPEG quality and its estimation is this tutorial on Neal Krawetz’s Fotoforensics site. The information under the “Estimating Quality” tab is particularly useful. I will return to this on various occasions later in this post.

Estimating JPEG quality with ImageMagick

ImageMagick is able to estimate the quality of a JPEG image. For example, let’s create a test image at 70% quality:

convert -quality 70 wizard: wizard-70.jpg

We can then get an estimate of the JPEG quality using:

identify -format '%Q\n' wizard-70.jpg

Which results in:

70

However, when I ran this command on some of our access scans, the results were not what I expected. An example is this image, which I extracted from one of our PDFs2. According to ImageMagick3, this image is compressed at 92% quality. This JPEG was extracted from a small access PDF, for which I would expect a quality of 50% or less. Its file size is also much smaller than I would expect for a 92% quality image.

Check with Fotoforensics

To verify this unexpected result I uploaded the image to Neal Krawetz’s Fotoforensics service. The result is available here. Fotoforensics estimates the JPEG quality at a paltry 18%. So why does ImageMagick report a value of 92% here?

ImageMagick uses “92” as a fallback value

From a cursory look at its source code, it seems that ImageMagick uses 92 as a fallback value if it cannot come up with a quality estimate. This makes the interpretation of its quality output needlessly difficult, since it’s impossible to differentiate between images that have a true 92% quality, and images for which the quality cannot be established. I created a ticket on Github when I first came across this issue over a year ago, but thus far it hasn’t been fixed.

Even if it was fixed, this still leaves the question why ImageMagick’s quality heuristic is failing here in the first place. As I’m not proficient in C, I didn’t pursue things any further when I first ran into this issue.

ImageMagick’s JPEG quality heuristic

This changed when I recently came across this StackOverflow post, which points to a Python port of ImageMagick’s JPEG quality heuristic by one “Edward O”. It is based on ImageMagicks original code, and estimates the JPEG compression quality from the quantization tables. This immediately grabbed my attention, since my quality-checking workflow is also implemented in Python. The ability to estimate JPEG quality natively in Python would remove the need to wrap any external tools for this.

A quick test showed that it produced results that were identical to ImageMagick in most cases, but like ImageMagick, it failed to come up with a quality estimate for my problematic JPEG. To find out why, I incorporated Edwards’s code into this test script.

How it works

The algorithm reads the image’s quantization tables (usually 2), each of which is a list of 64 quantization coefficients. It first adds up all of these numbers, resulting in variable qsum. It then calculates qvalue, which is the sum of the quantization coefficients at 2 specific positions in each table (these are then summed for all quantization tables). The main “meat and potatoes” of the heuristic is this loop at the very end of the function:

for i in range(100):
    if ((qvalue < hashes[i]) and (qsum < sums[i])):
        continue
    if (((qvalue <= hashes[i]) and (qsum <= sums[i])) or (i >= 50)):
        return i+1

For each iteration, qvalue and qsum are evaluated against the corresponding values in two hard-coded numerical lists (hashes[i] and sums[i]). Although I couldn’t find any documentation, a little digging showed that the values in the sums and hashes lists are derived from the “standard” quantization tables defined in Annex K of the JPEG standard4. The first if block makes sure that as long as qvalue is smaller than hashes[i] and qsum is smaller than sums[i], the code will immediately jump to the next iteration, skipping the second if block. The second if block (which reports the quality estimate as i+1) is only evaluated if the test condition in the first block fails.

Tracing all loop variables

To get a better impression of why the heuristic cannot come up with a meaningful quality estimate for my problematic JPEG, I added a line of code that prints out all variables at the start of each iteration. This gave the following output:

i: 0, qvalue: 586, hashes[i]:1020, qsum:24028, sums[i]32640
i: 1, qvalue: 586, hashes[i]:1015, qsum:24028, sums[i]32635
i: 2, qvalue: 586, hashes[i]:932, qsum:24028, sums[i]32266
i: 3, qvalue: 586, hashes[i]:848, qsum:24028, sums[i]31495
i: 4, qvalue: 586, hashes[i]:780, qsum:24028, sums[i]30665
i: 5, qvalue: 586, hashes[i]:735, qsum:24028, sums[i]29804
i: 6, qvalue: 586, hashes[i]:702, qsum:24028, sums[i]29146
i: 7, qvalue: 586, hashes[i]:679, qsum:24028, sums[i]28599
i: 8, qvalue: 586, hashes[i]:660, qsum:24028, sums[i]28104
i: 9, qvalue: 586, hashes[i]:645, qsum:24028, sums[i]27670
i: 10, qvalue: 586, hashes[i]:632, qsum:24028, sums[i]27225
i: 11, qvalue: 586, hashes[i]:623, qsum:24028, sums[i]26725
i: 12, qvalue: 586, hashes[i]:613, qsum:24028, sums[i]26210
i: 13, qvalue: 586, hashes[i]:607, qsum:24028, sums[i]25716
i: 14, qvalue: 586, hashes[i]:600, qsum:24028, sums[i]25240
i: 15, qvalue: 586, hashes[i]:594, qsum:24028, sums[i]24789
i: 16, qvalue: 586, hashes[i]:589, qsum:24028, sums[i]24373
i: 17, qvalue: 586, hashes[i]:585, qsum:24028, sums[i]23946
quality: -1

Here we see that at i=17, the value of sums[i] becomes smaller than qsum. As a result, we end up in the second if block. This reports the quality factor as i+1, but only if either of the following conditions is met:

  1. both qvalue is smaller than or equal to hashes[i], and qsum is smaller than or equal to sums[i], or:
  2. the value of i is larger than or equal to 50.

In this case, we see that qvalue is indeed smaller than hashes[i], but qsum is not smaller than (or equal to) sums[i]. So the first condition is not met. Since i equals 17, the second condition is not met either, meaning that the code doesn’t come up with a meaningful quality estimate, and reports the fallback value (-1) instead.

Effect of quality threshold

It’s not clear why the heuristic uses the quality threshold value of 50, although Neal Krawetz points out that “the JPEG Standard changes algorithms at quality values below 50%”5. As a test, I tried changing the threshold to 0:

if (((qvalue <= hashes[i]) and (qsum <= sums[i])) or (i >= 0)):
    return i+1

With this change, the code now reports a quality value of 18. Incidentally this is identical to the Fotoforensics estimate.

Modified ImageMagick heuristic

Setting the threshold of 0 effectively makes the first condition in the second if block superfluous, which means we could simplify things further to:

for i in range(100):
    if ((qvalue < hashes[i]) and (qsum < sums[i])):
        continue
    else:
        return i+1

I also noticed that the original ImageMagick code includes a variable that indicates whether the quality estimate is “exact” or “approximate” (here). I initially assumed here that an “exact” match implies a perfect agreement with the standard JPEG quantization tables. This would be useful information to assess the accuracy of the quality estimate. I created a test script with a modified version of the ImageMagick heuristic that incorporates the following changes to the ImageMagick heuristic:

  1. Removal of the quality thresold.
  2. Added reporting of the “exactness” flag, based on the original ImageMagick code.

Tests with Pillow and ImageMagick JPEGs

As a first test I created a set of small test images with known quality values. I did this using Python’s Pillow library and ImageMagick. In both cases I generated test images with quality values 5, 10, 25, 50, 75 and 100, respectively. The images are available here. Running the script with my modified ImageMagick heuristic gave the following result:

Qenc Qest(Pillow) Exact(Pillow) Qest(IM) Exact(IM)
5 5 True 5 True
10 10 True 10 True
25 25 True 25 True
50 50 True 50 True
75 75 True 75 True
100 100 True 100 True

Here Qenc is the encoding quality, and Qest(Pillow) and Qest(IM) represent the script’s estimates for the Pillow and ImageMagick images, respectively. Exact(Pillow) and Exact(IM) represent the reported values of the “exactness” flag. The table shows that the script was able to reproduce the encoding quality with an “exact” match for all test images.

I also ran the script on some of the problematic JPEGs from our dbnl access PDFs. Here, it estimated the JPEG quality at 18%, but without an “exact” match. The JPEGs from the corresponding master PDFs resulted in a 84% quality estimate (which is slightly less than the expected quality of 85%), but again without an “exact” result.

Although the results of these (very limited!) tests look encouraging at first sight, this exercise left me with some doubts and reservations.

Limitations of ImageMagick’s heuristic

Most importantly, ImageMagick’s heuristic is based on a comparison of aggregated coefficients of the image’s quantization tables, which makes it potentially vulnerable to collisions. As an example, for any value of qsum (the sum of all coefficients in the quantization table), many possible combinations of quantization coefficients exist that will add up to the same value. The qvalue check (which is based on coefficients at specific positions in the quantization table) partially overcomes this, but it does this in a pretty crude way using yet another aggregate measure.

Another thing that bothered me, is that the reasoning behind certain aspects of ImageMagick’s heuristic isn’t entirely clear to me. Examples are the 50% threshold, the qvalue check (again) and the precise meaning of its “exact” vs “approximate” designation.

Further down the JPEG rabbit hole

When I wrote the first draft of this post, I hoped it would provoke some response from people who are better versed than me in the inner workings of JPEG compression. As happens so often with these things, only a few days after publishing it I came across this 2008 paper which explains how quantization tables in JPEG work. This inspired me to do some more in-depth testing, and ultimately this resulted in an alternative, more straightforward quality estimation method. This is the subject of a follow-up blog post, which is out now!

Acknowledgment

Thanks are due to Eddy O (AKA “eddygeek”) for creating the Python port of ImageMagick’s JPEG quality heuristic from which most of the results in this post are derived.

Annex: other JPEG quality estimation tools and methods

While working on this, I came across a few alternative tools and methods for JPEG quality estimation. Here’s a brief overview, which is largely for my own reference (but I imagine others may find it useful as well).

ExifTool

ExifTool also reports a JPEG quality estimate if its -JPEGQualityEstimate option is invoked. For example:

exiftool -JPEGQualityEstimate test_im_050.jpg

Results in:

JPEG Quality Estimate           : 50

A peek at the source code shows it uses a ported version of ImageMagick’s heuristic, so it largely has the same limitations. As an example, running it on my problematic JPEG results in:

JPEG Quality Estimate           : <unknown>

Approximate Ratios method

This method is outlined on Neal Krawetz’s Fotoforensics site, which also links to a an inplementation in C. A more detailed explanation can be found in Section 3.3.3 of his Digital Image Analysis and Forensics whitepaper. As Krawetz explains, this method can become unreliable at quality values below 50%, so I didn’t consider it suitable for my use case.

Cogranne method

A 2018 paper by Rémi Cogranne describes an alternative method for estimating JPEG quality. It claims to overcome some of the limitations of established methods, such as the ImageMagick heuristic. However, as it is only valid for a quality factor greater than 49, this also isn’t ideally suited to my use case.

Scripts and test data

Revision history

  • 24 October 2024: re-arranged introductory section, and added an explanation on the difference between quality level and image quality.

  • 30 October 2024: revised earlier draft in preparation of follow-up post.

  1. Note that the quality level does not necessarily reflect the image quality! As an example, if an image was first compressed at 20% quality and subsequently re-saved at 90%, the image quality will be very low (relative to the source image), despite the high quality level at the last save operation. So the quality level only says something about the compression process that was used on the last save. 

  2. I used Poppler’s pdfimages tool to extract the images from the PDF, using the -all switch which ensures images are kept in their original format. 

  3. I used ImageMagick 6.9.10-23. 

  4. More precisely, each value in the sums lists represents the sum of all “standard” quantization coefficients for a particular quality level. Similarly, each value in the hashes lists represents a value of qvalue for a particular quality level in the “standard” tables. 

  5. He mentions this in the context of the “Approximate Ratios” quality estimation method (which indeed becomes unreliable for low qualities). It’s not clear to me if other methods such as the one used by ImageMagick are also affected by this. 




Search

Tags

Archive

2024

December

November

October

March

2023

June

May

March

February

January

2022

November

June

April

March

2021

September

February

2020

September

June

April

March

February

2019

September

April

March

January

2018

July

April

2017

July

June

April

January

2016

December

April

March

2015

December

November

October

July

April

March

January

2014

December

November

October

September

August

January

2013

October

September

August

July

May

April

January

2012

December

September

August

July

June

April

January

2011

December

September

July

June

2010

December

Feeds

RSS

ATOM