JPEG quality estimation using simple least squares matching of quantization tables

30 October 2024

Photograph of faded sign on building front showing the word 'Quality'. — Adapted from Quality Coal by Greenville Daily Photo. Used under CC0 1.0. license.

In my previous post I addressed several problems I ran into when I tried to estimate the “last saved” quality level of JPEG images. It described some experiments based on ImageMagick’s quality heuristic, which led to a Python implementation of a modified version of the heuristic that improves the behaviour for images with a quality of 50% or less.

I still wasn’t entirely happy with this solution. This was partially because ImageMagick’s heuristic uses aggregated coefficients of the image’s quantization tables, which makes it potentially vulnerable to collisions. Another concern was, that the reasoning behind certain details of ImageMagick’s heuristic seems rather opaque (at least to me!).

In this post I explore a different approach to JPEG quality estimation, which is based on a straightforward comparison with “standard” JPEG quantization tables using least squares matching. I also propose a measure that characterizes how similar an image’s quantization tables are to its closest “standard” tables. This could be useful as a measure of confidence in the quality estimate. I present some tests where I compare the results of the least squares matching method with those of the ImageMagick heuristics. I also discuss the results of a simple sensitivity analysis.

JPEG quality and standard quantization tables

ImageMagick’s JPEG quality heuristic is based on the “standard” quantization tables that are defined in Annex K of the JPEG standard. Its overall objective seems to be to match an image’s quantization tables with the most similar “standard” quantization tables. Since the quality level of each of these “standard” tables is known, this then provides the quality estimate.

ImageMagick’s heuristic does this in an indirect (and to me somewhat opaque) way, possibly to avoid computational cost. This post explores a more straightforward approach, which simply compares the coefficients in an image’s quantization tables against the corresponding coefficients in the “standard” tables, and then returns the quality level of the best match¹.

Scaling of standard tables to quality levels

To understand how this works, it’s first important to know that Annex K in the JPEG standard describes two “standard” “quantization tables for luminance and chrominance. Here’s the luminance table:

16	11	10	16	24	40	51	61
12	12	14	19	26	58	60	55
14	13	16	24	40	57	69	56
14	17	22	29	51	87	80	62
18	22	37	56	68	109	103	77
24	35	55	64	81	104	113	92
49	64	78	87	103	121	120	101
72	92	95	98	112	100	103	99

This is the base table with coefficients that are valid for quality level 50. The tables for all other quality levels can be derived from this base table using Equations 1 and 2 from this paper by Kornblum (2008)². First, for each quality level Q we can calculate a corresponding scaling factor S:

S = \frac{5000}{Q} (Q < 50)

S = 200 \cdot Q (Q \geq 50)

S is then used to calculate scaled quantization coefficients Tⁱ_s from the base coefficients Tⁱ_b using the following equation:

T_{s}^{i} = max (⌊ \frac{S \cdot T_{b}^{i} + 50}{100} ⌋, 1)

Here i is the ith element in the table. Note the floor brackets, which mean the expression inside them is rounded down to the nearest integer number. For 8-bit quantization tables (which is the most common situation) the scaled coefficients also need to be capped at a maximum of 255.

As an example, applying these equations to Q=75 results in a scaling factor S of 15000, and the quantization coefficients become:

8	6	5	8	12	20	26	31
6	6	7	10	13	29	30	28
7	7	8	12	20	29	35	28
7	9	11	15	26	44	40	31
9	11	19	28	34	55	52	39
12	18	28	32	41	52	57	46
25	32	39	44	52	61	60	51
36	46	48	49	56	50	52	50

This way it is possible to calculate the quantization tables for all 100 quality levels. For the chrominance tables the same procedure can be used.

Estimating quality from scaled tables

For a given JPEG file, the quality can then be estimated by comparing its quantization tables against each of the scaled tables that are derived from the standard tables. As a basis for this comparison we can calculate, for each quality level, the sum of squared errors between the coefficients in the image’s quantization table and the corresponding coefficients in the scaled standard table. For an image with 2 quantization tables (one for luminance and another one for chrominance) this is given by:

SSE = \sum_{i = 1}^{64} {(T_{lum}^{i} - T_{s, lum}^{i})}^{2} + {(T_{chrom}^{i} - T_{s, chrom}^{i})}^{2}

Here, Tⁱ_lum and Tⁱ_chrom represent the coefficients for luminance and chrominance from the image’s quantization table, and Tⁱ_s,lum and Tⁱ_{s, chrom} are the corresponding coefficients from the (scaled) standard tables.

Repeating this for all quality levels results in 100 SSE values. The quality level with the smallest SSE value is then our best estimate for the quality of the image. An SSE value of exactly 0 means the image uses the standard JPEG quantization tables. In that case we can be confident that our estimate is the exact quality level at which the image was compressed. Larger values indicate the use of non-standard quantization tables, in which case the quality estimate may be less accurate.

Characterizing similarity to standard tables

For images that don’t use the standard quantization tables, it would be useful to have some measure that expresses how much the quantization tables deviate from the best matching standard table. This could be used as a measure of confidence in the quality estimate. By itself, SSE is not a good measure of this. Firstly, it is influenced by the number of quantization tables in the image. We could remove this influence by transforming the SSE value to a root mean squared error (RMSE):

RMSE = \sqrt{\frac{SSE}{tables \cdot 64}}

Here, tables is the number of quantization tables (which is either 1 or 2). The interpretation of these RMSE values is still complicated by the fact that the quantization coefficients are significantly larger at lower quality levels. In practice this has the effect that the RMSE values are generally much larger at low quality levels relative to higher quality levels, even for images with a similar overall “fit” to a standard quantization table.

One “goodness of fit” measure that does not have this drawback is the Nash–Sutcliffe efficiency coefficient (NSE). It is mostly used in the field of hydrology to characterize how well the output of hydrological simulation models agrees with observations. Here, we will use it to characterize how well the coefficients in the standard table agree with those in the image’s quantization table. Rewritten for our quantization coefficients, NSE is given by:

NSE = 1 - \frac{\sum_{i = 1}^{N} {(T^{i} - T_{s}^{i})}^{2}}{\sum_{i = 1}^{N} {(T^{i} - T)}^{2}}

Here Tⁱ represents the ith coefficient from the image’s quantization tables, and Tⁱ_s is the corresponding coefficient from the (scaled) standard tables. N is the total number of coefficients in the image’s quantization tables. Note that, unlike in the SSE equation, the luminance and chrominance coefficients are lumped here for simplicity. Finally, T is the mean of all coefficients Tⁱ in the image’s quantization tables. The interpretation of NSE is quite straightforward:

A value of 1 indicates a perfect agreement between the image quantization tables and the corresponding standard tables.
For a value of 0, the standard tables are as good (or rather, bad) an approximation of the image’s quantization tables as T.
Negative values indicate an extremely poor agreement.

As an example, the below scatter plot shows the quantization coefficients (T) from one of out dbnl master images, plotted against the corresponding coefficients (T_s) from the best matching standard table. It also shows the line of perfect agreement (green, dashed), the quality estimate, the root mean squared error, and the Nash-Sutcliffe Efficiency:

scatter plot of T against Ts for file mul-master.jpg, with Quality = 84%, RMSE = 1.057 and NSE = 0.997.

The plot shows that, aside from one outlier in the chrominance table, the image’s quantization coefficients are closely approximated by those in the standard quantization tables. This is reflected by the NSE value, which is close to 1. Now compare this with the plot I made for the corresponding access image:

scatter plot of T against Ts for file mul-access.jpg, with Quality = 18%, RMSE = 6.244 and NSE = 0.999.

Visually, the agreement between the standard quantization table coefficients and those of the image looks very similar to the previous plot. But note how the RMSE value is much larger here, which is caused by the much larger overall coefficients in the quantization tables. However, this doesn’t have any effect on NSE, which is even marginally closer to 1 than for the master image.

By contrast, things are very different for this image:

scatter plot of T against Ts for file image-177.jpg, with Quality = 81%, RMSE = 20.509 and NSE = 0.732.

The plot shows that the standard JPEG tables are a relatively poor approximation here, and this is refected by the low NSE value. A similar case is this image:

scatter plot of T against Ts for file sample-jpg-files-sample-4.jpg, with Quality = 89%, RMSE = 12.294 and NSE = 0.734.

Despite the different quality estimate and RMSE value, this visually looks like it’s in the same ballpark as the previous image, and the similar NSE value confirms this.

Based on these examples, NSE appears to give a good indication of the fit between the image and standard quantization coefficients. This makes it useful as a measure to asses the confidence in the method’s quality estimates.

Python implementation of least squares matching method

I created a test script with a Python implementation of the method. Apart from the quality estimate, it also reports the corresponding RMSE and NSE values.

Tests with Pillow and ImageMagick JPEGs

As a first test I ran the script on the Pillow and ImageMagick JPEGs I discussed in my previous post. This gave the following result:

Q_enc	Q_est(PIL)	NSE(PIL)	Q_est(IM)	NSE(IM)
5	5	1.0	5	1.0
10	10	1.0	10	1.0
25	25	1.0	25	1.0
50	50	1.0	50	1.0
75	75	1.0	75	1.0
100	100	1.0	100	1.0

Here Q_enc is the encoding quality, and Q_est(PIL) and Q_est(IM) are the script’s estimates for the Pillow and ImageMagick images, respectively. The script correctly reproduced the encoding quality for all test images. The values RMSE=0 and NSE=1 also indicate that all images use the standard JPEG quantization tables.

Comparison of quality estimation methods

Far more interesting is the behaviour for images that don’t use the standard tables. The previous section already showed the results for the dbnl master and access JPEGs that started this work. To test the method on a more diverse selection of images, I downloaded JPEGs from a variety of sources³. I then ran them through this test script that estimates the JPEG quality using my Python port of the original ImageMagick heuristic, the modified ImageMagick heuristic from my previous post, and the least squares matching method. I then identified images for which one or more of these methods came up with different results, and added a selection of them to this dataset. The following table shows the results of the script for this dataset:

File	Q (im)	Q (im, mod)	Exact (im, mod)	Q (lsm)	RMSE (lsm)	NSE (lsm)
psgradient.jpg	92	92	False	93	4.438	0.747
hopper_16bit_qtables.jpg	na	1	False	13	0.795	1.0
image-177.jpg	63	63	True	81	20.509	0.732
image-98.jpg	63	63	True	81	20.509	0.732
jpeg420exif.jpg	90	90	False	89	0.424	0.999
jpeg422jfif.jpg	96	96	False	97	0.0	1.0
jpeg444.jpg	60	60	False	75	0.0	1.0
sample-birch-400x300.jpg	95	95	False	94	3.188	0.838
sample-jpg-files-sample-4.jpg	78	78	True	89	12.294	0.734
tapedeck1.jpg	90	90	False	93	0.753	0.992
tapedeck2.jpg	92	92	False	95	0.421	0.996

Here Q(im) is the quality estimate from the original ImageMagick heuristic, Q(im), mod is the quality estimate from the modified ImageMagick heuristic, Q(lsm) is the quality estimate from the least squares matching method. In addition Exact(im, mod) is the “exactness” indicator of the modified ImageMagick heuristic, and RMSE(lsm) and NSE(lsm) are the root mean squared error and Nash-Sutcliffe Efficiency values reported by the least squares matching method, respectively.

Below I highlight some of the more interesting results.

jpeg444.jpg

For this image, both the original and modified ImageMagick heuristics estimate the quality at 60%, with no “exact” match. By contrast, the least squares matching method came up with a quality of 75%, with RMSE and NSE indicating a perfect match with the standard JPEG quantization tables. This is comfirmed by plotting the coefficients from the quantization tables against the standard coefficients:

I double-checked the result by uploading the image to FotoForensics, which also came up with 75% quality, and an exact match with the standard tables.

image-98.jpg

Here, the quality is estimated at 63% by both the original and modified ImageMagick heuristics, with an “exact” match. However, the least squares matching method results in a much higher (81%) quality, but the relatively low NSE value of 0.732 indicates a poor fit to the standard JPEG tables. This is confirmed by the scatter plot of T against T_s :

The FotoForensics result also indicates a quality of 81%.

hopper_16bit_qtables.jpg

This image is interesting for a number of reasons. ImageMagick’s original heuristic fails to come up with a quality estimate, while the modified ImageMagick heuristic returns a 1% estimate. Meanwhile, the least squares matching method estimates the quality at 13%. Here’s the corresponding scatter plot:

scatter plot of T against Ts for file hopper_16bit_qtables., with Quality = 13%, RMSE = 0.795 and NSE = 1.0.

The coefficients in the JPEG quantization tables are usually stored as 8-bit unsigned integers, which means the highest possible value is 255. This particular image uses 16-bit values instead, which we can see from the range of T values in the plot, which goes all the way up to 380! ImageMagick’s heuristic is unable to deal with this⁴, which results (for the modified version) in an unrealistically low value. The least squares matching method explicitly checks for coefficients outside the 8-bit range, and adjusts its calculations accordingly. Its quality estimate corresponds to the assessment by FotoForensics.

One detail that caught my attention is the non-zero RMSE value, which at first sight seems at odds with the reported NSE value of 1.0. On closer inspection, it turned out that NSE is actually marginally smaller than 1 here, but this is obscured by rounding the reported values at 3 decimals⁵.

sample-jpg-files-sample-4.jpg

Both ImageMagick heuristics estimate the quality of this image at 78% with an “exact” match, whereas the least squares matching method gives a much higher estimate of 89% (with quite a poor fit with the standard tables). The corresponding FotoForensics estimate is marginally different from this at 88%, but still very close. For completeness here’s its scatter plot (again):

Conclusions from this comparison

Even though the tests presented here are quite limited, the differences that can occur between the quality estimates by the least squares matching method and the ImageMagick heuristics are quite striking. What surprised me in particular, was that even for JPEGs that use the standard quantization tables, ImageMagick’s heuristic may still provide quality estimates that are quite inaccurate. Of course, a major limitation here is the lack of reliable “ground truth” in the form of known quality settings at the time the test images were created. However, the good agreement between the quality estimates of the least squares matching method and the FotoForensics service does inspire some confidence in the methodology.

Another surprise was that ImageMagick’s “exactness” flag isn’t actually indicative of an exact match with the standard JPEG tables. None of the test images for which it returned a “True” value actually contains standard quantization tables, whereas the two images that do contain standard tables resulted in a “False” value!

Sensitivity analysis

To get a better impression of how the method behaves with non-standard quantization tables, I did a simple sensitivity analysis using the cjpeg utility that is part of libjpeg. This tool supports compression with separate quality levels for the luminance and chrominance tables. I used this to compress one source image to all possible quality level combinations. This resulted in 10,000 images, which I then ran through the least squares matching method.

The plot below shows the distribution of estimated quality values Q_lsm against Q_av, which are the corresponding averages of encoding qualities Q_lum and Q_chrom⁶:

scatter plot of average encoding quality versus estimated encoding quality.

At low average encoding qualities, the bandwidth of corresponding estimates by the least squares matching method is very wide, but this narrows considerably towards higher quality levels.

Another, probably more useful view on the results appears if we plot the quality estimation errors, here expressed as absolute differences between Q_av and Q_lsm, against the Nash-Sutcliffe Efficiency coefficients:

scatter plot of prediction errors (absolute differences between estimated and average encoding), versus Nash-Sutcliffe Efficiency.

This shows a strong association between NSE values in the upper range with small prediction errors. At lower NSE values, the range of prediction errors becomes progressively wider. Thus, this demonstrates how NSE can be useful as a measure of confidence in the quality estimate.

Performance

One potential concern about the least squares matching method might be that it is computationally not very efficient: for each image, the analysis typically involves 200 comparisons of 64-element tables. Out of interest I did a little performance test using this collection of 700 JPEGs. I analyzed these files with my scripts with the Python ports of the original and modified ImageMagick heuristics, and the least squares matching method.

For each script run, I first ran this command to empty the cache memory:

sudo sysctl vm.drop_caches=3

I then ran each script like this:

(time python3 ./jpeg-quality-demo/jpegquality-lsm.py ./sample-images/images/*.jpg > sample-images.txt) 2> time-lsm.txt

The “time” command results in 3 performance metrics, The most important of which are:

“real” - the actual amount of time passed between starting the script and its termination.
“user” - actual CPU time used in executing the process⁷.

Below table shows these metrics for the three scripts⁸:

Method	time (real)	time (user)
im original	0m7,624s	0m3,321s
im modified	0m7,215s	0m2,968s
lsm	0m13,134s	0m8,673s

This shows the least squares matching method is almost 3 times slower than the ImageMagick heuristics in terms of “user” time, and about 2 times slower in terms of “real” time. This translates to an average processing time of 0.01 to 0.02 s per file. Therefore, the reduced performance shouldn’t be any problem in practical terms.

Final thoughts

I originally wrote the least squares matching method code in an attempt to better understand JPEG quality estimation, and to make a more informed assesment of how ImageMagick’s heuristic works. Based on the tests described here, I think I ended up with something that might actually be quite useful, and preferrable to either the original or modified ImageMagick heuristic.

By itself the method isn’t in any way novel, as it’s basically just another implementation of the “Approximate Quantization Tables” quality estimation method as described by Neal Krawetz⁹. I expect many other, very similar implementations exist that I’m simply not aware of, particularly in the digital forensics domain. This makes it all the more surprising that these apparently haven’t made it to popular image processing and analysis software like ImageMagick. The use of the Nash-Sutcliffe Efficiency as a measure of confidence in the quality estimate may be somewhat novel, but I wouldn’t be surprised if other (and possibly better) methods for this exist.

Finally, it’s important to be aware that the characterization of JPEG quality using the 1 - 100 scale that follows from the “standard” quantization tables is by itself pretty arbitrary¹⁰. Essentially, the corresponding “quality” values are just pointers to sets of quantization tables that only have ordinal significance (i.e. higher values mean better quality), but not much more. Due to its wide use, and the lack of any better alternative, it’s still a useful benchmark. This is also why I think it’s important to provide some information on the similarity of an image’s quantization tables to the “standard” ones, as this helps assessing the confidence in the quality estimate.

As always, any feedback and suggestions in response to this post are very welcome!

Scripts and test data

jpeg-quality-demo Github repository - Github repo with all scripts and test data that were used in this analysis.
jpegquality-lsm.py - Python implementation of the least squares matching method.

Important note on Python Pillow version

The Python implementation of the least squares matching method (and most of the other scripts as well) requires a recent version of the Pillow Imaging Library. This is because around the release of version 8.3 (I think) Pillow changed the order in which it returns the values inside JPEG quantization tables (details here). All scripts in the repo expect the current/new behaviour, and they will give very wrong results when used with older Pillow versions!

Revision history

1 November 2024: added paragraph on significance of JPEG quality scale; added sensitivity analysis.

This corresponds to the “Approximate Quantization Tables” method that is mentioned on (and used by) Neal Krawetz’s FotoForensics site (but the site doesn’t provide any details about the implementation). ↩
See also this post on StackOverflow. ↩
Most notably sample-images, samplelib.com, toolsfairy.com, w3.org and the Pillow source repository. ↩
This is because its hard-coded sums and hashes lists (see my previous post) are based on 8-bit values. ↩
In this case the numerator of the NSE equation (the SSE value) was 81, and the denominator (the variance of the quantization coefficients) 7743557. This results in NSE = 1 - (81/7743557) = 0.99998954, which is reported as 1.0 when rounded to 3 decimals. Meanwhile RMSE = √(81/128) = 0.795. ↩
The actual significance of Q_av is somewhat questionable, especially for extreme high/low combinations of Q_lum and Q_chrom. ↩
For a more in-depth explanation of these metrics see: https://stackoverflow.com/a/556411/1209004 ↩
I ran the test on a pretty low spec machine with an Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz with 4 cores. ↩
In response to this post, Krawetz let me know that FotoForensics uses a different algorithm that isn’t based on least squares. ↩
See e.g. the JPEG FAQ for an explanation. ↩

Comments

Post a comment by replying to this post using your ActivityPub (e.g. Mastodon) account.

16	11	10	16	24	40	51	61
12	12	14	19	26	58	60	55
14	13	16	24	40	57	69	56
14	17	22	29	51	87	80	62
18	22	37	56	68	109	103	77
24	35	55	64	81	104	113	92
49	64	78	87	103	121	120	101
72	92	95	98	112	100	103	99

8	6	5	8	12	20	26	31
6	6	7	10	13	29	30	28
7	7	8	12	20	29	35	28
7	9	11	15	26	44	40	31
9	11	19	28	34	55	52	39
12	18	28	32	41	52	57	46
25	32	39	44	52	61	60	51
36	46	48	49	56	50	52	50

Q_enc	Q_est(PIL)	NSE(PIL)	Q_est(IM)	NSE(IM)
5	5	1.0	5	1.0
10	10	1.0	10	1.0
25	25	1.0	25	1.0
50	50	1.0	50	1.0
75	75	1.0	75	1.0
100	100	1.0	100	1.0

16	11	10	16	24	40	51	61
12	12	14	19	26	58	60	55
14	13	16	24	40	57	69	56
14	17	22	29	51	87	80	62
18	22	37	56	68	109	103	77
24	35	55	64	81	104	113	92
49	64	78	87	103	121	120	101
72	92	95	98	112	100	103	99

8	6	5	8	12	20	26	31
6	6	7	10	13	29	30	28
7	7	8	12	20	29	35	28
7	9	11	15	26	44	40	31
9	11	19	28	34	55	52	39
12	18	28	32	41	52	57	46
25	32	39	44	52	61	60	51
36	46	48	49	56	50	52	50

Q_enc	Q_est(PIL)	NSE(PIL)	Q_est(IM)	NSE(IM)
5	5	1.0	5	1.0
10	10	1.0	10	1.0
25	25	1.0	25	1.0
50	50	1.0	50	1.0
75	75	1.0	75	1.0
100	100	1.0	100	1.0

JPEG quality estimation using simple least squares matching of quantization tables

JPEG quality and standard quantization tables

Scaling of standard tables to quality levels

Estimating quality from scaled tables

Characterizing similarity to standard tables

Python implementation of least squares matching method

Tests with Pillow and ImageMagick JPEGs

Comparison of quality estimation methods

Conclusions from this comparison

Sensitivity analysis

Performance

Final thoughts

Scripts and test data

Important note on Python Pillow version

Revision history

Comments

Search

Tags

Archive

June

May

April

December

November

October

March

June

May

March

February

January

November

June

April

March

September

February

September

June

April

March

February

September

April

March

January

July

April

July

June

April

January

December

April

March

December

November

October

July

April

March

January

December

November

October

September

August

January

October

September

August

July

May

April

January

December

September

August

July

June

16	11	10	16	24	40	51	61
12	12	14	19	26	58	60	55
14	13	16	24	40	57	69	56
14	17	22	29	51	87	80	62
18	22	37	56	68	109	103	77
24	35	55	64	81	104	113	92
49	64	78	87	103	121	120	101
72	92	95	98	112	100	103	99

8	6	5	8	12	20	26	31
6	6	7	10	13	29	30	28
7	7	8	12	20	29	35	28
7	9	11	15	26	44	40	31
9	11	19	28	34	55	52	39
12	18	28	32	41	52	57	46
25	32	39	44	52	61	60	51
36	46	48	49	56	50	52	50

Q_enc	Q_est(PIL)	NSE(PIL)	Q_est(IM)	NSE(IM)
5	5	1.0	5	1.0
10	10	1.0	10	1.0
25	25	1.0	25	1.0
50	50	1.0	50	1.0
75	75	1.0	75	1.0
100	100	1.0	100	1.0