Free help and advice to the UK Further and Higher Education community

Helpdesk

What is Wavelet Compression?

Last updated: 12 September 2008
Published in: Creating new digital media
Tags:

Summary

The wavelet transform has been around about for some time, but has only recently been applied to image compression. It is now used within several file formats, but the best known is JPEG 2000

You’ll find a simple overview of compression techniques in the JISC Digital Media Advice document: File Formats and Compression. This current FAQ offers a fuller explanation of wavelet compression, whilst still trying to avoid using too much hard maths.

The wavelet transform has been around about for some time, but has only recently been applied to image compression. It is now used within several file formats, but the best known is JPEG 2000. We’ll use JPEG 2000 as an example of wavelet compression, comparing it with the common JPEG/JFIF format, which uses an inferior, non-wavelet compression.

Several of the fundamental differences between the common JPEG and JPEG 2000 are directly related to the different approaches they take to compression. These include the option of lossless compression in JPEG 2000 (unavailable in the common JPEG), the smoothness of highly compressed JPEG 2000 images (compared to the ‘blockiness’ of common JPEGs); and the additional display functionality, including zooming, offered by JPEG 2000.

The common JPEG’s compression (known as the DCT - Discrete Cosine Transform) cuts the pixels into blocks of 64 (8x8) and processes each block independently, shifting and simplifying the colours so there is less information to encode. Pixels are changed only in relation to the other pixels within their block, so two identical pixels that are next to each other, but in different blocks, could be transformed in different ways. Consequently, when high compression (maximum simplification) is used, the block boundaries become obvious, causing the ‘blockiness’ or ‘blocking artefact’ frequently observed in the common JPEG.

In contrast, JPEG 2000’s wavelet compression (known as DWT - Discrete Wavelet Transform) treats the image as a signal or wave, rather than small sets of discrete pixel values. The transform organises the image information into a continuous wave - typically with many peaks and dips - and centres it on zero.  Actually, it regards the image as a series of waves, one for each colour channel (e.g. Red, Green, Blue), and it does sometimes break up big images into large tiles for ease of processing, but it’s simpler to talk about one wave here - the process is exactly the same for each.

Having centred the wave, the transform records the distances from the zero line to points along the wave (these distances are known as coefficients) and then takes the average between adjacent coefficients to produce a simplified version of the wave - in effect, it reduces the image’s resolution or detail by half.  The averages are then averaged again, and so on, producing progressively simpler waves. This process is known as ‘decomposition’.

This is only part of the story. In the process of decomposing a wave, the wavelet transform identifies significant and less significant variations in the image. How does it do this? As the coefficients are averaged, the differences are recorded. Smaller differences between adjacent coefficients (flatter areas of the wave) represent slight variations within the image - they are prime candidates for smoothing and simplification. Larger differences (steeper rises or falls) represent more significant detail, typically lines or edges within the picture - aspects of the image that need to be preserved.

As the wavelet transform takes place it generates progressively lower resolution versions of the wave, approximating the general shape and colour of the image. In addition to this, it has all the information necessary to reconstruct the wave in finer detail.

This can be quite hard to visualise, but the Haar function (see box below) provides a simple working model of wavelet decomposition.

The Haar function: a simple wavelet example

Let us assume we have a one-dimensional image with just four values (i.e. four pixels in a row, each a different shade of grey):

Illustration of 4-pixel image

1, 3, 8, 6

We could take the average of the first pair and then the average of the second pair to give:

Illustration of 2-pixel image

2 ,  7

We have simplified the image and, in effect, halved its resolution. But we’ve also lost some information along the way. In order to restore the detail, we would need to record the differences as well as the averages. These are known as the ‘detail coefficients’ and are added to and subtracted from the averages in order to reconstruct the earlier values. In the example above, the detail coefficients are -1 and 1. This is because:

2 + -1 = 1

2 - -1 = 3

7 +  1 = 8

7 -  1 = 6

If we keep on decomposing the image we will soon arrive at one overall average and one detail coefficient.  In the simple example above, we only need to decompose the image once more, taking the average of 2 and 7 to give an overall average of 4.5 and a difference of 2.5.

This table shows how our image has been decomposed:

ResolutionAveragesDetail Coefficients
4 (original, detailed image) 1, 3, 8, 6 -
2 (half resolution) 2, 7 -1, 1
1 (coarsest image) 4.5 -2.5

If we recorded the overall average (4.5) and then the detail coefficients in order of increasing resolution, we would have enough information to reconstruct the image at half resolution (2, 7) or full resolution (1, 3, 8, 6). All we’d need to write down is:

4.5, -2.5, -1, 1

The image described by this information is lossless, since it can be reconstructed exactly as it was. If we threw away any of the detail in an image this size it would be very obvious.

However, when dealing with larger, naturalistic images, the detail coefficients are often less significant (i.e. close to 0) across broad areas of the image. If we just kept the significant detail and a simplified version of the rest we would be able to produce smaller files that appear as good as the original when they are decompressed. The result would be a lossy compressed image.

It is important to note that the process of shifting and decomposing the wave (the Discrete Wavelet Transform) does not in itself lose or compress any image information. Any savings in file size and loss of information occur in later stages of the compression. During these stages all of the information necessary to reconstruct the complete wave can be kept and encoded, providing a lossless compression.

Alternatively, the significant detail identified by the decomposition process can be retained, while the insignificant variations can be smoothed over and simplified (a process known as quantisation) and then encoded as a lossy compressed image.

JPEG 2000’s wavelet compression is superior to the common JPEG’s compression because it is able to treat larger areas of the image at once, and in a more discriminating way. The image can be compressed more tightly, while at the same time preserving the detail and avoiding any blockiness.

In addition to improving the quality and efficiency of compression, the products of a wavelet transform can be used to enhance the delivery of an image.  The decomposition process produces a series of increasingly simplified versions of the image (either smaller or less detailed, depending on how they are encoded).  If these are ‘played back’ in reverse as the image is reconstructed and displayed, the result is a picture that literally grows in size (i.e. resolution) or in detail (fidelity).  JPEG 2000 offers both of these among its display options. There is potential for them to be further exploited by software developers to develop zooming and panning facilities.

Last updated: 12 September 2008
Published in: Creating new digital media
Tags: