blue-car-no-compression_diff

Practical JPEG Error Level Analysis

At the heart of the Newslinn platform is image validation. One aspect we are researching is ELA, Error Level Analysis. This is specific to the JPEG image file format and a niche area for fraud image detection.

I’m going to attempt to describe ELA and how Newslinn is hoping to use it within our platform. We will need to cover some ground first….

What’s a JPEG?

So a JPEG is a type of file format for saving photographs. There’s a lot of history to it, but in short, it was designed for the internet, designed purposefully for storing photographs and is the main file format that digital camera and smart phones use to store photos.

A JPEG file is a really cool file format – because it’s cool, it does things that other file formats don’t.

As a side note, other image file formats are GIF, PNG, BMP, TIFF.

JPEG Compression

The JPEG file format was really made for the internet back when the internet was “low bandwidth”. So everything on the internet needed small in file size and so the JPEG file format had a feature of ‘compression’ so that when a photograph was stored as a JPEG file, the file size of that was much smaller than if it was stored as another image file format, eg. a Bitmap (BMP) file.

So for this photograph

original

JPEG File Size = 377 KB

BMP File Size = 5,343 KB

The JPEG file format is able to do that file size reduction because it uses compression.

How does JPEG Compression work?

So for the JPEG file format to compress a photo, it splits up the photo into tiny squares of 8×8 pixels. When a JPEG is saved with a low compression you can see the tiny squares.

Capture

What it does is quite genius.

To lower the file size, the JPEG file format reduces the amount of colours in the photograph. So say the original example photograph contains about 45,265 individual unique colours, the JPEG file format, reduces those to only 20,000 colours.

How it goes about that is connected with these tiny 8×8 square. Ultimately, the JPEG file format takes one of the 8×8 squares, figures out how many colours are in it, decides what colour is the average and uses that average colour to replace other colours in the 8×8 square. Thus reducing the amount of colors.

So that’s all well and good. It’s easy enough to understand how a JPEG file format makes a photographic image file size small.

It is this compression method that enables ‘Error Level Analysis’ on the JPEG file format.

What is Error Level Analysis (ELA)?

Error Level Analysis is a way to see what areas of a photograph have been changed.

So if someone took a photo with their smartphone, opened it up in Photoshop and changed something about that photo – Error Level Analysis is a way to try and detect what was changed.

It’s not an exact science yet but it’s useful to bring about suspicion if nothing else.

How does ELA work?

ELA works because of how JPEG compresses photographs into 8×8 tiny squares – and it works because each time a JPEG image is saved, it gets compressed again.

So that is where the magic is.

So when a JPEG photograph is first saved, it compresses the photo for the first time.

If the image is then opened into Photoshop, edited and saved again as a JPEG, it gets compressed again.

What this means is that the “original” parts of the photographic image have been compressed twice – once by the camera that took the photo and again by Photoshop.

Whereas, the “edited” part of the photographic image, was only compressed once, by Photoshop.

To the human eye, you can not notice the difference by looking at the image. However, you can comparing the two images together and looking at the differences.

This is the basis of JPEG Error Level Analysis.

Practical Example

original

Original JPEG Photograph, saved as a JPEG, “compressed once”

 

blue-car-no-compression

Edited JPEG photograph, “compressed twice”

(The editing of the car is crude, but just go with it, imagine it was perfect)

original_diff_smaller

ELA on original image

 

blue-car-no-compression_diff_smaller

ELA on edited image

 

Exact Art

While ELA isn’t an exact science, it’s a useful tool to add into the mix for fraud verification. It still requires a trained eye as the resulting ELA images can produce a wide range of variations that might trigger a level of suspicion.

But combing that with other factors for verification it can be quite interesting – this is what the Newslinn platform does.

Tips

If the edited photograph moves around parts of the image instead of overlaying a new image. Then is is very hard to detect, as the compression levels are all the same.

car-direction

Edited image, car direction changed

2car-direction_diff

ELA of edited photo

The same can be said if the image is air brush and part of it removed.

airbrushed

Edited image, car removed

2airbrushed_diff

ELA of image with car removed

 

  1. ELA can compliment existing verification techniques
  2. ELA has problems when it comes to images that have high contrast colours beside each other.
  3. Understanding the output of ELA takes time and experience
  4. ELA is an interactive verification technique, it’s not enough to just look at a single static image, you need to be able to adjust settings and see output in real-time.
  5. You can get around ELA once you know how to.
  6. JPEG files that have not been ‘compressed’ when being saved will not work.
  7. If something gets removed from an image and replaced with another part of the same image, this will be very hard to detect.

Other things similar to ELA

I’ll write up later on on Edge Detection and Histogram Analysis, Air Brush Detection three other ways to investigate an image to see what might have changed.

Some Python Code

Let’s finish this off with some code.

code

If you want access to our actual code base / shared github just get in touch contact at newslinn dot com.