Multimedia Security, Volume 1. William Puech
The value of quantization is q = 6
Figure 1.12. In a), an area has been copied four times. The original image is shown in b)
COMMENT ON FIGURE 1.12.– In (c) and (d), the color indicates the origin of the grid of the JPEG blocks detected locally for the falsified and original images, respectively. The navy blue color corresponds to the detected main grid of origin (0, 0). In (c), the areas whose block origin does not match the rest of the image are clearly visible. This detection is made automatic by the a contrario method, whose result can be seen in (e) and (f), where no anomaly is detected in the original image (f), while the falsified image (e) finds the four areas detected as being altered. The original and falsified images come from the database (Christlein et al. 2012).
Likewise, the quantization matrix can be estimated in order to know if it is consistent in each block of the image, and with the global quantization matrix which can be found in the associated header file, which allows the decompression of the image (Thai et al. 2017).
1.6. Internal similarities and manipulations
Finally, we will study the so-called internal manipulations, which modify an image by directly using parts of itself, like inpainting (Arias et al. 2011) and copy and paste.
Unlike other forgeries, these manipulations do not necessarily change residual traces of an image, because the parts used for the modification come from the same image. Therefore, specific methods are necessary for their detection.
The main difficulty in the detection of internal manipulations is the internal similarity of the image. A specialized database was created specifically to measure the rate of false detections between altered and authentic images, but with similar content in different regions (Wen et al. 2016).
The first methods are based on the study of Cozzolino et al. (2015a). Other methods use and compare key points, like those obtained with SIFT (Lowe 2004), which allows similar content to be linked. But this is often too permissive to detect copy and paste. This is why specialized methods, such as proposed by Ehret (2019), propose comparisons between descriptors to avoid the detection of similar objects, which are often distinguishable as shown in Figure 1.13. An example of copy and paste can be found in Figure 1.14.
Neural networks can also be used to detect copy-move manipulations, such as in Wu et al. (2018), where a first branch of the network detects the source and altered regions, while a second branch determines which of the two is the forgery, while other methods generally cannot distinguish the source from falsification.
Figure 1.13. The image in a) represents two similar, but different objects, while the image in b) represents two copies of the same object. Both images come from the COVERAGE database (Wen et al. 2016)
COMMENT ON FIGURE 1.13.– The patches in (c) and (d) correspond to the descriptors used by Ehret (2019) associated with the look-at points represented by the red dots for the images that are authentic (a) and falsified (d), respectively. Differences are visible when the objects are only similar, whereas in the case of an internal copy–paste, the descriptors are identical. It is through these differences that internal copy–paste detection methods can distinguish internal copies from the presence of objects that would naturally be similar.
Figure 1.14. Example of detection of copy–paste type modification on the images in Figure 1.13. The original and altered images are in (a) and (d), respectively, the ground-truth masks in (b) and (e), and the connections (Ehret 2019) between the areas detected as too similar in (c) and (f)
1.7. Direct detection of image manipulation
To detect a particular manipulation, one must first be aware of the existence of this type of manipulation. As new manipulation possibilities are continually being created, it is necessary to continually adapt to new types of manipulation, otherwise the detection methods quickly become outdated. To break out of this cycle, several methods seek to detect manipulations without prior knowledge of their nature.
Recently, generative adversarial networks (GAN) have shown their ability to synthesize convincing images. A GAN is made up of two neural networks competing against each other: the first network seeks to create new images that the second one fails to detect, while the second network seeks to differentiate original images from the ones generated by the first network.
Finally, the most common example concerns the use of automatic filters offered by image editing software such as Photoshop. Simple to use and able to produce realistic results, they are widely used. Neural networks can learn to detect the use of these filters or even reverse them (Wang et al. 2019), the training data can be generated automatically, but must deal with the immense variety of filters existing on this software.
Figure 1.15. Structure of the Mayer and Stamm (2019) network to compare the source of two patches. The same first network A is applied to each patch to extract a residue. These residues are then passed on to a network B which will compare their source and decide if the patches come from the same image or not
Recently, Siamese networks have also been used for the detection of falsification (Mayer and Stamm 2019). They are bipartites, as shown in Figure 1.15. They consist of a first convolutional network that is applied independently to two image patches to extract hidden information from each, and then of a second network that compares the information extracted on the two patches to determine whether they come from the same picture. A big advantage of these methods is the ease of obtaining training data, since it is enough to have non-falsified images available and to train the network to detect whether or not the patches were obtained from the same picture. An example of detection with Siamese networks can be found in Figure 1.16.
1.8. Conclusion
In this chapter, we have described methods that analyze an image’s formation pipeline. This analysis takes advantage of alterations made by the camera from the initial raw image to its final form, usually compressed JPEG. We have reviewed the transformations undergone by the raw image, and shown that each operation leaves traces. Those traces can be used to reverse engineer the camera pipeline, reconstructing the history of the image. It can also help detect and localize inconsistencies caused by forgeries, as regions whose pipeline appears locally different than on the rest of the image. With that in mind, it is usually impossible to guarantee that an image is authentic. Indeed, a perfect falsification, which would not leave any traces, is not impossible, although it would require great expertise to directly forge a raw image – or revert the image into a raw-like state – and simulate a new processing chain after the forgery has been done. Falsifiers rarely have the patience nor the skills needed to carry out this task, however one cannot exclude that software to automatically make forged images appear authentic may emerge in the future.
Figure 1.16. Example of modification detection with the Siamese network (Mayer and Stamm