Multimedia Security, Volume 1. William Puech
(this can vary if the image is cropped). If a grid is present, among the 8 × 8 = 64 different original possibilities, only one is correct.
Here, two families of methods are presented: methods based on block artifacts and methods based on the impact of quantization on the DCT coefficients.
1.5.2.1. Compression artifacts
These are methods based on the detectable traces left by compression. In their article, Minami and Zakhor propose a way of detecting JPEG grids with the aim of removing the blocking artifacts (Minami and Zakhor 1995). Later on, in Fan and de Queiroz (2003), the same ideas are used to decide whether an image has undergone JPEG compression, depending on whether traces are present or not. These methods use filters to bring out the traces of compression (Chen and Hsu 2008; (Li et al. 2009)). The simplest method calculates the absolute value of the gradient magnitude of the image (Lin et al. 2009), and others use the absolute value of derivatives of order 2 (Li et al. 2009). However, these two filters can have a strong response to edges and to textures present in the image and therefore can sometimes lead to faulty grid detections. To reduce the interference of details in the scene, a cross-difference filter, proposed by Chen and Hsu (2008), is more suitable. This filter, represented in Figure 1.10, amounts to calculating the absolute value of the result of a convolution of the image by a 2 × 2 kernel. The grid becomes visible because of the differentiating filter applied to the compressed image. The stronger the compression, the more this feature is present.
Recently, methods like the one proposed in Nikoukhah et al. (2020) have made these methods automatic and unsupervised thanks to statistical validation.
1.5.2.2. DCT coefficients
These are methods based on the impact of compression on the DCT coefficients. After quantization, the compression makes the size of the image file smaller by setting many of the DCT values to zero. As illustrated in Figure 1.5, the quantization leads to setting a lot of the high-frequency coefficients to zero. The values of the quantization matrix are generally larger in high frequencies. The stronger the compression, the more values are set to zero.
Figure 1.10. Derivative filter and vote map applied to the same image without compression in a) and after JPEG compression of quality 80 in b)
COMMENT ON FIGURE 1.10.– The compressed image features a grid structure. The saturated zone on the right of the image hides any traces of compression. In the vote map, each pixel is associated with the grid for which it voted, in other words, the grid with the most zeros. For the compressed image, one color is dominant: it corresponds to the position (0, 0).
Based on the example of CAGI (Iakovidou et al. 2018), the ZERO method determines the origin of the grid by testing the 64 possibilities and selecting the one on which the DCT coefficients of the blocks has the most zeros (Nikoukhah et al. 2019). In other words, given an image, all of its pixels vote for the grid they think they belong to. In the event of a tie, the vote is not taken into account.
Figures 1.10(e) and 1.10(f) present the vote map: each pixel’s color represents which of the 64 possible grids the pixel notes. Navy blue corresponds to the original grid (0, 0), and red to a non-valid vote, in the event of a tie. At the top right of the image, the saturated zone is not used to detect JPEG traces since it does not contain any information.
The derivative filter presented in Figures 1.10(c) and 1.10(d) makes it possible to highlight the compression artifacts, and the vote map presented in Figures 1.10(e) and 1.10(f) is a colormap where each color is associated with a grid origin. In both cases, there is a clear difference between the image that has not undergone compression and the one which has undergone compression. In fact, these filters, which are part of the tools used by journalists and police experts today, lack a validation step. Indeed, as they are presented, users need to interpret them. It is important to understand why a filter detects one area rather than another. The goal would be to get a binary result.
In the case of an uncompressed image, no “vote” stands out significantly compared to the others. In the case of the compressed image in Figure 1.10(f), navy blue is dominant: it corresponds to position (0, 0).
Whether it is the cross-difference or the pixel vote map, some areas remain difficult to interpret, therefore justifying the need for a statistical validation. For example, the saturated parts have no visible JPEG grid and therefore cannot be used to reach a decision.
1.5.3. Detecting the quantization matrix
The histogram of each of the 64 DCT coefficients makes it possible to determine the quantization step that corresponds to the associated value in the quantization matrix. Quantization has a very clear effect on the DCT coefficients histograms of an image, visible in Figure 1.11 before and after compression. DCT components generally follow Laplacian distribution (Clarke 1985; Wallace 1992), except for the first coefficient that represents the average of the block.
The JPEG quantization step transforms each DCT coefficient into an integer, multiple of the quantization value (Fridrich et al. 2001). These integer values lead to real values for each pixel during compression, which are then rounded off to integer values. Due to the second rounding, the DCT coefficients of the image are no longer integer, but show a narrow distribution around the quantization values, as shown in Figure 1.11. The quantization value in Figure 1.11 is q = 6, and so the uncompressed coefficients are centered around the values 0, 6, −6, 12, −12, and so on. Once a quantization model has been obtained for the DCT coefficients, forgery detection methods such as (Ye et al. 2007), look for inconsistencies in the histograms, after having established a stochastic model.
For example, Bianchi et al.’s method first estimates the quantization matrix used by the first JPEG compression, and then tries to model the frequencies of the histogram of each DCT coefficient (Bianchi et al. 2011).
1.5.4. Beyond indicators, making decisions with a statistical model
Block artifacts, the number of zeros and the frequency interval of the histograms can be seen as compression detectors. However, a statistical validation is needed to determine whether the observations are indeed caused by compression or they are simply due to chance. This validation can be carried out by the a contrario approach (Desolneux et al. 2008).
Applied to the whole image, these methods make it possible to know if an image has undergone JPEG compression, and if necessary, to know the position of the grid. The position of the grid origin indicates if the image has undergone a cropping after compression, as long as this cropping is not aligned with the initial grid, which can happen by chance in one out of 64 cases.
To verify an image, it is important to make the previous analysis methods local by checking the consistency of each part of the image with the global model. Several methods detect forgeries in areas having a different JPEG history than the rest of the image (Iakovidou et al. 2018; Nikoukhah et al. 2019).
Figure 1.12 illustrates a method that highlights an area where the JPEG grid origin is different from the rest of the image. In fact, the vote map in Figure 1.12(c) shows that it is already possible to visually distinguish the objects of the image having voted for a different grid than the rest of the image. A statistical validation automates the decision by giving a binary mask of the detection, as illustrated in Figures 1.12(e) and 1.12(f).
Figure 1.11. Histogram of a DCT coefficient for an image before and after compression. There is a clear structure after quantization