Multimedia Security, Volume 1. William Puech
in the image and estimate noise in high frequencies, where noise dominates over signal.
Figure 1.5. An example of the impact of quantization on a DCT block. Each DCT coefficient is quantized by a value found in a quantization matrix. Rounding to the nearest integer results in many of the high-frequency coefficients being set to zero. Each block is zig-zagged to be encoded as a vector with a sequence of zeros
We shall limit ourselves to discuss the method acknowledged as the best estimator for homoscedastic noise in the review (Lebrun et al. 2013), the Ponomarenko et al.’s method (Ponomarenko et al. 2007). This method computes the variance of overlapping 8 × 8 pixels blocks. To avoid the effects of textures and edges, blocks are sorted according to their low-frequency energy; only a small percentile (typically 0.5%) is used to select the blocks whose low- and medium-frequency energy is lowest. The final noise estimation is obtained by computing the median of the variances in the high frequencies of these blocks.
Homoscedastic white noise estimation algorithms can be adapted to estimate an arbitrary signal-dependent noise curve, as pointed out by Colom et al. (2014). However, after undergoing the complete camera processing chain detailed in section 1.3.2, noise depends not only on signal but also on frequency. A multi-scale estimation is needed in order to estimate highly correlated frequency-dependent noise (Lebrun et al. 2015). Following this observation, Colom and Buades extended Ponomarenko et al.’s method (Ponomarenko et al. 2007) to incorporate such a multi-scale approach (Colom and Buades 2013).
1.3.2. Transformation of noise in the processing chain
This section examines the way in which noise is affected at each step of the camera processing chain (see section 1.2). Noise curves obtained with the extended Ponomarenko et al.’s method (Colom and Buades 2013) along the processing chain (raw image, demosaicing, white balance, gamma correction and JPEG-encoding) are presented in Colom (2014) and compared to the temporal estimation.
Temporal estimations of noise curves are non-parametric, and good enough to be considered as ground-truth. Having ground-truth noise curves is an important issue when evaluating the performance of estimation methods. These temporal estimations are built by taking burst photos of the same scene, which consists of a calibration pattern with large flat zones (Figure 1.6), under constant lighting with a steady camera. Under these conditions, the variance of a pixel value can only be explained by noise. Thus, the noise curve obtained by computing the standard deviation of the temporal series yields the ground-truth noise curve. These noise curves depend on the camera used as well as the particular processing chain, including the ISO level and the exposure time.
Figure 1.6. Calibration model used for the construction of the temporal series
1.3.2.1. Raw image acquisition
The value at each pixel generated by the process described in section 1.2.1 can be modeled as a Poisson variable, whose expectation is the real value of the pixel. The noise measured at the CCD or CMOS sensor has several components; Table 1.1 describes the main sources.
Figure 1.2 shows the noise curve obtained by temporal series (ground truth) and the estimation obtained from a single image computed using Ponomarenko et al.’s method (Colom and Buades 2013) with a simplified pipeline. Note that the estimate is accurate since all curves match. At this step, all channels have the same noise curve. As noise follows a Poisson distribution, the noise variance follows a simple linear relation σ2 = a + bu, where u is the intensity of the ideal noiseless image, and a and b are constants. Consequently, the noise curves are strictly increasing. Moreover, although the noise curves do not account for it, the noise characteristics reported above suggest that the noise is uncorrelated, that is, the noise at a certain pixel is not related to noise at any other pixel with the same signal intensity.
Table 1.1. Description of the main sources of noise during the acquisition process
Type of noise | Description |
---|---|
Shot noise | Due to the physical nature of light. It describes the fluctuations in the number of photons detected due to their independent emission from each other. |
Dark noise | Some electrons accumulate on the potential well as the result of a thermal cause. These electrons are known as dark current because they are present and will be detected even in the absence of light. |
Photo response non-uniformity (PRNU) | It describes the way in which the individual pixels in the sensor array respond to uniform light sources. Due to variations in pixel geometry, substrate material, and micro-lenses, different pixels do not produce the same number of electrons from the same number of photons hitting them. |
Readout noise | During the readout phase of the acquisition process, a voltage value is read at each pixel. This voltage is computed as a potential difference from a reference level which represents the absence of light. Thermal noise, inherent in the readout circuit, affects the output values. |
Electronic noise | It is caused by the absorption of electromagnetic energy by the semiconductors of the camera circuits and the cross-talk phenomenon. |
1.3.2.2. Demosaicing
Demosaicing is presented in more detail in section 1.2.2. After this step, the noise at each pixel is correlated with its neighbors. After demosaicing, each channel has a different noise curve since channels are processed differently by the demosaicing algorithm.
In addition, the noise curves calculated using Ponomarenko et al.’s method and those obtained from the temporal series no longer match. This is due to the fact that Ponomarenko et al.’s algorithm assumes white noise and estimates noise at high frequencies, which are affected by demosaicing. As the image processing chain is sequential, the temporal noise curves and those measured on a single image will no longer match after demosaicing.
1.3.2.3. Color correction
White balance increases the intensity range of the image. Since the weights are different for each color channel, as mentioned in section 1.2.3, the three noise curves are less correlated after this step. Then, gamma correction greatly increases the noise and the dynamic range of the image, due to the power law function. Furthermore, the noise curves are no longer monotonically increasing after this step. Indeed, if we denote γ the function applied during the gamma correction step, the asymptotic expansion around the intensity u yields γ(u + n) = γ(u) + γ′(u)n, where n is the noise at the intensity u.
1.3.2.4. JPEG compression
The dynamic range remains unchanged after JPEG compression. However, noise is reduced