How to use RGB format image sensor to build a general mosaic algorithm

The Bayer format image sensor only samples one of the three primary colors of red, green and blue at each pixel position. In some applications, a G (green) filter of the Bayer colour filter array will be replaced by another filter, such as a W (white) filter or an IR (infrared) filter Wait. Most of the published demosaicing algorithms are based on the classic Bayer format. In this paper, a general demosaicing algorithm is proposed, which is not only applicable to all image sensors with 2x2 as the minimum sampling period, but also achieves a high index in terms of color peak signal-to-noise ratio (CPSNR).

How to use RGB format image sensor to build a general mosaic algorithm

1 Introduction

The Bayer format image sensor only samples one of the three primary colors of red, green and blue at each pixel position [1-5]. In order to improve the resolution of the output image after demosaicing, the Bayer format image sensor samples green at the diagonal position of its 2×2 minimum sampling period. Certain applications require imaging equipment to provide both RGB images and infrared images of the scene being shot [6]. From the perspective of cost reduction, this requires the image sensor to collect infrared light while sampling the three primary colors of red, green and blue. The simplest improvement method based on the existing Bayer format sensor is to replace a green filter on the diagonal of the existing 2×2 minimum sampling period with an infrared filter. The industry calls this image sensor the RGBIR format.

S. Süsstrunk and C. Fredembach also proposed many other applications of RGBIR sensors in paper [7], including dehazing and skin smoothing. Other applications require the image sensor to provide better low-light performance while collecting color images. A green filter in the existing Bayer format is directly removed so that the channel can collect light of various wavelengths. The industry calls this image sensor the RGBW format [8]. For the convenience of description, this improved image sensor is collectively called RGBX sensor.

In order to meet the above-mentioned applications, other permutation and combination methods of color filters are discussed in the literature [9,10]. However, considering the manufacturing cost of the sensor, this article only focuses on the non-bayer CFA mode with 2×2 as the minimum sampling period. As shown in Figure 1, this is an improved image sensor.

Figure 2 is a typical quantization response curve corresponding to this improved image sensor. The abscissa is the wavelength of incident light, and the ordinate is the quantization efficiency corresponding to different wavelengths.

Due to the great difference in the shape of the quantized response curve of the X channel and the green channel to the incident light, the sampling values ​​of the green and infrared on the diagonal position are generally different. The traditional demosaicing method for the Bayer format generally uses the correlation of the two green channels on the diagonal of the Bayer sample to interpolate the full-resolution green channel, and then based on the assumption of local color difference consistency. , Interpolate the full resolution red and blue channels. It is not difficult to see that after replacing a green channel with an X channel, the sampling rates of the four channels are the same, so this demosaicing method of first interpolating the green channel with a high sampling rate is obviously no longer suitable for improved image sensors.

Although ordinary interpolation methods such as bilinearity and bicubic are applied to the four channels of the RGBX sensor to complete the RGBX demosaicing operation and output four full-resolution images. However, this type of independent interpolation of each channel does not use the information between channels at all, and the effect of the output image after demosaicing will have problems such as low resolution and serious bright color aliasing. From the results of our academic novelty search, the academic community has not yet announced a high-performance RGBX demosaicing algorithm that is fully compatible with the Bayer format.

2 Algorithm proposed in this paper

Since we assume that all channels in the RGBX image sensor have the same sampling frequency (Bayer format can be regarded as a special format of the RGBX image sensor), the algorithm proposed in this paper does not need to follow the traditional demosaicing algorithm process framework of interpolating the G channel first . The specific algorithm flow is as follows.

We regard the original RAW image as the output result of four full-frame RGBX images after the sampling function is added. Expressed as formula (1).

(1)

Where f represents a full-frame image, for example, fCFA represents a full-frame RAW image, fR represents a full-frame R image, and so on.

n1 = [1,2,...H], n2 = [1,2,...W] respectively represent the vertical and horizontal pixel positions of the image, H is the full-frame image height, and W is the full-frame image width.

Define the conversion relationship between the luminance component L, the chrominance components C1, C2, C3 and RGBX, expressed as a matrix as in formula (2).

(2)

Therefore, a full-frame RAW image (CFA) can also be regarded as the output result of the sum of the luminance and chrominance components of the four full-frame surfaces after the sampling function is added as shown in equation (3).

(3)

Among them, -1 = ejπ, the chrominance component C1 can be regarded as a high frequency signal modulated to the center frequency of (0.5, 0), and the chrominance component C2 can be regarded as modulated to the center frequency of (0, 0.5). The chrominance component C3 can be regarded as a high frequency signal modulated to the center frequency of (0.5, 0.5), and the brightness component can be regarded as a fundamental frequency signal.

Figure 3 is a schematic diagram of the energy distribution of the luminance and chrominance components in the frequency domain after the frequency zero point is shifted to the center point. The abscissa represents the frequency in the horizontal direction, and the ordinate represents the frequency in the vertical direction. It can be seen from Figure 3 that the luminance signal L is mainly distributed in the low frequency area of ​​the two-dimensional coordinate system. The chrominance signals C1 to C3 are distributed in the high-frequency area of ​​the two-dimensional coordinate system, where C3 is distributed in the four corners of the above figure, representing the high-frequency signals of the chrominance in the diagonal direction. C1 is distributed on both horizontal sides of the above figure, representing high-frequency signals with chromaticity in the horizontal direction. C2 is distributed on the vertical sides of the above figure, representing the high-frequency signal with chromaticity in the vertical direction. The idea of ​​demosaicing can be to restore the luminance signal through a low-pass filter, and restore the color signal through a high-pass filter, and then convert it to RGBX through matrix transformation.

Before starting the demosaicing operation, the fourth channel needs to be pre-corrected. The fourth channel refers to the channels other than the RGB channels among the four channels. If it is an RGBIR image sensor, the fourth channel is an infrared channel. If it is an RGBW image sensor, W represents a colorless filter, that is, this channel responds to visible light For all wavelengths of infrared light, the fourth channel is the W channel.

Here is still taking the RGBIR image sensor as an example. Specifically, before performing the above-mentioned brightness and chroma convolution operation, it is best to perform a pre-correction operation on the infrared channel. This is due to the quantization curve of the infrared channel and the quantization of the RGB channel. There is a big difference in the corresponding curves, which may cause a big gap in the signals of the infrared and visible light channels even in adjacent areas in some scenes. The filter design is limited by the order, and there will be ripples.

After the convolution operation, the difference between the input signals of adjacent areas on the RAW image may be magnified, and the demosaicing output may appear "grid" on the flat area. The purpose of infrared channel pre-correction is to weaken the difference between the infrared channel and the visible light (RGB) channel in the vicinity of the current pixel, especially the flat area.

The pre-correction operation of the infrared channel (fourth channel) is completed by multiplying the correction factor and the infrared component. The correction factor includes a global correction factor and a local correction factor. The global correction factor is obtained by dividing the average value of the RGB channel and the average value of the infrared channel in the input RAW image. The local correction factor is obtained by dividing the average value of the RGB channels in the neighborhood of the current infrared channel and the average value of the infrared channel. The global correction factor and the local correction factor are obtained by fusion of weighted coefficients. The weighting coefficient is calculated from the gradient value of the RGB channel adjacent to the current infrared channel by looking up the table. The larger the gradient value, the smaller the weighting coefficient of the local correction factor, and vice versa.

After pre-correction processing, interpolation can be started. The luminance component demosaicing operation can recover the luminance signal from the RAW data through a luminance filter and the input full-frame RAW data convolution operation. Here the parameters of the brightness filter can be pre-set or dynamically adjusted according to the local characteristics of the current pixel in the RAW image.

According to the local characteristics of the current pixel in the RAW image, the brightness filter parameters are dynamically adjusted. Specifically, it first calculates the n×n centered on the current pixel (for better processing results, n is preferably greater than or equal to 3). The gradient or other equivalent information in all directions, and then use the gradient value or other equivalent information as the retrieval value to look up a predefined lookup table to obtain the corresponding filter parameters. Or by calculating the gradient difference or ratio between different directions and comparing it with a predefined threshold, it can be judged whether the current pixel is in a flat area or a boundary area or a detailed area.

If the current pixel is in a flat area, the brightness filter parameter that responds well to the low frequency area is selected. If the current pixel is in the boundary area, it is necessary to determine the boundary direction of the current pixel and select the brightness filter parameter that responds well in this direction. If the current pixel is in the detail area, it is necessary to determine the frequency band of the current pixel and select the brightness filter parameter that responds better in this frequency band.

The determination of a flat area refers to calculating the gradient difference or ratio between each direction and comparing it with a predefined threshold. When the result is less than the predefined threshold and the absolute value of the gradient in each direction is also less than the predefined threshold, it is considered a flat area.

The judgment of the boundary area refers to calculating the gradient difference or ratio between the directions and comparing it with the predefined threshold. When the result is greater than the predefined threshold and the absolute value of the gradient in the direction indicated by the subtracted or divided gradient is also greater than the predefined threshold , Which is considered a border area. Generally speaking, if the current pixel is neither in the flat area nor in the boundary area, then the pixel is in the detail area. When judging that the current pixel is in the detail area, it is necessary to judge the frequency band of the current pixel. The frequency band of the current pixel is approximated by the following method: Pre-defined groups of bandpass filters with different frequencies and the neighborhood of the current pixel are convolved to get the maximum value, and the corresponding bandpass filter pass frequency is the current pixel The frequency band.

The chroma component demosaicing operation can be convolved with the input full-frame RAW data through three high-pass filters to recover the color signal from the RAW data. Here the parameters of the high-pass filter can be set in advance, or the above-mentioned gradient value (that is, the gradient of the n×n neighborhood with the current pixel as the center in all directions) or other equivalent information can be used as the search value to find the predefined lookup table Get the corresponding filter parameters. Or it can be dynamically adjusted according to the local characteristics of the current pixel in the RAW image and the aforementioned brightness filter parameters.

The above mentioned dynamic adjustment of the chroma filter parameters according to the local characteristics of the current pixel in the RAW image and the above-mentioned brightness filter parameters, the idea is to reduce the aliasing (false color) phenomenon of the brightness and chroma in the high frequency part. Specifically, when the passband of the luminance filter corresponding to the current pixel is close to the passband of the default chrominance filter, the amplitude of the chrominance filter is suppressed. Otherwise, the amplitude of the chroma filter is retained.

Specifically, if the pass frequency of the luminance filter in the horizontal direction is high, the amplitude of the chrominance C1 filter corresponding to the current pixel is suppressed; if the pass frequency of the luminance filter in the vertical direction is higher, the amplitude of the chrominance C1 filter is suppressed. The amplitude of the chrominance C2 filter corresponding to the current pixel; if the pass frequency of the luminance filter in the diagonal direction is high, the amplitude of the chrominance C3 filter corresponding to the current pixel is suppressed. Suppressing the amplitude of the chrominance filter refers to dynamically calculating the adjustment factor and multiplying it to the chrominance filter parameter.

The aforementioned parameter adjustment of the chrominance filter can also be achieved by adjusting the cutoff frequency of the chrominance filter. Specifically, when the passband of the luminance filter corresponding to the current pixel is close to the passband of the default chrominance filter, the cutoff frequency of the chrominance filter is increased. Specifically, if the pass frequency of the luminance filter in the horizontal direction is high, the cutoff frequency of the chrominance C1 filter corresponding to the current pixel is increased. If the pass frequency of the luminance filter in the vertical direction is high, the cut-off frequency of the chrominance C2 filter corresponding to the current pixel is increased. If the pass frequency of the luminance filter in the diagonal direction is high, the cut-off frequency of the chrominance C3 filter corresponding to the current pixel is increased.

It should be noted that the demosaic output of the above-mentioned luminance component can also be obtained by subtracting the sum of the original RAW data and the chrominance component of the full frame after the demosaic operation of the three chrominance components is completed. Similarly, the demosaic output of a certain chrominance component can also be obtained by subtracting the original RAW data and the sum of the above-mentioned full-frame luminance and 2 chrominance components after the demosaicing operation of the luminance and other 2 chrominance components is completed.

Finally, the four-channel pixel values ​​of RGBX after interpolation are calculated using the conversion matrix of luminance, chrominance and RGBX defined in equation (4).

(4)

3 Test results

Since there is no standard RGBX test set in academia and industry at present, in the objective test, this article uses 24 RGB images of Kodak, the most commonly used test set to test the Bayer format demosaicing performance.

In view of the fact that the algorithm in this paper is fully compatible with the Bayer format, in our experiments, we directly use the Bayer CFA mode to sample the image to obtain the raw image and then perform demosaicing and interpolation on the down-sampled image to evaluate the interpolation result.

We take the color peak signal-to-noise ratio CPSNR as an objective evaluation performance indicator. Table 1 records the CPSNR comparison results. It can be seen from the table that the demosaicing algorithm proposed in this paper is better than the algorithm in [11] by nearly 0.8 dB on average, and is higher than the EIG interpolation algorithm in [12] by more than 1 dB, and higher than that in [13- The linear interpolation method in 15] exceeds 4 dB.

4 Conclusion

The main contribution of this paper is to establish a general demosaicing algorithm framework, which is applicable to all Bayer and non-Bayer image sensors with a minimum sampling period of 2×2. Experimental results show that this method has better CPSNR compared with other algorithms. After a few modifications, this method can also be used for other raw signals with 4×4 as the minimum sampling period.

Computer Cases

Computer Case, Desktop Computer Case, Console Case,Gaming Computer Case,Boluo Xurong Electronics Co., Ltd.

Boluo Xurong Electronics Co., Ltd. , https://www.greenleaf-pc.com