I. INTRODUCTION
Conventionally, noise is thought to be a nuisance that deteriorates a system. Stochastic resonance (SR), on contrary, is a phenomenon in which noise can be used to enhance rather than hinder the system performance. SR is a counter-intuitive phenomenon where the presence of noise in a non-linear system is essential for optimal system performance. Here, this approach has been extended to use SR to enhance contrast of an image. The first experimental work on visualization of SR was reported by Simonotto et al. [Reference Simonotto, Riani, Charles, Roberts, Twitty and Moss1]. Recently, some of the works on application of SR for image enhancement that have been reported in literature are [Reference Ye, Huang, He and Zhang2–Reference Chouhan, Jha and Biswas13]. In this paper, an contrast enhancement technique based on dynamic stochastic resonance (DSR) and scaling of discrete cosine transform (DCT) coefficients using bistable potential double-well model has been proposed. The unique feature of this technique is the use of internal noise instead of externally added noise, and an adaptive processing to reach target optimal performance. Performance parameters have been computed to assess contrast enhancement, perceptual quality and color enhancement of output image. Here, it is observed that the DSR-based enhancement technique surpasses some of the conventional techniques of image enhancement in both spatial and frequency transform domains.
Many images have very low dynamic range of the intensity values due to insufficient illumination, and therefore, need to be processed before being displayed. Enhancement of images is required for better visualization of dark images so as to improve visual perception. Many techniques for contrast enhancement that operate in spatial domain exist in literature [Reference Lim14–Reference Farbman, Fattal, Lischinski and Szeliski18]. Tone mapping methods [Reference Mantiuk, Myszkowski and Seidel19,Reference Fattal, Agrawala and Rusinkiewicz20] attempt to avoid halos by manipulating gradients. Farbman et al. [Reference Farbman, Fattal, Lischinski and Szeliski18] propose the use of an edge-preserving smoothing operator, based on the weighted least-squares optimization framework, which is particularly well-suited for progressive coarsening of images and for multi-scale detail extraction. Recently, novel techniques that exposure fusion [Reference Mertens, Kautz and van Reeth21,Reference Song, Tao, Chen, Bu, Luo and Zhang22] have been reported. Mertens et al. [Reference Mertens, Kautz and van Reeth21] have proposed a novel technique for fusing a bracketed exposure sequence into a high-quality image without converting to HDR first, by blending multiple exposures in multiresolution fashion. Song et al. [Reference Song, Tao, Chen, Bu, Luo and Zhang22] have reported a technique to handle HDR scenes by integrating locally adaptive scene detail capture and suppressing gradient reversals introduced by the local adaptation.
Many algorithms found in the literatures have been designed for both colored and grayscale images in block DCT domain [Reference Bockstein23–Reference Tang, Peli and Acton26]. However, there are some disadvantages in processing images using block DCT. Owing to independent processing of blocks, as in most of the cases, the presence of blocking artifacts may become more visible in the processed data. Sometimes superfluous edges may appear at the image boundaries due to the sharp discontinuities of the intensity distribution.
The display of a color image depends on three important factors, namely: (i) brightness, (ii) contrast, and (iii) original color composition. Interestingly, most of the previous works have considered either the brightness (such as adjustment of dynamic ranges) or the contrast (such as image-sharpening operations), and in some cases a combination of both attributes. But little consideration was made toward preservation of colors in the enhanced image. In this paper, all these three important factors have been considered while designing a simple computationally-efficient algorithm that enhances a dark or low contrast image. Performance parameters have been computed to assess contrast enhancement, perceptual quality, and color enhancement of output image.
The concept of SR was invented in 1981–1982 in the context of the evolution of the earth's climate. Statistical data show that interglacial transitions are random variables displaying average periodicity of around 106 years. Since the only-known time scale in this range is that of the changes in time of the eccentricity of the Earth's orbit around the Sun, as a result of the perturbing action of the other bodies of the solar system. This perturbation modifies the total amount of solar energy received by the earth but the magnitude of this astronomical effect is exceedingly small, about 0.1%. To explain this phenomenon, the concept of SR was introduced with the assertion that it is not the periodic eccentricity alone, but eccentricity in resonance with weak disturbance that causes this colossal change in temperature. In a model of Benzi et al. [Reference Benzi, Sutera and Vulpiani27,Reference Benzi, Parisi, Sutera and Vulpiani28] to explain recurrence of ice age on Earth, the global climate is represented by a double-well potential, where one minimum represents a small temperature corresponding to a largely ice-covered Earth. The small modulation of the earth's orbital eccentricity is represented by a weak periodic force. Short-term climate fluctuations, such as the annual fluctuations in solar radiation, are modeled by Gaussian white noise. At some optimal tuned amount of noise, synchronized hopping between the cold and warm climate could significantly enhance the response of the earth's climate to the weak perturbations caused by the earth's orbital eccentricity, according to arguments by Benzi et al. [Reference Benzi, Sutera and Vulpiani27]. Specifically, glaciation cycles are viewed as transitions between glacial and interglacial states that somehow manage to capture the periodicity of the astronomical signal, even though they are actually made possible by the environmental noise rather than by the signal itself. Starting in the late 1980s the ideas underlying SR were taken up, elaborated, and applied in a wide range of problems in physical and life sciences [Reference Rouvas-Nicolis and Nicolis29–Reference Gammaitoni, Hanggi, Jung and Marchesoni31].
First experiment of SR for image visualization was reported in [Reference Simonotto, Riani, Charles, Roberts, Twitty and Moss1]. They reported the outcome of a psychophysics experiment, which showed that the human brain can interpret details present in an image contaminated with time-varying noise and the perceived image quality is determined by the noise intensity and its temporal characteristics. Piana et al. [Reference Piana, Canfora and Riani32] described two experiments related to the visual perception of noisy letters. The first experiment found an optimal noise level at which the letter is recognized for a minimum threshold contrast. In the second experiment, they demonstrated that a dramatically increased ability of the visual system in letter recognition occurs in an extremely narrow range of noise intensity. Qinghua et al. [Reference Ye, Huang, He and Zhang4] have used SR phenomenon for image enhancement of low contrast sonar images. They have reported the image enhancement technique that showed that an additional amount of noise besides the noise of the image itself would be helpful to enhance low contrast images. Peng et al. [Reference Peng, Chen and Varshney5] reported a novel preprocessing approach using SR to improve the low-contrast medical images. The contrast is improved by adding some suitable noise to the input image.
Recently, SR-based techniques in wavelet and Fourier domain for the enhancement of unclear diagnostic ultrasound and MRI images, respectively, have been reported [Reference Rallabandi6,Reference Rallabandi and Roy7]. These methods can readily enhance the image by fusing a unique constructive interaction of noise and signal, and enable improved diagnosis over conventional methods. The approach illustrates the potential of using a small amount of Gaussian noise to improve the image quality. Ryu et al. [Reference Ryu, Konga and Kimb8] have developed a new approach for enhancing feature extraction from low-quality fingerprint images using SR.
In this paper, we have used internal stochastic fluctuation to enhance an image. The objective of this investigation is to neutralize the noise due to lack of illumination, and enhance the dark regions of an image in a double-well model analogous to that developed by Benzi et al. [Reference Benzi, Sutera and Vulpiani27]. Our approach is to maximize the performance of our algorithm in terms of contrast and color enhancement while assuring good perceptual quality (visual information). Noise itself is used to counter the effect of noise. In other words, in DSR-based enhancement, a small amount of extra noise rearranges the intrinsic noise that is already present in the image. The outcome of the process is both image enhancement and reduction of noise. DSR-based technique works effectively for images that are dark as well as those which have an overall dull appearance.
The display of a color image depends on three important factors, namely: (i) brightness, (ii) contrast, and (iii) original color composition. In this paper, we have chosen to work on Hue-Saturation-Value (HSV) color model and have computed performance parameters to assess contrast enhancement, perceptual quality, and color enhancement of output image. We have observed that the DSR-based enhancement technique surpasses the conventional methods of image enhancement in both spatial and frequency transform domains. As previous studies [Reference McNamara and Wiesenfeld30,Reference Gammaitoni, Hanggi, Jung and Marchesoni31] have shown that bistable SR can enhance weak one-dimensional (1D) noisy signals, here we extend this approach to use SR to enhance contrast of a 2D signal or image.
II. KEY CONTRIBUTION
The work proposed in this paper is uniquely different from the state-of-the-art SR-based techniques in the aspects mentioned as follows. The technique reported in [Reference Hongler, Meneses, Beyeler and Jacot3] deals with edge detection using vibrating noise. Also, the technique reported in [Reference Peng, Chen and Varshney5] used non-dynamic SR to improve the performance of adaptive histogram equalization by using SR. The technique proposed by Ye et al. [Reference Ye, Huang, He and Zhang4] for sonar image enhancement suggests addition of externally added noise on bileveled images. Both [Reference Peng, Chen and Varshney5] and [Reference Ryu, Konga and Kimb8] use the concept of non-dynamic SR and that adds N-parallel frames of independent and identically distributed (i.i.d.) Gaussian noise and uses addition of externally added noise. All these techniques are in spatial domain. Our earlier work on suprathreshold SR [Reference Jha, Chouhan and Biswas9] used addition of external noise in spatial domain, where random noise was added repeatedly to an image and is successively hard-thresholded followed by overall averaging. By varying the noise intensities, noise-induced resonance was obtained at a particular optimum noise intensity. This approach is different from the one used in this paper, as the type of non-linearity introduced in the former is due to thresholding, while that introduced in the latter (proposed here) is due to barrier height of double-well. Our other DSR-based investigations [Reference Jha and Chouhan10,Reference Chouhan, Jha and Biswas12] were based in spatial and wavelet domains, respectively, while the concept of DCT-based enhancement was introduced in [Reference Jha, Chouhan, Biswas and Aizawa11].
The major difference between earlier SR-based technique and the present approach is as follows: the focus of earlier SR-based work was centered about edge detection [Reference Hongler, Meneses, Beyeler and Jacot3] or increasing feature interpretability [Reference Ye, Huang, He and Zhang4,Reference Peng, Chen and Varshney5,Reference Ryu, Konga and Kimb8]. However, the aspect of contrast enhancement of dark images using Dynamic SR, particularly in DCT domain was not addressed and the suitability of frequency domain for inducing the same was unexplored. Unlike earlier techniques [Reference Rallabandi6,Reference Rallabandi and Roy7] that were based on addition of external noise and experimental selection of parameters, the parameter selection in this work has been done by maximization of signal-to-noise-ratio (SNR) and imposition a condition of subthreshold nature on input image. The focus of the investigation is on dark real-life dark images and it is here that a novel noise-enhanced dynamic SR-based application in DCT domain has been proposed. Applicability of DSR using intrinsic noise due to low illumination and preservation and enhancement of color has been tested. A major and novel aspect of our approach is to maximize the performance of our algorithm in terms of contrast and color enhancement alongside assuring good perceptual quality (visual information). Here, noise itself is used to counter the effect of noise. In other words, in DSR-based enhancement, a small amount of extra noise rearranges the intrinsic noise that is already present in the image. The proposed approach has been explored to exploit the nature of DC and AC coefficients of a DCT block, and was found to implicitly enhance and preserve color accurately. By scaling of DCT coefficients using DSR in a discrete iterative equation, it has been observed the overall contrast and luminance of an image is increased.
In the technique, an analogy to Benzi's double-well model for recurrence of ice ages [Reference Benzi, Sutera and Vulpiani27] has been presented treating one state (minima) as the poor contrast state and the other one as the enhanced state. The low-contrast image is treated as a sum of weak subthreshold signal and noise (due to lack of adequate illumination). A transition of the image from the low-contrast state to high-contrast state is induced by a “noise-induced” resonance between the internal noise and subthreshold signal after certain number of iterations following the dynamics of motion of a particle in a double-well. Oscillation about the mean (minima) of the double-well are considered analogous to iterations of the discrete DSR equation. The proposed technique follows an iterative algorithm and selects best output when performance metrics are maximized. The proposed technique selects parameters by maximization of SNR and also further relates a DSR parameter with the statistical properties of the low-illuminated image itself. The applicability of DSR has been extended on a low-contrast image (to make it a subthreshold signal) by imposing condition on another parameter so that the coefficients of low-contrast image can be accepted as an input to the SR system (Section VI).
III. DYNAMIC STOCHASTIC RESONANCE
The word noise in general understanding is associated with the term hindrance. It was traditionally believed that the presence of noise can only make the system worse. However, recent studies have convincingly shown that in non-linear systems, noise can induce more ordered regimes, which cause the amplification of weak signals and increase the SNR [Reference Benzi, Sutera and Vulpiani27,Reference Gammaitoni, Hanggi, Jung and Marchesoni31,Reference Bulsara and Gammaitoni33]. In other words, noise can play a constructive role in enhancing weak signals.
The general behavior of SR mechanism shows that at lower noise intensities, the weak signal is unable to cross the threshold, thus giving a very low SNR. For large noise intensities, the output is dominated by the noise, also leading to a low SNR. But, for moderate noise intensities, the noise allows the signal to cross the threshold giving the maximum SNR at some optimum noise levels. Thus, a plot of SNR as a function of noise intensity shows a peak at an optimum noise level as shown in Fig. 1(a).
The bistable-SR model conventionally used by the physicists shall be explored and elaborated in its application to contrast enhancement of a digital image. The image pixel would transform if mean-zero Gaussian fluctuation noise is added, so that the pixel is transferred from weak signal state to enhanced state. Such a change of state of pixel under noise can be modeled by Brownian motion of a particle placed in a double-well-potential system shown in Fig. 1(b).
A classic 1D non-linear dynamic system that exhibits SR is modeled with the help of Langevin equation of motion [Reference Risken34] in the form of Equation (1) given below.
This equation describes the motion of a particle of mass m moving in the presence of friction, γ. The restoring force is expressed as the gradient of some bistable or multistable potential function U(x). In addition, there is an additive stochastic fluctuation (noise) ξ(t) with intensity D. If the system is heavily damped, the inertial $m {d{^{2}}x(t) \over dt^{2}}$ term can be neglected. Rescaling the system in Equation (1) with the damping term γ gives the stochastic overdamped Duffing equation [Reference McDonnell, Stocks, Pearce and Abbott35] which is frequently used to model non-equilibrium critical phenomena as given in Equation (2).
where U(x) is a bistable quartic potential (Fig. 1(b)) given in equation (3).
Here, a and b are positive bistable double-well parameters. The double-well system is stable at $x_{m}=\pm{\sqrt{{a \over b}}}$ separated by a barrier of height $\Delta{U}={a^{2} \over 4b}$ when the ξ(t) is zero.
Addition of a periodic input signal [B sin(ωt)] to the bistable system makes it time-dependent (as given in equation (4)) whose dynamics are governed by equation 5.
where B and ω are the amplitude and frequency of the periodic signal, respectively. It is assumed that the signal amplitude is small enough so that in the absence of noise it is insufficient to force a particle to move from one well to another. Substituting U(x) from equation (3) into equation (5).
In the absence of periodic force, the particle fluctuates around its local stable states. The rate of transition of particle (r k) between the potential well under the noise-driven switching is given by Kramer's rate [Reference Risken34] as in equation (7).
When a weak periodic force is applied to the unit mass particle in the potential well, noise-driven switching between the potential wells takes place. When the average waiting time, $T_{k}(D) = {1 \over r_{k}}$, between two noise-driven inter-well transitions satisfies the time-scale matching between signal frequency, ω, and the residence times of the particle in each well [Reference Gammaitoni, Hanggi, Jung and Marchesoni31,Reference Jung and Hanggi36], that is the condition when resonance occurs. In other words,
where T ω is the period of the periodic force.
One way of measuring how well the position of the particle represents the frequency of the input is to measure the power spectral density (PSD) of the position, and determine the SNR at ω. This will have a peak at a non-zero value of D, and hence SR occurs. The optimal value of D is the one that provides the best time-scale matching between ω and the residence times of the particle in each well. The most common quantifier of SR is SNR. For a symmetric bistable system, SNR is obtained from [Reference Gammaitoni, Hanggi, Jung and Marchesoni31,Reference Jung and Hanggi36].
Substituting the value of r k from equation (7) to equation (9) we get.
The SNR expression for dynamic SR as derived in [Reference Ye, Huang, He and Zhang4] is given below.
Here σ1 is the standard deviation of the added noise in the SR-based system and σ0 is the internal noise standard deviation of the original bistable system.
IV. CHOICE OF DSR FOR IMAGE ENHANCEMENT
It has been observed in 1D signals that at an optimum “resonant” value of noise, the signal crosses the threshold and transits into another (enhanced) state. An analogy to Benzi et al.’s double-well model for global climate in the context of image enhancement has been developed in this paper. Here double-well represents the contrast of an image. The position of particle is analogous to the state of the coefficient magnitude. The weak periodic forcing is constituted by the DCT coefficients while the noise is constituted by the noise inherent in the DCT coefficients due to lack of illumination. Each of the two stable states are represented by a low-contrast state and enhanced state, respectively. The state at which performance metrics are found to be maximum can be considered as enhanced state from input state.
The quartic potential system is of particular interest because it represents the simplest bistable system. Under random periodic forcing, a particle will spend most of its time near the minima of the double-well oscillating about the mean position with progressively increasing excursions along the x-axis, Fig. 1(b). Each of these oscillations corresponds to iteration in our iterative DSR equation. At optimum intrinsic noise density (or optimum number of oscillations) the particle makes a transition into the other well. In the proposed analogy, this optimum amount of noise is reached by a corresponding discrete iterative equation because the number of iterations are directly proportional to the internal noise.
Dynamic SR is exhibited by a double-well-potential valley denoted by parameters a and b that govern the shape of the valley Fig. 1(b). The transform coefficients (here, DCT) of a low-contrast image are in a weak state in the sense that their distribution becomes random due to inherent noise in the form of lack of proper illumination. The application of DSR involves correlation of parameters a and b with these random coefficients. When content of each frequency is correlated with optimum bistable DSR system parameters, the DCT coefficient distribution spread increases, and so does the overall contrast of an image. The result is that an image in poor contrast state transits into an enhanced contrast state after certain optimum number of oscillations about the poor state.
A) Choice of DCT domain
The nature of DCT coefficients within a block and in totality is observed to be following normal distribution. When both DC and AC coefficients are tuned using iterative dynamic SR equation, the variance of the DCT coefficient distribution is found to increase with iterations. It is known that the DC coefficient represents the average brightness of an image while the sum of squares of the normalized AC coefficients gives variance of an image. Thus, modification of DC coefficient of each block would increase the local brightness (this would be very useful for enhancement of dark images). Owing to block-wise operation, local contrast and brightness can be adjusted accordingly. Different algorithms have been reported for both color and graylevel images in the block DCT domain, such as multicontrast enhancement [Reference Tang, Peli and Acton26], alpha rooting [Reference Aghagolzadeh and Ersoy37], by processing the AC coefficients and its modified form by processing both DC and AC coefficients [Reference Mukherjee and Mitra25,Reference Lee38].
B) Distribution of DCT coefficients of image: mechanism of DSR for contrast enhancement
For a low-contrast input image, the histogram of the image as well as of its transformed coefficient distribution is observed to be of low spread. Since squared magnitude of the coefficients imply energy, a low-variance distribution implies that the energy distribution is concentrated in only certain areas, confirming that the image in question is of low contrast.
Now if DSR is applied to these DCT coefficients, its variance is observed to increase with iterations (Figs. 2(a) and 2(d)). This is because the coefficients are being tuned by certain bistable system parameters. The sum of square of normalized AC coefficients provides the variance of the image. Hence, any change in the DC component does not have any bearing on its standard deviation. So under scaling or modification of the DCT coefficients, the mean and standard deviation of the processed image become some multiple of original mean and standard deviation, respectively. As a result the contrast of the processed image becomes proportionally certain multiple of that of the original image [Reference Mukherjee and Mitra25]. Another way of interpretation is that, the value of DCT coefficient denotes the amount of a particular frequency being present. An increase in variance of coefficients means that now a greater range of amount of frequencies are occupied. In other words, there is significant variation in amount of different frequencies present in the signal. This would ensure that the balance between low- and high frequencies is restored and the image gets a better and more uniformly spaced graylevels, thus implying on high-contrast output image. This is the basic mechanism of how DSR works toward improving contrast.
V. MATHEMATICAL FORMULATION OF THE DCT-BASED DSR
Mathematical formulation of DSR for enhancement of very dark image is discussed here. 2D DCT is applied to the input image. Let us consider the 2D spatial representation of an M×N image, I(x, y), in an actual physical space (x, y) where the function will be image pixel value. After applying the 2D DCT, I′(u,v) is obtained [Reference Gonzales and Woods15].
where
where u and v are the DCT frequency pair corresponding to spatial coordinate x, y. Now DSR is applied to the I′(u,v) coefficients, thereby obtaining the stochastically enhanced set of DCT coefficients given as
where the DSR operation can be shown in differential equation form and in discrete equation form as given in equation (6) and equation (12). Here the noise term $\sqrt{D}\xi(t)$ and the input term B sin (ωt) is replaced by DCT coefficient of I(x, y), that is, I′(u,v). In equation (6), the DSR is produced by the noise term $\sqrt{D}\xi(t)$, whereby the maximization of the SNR occurs at double-well parameters $a=2\sigma_{0}^{2}$ (as described in Section VI). We need to solve the stochastic differential equation given in equation (6) using the stochastic version of Euler–Maruyama's method using the iterative discretized method as follows [Reference Gard39].
Note that $Input=B\sin(\omega{t}) + \sqrt{D}\xi(t)$ denotes the sequence of input signal and noise. This notation can be done keeping in view that the low-contrast image is a noisy image containing internal noise due to lack of illumination. This noise is inherent in its DCT coefficients and therefore, the DCT coefficients can be viewed as containing signal (image information) as well as noise. The final stochastic simulation is obtained after number of iterations. Finally, the image is reconstructed in the spatial domain by applying inverse DCT operation given below:
VI. SELECTION OF PARAMETERS FOR IMAGE ENHANCEMENT
This section describes one of our key contributions – the approach for selection of double-well parameters a and b.
A) Selection of a
DSR is defined by equation (12) after proper selection of the double-well parameters a and b. These double-well parameters can be obtained by maximization of the SNR expression of DSR.
For SNR maximization, we differentiate equation (11) with respect to a and equate to zero. Out of two parameters a and b of the DSR, any one can be selected for proper discussion of DSR. We have selected parameter a here for our discussion.
This gives $a=2\sigma_{0}^{2}$ for maximum SNR. Thus SNR has maximum value at an intrinsic property a of the dynamic double-well system. The other parameter b can be obtained using parameter a.
B) Selection of b
To ensure that the low-contrast signal is subthreshold, we have derived a condition for the value of parameter b. As shown in equation (1), the restoring force is expressed as the gradient of some bistable potential function U(x). The periodic force alone is not responsible for the transition of the particle from one well to another. The maximum possible value of the a periodic force on the particle by which the bistable potential well does not change its state due to this force alone. Let R=B sin ωt be the periodic forcing signal.
implying $x= \sqrt{{a \over 3b}}$. Finding R at this value gives maximum force as $\sqrt{{4a^{3} \over 27b}}$. This is the maximum possible force at which the bistable system would become stable. For a force larger than this, the system would become unstable. Therefore,
Since our desire is to obtain a maximal signal, we let the sine term attain its maximum value, that is, unity, B=1.
Hence, for a weak input signal $b < {4a^{3} \over 27}$.
Thus, the values of these parameters for maximizing contrast enhancement or SNR (in a general sense) are taken to be $a=2\sigma_{0}^{2}$ and b<(4a 3)/27.
However, it has been found experimentally that for the purpose of contrast enhancement, best results are obtained by introducing a factor-denoting image region dullness (k) in the determination of a.
VII. PROPOSED DSR-BASED ALGORITHM FOR CONTRAST ENHANCEMENT
The proposed algorithm performs contrast enhancement on colored images by applying DSR iteratively on the DCT coefficients of the image in question. The algorithm addresses requirement of dark and low-contrast images. The procedure comprises three basic steps.
In the first step, a block-wise DCT of input image is computed. By using block DCT space, the localized information can be modified or enhanced in successive steps. We have therefore adopted an adaptive selection of blocksize. This step ensures that blocking artifacts are suppressed in the output.
In the second step, areas that need different extent of enhancement are segregated using a threshold value in the neighborhood (blocksize) determined in the first step.
The third step is the most crucial one: the application of DSR selectively on DCT coefficients. For example, in a very dark image or a dull (low dynamic range) image, enhancement operator (DSR) will be applied to the entire image whereas in an image with high dynamic range, the enhancement operator would be applied on selected areas where the local brightness is below a certain threshold. This is to accentuate features that were not visible in the input image, without making the bright areas too bright. On such an image, the underilluminated areas should be iterated more than those areas that were overilluminated. This step ensures that background illumination is adjusted and local as well as global contrast is improved.
In order to investigate the ways in which DSR operator modifies the transformed coefficients, and to study its effect on preservation and enhancement of color in the enhanced image, the algorithm, in its current form, is designed to operate on each color band (R, G, B). The same approach can also be extended to Y–Cb–Cr or H–S–V color spaces. The DSR operator also works on each DCT block independently, and therefore, can be considered suitable for implementation in parallel configuration. Optimized tuning of parameters of DSR bistable system have been done.
A dark image has a narrow histogram (intensity distribution) concentrated at the lower (darker) end of intensity axis. A low dynamic range image also has a narrow histogram but may be centered anywhere across the entire available intensity axis. An image with high dynamic range is expected to have very dark areas along with some very bright areas. Enhancement of such images is challenging as most of the available algorithms enhance dark areas but at the cost of making bright areas too bright leading to loss of information. This aspect is dealt with remarkable efficiency by the proposed DSR-based technique.
A) Quantitative performance metrics
For measuring the efficiency of the proposed DSR-based technique, we need to compare its performance with other conventional methods. For this comparison, there is a need to quantify the quality of enhanced image. Since we need to gauge the performance of our technique in terms of contrast, perceptual quality as well as degree of color enhancement while being preserved, we have chosen three metrics F, PQM, and CEF, respectively, to characterize each of them.
Metric of contrast enhancement (F) is based on global variance and mean of original and enhanced images. It can be stated that when an image is enhanced and clearer heterogeneity in its structure is obtained, the value of enhancement can be characterized by variation of Michelson contrast index (which is given by ratio of spread and mean image intensity) [Reference Rallabandi and Roy7]. We have therefore used a descriptor called image contrast quality index, Q, such that
where σ and μ are, respectively, the standard deviation and mean of the image. An estimate of relative contrast enhancement factor, F, by computing ratio of values of contrast quality indices post-enhancement (Q B) and pre-enhancement (Q A). Therefore,
For evaluation of perceptual quality, we have used a no-reference metric for judging the image quality reconstructed from the block DCT space to take into account visible blocking and blurring artifacts, which we shall refer to as perceptual quality metric (PQM) [Reference Lee38].
where α, β, γ1, γ2, and γ3 are model parameters that were estimated with the subjective test data as described by [Reference Wang, Sheikh and Bovik40]. B is the average blockiness, estimated as the average differences across block boundaries for horizontally and vertically. A is the average absolute difference between in-block image samples and Z is the zero-crossing rate. According to [Reference Mukherjee and Mitra25], the PQM value should be close to 10 for best perceptual quality. {0.20mm} Since, we are also interested to observe the quality in terms of color enhancement, we have used a no-reference metric called colorfulness metric (CM) as suggested by [Reference Susstrunk and Winkler41]. If R, G and B be the red, green and blue components respectively of an image, I, and let α=R−G and β = (R+(G/2))-B, then the colorfulness of the image is defined as follows.
where σα and σβ are the standard deviations of α and β. Similarly, μα and μβ are their means. The CEF has been defined as follows.
For good color and contrast enhancement, respective values CEF and F should be greater than 1. Codes obtained from [42] were used to compute PQM and CEF.
B) Algorithm
The proposed DSR-based iterative algorithm for contrast enhancement consists of the following steps.
1) STEP-1 BLOCK DCT
An input image of dimension M × N is considered, and its 8 × 8 block-wise DCT is calculated. Small block size is preferred to preserve continuity in the enhanced output image because DCT, when processed in blocks, is known to produce blocking artifacts in the processed data. However, in areas where there is little change in luminance as well as no sharp transitions, a larger blocksize could be used to reduce computation. Since blocking artifacts are more visible in the regions where brightness values vary significantly, specially near the edges of sharp transitions of luminance values, an adaptive selection of blocksize has been adopted. To suppress the effect of artifacts, here, the DCT blocks have been decomposed into smaller blocks, if necessary, and computations are performed on them. Later these smaller blocks are merged into the original block size.
Adaptive blocksize for DCT: In this case, the blocks having significant variations have been identified by examining the standard deviations of the normalized AC coefficients. If the standard deviation is beyond a threshold, an 8 × 8 block is decomposed into four 4 × 4 subblocks, and so on. This step also creates a non-uniform grid for creating of mask in Step-2 because the blocksize is not fixed for the entire image. Later, the four enhanced subblocks can be combined again to an 8 × 8 block.
2) STEP-2 SELECTION OF AREAS FOR ENHANCEMENT
The entire image might or might not need contrast enhancement. There are various practical images in which certain portions are already bright enough but certain areas require accentuation of features that are lost due to insufficient illumination. In such images where some areas are bright, DSR is applied to greater extent to those areas that are dark. To locate those areas, local intensity in each m × n block of the image is observed and those blocks where local brightness (mean) and contrast (variance) is observed to be above a certain threshold are separately processed. The DC coefficient of DCT could be used to denote average brightness. These blocksizes (or neighborhood of analysis) have been determined by the adaptive process in Step-1. Using the bilevel thresholding technique (proposed by Chao et al. [Reference Ming, Wu and Chen43]), a threshold in each neighborhood can be determined and pixels having intensity values greater than the threshold are separated from those below it using a binary mask. This way, for each color band, a binary mask for separating dark and bright regions is created. Depending on the requirement of the image, the bright areas have been iterated using DSR but fewer times than the dark ones to achieve a moderate overall contrast. A case of very dark image can be considered as a special case of the above selection where the local mean of all the blocks is below threshold and requires enhancement.
Note: For a very dark low dynamic range image, where maximum intensity level in the image is below graylevel 128 (in 8-bit representation) for each band, the threshold can be taken as the maximum graylevel in each color band. This would mean all the DCT coefficients will be processed. For an image with under- and overilluminated areas, selection of coefficient can be done as stated above.
3) STEP-3 APPLICATION OF DSR
Assuming an initial value of bistable parameters Δt, k, and m, the selected DCT coefficients are tuned using DSR as follows.
Assuming Δt = 0.001-s, a i=k × 2σ0i2, b i = m × (4a i3)/27. Bistable parameters, a and b, are computed for each block using its local variance (σ0i).
Here, k is a factor denoting image region dullness (given by inverse of (variance × dynamic range)) and m is a factor much less than 1 (to ensure that b is less than its maximum value, so that input is weak signal and eligible for application on DSR).
Initializing a matrix of dimension M × N as zero. x(0)=0. Using the bistable DSR parameters tune the DCT coefficients according to equation (12) as
where x denotes the set of tuned coefficients and n is the iteration count. It is important to note here that the assignment of DCT-block variance in computation of parameter $a$ follows the assumption that DCT coefficients of a low-contrast image contain both signal information and noise due to insufficient illumination. Since noise is inherent in DCT coefficients, therefore, the standard deviation of internal noise can be equated to standard deviation of DCT coefficients. Since the algorithm is framed in a block DCT scenario, local DCT variance is considered as local noise variance. Inverse DCT is computed for the tuned set of coefficients. Optimization of performance of this intermediate output is performed with respect to bistable parameters as follows.
4) STEP-4 OPTIMIZATION OF PARAMETERS
Optimization of parameters is done with respect to all the performance metrics F, PQM, and CEF to make the algorithm reach target performance. This is to ensure that output is not only of good contrast, but also has better visual perception along with color enhancement.
There are four main parameters controlling the DSR operation – a, b – governing shape of double-well, Δt, and n-governing the DSR difference equation.
Value of parameter a depends on k (of the image region) and standard deviation of DCT coefficients in local neighborhood (as derived by maximization in Section VI. In other words, its value is obtained from the image (and image transform) statistics and is therefore fixed. Values of m, Δt, and n are initialized as m=10−10, Δt=0.001, and n=500. Initial value of F, PQM, and CEF is assumed to be 0.01. To make enhancement algorithm give optimal results, each of the performance metric for individual output enhanced image is calculated and each parameter is analyzed for the value at which maximum performance metric (F, PQM, and CEF) is reached. As stated in Section VIIa), for best performance, F and CEF should be as large as possible, while PQM should be nearly 10. With these conditions on F and CEF, and the constraint on PQM, the optimum value of the particular parameter is chosen by linear maximization of F and CEF, in the vicinity of PQM=10. Then each other parameter is optimized similarly keeping the remaining three parameters constant.
The steps can be understood as described below:
(a) Optimize m keeping Δt and n as constant (initial values as stated above); that is, calculate F and CEF for enhanced outputs obtained by varying values of m. Optimum value of m is the one at which F(m) + CEF(m) is maximum for PQM(m) ~ 10 (say, 10±1).
(b) With this value of optimum m and initial value of n, the value of Δt is then similarly optimized.
(c) With these obtained optimized values of m and Δt, the iteration count n is then similarly optimized.
Please note that though F, PQM, or CEF are not functions of m, n, or Δt, the notation F(m), PQM(n) etc has been used to denote the metric value for the particular parameter. For example, F(m) denotes that value of F obtained from the enhanced image obtained for parameter value of m.
The final enhanced output is the one with optimized values of m, Δt, and n ensuring that it has best possible contrast enhancement, perceptual quality, and color enhancement. With the above optimization process, optimized values of bistable parameters are calculated per image. The parameters that maximizes all the performance metrics and minimizes trade-off are used to display final outputs.
Note: There are two regions in the image after creation of binary mask – one that needs serious enhancement and the other that needs to be preserved. This optimization procedure is followed independently for both the regions. So for each image, there are two values of k corresponding to two masked regions each depending on inverse of (variance × dynamic range) of that region. This would ensure that the two regions are optimized differently and enhanced as per requirement. The issue of selectively iterating areas of different illumination is therefore taken care of by the optimization. In this way, the final output image is maximally enhanced.
VIII. EXPERIMENTAL RESULTS
The proposed method has been tested on around forty different types of dark and low-contrast grayscale and colored images. The outputs have been compared with many different existing contrast enhancement techniques and have been found to give noteworthy and better contrast enhancement. Performance metrics have been calculated and displayed for evaluation of output image quality. The best outputs are selected by optimal values of the three performance metrics relative contrast enhancement factor (F), PQM, and CEF.
Some of the test images are naturally low-contrast and have been captured by Sony DSC H9 camera in very poor illumination (Figs. 5(a) and 5(c)), while others have been made low contrast by manipulation (Figs. 3(c), 6(a) and 10(a)). Some images with already dark background have been obtained from Internet (Figs. 3(a) and 4(a)).
A) Enhancement results
It can be observed that Figs. 3(a), 4(a), 6(a) and 7(a) are low-contrast images with certain well-lit areas in the foreground but totally dark and indistinguishable features in the background. E.g. the buildings in the background of Fig. 3(a) and trees in Figs. 7(a). Figs. 5(a) and 5(c) are dark and low-contrast range images with dull appearance (where Fig. 5(c) is a grayscale image). Figs. 4(c) and 3(c) are low-contrast dull images.
For such images, by selecting a mask for each color band after analyzing the average brightness in local neighborhood, coefficients on which processing is needed are selected and shown in Fig. 7. It is apparent that after certain number of iterations using optimized bistable parametric values, the dark portions are enhanced drastically while the bright portions are affected little. It has been observed that the enhanced output is smooth and devoid of any kind of artifacts.
Fig. 7 shows input and enhanced images for a test image along with the respective histograms of their color bands. The histograms that have a major portion toward the darker end are found to be shifted and broadened with little modification in the regions near the brighter end. This implies that the local dynamic range in dark and dull areas is expanded while that in bright areas are not greatly disturbed.
Figs. 6(c) and 6(d) show zoomed in area of Figs. 6(a) and 6(b), respectively. It can be clearly seen that many details of the dark portions are observed to appear in the DSR-enhanced output. Fig. 5(a) shows a naturally dark image, captured in a dark room, that is, in very poor illumination. Since the image is dark and its maximum intensity is less than graylevel 128, no selectivity is required and the entire image is processed.
Table 1 show the metrics relative contrast enhancement factor (F), Perceptual Quality Measure (PQM) and Color Enhancement Factor (CEF) for three different types of input images – dark, low-contrast and one with both over and underilluminated areas. It is apparent that the most optimal performance metrics are achieved by the proposed DCT-based DSR technique in terms of visual quality, contrast enhancement, and color enhancement.
B) Computational complexity
On an Intel Core 2 Duo CPU 3.25GB of RAM, on a general 512 × 512 colored image, 100 iterative steps take approximately 1.5 s. Since parameter a is derived from the statistics of the coefficient distribution, the other two parameters, b and Δt, are calculated by linear maximization of performance metrics. This total optimization of two parameters takes around 40 s. In general, 100 simple DSR iterations take around 1.5 s. If average number of optimal iterations is around 250, then cost of computation, when optimal parameters are known is (1.5/100) × 250 = 3.75 s. The computation time may be reduced by defining a logical relationship between all bistable parameters (including b and Δt) and input statistics, as this may eliminate the iterative optimization of each parameter as appears in the current state of the algorithm.
After optimized performance metrics have been computed, the general computational complexity of the SR-based algorithm on an M×N image may be stated in terms of iteration count, n, needed for optimal performance. The complexity of the algorithm is guided by two factors – the iteration count and complexity of transformation. For each iteration from 1 to n, the computation for DSR step is O(MN), whereas that of inverse DCT step is O(MN lg N). Since inverse transformation is done after each iteration, total complexity including one-time DCT is O(MN lg N) + n · O(MN + MN lg N). Although the complexities of general histogram equalization (O(aL+bMN), where a, b: constants, L is the number of graylevels), gamma correction (one exponentiation operation), and many other techniques are relatively less than the SR-based algorithm, the superior and optimal overall performance of the SR-base algorithm makes it suitable for applications to dark image enhancement.
C) Comparison with other techniques
The response of the proposed technique has been compared with other image enhancement techniques and has been shown in Figs. 8, 9, 10 and 11. In spatial domain, comparison with contrast limited adaptive histogram equalization (CLAHE) [Reference Zuiderveld44], gamma correction (Gamma), single-scale retinex (Retinex) [Reference Jobson, Rahman and Woodell16], multiscale retinex (MSR) [Reference Jobson, Rahman and Woodell45], modified high-pass filtering (MHPF) [Reference Yang46], edge-preserving multiscale decomposition (EPMD) [Reference Farbman, Fattal, Lischinski and Szeliski18] has been done. Additional comparison for SR-based techniques in spatial domain, using suprathreshold SR (SSR) [Reference Jha, Chouhan and Biswas9], and dynamic SR on singular values (DSR-SVD) [Reference Jha and Chouhan10] has also been made. In transform (DCT) domain, multicontrast enhancement (MCE) [Reference Tang, Peli and Acton26], multicontrast enhancement with dynamic range compression (MCEDRC) [Reference Lee38], color enhancement by scaling (CES) [Reference Mukherjee and Mitra25], SR in Fourier domain (Fourier-SR) [Reference Rallabandi and Roy7], and SR in wavelet domain (Wavelet-SR) [Reference Rallabandi6] have been used for comparison. Since the proposed technique is an automatic algorithm, a comparison has beenmade with outputs of “Auto Contrast” control of Adobe Photoshop CS2. The medium detail of edge-preserving multiscale decomposition [Reference Farbman, Fattal, Lischinski and Szeliski18] was used as obtained from [47]. Out of the many outputs of CES [Reference Mukherjee and Mitra25], the one obtained using mapping function τ(x) with suppression of blocking artifacts was chosen for comparison from the code obtained from their website [Reference Susstrunk and Winkler41]. Techniques reported by [Reference Rallabandi6,Reference Rallabandi and Roy7] have been designed for only grayscale images, and selects parameters by experimentation in a range of values of a and b. Here, the same approach has been applied to value vector of HSV color model. Implementation of this experimental selection is computationally extensive and there is no clear mention of values of other parameters Δt and added noise variance. Therefore, we cannot claim that this is the best possible output of the techniques reported in [Reference Rallabandi6,Reference Rallabandi and Roy7].
• The proposed DCT-based DSR technique has been found to give noteworthy contrast enhancement when compared with enhanced output using other existing techniques. In the output obtained from existing enhancement techniques, the brighter portions are observed to be brightened beyond sensible enhancement and there is loss of information in those areas. This loss is not significant in the output of the proposed technique. When compared with the results of edge-preserving multiscale decomposition, the color vectors are observed to be more enhanced although the details are only moderately enhanced (Fig. 11(n)).
• One of the most striking property of the proposed DSR-based technique is the enhancement of very dark images (Fig. 5). The grayscale values of even very dark regions (nearly zero) can be modified to remarkably enhance the details of the image. This leads to very high F and CEF values. This property is not observed in any other existing techniques.
• The performance values have been tabulated in Table 1. It is clear from the values that the proposed DSR–DCT technique reaches contrast enhancement factor (F) values higher than most of the techniques for all types of images. Similarly, as stated in Section VIIA), PQM should be close to 10 for best perceptual quality. On darker images and images with bright and some very dark areas also, the DSR–DCT technique keeps its value closest to 10 signifying better perceptual quality than most of the other techniques. Among the compared techniques, DSR–DCT clearly gives better or comparative color information than others on all types of images especially for dark images. The poor performance of DSR–SVD and SSR (for dark images) on Fig. 3(a) is primarily due to the fact that these algorithms have not been designed to operate on local neighborhood and therefore cannot process images with both dark and bright areas with high efficiency. DSR–DCT is also observed to reach much higher contrast and CM than most of the other SR-based techniques. For dark images, although greater values of color enhancement are observed for retinex and modified HPF, but the corresponding perceptual quality is low. The DSR-based DCT technique is found to give remarkably high value of all performance metrics. The DFT and DWT-based approaches presented by Rallaban di et al. [Reference Rallabandi and Roy7] seems to introduce a useful concept for medical imaging, but do not appear to work very well for real-life dark-colored images. Proposed DCT-based DSR technique gives performance comparable to that obtained by wavelet-based SR [Reference Rallabandi and Roy7] making certain modifications in the reported algorithm.
It may, therefore, be observed that though some of other techniques reach higher values of F, and/or CEF, they lack in the quantitative PQM. An optimal combination of all three metrics is required for considering a technique more suitable for image enhancement. The limitations of DCT-based algorithm, as observed in its underperformance against some of the techniques, may be improved by a more suitable selection of parameters, the research for which is still in progress. The performance metrics, F and CEF, from the algorithm may further be increased if the constraint on PQM is made more lax (but at the cost of quantitative loss of visual quality). However, if the PQM calculation model is suitably modified for its applicability only to dark and low-contrast images, even a strict constraint on its value might lead to better overall performance.
IX. GENERAL DISCUSSION
Various aspects of the proposed technique and their implications have been discussed in this section.
A) Optimization characteristics of bistable DSR parameters
An example of characteristics of m, Δt, and n (initially assumed value of m=10-10, Δt=0.001, and n=500) w.r.t. contrast enhancement factor (F) (as explained earlier in the algorithm) has been displayed in Fig. 12 for three kinds of test images. Similarly, the corresponding graphs for other performance metrics were also obtained. It is important to note here that optimized selection of any parameter can be done only after corresponding characteristics of the parameter with respect to CEF and PQM have been obtained. The values of the bistable system parameters play a crucial role in the process of contrast enhancement using DSR. The local contrast of local neighborhood is also enhanced by taking block DCT since bistable parameters a and b are obtained for each block independently from the block's local variance. The expression for SR on any data set contains additive terms of multiples of k and subtractive term of multiples of m (equation (12)). From the DSR iterative equation (12), it is apparent that for an image that is very dark and has low dynamic range require larger values of k to reach sensible contrast quality in fewer iterations while those that are relatively less dark and cover an appreciable graylevel range require smaller values of k for the proper enhancement. The opposite is true for m.
Therefore, it can be suggested that k is inversely proportional to overall variance (signifying contrast of input image) and dynamic range of the input image. It can be shown by observing the statistics of transformed coefficients of input image that value of k is a non-linear inversely proportional function of variance and dynamic range. We therefore consider value of k as the inverse of variance × dynamic range relation.
Optimization of m has been done on a logarithmic scale due to a large range of experimental values, so that it is comparable with k. The objective is to obtain an optimized value of a factor with which b should be multiplied, so that it is less than 4a 3/27 (to ensure subthreshold condition). Fig. 12(a) shows that for a dark image optimum m lies in the lower end, whereas for low-contrast image it lies in the middle range.
Values of Δt have been observed to affect number of iterations similar to that of k. It plays the role of an initial step size for tuning. If Δt is chosen to be large, it would take fewer iterations but it would limit the refinement leading to poor tuning. This is why a very small value of Δt is desired and so it has also been optimized. Range of optimum value of Δt is 0.01–0.055.
B) Role of internal noise
Additive noise can be used in the DSR iterative equation but it increases the ambiguity of the system as the system already consists of noise in the form of low illumination. The nature of DCT coefficients is itself Gaussian-like and hence can be considered to be having nature of white Gaussian noise [Reference Barni, Bartolini, Capellini and Piva48]. Externally added noise could also be used but it increases the ambiguity of the system as the system already contains noise in the form of low illumination. It is also known that the two types of noises – internal and additive – in an SR system should be similar in distribution [Reference Sun and Lei49]. Now since investigation of additive gaussian noise has been performed on fourier coefficients (that themselves have a mixture of Gaussian distribution) by [Reference Rallabandi and Roy7], it indicates that internal noise inherent in frequency coefficient distribution too has a near Gaussian nature. Similarly, for DCT coefficients, the internal noise can be considered to be inherent in the coefficients and can be scaled iterative instead of adding external noise of similar distribution. Owing to inherent noisy nature of the frequency transform coefficients of a low contrast image, we preferred to pursue behavior of DSR on these coefficients. The nature of performance metrics is observed to be analogous to SNR of a bistable system. Iterative processing increases (scales) the internal noise inherent in the image. The performance metrics are observed to reach a maximum after certain optimum number of iterations and start decreasing as iteration increases beyond optimum. This is because the iteration count is directly proportional to the internal noise. This behavior is similar to addition of external noise in a general DSR bistable system where SNR is maximum at some optimum amount of additive noise.
C) Suppression of blocking artifacts
The adaptive selection of blocksize addresses two problems. The first is to suppress blocking artifacts introduced due to DCT. The second is related to preserving the continuity between the dark and bright portions after processing. Since selection of areas for enhancement is done based on per block intensity and contrast, this can create a problem of boundary continuity between the more processed and less processed blocks in the output. To reduce this problem, block size needs to be adaptively reduced in areas of sharp discontinuities. Thus, it serves a twofold purpose.
D) Preservation of color information
Preservation of colors implies that in the RGB color space the color vector of a pixel in the processed image has the same direction as that in the original. As we know that the DCT transform has a property of energy compaction. Therefore, most of the energy resides within a small range of the coefficients. The variation in coefficient values in each band with successive iterations is such that the processed color vector is parallel to original RGB vector.
The color preservation is implicit in the algorithm and has been validated by calculating the peak-signal-to-noise ratio (PSNR) of Hue, (PSNR Hue), and saturation, (PSNR Sat), respectively for each of the test images. A subjective score based on visual appearance, mean opinion score (MOS), on a scale of 1–10 obtained from ten people, was also computed. It can be seen in Table 2 that values of PSNR for almost all the outputs is greater than 12 dB, indicating less mean-square error between hue of enhanced images and original images. The subjective score for images that are very dark is low owing to difficulty in perceiving the hue of an object from a dark image. It has been observed to be high for all the other images. One important thing to note from this data is that the saturation mean-square error is found to be more, implying significant modification of saturation vector of the images. This is precisely why there has been noteworthy color enhancement in the images due to DSR processing, as the colors have been slightly more saturated displaying increased colorfulness. For all the color bands (RGB), iteration count is same and large. So the averaged pixel value corresponding to RGB bands gives true color information due to this proportionate increment in coefficient value. Thus, there is no color-shift in the output color-vector. Similar results can be obtained by application of DSR only on luminance and leaving chromatic vectors untouched.
X. CONCLUSIONS
In this paper, a novel technique using DSR in DCT domain for the enhancement of low contrast and dark images has been proposed. The unique feature of this technique is that it tunes the DCT coefficients according to the bistable double-well system parameters a and b and utilizes internal noise due to lack of illumination of a low-contrast image. The DSR iterative process on the noisy (low contrast) coefficients enhances the image energy, that is, low-contrast image transits into enhanced state, in analogy with inter-well transition of a particle in a bistable double-well system. The performance of the proposed technique has been evaluated after optimization of the bistable parameters so that the output has maximum enhancement and least iteration count. The DCT-based DSR technique is found to enhance very dark as well as low-contrast images very effectively with negligible loss of information at the already bright areas (unlike most of the existing image enhancement techniques). It is an automatic process that not only adjusts background illumination, but also improves the contrast while preserving and enhancing color information. Therefore, it can be inferred that the proposed DCT-based DSR technique gives remarkable performance over the existing image enhancement techniques in terms of contrast enhancement, visual information, color enhancement, and preservation. It can be considered highly suitable on colored as well as grayscale images of varying dynamic ranges.