Signal denoising using the minimum-probability-of-error criterion

Jishnu Sadasivan; Subhadip Mukherjee; Chandra Sekhar Seelamantula

doi:10.1017/ATSIP.2019.27

Signal denoising using the minimum-probability-of-error criterion

Published online by Cambridge University Press: 20 January 2020

Jishnu Sadasivan

Subhadip Mukherjee and

Chandra Sekhar Seelamantula

Show author details

Jishnu Sadasivan: Affiliation:
Department of Electrical Engineering, Indian Institute of Science, Bangalore560012, India
Subhadip Mukherjee*: Affiliation:
Presently with Department of Mathematics, KTH Royal Institute of Technology, Stockholm, Sweden
Chandra Sekhar Seelamantula: Affiliation:
Department of Electrical Engineering, Indian Institute of Science, Bangalore560012, India
*: Corresponding author: Subhadip Mukherjee. Email: subhadipju@gmail.com

Article contents

Abstract
INTRODUCTION
THE PE RISK
EXPERIMENTAL RESULTS FOR MPE-BASED DENOISING
THE EXPECTED ℓ1 DISTORTION
PERFORMANCE OF THE EXPECTED ℓ1 DISTORTION-BASED POINTWISE SHRINKAGE ESTIMATOR
PERFORMANCE ASSESSMENT OF MPE AND ℓ1-RISK MINIMIZATION ALGORITHMS VERSUS BENCHMARK DENOISING ALGORITHMS
CONCLUSIONS
Footnotes
References

Abstract

We consider signal denoising via transform-domain shrinkage based on a novel risk criterion called the minimum probability of error (MPE), which measures the probability that the estimated parameter lies outside an ε-neighborhood of the true value. The underlying parameter is assumed to be deterministic. The MPE, similar to the mean-squared error (MSE), depends on the ground-truth parameter, and therefore, has to be estimated from the noisy observations. The optimum shrinkage parameter is obtained by minimizing an estimate of the MPE. When the probability of error is integrated over ε, it leads to the expected ℓ1 distortion. The proposed MPE and ℓ1 distortion formulations are applicable to various noise distributions by invoking a Gaussian mixture model approximation. Within the realm of MPE, we also develop a specific extension to subband shrinkage. The denoising performance of MPE turns out to be better than that obtained using the minimum MSE-based approaches formulated within Stein's unbiased risk estimation (SURE) framework, especially in the low signal-to-noise ratio (SNR) regime. Performance comparisons with three benchmarking algorithms carried out on electrocardiogram signals and standard test signals taken from the Wavelab toolbox show that the MPE framework results in SNR gains particularly for low input SNR.

Keywords

Minimum probability of error Shrinkage estimator Risk estimation Expected ℓ1 distortion Gaussian mixture model (GMM)

Information

Type: Original Paper
Information: APSIPA Transactions on Signal and Information Processing , Volume 9 , 2020 , e3

DOI: https://doi.org/10.1017/ATSIP.2019.27 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.
Copyright: Copyright © The Authors, 2020

1. INTRODUCTION

Signal denoising algorithms are often developed with the objective of minimizing a chosen distortion function that quantifies the distance between an estimate and the ground-truth. The most widely used distortion measure in the literature is the mean-squared error (MSE). The ground-truth parameter may be deterministic or stochastic with a known prior distribution. The latter formalism leads to Bayesian estimators. Within the deterministic signal estimation paradigm, which is also the formalism considered in this paper, one typically desires that the estimator has minimum variance and is unbiased (MVU) [Reference Poor1,Reference Kay and Eldar2]. An MVU estimator may not always exist, and if it does, it can be obtained using the theory of sufficient statistics. Eldar and Kay [Reference Kay and Eldar2] showed that, when it comes to minimizing the MSE, biased estimates can outperform the MVU estimate. For example, one could shrink the MVU estimate and optimally select the shrinkage parameter to minimize the MSE.

In this paper, we consider the problem of estimating a deterministic signal corrupted by additive white noise. The noise distribution is assumed to be known, but not restricted to be Gaussian. We propose a new distortion measure based on the probability of error and develop transform-domain shrinkage estimators. The proposed risk criterion leads to shrinkage estimators that are adapted to the noise type, meaning that the shrinkage operators corresponding to different noise distributions having the same variance turn out to be different. This behavior is in contrast with shrinkage estimators obtained using the MSE, wherein one obtains the same shrinkage profile corresponding to different noise distributions having the same variance. Before proceeding with the developments, we review important literature related to the problem at hand.

A) Prior art

The MSE is by far the most widely used measure for obtaining the optimum shrinkage parameter. Since the MSE is a function of the ground-truth, directly minimizing it results in a practically unrealizable estimate, in the sense that the estimate depends on the unknown parameter/or its statistics. However, in some cases, it is possible to find the optimum shrinkage parameter, for example, using a min-max approach [Reference Kay and Eldar2], where the parameter is constrained to a known set. An optimum shrinkage estimator, when the variance of the unbiased estimate is a scaled version of the square of the parameter, with a known scaling, is proposed in [Reference Kay and Eldar2].

Optimum shrinkage estimators have also been computed based on risk estimation, where an unbiased estimate of the MSE, which depends only on the noisy observations, is obtained and subsequently minimized to compute the optimal shrinkage. Under the assumption of Gaussian noise, an unbiased estimate of the MSE, namely Stein's unbiased risk estimator (SURE), was developed based on Stein's lemma [Reference Stein3], and has been successfully employed in numerous denoising applications. The shrinkage estimator of the mean of a multivariate Gaussian distribution with diagonal covariance matrix, obtained by minimizing SURE, dominates the classical least-squares estimate when the observation dimension exceeds three [Reference Eldar4].

A risk minimization approach for denoising using a linear expansion of elementary thresholding functions has been developed in [Reference Luisier, Blu and Unser5–Reference Zheng, Li, Blu and Lee9], wherein the combining weights are chosen optimally to minimize the SURE objective. SURE-optimized wavelet-domain thresholding techniques have been proposed in [Reference Donoho and Johnstone10–Reference Zhang and Desai12]. Atto et al. [Reference Atto, Pastor and Mercier13,Reference Atto, Pastor and Mercier14] have investigated the problem of signal denoising based on optimally selecting the parameters of a wavelet-domain smooth sigmoidal shrinkage function by minimizing the SURE criterion. Ramani et al. [Reference Ramani, Blu and Unser15] developed a Monte Carlo technique to select the parameters of a generic denoising operator based on SURE. An image denoising algorithm based on non-local means (NLM) is proposed in [Reference De Ville and Kocher16], where the NLM parameters are optimized using SURE. Notable denoising algorithms that aim to optimize the SURE objective include wavelet-domain multivariate shrinkage [Reference Hsung, Lun and Ho17], local affine transform for image denoising [Reference Qiu, Wang, Yu and Song18], SURE-optimized blockwise shrinkage for image denoising [Reference Wu, Tracey, Natarajan and P.Noonan19], SURE-optimized Savitzky-Golay filter [Reference Krishnan and Seelamantula20], etc. The SURE approach has also found applications in image deconvolution [Reference Pesquet, Benazza-Benyahia and Chaux21] and compressive sensing [Reference Zheng, Liu, Wanga and Maleki22].

The original formulation of SURE, which assumes independent Gaussian noise has been extended to certain distributions in continuous and discrete exponential families in [Reference Hudson23,Reference Hwang24], respectively, with the assumption of independence left unchanged. Eldar generalized SURE (GSURE) for distributions belonging to the non-i.i.d. multivariate exponential family [Reference Eldar4]. Giryes et al. [Reference Giryes, Elad and Eldar25] used a projected version of GSURE for selecting parameters in the context of solving inverse problems. Luisier et al. [Reference Luisier, Vonesch, Blu and Unser26] proposed an unbiased estimate of MSE under Poisson noise and minimized it to obtain the optimum wavelet-domain image denoising function. A detailed discussion of Gaussian parameter estimation using shrinkage estimators, together with a performance comparison of SURE with the maximum-likelihood (ML) and soft-thresholding-based estimators can be found in [Reference Johnstone27] (Chapter 2). It is shown in [Reference Johnstone27] that the soft-thresholding-based estimator dominates the James-Stein shrinkage estimator in terms of MSE if the parameter vector to be estimated is highly sparse, such as a spike. On the other hand, the shrinkage estimator dominates if the parameter vector has nearly equal entries.

B) This paper

We propose a signal estimation method based on the probability of error (PE) criterion, which we first introduced in [Reference Sadasivan, Mukherjee and Seelamantula28] and demonstrated application to electrocardiogram (ECG) signal denoising. Recently, in [Reference Sadasivan, Mukherjee and Seelamantula29] we proposed a speech denoising framework based on [Reference Sadasivan, Mukherjee and Seelamantula28]. In this paper, we expand further on the idea and carry out a detailed investigation on various aspects related to the PE criterion. In particular, we consider different noise distributions such as Laplacian, Student's-t, Gaussian, a Gaussian mixture model (GMM), expected ℓ₁ distortion, analysis and comparison of the implicit shrinkage function, and performance benchmarking with standard denoising techniques. The PE quantifies the probability of the estimate falling outside an ε-neighborhood of the true parameter value. Since the PE risk depends on the ground truth, we consider a surrogate, which may be biased, and minimize it to obtain the shrinkage parameter (Section 2). The optimization is carried out in the discrete cosine transform (DCT) domain, either in a pointwise fashion or on a subband basis. The resulting estimator is referred to as the shrinkage estimator based on the minimum probability of error (MPE). We derive the PE risk for Gaussian, Laplacian, and Student's-t noise distributions (Sections 2-2.1 and 2.5). In practical applications, where the noise distribution may be multimodal and not known explicitly, we propose to use a Gaussian mixture model (GMM) approximation [Reference Plataniotis and Hatzinakos30,Reference Sorenson and Alspach31] (Section 2-2.4). We show the performance of the MPE-based denoising technique on the Piece-Regular signals taken from the Wavelab toolbox in Gaussian, Student's-t, and Laplacian noise contaminations (Section 3). Proceeding further, we also consider the probability of error integrated over 0 < ε < ∞ (Section 4), which results in the expected ℓ₁distortion between the parameter and its estimate. The ℓ₁ distance has been used in the Bayesian setting and in regression problems as a robust distortion metric [Reference Poor1,Reference Sovic and Sersic32]. The estimators for the expected ℓ₁ distortion are also derived by invoking the GMM approximation (Section 4-4.1). The noise-adaptive shrinkage behavior resulting from the proposed PE and ℓ₁ distortion-based risk functions are demonstrated by plotting the shrinkage profile as a function of the parameter value (Section 4-4.3). We also assess the denoising performance of the shrinkage estimator obtained by minimizing the ℓ₁ distortion for different input SNRs and for different number of noisy realizations (Section 5).

To further boost the denoising performance of the ℓ₁ distortion-based estimator, we develop an iterative algorithm to successively refine the cost function and the resulting estimate, starting with the noisy signal as the initialization (Section 5). The iterations lead to an improvement of 2–3 dB in the output signal-to-noise ratio (SNR) (Section 5-5.1).

For performance evaluation, we conduct experiments on the Piece-Regular and the HeaviSine signals from the Wavelab toolbox [33], and ECG signals from the PhysioBank database [34]. We consider three techniques for comparison: (i) wavelet-domain soft-thresholding [Reference Donoho35]; (ii) SURE-based orthonormal wavelet thresholding using a linear expansion of thresholds (SURE-LET) [Reference Luisier, Blu and Unser5]; and (iii) SURE-based smooth sigmoid shrinkage (SS) in wavelet domain [Reference Atto, Pastor and Mercier13], all assuming Gaussian noise contamination (Section 6) in order to facilitate a fair comparison.

II. THE PE RISK

Consider the observation model x = s + w, where x ∈ ℝⁿ and s ∈ ℝⁿ denote the noisy and clean signals, respectively. The noise vector w is assumed to have i.i.d. entries with zero mean and variance σ². The goal is to estimate $\boldsymbol s$ from $\boldsymbol x$ by minimizing a suitable cost function. The signal model is considered in an appropriate transform domain, where the signal admits a parsimonious representation, but noise does not. We consider two types of shrinkage estimators: (i) pointwise, where a shrinkage factor a _i ∈ [0, 1] is applied to x _i in order to obtain the estimate $\widehat {s}_i = a_i x_i$, which means that $\widehat {s}_i$ does not depend on $\widehat {s}_j$ for i ≠ j; and (ii) subband-based estimator, wherein a single shrinkage factor a _J is applied to a group of coefficients {x _i, i ∈ J} in subband J ⊂ {1, 2, …, n}. Shrinkage estimators may also be interpreted as premultiplication of x by a diagonal matrix.

A) PE risk for pointwise shrinkage

Since the estimate of s _i does not depend on x _j, for j ≠ i, in the pointwise shrinkage scenario, we drop the index i for brevity of notation. The PE risk is defined as

(1)\begin{equation} \mathcal{R} = \mathbb{P}\left( \left\vert \,\widehat{s}-s \right\vert>\epsilon \right), \tag{1}\end{equation}

where ε > 0 is a predefined tolerance parameter. The risk $\mathcal {R}$ quantifies the estimation error using the probability measure and implicitly takes into account the noise distribution beyond the first- and second-order statistics. On the contrary, the MSE relies only on the first- and second-order statistics of noise for shrinkage estimators. Substituting $\widehat {s}=ax=a(s+w)$, the risk $\mathcal {R}$ evaluates to

\begin{align*} \mathcal{R}\left(s;a\right) &= \mathbb{P}\left( \left\vert a(s+w) -s \right\vert > \epsilon \right) \\ &= 1-F \left( \frac{\epsilon-(a-1)s}{a} \right) + F\left( - \frac{\epsilon+(a-1)s}{a} \right), \end{align*}

where F( · ) is the cumulative distribution function (c.d.f.) of the noise. Since $\mathcal {R}$ depends on s, which is the parameter to be estimated, it is impractical to optimize it directly over a. To circumvent the problem, we minimize an estimate of $\mathcal {R}$, which is obtained by replacing s with x (which is also the ML estimate of s). Such an estimate $\widehat { \mathcal {R}}=\mathcal {R}(x;a)$ takes the form

(2)\begin{equation} \widehat{\mathcal{R}}=1-F \left(\frac{\epsilon-(a-1)x}{a} \right) + F\left(-\frac{\epsilon+(a-1)x}{a}\right), \tag{2}\end{equation}

and correspondingly, the optimal shrinkage parameter is obtained as $a_{{\rm opt}} = {\rm arg}\,\underset {0 \leq a \leq 1}{\min }\, \widehat {\mathcal {R}}$. A grid search is performed to optimize $\widehat {\mathcal {R}}$ over a ∈ [0, 1], and the clean signal is obtained as $\widehat {s}=a_{{\rm opt}}x$. We next derive explicit formulae for the risk function for Gaussian, Laplacian, and Student's-t noise distributions.

(i) Gaussian distribution: In this case, the noisy observation x also follows a Gaussian distribution, and therefore, $\widehat {s}-s$ is distributed as $\mathcal {N}((a-1)s,a^2\sigma ^2 )$. The PE risk estimate is given as

(3)\begin{equation} \widehat{\mathcal{R}} = Q \left(\frac{\epsilon-(a-1)x}{a\sigma} \right) + Q \left( \frac{\epsilon+(a-1)x}{a\sigma} \right), \tag{3}\end{equation}

where $Q(u)=({1}/{\sqrt {2 \pi }})\int _{u}^{\infty }e^{-{t^2}/{2}}\,\mathrm {d}t$.

(ii) Student's-t distribution: Consider the case where the noise follow a Student's-t distribution with parameter λ > 2 and the probability density function (p.d.f.) of noise is given by

\[ f(w) = \frac{\Gamma \left( \frac{\lambda+1}{2} \right) }{\sqrt {\lambda \pi } \Gamma \left( \frac{\lambda}{2} \right) } \left( 1+\frac{w^2}{\lambda} \right)^{-{(\lambda+1)}/{2}}.\]

The variance of w is σ² = λ/(λ − 2). The expression for $\widehat {\mathcal {R}}$ is the one given in (2) with

\[ F(w) = \frac{1}{2}+ w\Gamma \left( \frac{\lambda+1}{2} \right) \frac{G_1 \left( \frac{1}{2},\frac{\lambda+1}{2};\frac{3}{2};- \frac{w^2}{\lambda} \right)} {\sqrt {\lambda \pi } \Gamma\left( \frac{\lambda}{2} \right) },\]

where G ₁ is the hypergeometric function defined as

\[ G_1 \left( a,b;c;z \right) = \sum_{k=0}^{\infty}\frac{(a)_k (b)_k}{(c)_k}\frac{z^k}{k!},\]

and (q)_k denotes the Pochhammer symbol:

\[ (q)_k \stackrel{\Delta}{=}\begin{cases} 1 & {\rm for}\ k=0, \\ q(q+1)(q+2)\cdots(q+k-1), & \mbox{for}\ k>0. \end{cases}\]

(iii) Laplacian distribution: Considering the noise to be i.i.d. Laplacian with zero-mean and parameter b (variance σ² = 2b ²), with the p.d.f. f(w) = (1/2b)exp ( − |w|/b), the PE risk can be obtained by using the following expression for F(w) in (2):

(4)\begin{equation} F(w) = \frac{1}{2} + \frac{1}{2} {\rm sgn}(w) \left( 1-\exp \left(-\frac{\vert w\vert }{b} \right) \right). \tag{4}\end{equation}

B) Closeness of $\widehat {\mathcal {R}}$ to $\mathcal {R}$

To measure the closeness of $\widehat {\mathcal {R}}$ to $\mathcal {R}$, consider the example of estimating a scalar s = 4 from a noisy observation x. The PE risk estimate $\widehat {\mathcal {R}}$ is obtained by setting s = x. In Figs 1(a),1(b), and 1(c), we show the variation of the actual risk $\mathcal {R}$ and its estimate $\widehat {\mathcal {R}}$ with a, averaged over 100 independent trials, for Gaussian, Student's-t, and Laplacian noise distributions, respectively. The noise has zero mean, and the variance is taken as σ² = 1 for Gaussian and Laplacian models, whereas for Student's-t model, the variance is σ² = 2. The value of ε is set equal to σ while computing the PE risk. We observe that $\widehat {\mathcal {R}}$ is a good approximation to $\mathcal {R}$, particularly in the vicinity of the minima. The deviation of the shrinkage parameter a _opt(x), obtained by minimizing $\widehat {\mathcal {R}}$, with respect to its true value a _opt(s) resulting from the minimization of $\mathcal {R}$, is shown in Fig. 1(d) for the three noise models under consideration. The central red lines in Fig. 1(d) indicate the medians, whereas the black lines on the top and bottom denote the 25 and the 75 percentile points, respectively. We observe that a _opt(x) is concentrated around a _opt(s), especially for Gaussian and Laplacian noise, barring a few outliers.

Fig. 1. (Color online) The PE risk averaged over 100 realizations for: (a) Gaussian, (b) Student's-t, and (c) Laplacian noise, versus the shrinkage parameter a; and (d) the percentiles of error in minima.

C) Perturbation probability of the location of minimum

The location of the minimum of the PE risk determines the shrinkage parameter. Therefore, one must ensure that it does not deviate too much from its actual value, with high probability, when s is replaced by x in the original risk $\mathcal {R}$. Let $a_{{\rm opt}}(s)=\arg \underset {0\leq a \leq 1}{\min } \mathcal {R}(s;a)$ denote the argument that minimizes the true risk $\mathcal {R}$. Consider the probability of deviation given by

(5)\begin{equation} P_e^{{\rm MPE}}=\mathbb{P}\left( \left\vert a_{{\rm opt}}\left(s\right) - a_{{\rm opt}}\left(x\right) \right\vert \geq \delta \right), \tag{5}\end{equation}

for some δ > 0. Using a first-order Taylor-series approximation of a _opt(x) about s, and substituting x = s + w, we obtain a _opt(x) ≈ a _opt(s) + w a′_opt(s), where ′ denotes the derivative. We would like to point out that the Taylor-series approximation is not rigorously justified here but it is reasonable to guide our analysis. The deviation probability $P_e^{{\rm MPE}}$ in (5) simplifies to $P_e^{{\rm MPE}}=\mathbb {P}( \left \vert w \right \vert \geq {\delta }/{\left \vert a'_{{\rm opt}}(s) \right \vert } ).$ For additive Gaussian noise w with zero mean and variance σ², placing the Chernoff bound [Reference Mitzenmacher and Upfal36,Reference Chang, Cosmon and Milstein37] on $P_e^{{\rm MPE}}$ leads to

\[ P_e^{{\rm MPE}} \leq 2\exp\left( -\frac{\delta^2}{2\sigma^2\left\vert a'_{{\rm opt}} \left(s\right) \right\vert ^2} \right).\]

To ensure that $P_e^{{\rm MPE}}$ is less than α, for a given α ∈ (0, 1), it suffices to have

(6)\begin{equation} \left\vert a'_{{\rm opt}}\left(s\right) \right\vert ^2 \leq \frac{\delta^2}{2\sigma^2 \log \left(\frac{2}{\alpha}\right)}, \tag{6}\end{equation}

which translates to a lower-bound on the input SNR. Since there is no closed-form expression available for $a'_{\rm opt}(s)$ in the context of the PE risk, we empirically obtain the range of input SNR values s ²/σ², for which (6) is satisfied.

Analogously, to satisfy an upper bound on the deviation probability $P_e^{{\rm SURE}}$ of the minimum in the case of SURE, for a given deviation δ > 0, one must ensure that

(7)\begin{equation} \frac{s^6}{8\sigma^6} \left( \delta - \frac{\sigma^4}{\left( s^2+\sigma^2 \right)s^2}\right)^2 \geq \log \left(\frac{2}{\alpha}\right). \tag{7}\end{equation}

The proof of (7) is given in Appendix A.

The minimum input SNR required to ensure P _e ≤ α for both SURE- and MPE-based shrinkage estimators is shown in Fig. 2, for different values of α and δ. The PE risk estimate is obtained by replacing s with x and setting ε = σ. We observe that reducing the amount of deviation δ for a given probability α, or vice versa, leads to a higher input SNR requirement for both SURE and MPE. We also observe from Fig. 2 that, for given δ and α, SURE requires a higher input SNR than MPE to keep the δ-deviation probability under α. Also, for a given input SNR, the δ-deviation probability of the estimated shrinkage parameter a _opt(x) from the optimum a _opt(s) is smaller for MPE than SURE, thereby indicating that the MPE-based shrinkage is comparatively more reliable and robust than the SURE-based one at low input SNRs.

Fig. 2. (Color online) Input SNR requirement for SURE (black) and MPE (blue) to ensure that the probability of δ-perturbation of the minima is less than or equal to α.

D) Unknown noise distributions

In practical applications, the distribution of noise may not be known in a parametric form and may also be multimodal. At best, one would have access to realizations of the noise, from which the distribution has to be estimated. In such cases, approximation of the noise p.d.f. using a GMM is a viable alternative [Reference Plataniotis and Hatzinakos30,Reference Sorenson and Alspach31], wherein one can estimate the parameters of the GMM using the expectation-maximization algorithm [Reference Redner and Walker38]. Gaussian mixture modeling is attractive as it comes with the guarantee that, asymptotically, as the number of Gaussians increases, the GMM approximation to a p.d.f. that has a finite number of discontinuities converges uniformly except at the points of the discontinuity [Reference Sorenson and Alspach31]. The GMM approximation can be used even for non-Gaussian, unimodal distributions. For the GMM-based noise p.d.f.

(8)\begin{equation} f(w)=\sum_{m=1}^{M}\frac{\alpha_m}{\sigma_m \sqrt{2\pi}}\exp \left(-\frac{\left( w-\theta_m \right)^2}{2\sigma_m^2} \right), \tag{8}\end{equation}

the PE risk turns out to be

(9)\begin{align} \widehat{\mathcal{R}} &= \sum_{m=1}^{M}\alpha_m \left[Q \left(\frac{\epsilon-(a-1)s-\theta_m}{a\sigma_m} \right)\right. \nonumber\\ &\quad + \left.Q \left(\frac{\epsilon+(a-1)s+\theta_m}{a\sigma_m} \right)\right], \tag{9}\end{align}

using (3). For illustration, consider the estimation of a scalar s = 4 in the transform domain from its noisy observation x. The additive noise is Laplacian distributed with zero mean and variance σ² = 1. The noise distribution is modeled using a GMM with M = 4 components and the corresponding PE risk estimate is obtained using (9) by setting s = x. In Fig. 3(a), we show a Laplacian p.d.f. and its GMM approximation. Fig. 3(b) shows the GMM approximation to a multimodal distribution. Figure 4(a) shows the PE risk based on the original Laplacian distribution as well as the GMM approximation, as a function of the shrinkage parameter a. The close match between the two indicates that the GMM is a viable alternative when the noise distribution is unknown or follows a complicated model. In Fig. 4(b), we plot the GMM-based PE risk and its estimate averaged over 100 independent trials. We observe that the locations of the minima of the actual risk and its estimate are in good agreement, thereby justifying the minimization of $\widehat {\mathcal {R}}$. The PE risk and its estimate are shown in Fig. 4(c) for the multimodal p.d.f. of Fig. 3(b).

Fig. 3. (Color online) Original noise distribution and a GMM approximation: (a) Laplacian p.d.f. and its approximation using a four-component GMM; and (b) A multimodal p.d.f. and its three-component GMM approximation.

Fig. 4. (Color online) The PE risk estimate versus the shrinkage parameter a: (a) PE risk for Laplacian noise, considering the Laplacian p.d.f. and its GMM approximation; (b) GMM-based PE risk estimate for Laplacian noise; and (c) PE risk estimate for multimodal noise; for ε = σ. The risk estimates are averaged over 100 Monte Carlo realizations.

E) PE risk for subband shrinkage

Let a _J be the shrinkage factor applied to the set of coefficients {x _i, i ∈ J} in subband J. The estimate $\,\widehat {\boldsymbol {s}}_J$ of the clean signal is obtained by $\,\widehat {\boldsymbol {s}}_J= a_J \boldsymbol {x}_J$, where x_J ∈ ℝ^|J| and a _J ∈ [0, 1]. For notational brevity, we drop the subscript J, as we did for pointwise shrinkage, and express the estimator as $\widehat {\boldsymbol {s}} = a \boldsymbol {x}$, where boldface letters indicate vectors. Analogous to pointwise shrinkage, the PE risk for subband shrinkage is defined as $\mathcal {R} = \mathbb {P}( \left \Vert \widehat {\boldsymbol {s}}-\boldsymbol {s} \right \Vert _2>\epsilon ),$ which, for $\,\widehat {\boldsymbol {s}} = a \boldsymbol {x}$, becomes $\mathcal {R} = \mathbb {P}( \left \Vert a \boldsymbol {w} + (a-1)\boldsymbol {s} \right \Vert _2>\epsilon )$. For $\boldsymbol {w} \sim \mathcal {N}(0,\sigma ^2 I)$,

(10)\begin{equation} \mathcal{R} = 1-F(\theta\vert k,\lambda), \tag{10}\end{equation}

where k = |J|, $\lambda = \sum _{j=1}^{k}({(1-a)^2 s_j^2}/{a^2\sigma ^2})$, θ = (ε/aσ)², and F(θ | k, λ) is the c.d.f. of the non-central χ² distribution, given by

\[ F(\theta\,\vert \,k,\lambda)=\sum_{m=0}^{\infty} \frac{\lambda^m e^{-{\lambda}/{2}}}{2^mm!} \frac{\gamma\left(\frac{\theta}{2},\frac{k+2m}{2}\right)}{\Gamma\left(\frac{k+2m}{2}\right)},\]

where $\Gamma (a)\,{=}\,\int _{0}^{\infty } \exp (-t) t^{a-1}\,\mathrm {d} t$ and $\gamma (x,a)\,{=}\, \int _{0}^{x} \exp\break (-t) t^{a-1}\,\mathrm {d} t$.

Similar to pointwise shrinkage, we propose to obtain an estimate $\widehat {\mathcal {R}}$ of $\mathcal {R}$ for subband shrinkage estimators by replacing s _j with x _j. The optimum subband shrinkage factor is obtained by minimizing $\widehat {\mathcal {R}}$.

Fig. 5 shows the subband PE risk and its estimate versus a, where the underlying clean signal s ∈ ℝ^|J| is corrupted by Gaussian noise and the subband size is chosen to be |J| = k = 8. The clean signal s is generated by drawing samples from $\mathcal {N}(2\times \bold 1_k, I_k)$, where $\bold 1_k$ and I _k denote a k-length vector of all ones and a k × k identity matrix, respectively. The observation x is obtained by adding zero-mean i.i.d. Gaussian noise to s, with an input SNR of 5 dB, where the input SNR is defined as ${\rm SNR}_{{\rm in}}= 10 \log _{10}(({1}/{k\sigma ^2}) \sum _{n=1}^{k}s_n^2)\,{\rm dB}$. The PE risk estimate is obtained by replacing s with x in (10), which does not drastically alter the minimum (cf. Fig. 5).

Fig. 5. (Color online) The PE risk and its estimate averaged over 100 Monte Carlo trials for the subband shrinkage estimator versus a; where $\epsilon = \sqrt {k}\sigma$, with k=8. The additive noise is Gaussian with SNR_in = 5 dB. In each trial, s is generated by drawing samples from $\mathcal {N}(2\times \bold 1_k, I_k)$.

III. EXPERIMENTAL RESULTS FOR MPE-BASED DENOISING

The performance of the MPE-based pointwise and subband shrinkage estimator is validated on a synthesized harmonic signal (of length N = 2048) in Gaussian noise and the Piece-Regular signal (of length N = 4096) in Gaussian, Student's-t, multimodal, and Laplacian noise. The Piece-Regular signal has both smooth and rapidly-varying regions, making it a suitable candidate for the assessment of denoising performance.

A) Performance of pointwise-shrinkage estimator

1) Harmonic signal denoising

Consider the signal

(11)\begin{equation} s_n = \cos\left( \frac{5\pi n}{2048} \right)+ 2 \sin\left( \frac{10\pi n}{2048} \right),\quad 0\leq n \leq 2047, \tag{11}\end{equation}

in additive white Gaussian noise, with zero mean and variance σ². Since the denoising is carried out in the DCT [Reference Rao and Yip39] domain, the Gaussian noise statistics remain unaltered. For the purpose of illustration, we assume that σ² is known. In practice, σ² may not be known a priori and could be replaced by the robust median estimate [Reference Mallat40] or the trimmed estimate [Reference Pastor and Socheleau41]. The clean signal is estimated using inverse DCT after applying the optimum shrinkage. The denoising performance of the MPE and SURE-based approaches is compared in Table 1. In case of the Wiener filter, the power spectrum of the clean signal is estimated using the standard spectral subtraction technique [Reference Boll42,Reference Loizou43]. We observe that MPE-based shrinkage with ε = 3.5σ is superior to SURE and Wiener filter (WF) by 8–12 dB. The comparison also shows that the performance of the MPE depends critically on the choice of ε.

Table 1. Comparison of MPE, SURE-based shrinkage estimator and Wiener filter (WF) for different input SNRs. The output SNR values are averaged over 100 noise realizations.

2) Piece-Regular signal denoising

We consider several noisy realizations of the Piece-Regular signal, taken from the Wavelab toolbox [33], under Gaussian, Student's-t, multimodal (Fig. 3(b)), and Laplacian noise contaminations. The noise variance is assumed to be known. Notably, the Gaussian and Student's-t distributions of noise are preserved by an orthonormal transform [Reference Kibria and Joarder44], unlike the Laplacian distribution. Therefore, the PE estimates for Laplacian noise and multimodal noise are computed based on a four-component and a three-component GMM approximation, respectively, in the DCT domain. The denoised signal corresponding to Laplacian noise is shown in Fig. 6 to serve as an illustration. The MPE estimates are better than SURE estimates. The SNR plots in Fig. 7 indicate that the MPE outperforms SURE for the noise types under consideration and that the gains are particularly high in the input SNR range of −5 to 20 dB and tend to reduce beyond 20 dB.

Fig. 6. (Color online) Denoising performance of the MPE- and SURE-based pointwise shrinkage estimators for the Piece-Regular signal corrupted by Laplacian noise. The PE risk is calculated by using a GMM approximation and setting ε = 3σ.

Fig. 7. (Color online) Output SNR versus input SNR corresponding to the MPE- and SURE-based pointwise shrinkages, under various noise distributions. The output SNR values are calculated by averaging over 100 independent noise realizations. (a) Gaussian noise, (b) Laplacian noise, (c) Student's-t noise, and (d) Multimodal noise.

3) Effect of ε on the denoising performance of MPE

Obtaining a closed-form expression for the ε that maximizes the output SNR is not straightforward. We determine the optimum ε empirically by measuring the SNR gain as a function of ε (cf. Fig. 8) for i.i.d. Gaussian noise. We observe that the output SNR exhibits a peak approximately at β = ε/σ = 3.5 for the harmonic signal in (11) and at β = 3 for the Piece-Regular signal. As a rule of thumb, we recommend ε = 3σ for pointwise shrinkage estimators. Some insights into the role of ε and its choice will be presented in Section 4-4.3.

Fig. 8. (Color online) Average output SNR of the pointwise MPE shrinkage as a function of β = ε/σ, for different values of the input SNR. The output SNR curves peak when β ≈ 3.

B) Performance of subband MPE shrinkage

To validate the performance of the MPE-based subband shrinkage estimator (cf. Section 2-2.5), we consider denoising of the Piece-Regular signal in additive Gaussian noise. The clean signal and its noisy measurement are shown in Fig. 9(a). Denoising is carried out by grouping k adjacent DCT coefficients to form a subband. The denoised signals obtained using SURE and MPE are shown in Fig. 9(b) and 9(c), respectively. The subband size k is chosen to be 16 and the parameter ε is set equal to $1.75\sqrt {k}\sigma$, a value that was determined experimentally and found to be nearly optimal. We observe that the MPE gives 1 dB improvement in SNR than the SURE approach.

Fig. 9. (Color online) Comparison of denoising performance of the subband shrinkage estimators using MPE and SURE, for the Piece-Regular signal corrupted by additive Gaussian noise. The subband size is taken as k = 16 and the value of ε is $1.75\sqrt {k}\sigma$.

The variation of the output SNR is also studied as a function of k (cf. Fig. 10). We experimented with ε = 3σ, $\epsilon =1.75\sqrt {k}\sigma$, and $\epsilon =1.25\sqrt {k}\sigma$ corresponding to subband sizes k = 1, k ∈ [2, 16], and k>16, respectively. For both SURE and MPE, as k increases, the output SNR also increases and eventually saturates for k ≥ 40. For input SNR below 15 dB, MPE gives a comparatively higher SNR than SURE, and the margin diminishes with increase in input SNR or the subband size k. The degradation in performance of SURE for low SNRs is due to the large error in estimating the MSE at such SNRs. The SURE-based estimate of MSE becomes increasingly reliable as k increases, thereby leading to superior performance.

Fig. 10. (Color online) Output SNR versus subband size k, averaged over 100 noise realizations, for different input SNRs. The output SNR of MPE is consistently superior to that obtained using SURE, especially when k ≤ 40 and the input SNR is below 15 dB.

IV. THE EXPECTED ℓ ₁ DISTORTION

To eliminate the dependence of MPE on ε, we consider the accumulated probability of error, namely $\int _{0}^{\infty }\mathbb {P}( \left \vert \widehat {s}-s \right \vert >\epsilon )\mathrm {d} \epsilon$ as the risk to be minimized. For a nonnegative random variable Y, we know that $\mathcal {E} \{Y\}=\int _{0}^{\infty }\mathbb {P}( Y>\epsilon )\mathrm {d} \epsilon.$ Therefore, the accumulated probability of error is the expected ℓ₁ distortion:

(12)\begin{equation} \mathcal{R}_{\ell_1}\left(a,s \right)=\mathcal{E}\{ \left\vert \widehat{s}-s \right\vert \} = \int_{0}^{\infty}\mathbb{P}\left( \left\vert \widehat{s}-s \right\vert >\epsilon \right) \mathrm{d} \epsilon. \tag{12}\end{equation}

For Gaussian noise distribution,

(13)\begin{align} \mathcal{R}_{\ell_1}\left(a,s \right)&=\int_{0}^{\infty} Q \left( \frac{\epsilon-(a-1)s}{a\sigma} \right)\mathrm{d} \epsilon \notag\\ &\quad + \int_{0}^{\infty} Q \left( \frac{\epsilon+(a-1)s}{a\sigma} \right)\mathrm{d} \epsilon. \tag{13}\end{align}

Denoting u = ε − (a − 1)s/aσ and μ = −(a − 1)s/aσ, the first integral in (13) is evaluated as

\begin{align} & \int_{0}^{\infty}Q \left(\frac{\epsilon-(a-1)s}{a\sigma} \right)\mathrm{d} \epsilon\notag\\ &\quad =a\sigma \int_{\mu}^{\infty} Q \left( u \right)\mathrm{d}u \nonumber\end{align}

\begin{align} &\quad = a\sigma \left( \int_{0}^{\infty} Q \left( u \right)\mathrm{d}u - \int_{0}^{\mu} Q \left( u \right)\mathrm{d}u \right)\nonumber\end{align}

\begin{align} &\quad = a\sigma \left( \frac{1}{\sqrt{2\pi}}- \mu Q \left( \mu \right)- \frac{1}{\sqrt{2\pi}}\left( 1-e^{-{\mu^2}/{2}} \right) \right)\nonumber\end{align}

(14)\begin{align} &\quad = a\sigma \left( \frac{e^{-{\mu^2}/{2}} }{\sqrt{2\pi}} - \mu Q \left( \mu \right) \right). \tag{14}\end{align}

The second term in (13) can be evaluated by replacing μ with −μ in (17). Combining both integrals, we obtain the expression for the expected ℓ₁distortion:

\begin{align*} \mathcal{R}_{\ell_1}\left(a,s \right) &= a\sigma \left[\sqrt{{2}/{\pi}}e^{-{\mu^2}/{2}} - \mu Q \left( \mu \right) + \mu Q \left( - \mu \right) \right] \\ &= a\sigma \left[ \sqrt{\frac{2}{\pi}}\exp\left(-\frac{(a-1)^2 s^2}{2a^2 \sigma^2}\right)+ 2\frac{(a-1)s}{a\sigma}\right. \\ &\quad \left.\times Q \left( -\frac{(a-1)s}{a\sigma} \right) -\frac{(a-1)s}{a\sigma} \right]. \end{align*}

An estimate of the expected ℓ₁ distortion is calculated by replacing s with x. In Fig. 11(a), we show the variation of the original ℓ₁ distortion and its estimate obtained by setting s = x, as functions of a, averaged over 100 independent realizations of $\mathcal {N}(0,1)$ noise. The actual parameter value is s = 4. The figure shows that the minimum of the expected ℓ₁ risk is close to that of its estimate.

Fig. 11. (Color online) The expected ℓ₁-risk and its estimate versus a, averaged over 100 noise realizations for: (a) Gaussian noise with σ² = 1; (b) three-component GMM; and (c) a four-component GMM approximation to the Laplacian distribution.

In principle, one could iteratively minimize the ℓ₁ distortion by starting with $\ \widehat {s}=x$ and successively refining it. Such an approach is given in Algorithm 1. An illustration of the denoising performance of the iterative algorithm is deferred to Section 5.

Algorithm 1: Iterative minimization of the expected ℓ₁ distortion.

The parentheses in $\hat{s}^{(j)}$ in the algorithm flowchart are not appearing properly.

A) Expected ℓ₁ risk using GMM approximation

For the GMM p.d.f. in (8), the expected ℓ₁ distortion evaluates to (cf. Appendix B for the derivation)

(15)\begin{equation} \mathcal{R}_{\ell_1}=\sum_{m=1}^{M} a \alpha_m \sigma_{m} \left(\sqrt{\frac{2}{\pi}}e^{-{\mu_{m}^2}/{2}} -2 \mu_{m} Q \left( \mu_{m} \right) + \mu_{m} \right), \tag{15}\end{equation}

where μ_m = −((a − 1)s + θ_m)/aσ_m. The expected ℓ₁ risk and its estimate for a multimodal (cf. Fig. 3(b)) and Laplacian noise p.d.f.s are shown in Figs 11(b) and 11(c), respectively. We observe that, in both cases, the locations of the minima of the true risk and its estimate are in good agreement.

B) Optimum shrinkage versus posterior SNR

We next study the behavior of a _opt for different input SNRs to compare the denoising capabilities of the MPE and the expected ℓ₁-distortion-based shrinkage estimators. The optimal pointwise shrinkage parameter a _opt for Gaussian noise statistics, obtained by minimizing SURE, PE risk estimate, and the estimated ℓ₁ risk, for different values of the a posteriori SNR x ²/σ² is plotted in Fig. 12(a). To illustrate the effect of ε, the variation of a _opt versus a posteriori SNR for MPE corresponding to Gaussian noise is shown in Fig. 12(b), for different ε. We observe that the shrinkage profiles are characteristic of a reasonable denoising algorithm, as Fig. 12(a) and 12(b) exhibit that the shrinkage parameters increase as the a posteriori SNR increases. Whereas in the case of MPE, the choice of ε is crucial, the expected ℓ₁ distortion does not require tuning such a parameter.

Fig. 12. (Color online) Shrinkage parameter profiles as a function of the a posteriori SNR, corresponding to different risk functions: (a) MPE, SURE, expected ℓ₁ distortion; and (b) MPE for different values of ε. The shrinkage factor a _opt is plotted on a log scale mainly to highlight the fine differences among various attenuation profiles.

C) Shrinkage profiles of estimators based on MPE/ℓ₁ vis-à-vis SURE

To gain insight into the denoising behavior of the MPE and the ℓ₁-minimization-based shrinkage estimators, it is instructive to study the resulting thresholding functions shown in Fig. 13. Figures 13(a) and 13(b) correspond to Gaussian noise. The comparison of the shrinkage functions based on MPE, ℓ₁ risk, and SURE (cf. Fig. 13(a)) reveals that MPE results in a hard-thresholding-type shrinkage, whereas the ℓ₁ risk and SURE produce shrinkage profiles similar to a soft-thresholding function (the threshold operator is called “soft” as it is a continuous function of the input data [Reference Johnstone27]). The shrinkage functions resulting from MPE with different ε are shown in Fig. 13(b). We observe that increasing ε is tantamount to applying a larger attenuation on the noisy signal. In Fig. 13(c), we demonstrate the variation of the MPE-based shrinkage with respect to two different noise distributions, namely, Gaussian and Laplacian, having the same first- and second-order moments. We observe that the MPE-based estimator corresponding to Laplacian noise attenuates larger amplitudes more than its Gaussian counterpart. Such an attenuation profile helps suppress Laplacian noise more effectively, considering the heavier tail of the Laplacian distribution. A distribution-specific shrinkage profile also distinguishes MPE from the conventional MSE-based estimators. In contrast, the MSE-based shrinkage estimator, given by [Reference Sadasivan, Mukherjee and Seelamantula28]: $\hat {s}_{{\rm MSE}}=\max \{0,1-{\sigma ^2}/{x^2}\}x,$ relies only on moments up to second order.

Fig. 13. (Color online) Thresholding functions corresponding to different risk functions: (a) MPE, SURE, expected ℓ₁ distortion; (b) MPE for various values of ε; and (c) MPE, MSE. In (a) and (b), the noise considered is Gaussian. In (c), both Laplacian and Gaussian noise types have been considered with σ = 1 and ε = 3σ.

An inspection of the shrinkage functions corresponding to different ε values in Fig. 13(b) reveals a particularly interesting property: Transform coefficients with magnitudes smaller than ε are set to zero by the MPE-based shrinkage function. Thus, the PE risk essentially results in a distribution-specific threshold operator parameterized by ε. In this context, we recall an observation made by Johnstone et al. in [Reference Johnstone and Silverman45]: A threshold value of 3σ to 4σ or higher works well for signals that are sparse in the transform domain, whereas a threshold value of 2σ or lower is more appropriate for a dense signal. The optimal ε values obtained empirically for the harmonic and the Piece-Regular signals (cf. Section 3-3.1.3.1.3, Fig. 8) also support this statement. Since the harmonic signal is sparser than the Piece-Regular signal in the DCT domain, the optimal ε tends to be higher for the harmonic signal as compared with that for the Piece-Regular signal. Thus, the shrinkage functions shown in Fig. 13(b) provide us with valuable insights on how to fix ε. If the underlying signal is sparse in the transform considered, one should fix ε in the range [3σ, 4σ], whereas for dense signals, one should set ε to about 2σ.

V. PERFORMANCE OF THE EXPECTED ℓ₁ DISTORTION-BASED POINTWISE SHRINKAGE ESTIMATOR

In a practical denoising application, we have only one noisy realization from which the clean signal has to be estimated. However, it is instructive to consider the case of multiple realizations as it throws some light on the performance comparisons vis-à-vis other estimators such as the ML estimator. Consider the observation model x^(m) = s + w^(m) in ℝⁿ, 1 ≤ m ≤ M, where one has access to M noisy realizations of the signal s, and the noise vectors w^(m) are drawn independently from the $\mathcal {N}(\bold 0,\sigma ^2 I_n)$ distribution. The ML estimator of the i ^th signal coefficient s _i is given by $\hat {s}_{{\rm ML},i}=({1}/{M}) \sum _{m=1}^{M} x_i^{(m)}$, where $x_i^{(m)}$ is the i ^th component of x^(m). Dropping the subscript i, as each coefficient is treated independently of the others, the shrinkage estimator takes the form $\,\widehat {s}=a_{{\rm opt}} \hat {s}_{{\rm ML}}$. To study the behavior of the estimate with respect to M, we consider two variants: (i) where a _opt is obtained by minimizing $\mathcal {R}_{\ell _1}(s,a)$, referred to as the oracle-ℓ₁; and (ii) where a _opt is chosen to minimize $\mathcal {R}_{\ell _1}(\hat {s}_{{\rm ML}},a)$, referred to as ML-ℓ₁. The output SNR as a function of M for the Piece-Regular signal, corresponding to an input SNR of 5 dB, is shown in Fig. 14(a). For all three estimators, namely, oracle-ℓ₁, ML-ℓ₁, and the ML estimate, the output SNR increases with M. However, for the oracle-ℓ₁ and the ML-ℓ₁ estimators, the output SNR stagnates as M increases beyond 40. For M ≤ 60, the oracle-ℓ₁ and the ML-ℓ₁ shrinkage estimators exhibit better performance compared with the ML estimator. As one would expect, the performance of the ML-ℓ₁ estimator matches with that obtained using the oracle-ℓ₁ as M becomes large, because the ML estimate converges in probability to the true parameter. For M = 1, which is often the case in practice, the ML-ℓ₁ estimate significantly dominates the ML estimator as seen in Fig. 14(a). The SNR gain over the ML estimator could be further improved by using the iterative minimization algorithm introduced in Section 4 (cf. Algorithm 1). The performance of the ML-ℓ₁ and the ML estimators, for different values of M and input SNR is shown in Fig. 14(b). The figures show that for small values of SNR and M, the ML-ℓ₁ estimate outperforms the ML estimator. This is of significant importance in a practical setting where we have only one noisy realization (M = 1).

Fig. 14. (Color online) Comparison of denoising performance as a function of number of observations for a Piece-Regular signal in Gaussian noise: (a) Output SNR corresponding to input SNR 5 dB; and (b) Output SNR as a function of the number of observations M and input SNR. The numerical values on the curves indicate the corresponding values of M. In both (a) and (b), the results are averaged over 100 independent noise realizations.

A) Iterative minimization of the expected ℓ₁-risk

When M = 1, the ML-ℓ₁ estimator is obtained by minimizing $\mathcal {R}_{\ell _1}(x,a)$, where x is the noisy version of s. We refer to this estimate as the non-iterative ℓ₁-based shrinkage estimator. Following Algorithm 1, one could iteratively refine the estimate, starting from x. We compare the non-iterative ℓ₁-based estimator with its iterative counterpart (ITER-ℓ₁), and present the results in Figs 15,16, and 17, corresponding to Gaussian, multimodal (Fig. 3(b)), and a GMM approximation to the Laplacian noise, respectively. The output SNR obtained using the oracle-ℓ₁ estimator, calculated by minimizing $\mathcal {R}_{\ell _1}(s,a)$, is also shown for benchmarking the performance.

Fig. 15. (Color online) Performance of ℓ₁-risk-minimization-based pointwise shrinkage estimator (Piece-Regular signal in Gaussian noise): (a) Output SNR versus iterations (input SNR 5 dB); and (b) Output SNR versus input SNR, averaged over 100 independent noise realizations. The number of iterations in Algorithm 1 is fixed at N _iter = 20.

Fig. 16. (Color online) Performance of pointwise shrinkage estimator based on ℓ₁ risk minimization: (a) Output SNR versus iterations, corresponding to the Piece-Regular signal contaminated by noise (SNR 5 dB) whose p.d.f. is given in Fig. 3(b); and (b) Output SNR versus input SNR (averaged over 100 independent noise realizations). The number of iterations in Algorithm 1 is set to N _iter = 20.

Fig. 17. (Color online) Performance of pointwise shrinkage estimator obtained by ℓ₁ risk minimization: (a) Output SNR versus iterations for the Piece-Regular signal in Laplacian noise with an input SNR of 5 dB; and (b) Output SNR versus input SNR for N _iter = 20. In (a) and (b), the Laplacian distribution is modeled using a four-component GMM to calculate the ℓ₁-risk estimate. The results displayed in (b) are obtained after averaging over 100 realizations.

We make the following observations from Figs 15,16, and 17: (i) the output SNR increases with iterations, albeit marginally after about 10 iterations; (ii) the iterative method consistently dominates the non-iterative one, with an overall SNR improvement of about 2–3 dB, for input SNR in the range −5 dB to 20 dB; and (iii) the SNR gain of the iterative technique also reduces for higher input SNR, similar to other denoising algorithms.

VI. PERFORMANCE ASSESSMENT OF MPE AND ℓ₁-RISK MINIMIZATION ALGORITHMS VERSUS BENCHMARK DENOISING ALGORITHMS

We compare the MPE and the ℓ₁-based shrinkage estimators with three denoising algorithms: (i) wavelet soft-thresholdingFootnote ¹ (ST) [Reference Donoho35]; (ii) the SURE-LET denoising algorithmFootnote ² [Reference Luisier, Blu and Unser5]; and (iii) smooth sigmoid shrinkage (SS) [Reference Atto, Pastor and Mercier13] in the wavelet domainFootnote ³. In [Reference Donoho35], a wavelet-based soft-thresholding scheme is used for denoising, with the threshold selected as $\tau =\sigma \sqrt []{2 \ln (N)}$ for an N-length signal. The SURE-LET technique employs a linear expansion of thresholds (LET), which is a linear combination of elementary denoising functions and optimizes for the coefficients by minimizing the SURE criterion. In [Reference Atto, Pastor and Mercier13], a smooth sigmoid shrinkage is applied on the wavelet coefficients to achieve denoising, and the parameters of the sigmoid, which control the degree of attenuation, are obtained by minimizing the SURE objective. We consider ECG signals taken from the PhysioBank database, and the HeaviSine and Piece-Regular signals taken from Wavelab toolbox for performance evaluation.

The noise is assumed to follow a Gaussian distribution and the output SNR values are averaged over 100 independent realizations. The noise variance is estimated using a median-based estimator [Reference Mallat40], which was also employed by Luisier et al.¹ and Donoho². In SURE-LET, SS, and ST techniques, denoising is performed using Symmlet-4, with three levels of decomposition, as these settings were found to be the best for the ECG signal (following [Reference Krishnan and Seelamantula20]). In the case of MPE and ℓ₁-based shrinkage estimators, denoising is performed in the DCT domain. We use the shorthand notations MPE and MPE-SUB to denote the pointwise and subband shrinkage estimators, respectively. The corresponding SURE-based subband shrinkage estimator is denoted as SURE-SUB. We set k = 16 and $\epsilon =1.75\sqrt {k}\sigma$ for computing the subband shrinkage parameters. These parameters have not been specifically optimized; however, experimentally they were found to work well. The output SNR as a function of the input SNR, obtained using various algorithms, is shown in Fig. 18.

Fig. 18. (Color online) Output SNR of various denoising algorithms, averaged over 100 noise realizations, corresponding to different input SNRs.

From the ECG signal denoising performance shown in Fig. 18(a), we observe that the MPE estimate consistently dominates soft-thresholding-based denoising for input SNRs ranging from −5 dB to 20 dB. The iterative ℓ₁-distortion-based shrinkage estimator (20 iterations) yields lower output SNR compared with the MPE-based estimate for input SNR values in the range −5 to 17.5 dB, but surpasses it for relatively higher values of input SNR (17.5–20 dB). The SURE-LET and SS algorithms dominate both MPE and ℓ₁-based shrinkage estimators, because they use more sophisticated denoising functions in the transform domain, thereby offering greater flexibility. For input SNR in the range 0–20 dB, the expected ℓ₁-distortion-based shrinkage estimator consistently outperforms the soft-thresholding-based technique. We observe from Fig. 18 that the MPE-subband estimator is better than the competing algorithms at low input SNR.

In the preceding discussion, we considered DCT-domain shrinkage for the proposed methods and wavelet-based denoising for the benchmark methods, because these are the best-case performance scenarios. In order to maintain uniformity in comparison, we have also considered the scenarios where all the techniques operate either in the DCT or wavelet domain. Figure 19 shows a performance comparison in the case of DCT. For input SNRs in the range −5 to 15 dB, MPE-SUB shows the best denoising performance. Next, we address wavelet-domain denoising using MPE. We consider the same denoising framework as that of Atto and Pastor [Reference Atto, Pastor and Mercier13], but replace their sigmoid-shrinkage function with MPE-based shrinkage in every subband. The parameter ε has been set to ε = 3σ for MPE and $\epsilon =1.25\sigma \sqrt {k}$ for MPE-SUB, where k is the subband signal length. Tables 2 and 4 show that for input SNR below 10 dB, MPE-SUB gives better results than the competing algorithms. For the HeaviSine signal (cf. Table 3), soft-thresholding technique gave a better performance than the other techniques. However, MPE-SUB leads to a better performance than SURE-LET, SURE-SUB, and SS. The wavelet-domain denoising results are shown in a tabular form and DCT-domain denoising in figures, since the relative margin of improvement is smaller in the case of wavelet. When all the techniques operate in the DCT domain, the margin of improvement is maximum in the case of MPE.

Fig. 19. (Color online) DCT-domain performance comparison of various denoising algorithms for different input SNRs, results are averaged over 100 noise realizations. (a) ECG signal, (b) HeaviSine signal, and (c) Piece-Regular signal

Table 2. Performance comparison of wavelet domain denoising of ECG signal. The output SNR values are averaged over 100 noise realizations.

Table 3. Performance comparison for wavelet domain denoising of HeaviSine signal. Results presented are averaged over 100 noise realizations.

Table 4. Performance comparison for wavelet domain denoising of the Piece-Regular signal. The output SNR values are averaged over 100 noise realizations.

VII. CONCLUSIONS

We have proposed a new framework for signal denoising based on a novel criterion, namely the probability of error. Our framework is applicable to scenarios where the noise is i.i.d. and additive. Denoising is performed by means of optimum transform-domain shrinkage in the sense of minimum probability of error. We have considered both pointwise and subband shrinkage estimators within the MPE paradigm. The performance of the proposed MPE estimators depends on the choice of the error-tolerance parameter ε, which was determined empirically considering SNR gain as the quantity of interest. The parameter ε also acts as the threshold for the shrinkage function. We also proposed an alternative by integrating the probability of error for ε ∈ (0, + ∞), which led to the expected ℓ₁ distortion, which was minimized iteratively. We demonstrated that iterations improve the denoising performance and the resulting shrinkage estimator outperforms the classical ML estimator when the number of observations is small or the input SNR is low.

For performance comparisons with benchmarking denoising algorithms, we considered real ECG signals and Wavelab signals in Gaussian noise. Experimentally, we have found that an increase in subband size leads to a higher output SNR, and saturates beyond a point. When the subband size is small or the input SNR is low, MPE outperforms SURE.

We demonstrated that the optimum shrinkage parameter obtained by minimizing estimates of the PE/ℓ₁ distortions increases monotonically with increase in the a posteriori SNR. This behavior of the shrinkage parameter is essential for denoising. A theoretical characterization of this behavior is needed and may lead to interesting inferences, which could lead to a rigorous convergence proof for the proposed iterative expected ℓ₁ distortion minimization technique. Another important observation is that, for lower input SNRs, the proposed denoising framework yields a higher output SNR than MSE-based techniques. The improvement may be attributed to the fact that the MPE framework incorporates knowledge of the distribution of the observations, which goes beyond the second-order statistics considered in conventional MSE-based optimization. To the best of our knowledge, this is the first attempt at demonstrating competitive denoising performance with probability of error chosen as the distortion measure in a non-Bayesian setting.

Following our preliminary contribution [Reference Sadasivan, Mukherjee and Seelamantula28], recently, Kudryavtsev and Shestakov considered the probability that the maximum error between the estimates and true wavelet transform coefficients exceeds a critical value and analyzed the asymptotic behavior of the resulting optimal minimax threshold value considering specifically hard- and soft-thresholding schemes [Reference Kudryavtsev and Shestakov46]. However, the advantages of such a strategy in the context of a denoising task are yet to be ascertained.

ACKNOWLEDGMENTS

This work was supported by Indian Space Research Organization – Indian Institute of Science (ISRO-IISc) Space Technology Cell, and an IMPRINT project (No. 6000) funded by the Ministry of Human Resource Development and the Ministry of Health and Family Welfare, Government of India.

APPENDIX

A) PERTURBATION OF SURE-BASED POINTWISE SHRINKAGE

To analyze the perturbation in the location of the minimum of the SURE cost, in comparison with the true MSE, one needs to evaluate

\begin{equation*} P_e^{{\rm SURE}}=\mathbb{P}\left\{ \left\vert a_{{\rm opt}} \left(s\right) - a_{{\rm opt}}\left(x\right) \right\vert \geq \delta \right\}, \end{equation*}

where a _opt(s) = s ²/(s ² + σ²) and a _opt(x) = 1 − σ²/x ². Let

\[ h\left(x\right)=a_{{\rm opt}}\left(s\right) - a_{{\rm opt}}\left(x\right)= \left(\frac{s^2}{s^2+\sigma^2}- 1+\frac{\sigma^2}{x^2}\right).\]

The Taylor-series expansion of h(x) about s yields

\[ h(x)=\frac{\sigma^4}{s^2\left(s^2+\sigma^2\right)}-2 \frac{w \sigma^2}{s^3}+\sum_{n=2}^{\infty}\frac{h^{\left(n\right)}\left(s\right) }{n!} w^n,\]

where h ⁽ⁿ⁾ is the nth derivative h. Using the first-order Taylor series approximation h(x) ≈ h(s) + w h ⁽¹⁾(s), we obtain

\[ h(x)\approx \frac{\sigma^4}{s^2\left(s^2+\sigma^2\right)}-2 \frac{w \sigma^2}{s^3},\]

which, in turn, leads to an approximation of the perturbation probability $P_e^{{\rm SURE}}$:

\begin{align*} P_e^{{\rm SURE}} &= \mathbb{P}\left\{ \left\vert h\left(x\right) \right\vert \geq \delta \right\} \\ &\approx \mathbb{P}\left\{ \left\vert \frac{\sigma^4}{s^2\left(s^2+\sigma^2\right)}-2 \frac{w \sigma^2}{s^3}\right\vert \geq \delta \right\}. \end{align*}

Invoking $w\sim \mathcal {N}(0,\sigma ^2)$, and using the Chernoff bound [Reference Mitzenmacher and Upfal36], we obtain

\[ P_e^{{\rm SURE}} \leq 2\exp\left( -\frac{s^6}{8\sigma^6} \left(\delta - \frac{\sigma^4}{\left( s^2+\sigma^2 \right)s^2} \right)^2 \right).\]

Consequently, to satisfy an upper bound on the deviation probability of the form $P_e^{{\rm SURE}} \leq \alpha$, for a given δ > 0, one must ensure that

(A.1)\begin{equation} \frac{s^6}{8\sigma^6} \left( \delta - \frac{\sigma^4}{\left( s^2+\sigma^2 \right)s^2} \right)^2 \geq \log \left(\frac{2}{\alpha}\right). \tag{A.1}\end{equation}

The condition in (A.1) translates to an equivalent condition on the minimum required SNR s ²/σ² to achieve a certain $P_e^{{\rm SURE}}$.

B) EXPECTED ℓ₁ RISK FOR GMM

For additive noise with the p.d.f. given in (8), we have

(A.2)\begin{align} \mathcal{E}\{ \left\vert \widehat{s}-s \right\vert \}&= \sum_{m=1}^{M}\alpha_m \left( \int_{0}^{\infty} Q \left( \frac{\epsilon-(a-1)s-\theta_m}{a\sigma_m} \right) \right. \mathrm{d} \epsilon\nonumber\\ &\quad + \int_{0}^{\infty}\left. Q \left( \frac{\epsilon+(a-1)s+\theta_m}{a\sigma_m} \right) \mathrm{d} \epsilon \right), \tag{A.2}\end{align}

using (9) and (12). Letting μ_m = −((a − 1)s + θ_m)/aσ_m and u _m = (ε − (a − 1)s − θ_m)/aσ_m, we get

(A.3)\begin{equation} \int_{0}^{\infty} Q \left(u_m \right)\mathrm{d} \epsilon = a\sigma_m \left( \frac{e^{-{\mu_{m}^2}/{2}} }{\sqrt{2\pi}} - \mu_{m} Q \left( \mu_{m} \right) \right). \tag{A.3}\end{equation}

Substituting (A.3) in (A.2) yields

\begin{align*} \mathcal{E}\{ \left\vert \widehat{s}-s \right\vert \}&=\sum_{m=1}^{M} a \alpha_m \sigma_{m} \\ &\quad \times \left( \sqrt{\frac{2}{\pi}}e^{-{\mu_{m}^2}/{2}} -2 \mu_{m} Q \left( \mu_{m} \right) + \mu_{m} \right), \end{align*}

which is the expression for the expected ℓ₁ distortion for noise following a GMM distribution.

Footnotes

¹ A Matlab implementation is included in the Wavelab toolbox available at: http://statweb.stanford.edu/~wavelab/.

² A MATLAB implementation of the SURE-LET algorithm is available at: http://bigwww.epfl.ch/demo/suredenoising.

³ Pastor et al. kindly provided the MATLAB implementation of their denoising technique [Reference Atto, Pastor and Mercier13], which facilitated the comparisons reported in this paper.

References

REFERENCES

1Poor, H.V.: An Introduction to Signal Detection and Estimation, Springer, New York, 1994.CrossRef Google Scholar

2Kay, S.; Eldar, Y.C.: Rethinking biased estimation. IEEE Signal Process. Mag., 25 (3) (2008), 133–136.CrossRef Google Scholar

3Stein, C.M.: Estimation of the mean of a multivariate normal distribution. Ann. Stat., 9 (6) (1981), 1135–1151.CrossRef Google Scholar

4Eldar, Y.C.: Generalized SURE for exponential families: applications to regularization. IEEE Trans. Signal Process., 57 (2) (2009), 471–481.CrossRef Google Scholar

5Luisier, F.; Blu, T.; Unser, M.: A new SURE approach to image denoising: interscale orthonormal wavelet thresholding. IEEE Trans. Image Process., 16 (3) (2007), 593–606.CrossRef Google Scholar PubMed

6Luisier, F.; Blu, T.; Unser, M.: SURE-LET for orthonormal wavelet-domain video denoising. IEEE Trans. Circ. Syst. Video Tech., 20 (6) (2010), 913–919.CrossRef Google Scholar

7Luisier, F.; Blu, T.: The SURE-LET multichannel image denoising: interscale orthonormal wavelet thresholding. IEEE Trans. Image Process., 17 (4) (2008), 482–492.CrossRef Google Scholar PubMed

8Blu, T.; Luisier, F.: The SURE-LET approach to image denoising. IEEE Trans. Image Process., 16 (11) (2007), 2778–2786.CrossRef Google Scholar PubMed

9Zheng, N.; Li, X.; Blu, T.; Lee, T.: SURE-MSE speech enhancement for robust speech recognition, in Proc. 7th Int. Symp. Chinese Spoken Language Process., November 2010, 271–274.CrossRef Google Scholar

10Donoho, D.L.; Johnstone, I.M.: Adapting to unknown smoothness via wavelet shrinkage. J. Amer. Stat. Assoc., 90 (432) (1995), 1200–1224.CrossRef Google Scholar

11Benazza-Benyahia, A.; Pesquet, J.C.: Building robust wavelet estimators for multicomponent images using Stein's principle. IEEE Trans. Image Process., 14 (11) (2005), 1814–1830.CrossRef Google Scholar PubMed

12Zhang, X.; Desai, M.D.: Adapting denoising based on SURE risk. IEEE Signal Process. Lett., 5 (10) (1998), 265–267.CrossRef Google Scholar

13Atto, A.M.; Pastor, D.; Mercier, G.: Smooth adaptation by sigmoid shrinkage. EURASIP J. Image and Video Process., 2009, 1-24, Article ID 532312, doi:10.1155/2009/532312.CrossRef Google Scholar

14Atto, A.M.; Pastor, D.; Mercier, G.: Optimal SURE parameter for sigmoidal wavelet shrinkage, in Proc. Eur. Signal Process. Conf., August 2009, 40–43.Google Scholar

15Ramani, S.; Blu, T.; Unser, M.: Monte Carlo SURE: a black-box optimization of regularization parameters for general denoising algorithms. IEEE Trans. Image Process., 17 (9) (2008), 1540–1544.CrossRef Google Scholar PubMed

16De Ville, D.V.; Kocher, M.: Nonlocal means with dimensionality reduction and SURE-based parameter selection. IEEE Trans. Image Process., 20 (9) (2011), 2683–2690.CrossRef Google Scholar PubMed

17Hsung, T.; Lun, D.; Ho, K.C.: Optimizing the multiwavelet shrinkage denoising. IEEE Trans. Signal Process., 53 (1) (2005), 240–251.CrossRef Google Scholar

18Qiu, T.; Wang, A.; Yu, N.; Song, A.: LLSURE: local linear SURE-based edge-preserving image filtering. IEEE Trans. Image Process., 22 (1) (2013), 80–90.Google Scholar PubMed

19Wu, Y.; Tracey, B.H.; Natarajan, P.; P.Noonan, J.: Fast blockwise SURE shrinkage for image denoising. Signal Process., 103, (2014), 45–59.CrossRef Google Scholar

20Krishnan, S.R.; Seelamantula, C.S.: On the selection of optimum Savitzky-Golay filters. IEEE Trans. Signal Process., 61 (2) (2013), 380–391.CrossRef Google Scholar

21Pesquet, J.C.; Benazza-Benyahia, A.; Chaux, C.: A SURE approach for digital signal/image deconvolution problems. IEEE Trans. Signal Process., 57 (12) (2009), 4616–4632.CrossRef Google Scholar

22Zheng, L.; Liu, Q.; Wanga, X.; Maleki, A.: ℓ_p-based complex approximate message passing with application to sparse stepped frequency radar. Signal Process., 134, (2017), 249–260.CrossRef Google Scholar

23Hudson, H.M.: A natural identity for exponential families with applications in multi-parameter estimation. Ann. Stat., 6 (3) (1978), 473–484.CrossRef Google Scholar

24Hwang, J.T.: Improving upon standard estimators in discrete exponential families with applications to Poisson and negative binomial cases. Ann. Stat., 10 (3) (1982), 857–867.CrossRef Google Scholar

25Giryes, R.; Elad, M.; Eldar, Y.C.: The projected GSURE for automatic parameter tuning in iterative shrinkage methods. Appl. Comput. Harmon. Anal., 30 (3) (2011), 407–422.CrossRef Google Scholar

26Luisier, F.; Vonesch, C.; Blu, T.; Unser, M.: Fast interscale wavelet denoising of Poisson-corrupted images. Signal Process., 90, (2010), 415–427.CrossRef Google Scholar

27Johnstone, I.M.: Gaussian estimation: sequence and wavelet models, June 2013. Available at: http://statweb.stanford.edu/imj/GE06-11-13.pdf. Last Accessed on: Dec. 17, 2019.Google Scholar

28Sadasivan, J.; Mukherjee, S.; Seelamantula, C.S.: An optimum shrinkage estimator based on minimum-probability-of-error criterion and application to signal denoising, in Proc. IEEE Intl. Conf. on Acoust. Speech and Signal Process., 2014, 4249–4253.CrossRef Google Scholar

29Sadasivan, J.; Mukherjee, S.; Seelamantula, C.S.: Speech enhancement using the minimum-probability-of-error criterion, in Proc. Interspeech, September 2018, 1141–1145.CrossRef Google Scholar

30Plataniotis, K.N.; Hatzinakos, D.: Gaussian Mixtures and Their Applications to Signal Processing, CRC Press, Canada, 2000.CrossRef Google Scholar

31Sorenson, H.W.; Alspach, D.L.: Recursive Bayesian estimation using Gaussian sums. Automatica, 7, (1971), 465–479.CrossRef Google Scholar

32Sovic, A.; Sersic, D.: Robustly adaptive wavelet filter bank using L1 norm, in Proc. 18th Intl. Conf. on Systems, Signals and Image Process., 2011.Google Scholar

33WAVELAB toolbox [Online]. Available at: https://statweb.stanford.edu/wavelab/. Last accessed on: Dec. 17, 2019.Google Scholar

34PhysioBank database [Online]. Available at: https://www.physionet.org/physiobank/database/aami-ec13/. Last accessed on: Dec. 17, 2019.Google Scholar

35Donoho, D.L.: Denoising by soft thresholding. IEEE Trans. Info. Theory., 41 (3) (1995), 613–627.CrossRef Google Scholar

36Mitzenmacher, M.; Upfal, E.: Probability and Computing: Randomized Algorithms and Probabilistic Analysis, Cambridge University Press, UK, 2005.CrossRef Google Scholar

37Chang, S.; Cosmon, P.C.; Milstein, L.B.: Chernoff-type bounds for the gaussian error function. IEEE Trans. Commun., 59 (11) (2011), 471–481.CrossRef Google Scholar

38Redner, R.; Walker, H.: Mixture densities, maximum likelihood and the EM algorithm. SIAM Rev., 26 (2) (1984), 195–239.CrossRef Google Scholar

39Rao, K.R.; Yip, P.: Discrete Cosine Transform: Algorithms, Advantages, Applications, Academic, Boston, MA, 1990.CrossRef Google Scholar

40Mallat, S.: A Wavelet Tour of Signal Processing, 3rd ed., Academic Press, Burlington, 2009.Google Scholar

41Pastor, D.; Socheleau, F.: Robust estimation of noise standard deviation in presence of signals with unknown distributions and occurrences. IEEE Trans. Signal Process., 60 (4) (2012), 1545–1555.CrossRef Google Scholar

42Boll, S.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoust. Speech, Signal Process., 27 (2) (1979), 113–120.CrossRef Google Scholar

43Loizou, P.: Speech Enhancement – Theory and Practice, CRC Press, Boca Raton, 2007.CrossRef Google Scholar

44Kibria, B.M.G.; Joarder, A.H.: A short review of multivariate t-distribution. J. Stat. Res., 40 (1) (2006), 59–72.Google Scholar

45Johnstone, I.M.; Silverman, B.W.: Empirical Bayes selection of wavelet thresholds. Ann. Stat., 33 (4) (2005), 1700–1752.CrossRef Google Scholar

46Kudryavtsev, A.A.; Shestakov, O.V.: The asymptotic behavior of the optimal threshold minimizing the probability-of-error criterion. J. Math. Sci., 234 (6) (2018), 810–816.CrossRef Google Scholar

Fig. 2. (Color online) Input SNR requirement for SURE (black) and MPE (blue) to ensure that the probability of δ-perturbation of the minima is less than or equal to α.

Fig. 5. (Color online) The PE risk and its estimate averaged over 100 Monte Carlo trials for the subband shrinkage estimator versus a; where $\epsilon = \sqrt {k}\sigma$, with k=8. The additive noise is Gaussian with SNRin = 5 dB. In each trial, s is generated by drawing samples from $\mathcal {N}(2\times \bold 1_k, I_k)$.

Table 1. Comparison of MPE, SURE-based shrinkage estimator and Wiener filter (WF) for different input SNRs. The output SNR values are averaged over 100 noise realizations.

Fig. 6. (Color online) Denoising performance of the MPE- and SURE-based pointwise shrinkage estimators for the Piece-Regular signal corrupted by Laplacian noise. The PE risk is calculated by using a GMM approximation and setting ε = 3σ.

Fig. 8. (Color online) Average output SNR of the pointwise MPE shrinkage as a function of β = ε/σ, for different values of the input SNR. The output SNR curves peak when β ≈ 3.

Fig. 9. (Color online) Comparison of denoising performance of the subband shrinkage estimators using MPE and SURE, for the Piece-Regular signal corrupted by additive Gaussian noise. The subband size is taken as k = 16 and the value of ε is $1.75\sqrt {k}\sigma$.

Fig. 11. (Color online) The expected ℓ1-risk and its estimate versus a, averaged over 100 noise realizations for: (a) Gaussian noise with σ2 = 1; (b) three-component GMM; and (c) a four-component GMM approximation to the Laplacian distribution.

Algorithm 1: Iterative minimization of the expected ℓ1 distortion.

Fig. 12. (Color online) Shrinkage parameter profiles as a function of the a posteriori SNR, corresponding to different risk functions: (a) MPE, SURE, expected ℓ1 distortion; and (b) MPE for different values of ε. The shrinkage factor aopt is plotted on a log scale mainly to highlight the fine differences among various attenuation profiles.

Fig. 13. (Color online) Thresholding functions corresponding to different risk functions: (a) MPE, SURE, expected ℓ1 distortion; (b) MPE for various values of ε; and (c) MPE, MSE. In (a) and (b), the noise considered is Gaussian. In (c), both Laplacian and Gaussian noise types have been considered with σ = 1 and ε = 3σ.

Fig. 14. (Color online) Comparison of denoising performance as a function of number of observations for a Piece-Regular signal in Gaussian noise: (a) Output SNR corresponding to input SNR 5 dB; and (b) Output SNR as a function of the number of observations M and input SNR. The numerical values on the curves indicate the corresponding values of M. In both (a) and (b), the results are averaged over 100 independent noise realizations.

Fig. 15. (Color online) Performance of ℓ1-risk-minimization-based pointwise shrinkage estimator (Piece-Regular signal in Gaussian noise): (a) Output SNR versus iterations (input SNR 5 dB); and (b) Output SNR versus input SNR, averaged over 100 independent noise realizations. The number of iterations in Algorithm 1 is fixed at Niter = 20.

Fig. 16. (Color online) Performance of pointwise shrinkage estimator based on ℓ1 risk minimization: (a) Output SNR versus iterations, corresponding to the Piece-Regular signal contaminated by noise (SNR 5 dB) whose p.d.f. is given in Fig. 3(b); and (b) Output SNR versus input SNR (averaged over 100 independent noise realizations). The number of iterations in Algorithm 1 is set to Niter = 20.

Fig. 17. (Color online) Performance of pointwise shrinkage estimator obtained by ℓ1 risk minimization: (a) Output SNR versus iterations for the Piece-Regular signal in Laplacian noise with an input SNR of 5 dB; and (b) Output SNR versus input SNR for Niter = 20. In (a) and (b), the Laplacian distribution is modeled using a four-component GMM to calculate the ℓ1-risk estimate. The results displayed in (b) are obtained after averaging over 100 realizations.

Fig. 18. (Color online) Output SNR of various denoising algorithms, averaged over 100 noise realizations, corresponding to different input SNRs.

Table 2. Performance comparison of wavelet domain denoising of ECG signal. The output SNR values are averaged over 100 noise realizations.

Table 3. Performance comparison for wavelet domain denoising of HeaviSine signal. Results presented are averaged over 100 noise realizations.

Table 4. Performance comparison for wavelet domain denoising of the Piece-Regular signal. The output SNR values are averaged over 100 noise realizations.

Article contents

Signal denoising using the minimum-probability-of-error criterion

Abstract

Keywords

Information

1. INTRODUCTION

A) Prior art

B) This paper

II. THE PE RISK

A) PE risk for pointwise shrinkage

B) Closeness of $\widehat {\mathcal {R}}$ to $\mathcal {R}$

C) Perturbation probability of the location of minimum

D) Unknown noise distributions

E) PE risk for subband shrinkage

III. EXPERIMENTAL RESULTS FOR MPE-BASED DENOISING

A) Performance of pointwise-shrinkage estimator

1) Harmonic signal denoising

2) Piece-Regular signal denoising

3) Effect of ε on the denoising performance of MPE

B) Performance of subband MPE shrinkage

IV. THE EXPECTED ℓ 1 DISTORTION

A) Expected ℓ1 risk using GMM approximation

B) Optimum shrinkage versus posterior SNR

C) Shrinkage profiles of estimators based on MPE/ℓ1 vis-à-vis SURE

V. PERFORMANCE OF THE EXPECTED ℓ1 DISTORTION-BASED POINTWISE SHRINKAGE ESTIMATOR

A) Iterative minimization of the expected ℓ1-risk

VI. PERFORMANCE ASSESSMENT OF MPE AND ℓ1-RISK MINIMIZATION ALGORITHMS VERSUS BENCHMARK DENOISING ALGORITHMS

VII. CONCLUSIONS

ACKNOWLEDGMENTS

APPENDIX

A) PERTURBATION OF SURE-BASED POINTWISE SHRINKAGE

B) EXPECTED ℓ1 RISK FOR GMM

Footnotes

References

REFERENCES

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests

IV. THE EXPECTED ℓ ₁ DISTORTION

A) Expected ℓ₁ risk using GMM approximation

C) Shrinkage profiles of estimators based on MPE/ℓ₁ vis-à-vis SURE

V. PERFORMANCE OF THE EXPECTED ℓ₁ DISTORTION-BASED POINTWISE SHRINKAGE ESTIMATOR

A) Iterative minimization of the expected ℓ₁-risk

VI. PERFORMANCE ASSESSMENT OF MPE AND ℓ₁-RISK MINIMIZATION ALGORITHMS VERSUS BENCHMARK DENOISING ALGORITHMS

B) EXPECTED ℓ₁ RISK FOR GMM