On the Performance Bound of Sparse Estimation with Sensing Matrix Perturbation

1 Introduction

The problem of sparse recovery from linear measurement has been a hot topic these years and has drawn a great deal of attention. Various practical algorithms of sparse recovery have been proposed and their performance in noisy scenarios have been theoretical analyzed. Most of these works only consider the upper bound of the estimation error. The theoretical lower bound of estimation error is of great interest because it sets a limit performance which all sparse recovery algorithms cannot exceed. In addition, perturbed sensing matrices appear in many practical scenarios, therefore it is necessary to study the theoretical bounds of sparse recovery with perturbed sensing matrix. One of the consequences of perturbed sensing matrix is that it is a kind of multiplicative noise, and the total noise on the measurement vector is dependent on the sparse vector, which demonstrates potential complexity compared to the sensing matrix perturbation-free scenario.

Dealing with the more general setting in which the sensing matrix is perturbed by additive random noise, we worked on the theoretical lower bounds of sparse estimators and employed the constrained Cramér-Rao bound (CCRB) and the Hammersley-Chapman-Robbins bound (HCRB). Closed-form expressions of the CCRB are derived, and the quantitative behavior is discussed. For the HCRB, only the case of identity sensing matrix is studied for the sake of simplicity, but the results are still inspiring in that its analysis is much simpler and can still provide much information about the behavior of the theoretical lower bounds when the noises are large.

This work has been accepted for publication at IEEE Transactions on Signal Processing in the paper On the Performance Bound of Sparse Estimation with Sensing Matrix Perturbation by Yujie Tang, Laming Chen, and Yuantao Gu.

2 Problem Setting

In the case of general perturbation, the measurement vector is observed via a corrupted sensing matrix as
\label{fund_eq}
\mathbf{y}=(\mathbf{A}+\mathbf{E})\mathbf{x}+\mathbf{n},

where $\mathbf{x}$ is the deterministic parameter to be estimated, and $\mathbf{y}$ is the measurement vector. $\mathbf{E}$ represents the perturbation on the sensing matrix, whose elements are i.i.d. Gaussian distributed random variables with zero mean and variance $\sigma_e^2$. The vector $\mathbf{n}\sim\mathcal{N}(0,\sigma_n^2\mathbf{U}_m)$ is the noise on the measurement vector $\mathbf{y}$ and is independent of $\mathbf{E}$ where $\mathbf{U}_m$ denotes the $m\times m$ unit matrix.

The parameter $\mathbf{x}$ is supposed to be sparse, i.e. the size of its support is far less than its dimension. The support of $\mathbf{x}$ is denoted by $S$, and its size is assumed to satisfy $|S|=\|\mathbf{x}\|_{\ell_0}\leq s$. An estimator $\hat{\mathbf{x}}=\hat{\mathbf{x}}(\mathbf{y})$ is a function of the measurement vector, and is essentially a random variable. In this work only unbiased estimators are considered. Unbiased estimators are the ones that satisfy

Here $\mathcal{X}$ denotes the set of all possible values of the parameter $\mathbf{x}$. In the sparse setting, the notation $\mathcal{X}_s$ is used for this set and could be formulated as

\mathcal{X}_s=\{\mathbf{x}\in\mathbb{R}^n:\|\mathbf{x}\|_{\ell_0}\leq s\}.

The set of all unbiased estimators will be denoted by $\mathcal{U}$. For every unbiased estimator, its MSE at a specific parameter value possesses a lower bound known as the Barankin bound (BB). Unfortunately, the BB often does not possess a closed-form expression, or its computation is of great complexity. Consequently, two types of lower bounds of the BB, the constrained Cramér-Rao bound and the Hammersley-Chapman-Robbins bound, were researched for sparse estimation with general perturbation. As they are lower bounds of the BB, they can also be viewed as the lower bounds of the MSE of unbiased estimators. Although they are not as tight as the BB, they usually possess simpler expressions and can provide insights into the properties of the BB. The following two sections try to interpret and discuss these two lower bounds in an intuitive way. You may please refer to Tang'13 for detailed mathematical forms of them.

3 The Constrained CRB

The CCRB, which has simple closed-form expression and is convenient to analyze, generalizes the original CRB to the case where the parameter is constrained in an arbitrary given set, e.g. the sparse parameter set. After relaxing the restrictions on the estimators to be unbiased in the neighborhood of a specific parameter value, we derived the CCRB for the cases that $\bf x$ has maximal support and nonmaximal support. It is interesting that there is a gap between the maximal and nonmaximal support cases of the CCRB, which means that the CCRB is not a continuous function of the parameter $\mathbf{x}$.

This gap originates from the "discontinuity" of the restriction that the estimator should be unbiased in the neighborhood of a specific parameter value. The "neighborhood" of a parameter point having maximal support has an entirely different structure from that of a parameter point having nonmaximal support: the former is a subspace that is locally identical to $\mathbb{R}^s$, while the latter is a union of $s$-dimensional subspaces. Figure 1 is a geometric illustration of the structure of the neighborhood of $\mathbf{x}$. It can be seen that as the minimal entry approaches zero, the structure of the neighborhood of $\mathbf{x}$ will have an abrupt change: from being locally identical to $\mathbb{R}^s$ to being locally identical to $\mathbb{R}^n$. This is the cause of the gap between the maximal and nonmaximal cases.

Figure 1: A geometrical demonstration of the discontinuity between the neighborhood structures of the maximal and the nonmaximal support cases. The vectors $\mathbf{v}_i$ are base vectors of the neighborhood subspace $\mathcal{F}$.

On the other hand, if a stronger condition, global unbiasedness, is imposed on the considered estimators instead of unbiasedness in the neighborhood, i.e. the estimators are restricted to be unbiased for all sparse parameters, then this discontinuity should not occur. Thus the corresponding lower bound should also be continuous as $\mathbf{x}$ changes from having maximal support to having nonmaximal support, as will be revealed in next section. It is further demonstrated that the CCRB for maximal support is not sufficiently tight for estimators in $\mathcal{U}$, especially when the support of $\mathbf{x}$ is nearly nonmaximal, i.e. at least one of the non-zero entries is small compared to other non-zero entries.

Based on the above analysis, we further studied the asymptotic behavior of the CCRB. We found that as $s$ approaches infinity, the CCRB tends to the MSE of the oracle estimator, which knows the support set a priori. It can be concluded that as $s$ increases, less information can be possibly obtained from the dependence of the total noise on $\mathbf{x}$ to help reduce the estimation error, and finally this dependence can be ignored.

4 The Hammersley-Chapman-Robbins Bound

The constrained CRB given in the above section possesses a simple closed form, but it takes into account only the unbiasedness in the neighborhood of a specific parameter value rather than the global unbiasedness for all sparse vectors. Therefore it can be anticipated that for estimators that are globally unbiased for all sparse parameter values, the CCRB is not a very tight lower bound.

Motivated by this, we derived the HCRB for sparse estimation under the setting of general perturbation.
However, the calculation of the HCRB for general sensing matrix $\mathbf{A}$ is much more complicated, and therefore attention is focussed only on the simplest case of unit sensing matrix. Nevertheless, the HCRB of this simple case is still instructive for us to have a qualitative understanding of the HCRB for general cases.

After complicated theoretical work, we found that the HCRB is actually a family of lower bounds of unbiased estimators, and the tightest one is their supremum. However, it is often impossible to obtain a closed-form expression of the supreme value of the HCRB family, and thus we employed a certain case and obtain a corresponding HCRB, which is simple and easy to analyze. We revealed that the "worst case entry SNR" plays a central role in the transition from maximal support to nonmaximal support. Furthermore, we found that the situation of general sensing perturbation here is more complicated than that with only the measurement noise. One of the consequences is that when the smallest entry approaches infinity, the HCRB does not generally tends to the CCRB of maximal support unless there is no sensing matrix perturbation.

This phenomenon reveals that the global unbiasedness has an essential effect on the problem of the sensing matrix perturbation case. Fortunately, the HCRB will converge to the CCRB of nonmaximal support as the smallest entry approaches zero. Therefore it can be said that the gap between the maximal and nonmaximal cases is eliminated.

5 Numerical Results

We performed various numerical simulations to substantiate our theoretical results on the CCRB and HCRB. Here we only compare the derived bounds with the performance of existing estimators. Because the HCRB is only derived for the unit sensing matrix case, the following analysis and simulations will be focused on this particular situation. One of the existing estimators to be compared is the maximum likelihood (ML) estimator. It should be noted that when the noise is large, this estimator will be severely biased. Another estimator is the one given by Eldar et al. for the special case of $s=1$. This estimator is globally unbiased.

Figure 2: Comparison of the performance of ML estimator and the unbiased estimator by Eldar et al. with HCRB and CCRB.

We test the case when $s=1$ and $n=5$. The sparse vector $\mathbf{x}$ is set to be $[1,0,\ldots,0]^\mathrm{T}$. The MSE of the two estimators together with HCRB and CCRB are numerically computed for different settings of $\sigma_n$'s and $\sigma_e$'s. The results are shown in Figure 2. We first analyze the case $\sigma_e=0.1$ which represents relatively small matrix perturbation. It can be seen that when $\sigma_n$ is small, the MSEs of the ML estimator and the unbiased estimator are both lower bounded by the HCRB and the CCRB. As $\sigma_n$ increases, the ML estimator undergoes an evident performance degradation before the unbiased estimator does. However, when $\sigma_n$ is considerably large, the MSE of the ML estimator is not lower bounded by the HCRB. This is due to the severe bias of the ML estimator in such situations. For the unbiased estimator, it can be seen that its MSE is always lower bounded by the HCRB.

For the case $\sigma_e=1$, it can be seen that when $\sigma_n$ is small, the MSEs of both the ML estimator and the unbiased estimator are much larger than the HCRB due to the large matrix perturbation level. As explained before, the ML estimator is severely biased when the noise level is large, and thus for certain values of $\sigma_n$, its MSE will not be lower bounded by the HCRB. For the unbiased estimator, its MSE is always lower bounded by the HCRB.

6 Future works

There are several future directions to be explored. First, the HCRB is obtained only for the unit sensing matrix case, which is useful for a qualitative comprehension but has apparent limitation. In many cases the sensing matrix cannot be assumed unit, and a more precise quantitative study on the performance bound also requires generalizing the HCRB to the general sensing matrix case. There may be two practical approaches to this problem. One is to derive a closed-form lower bound which is convenient to analyze and understand; the other is to find a tractable way to numerically compute the lower bound. However, both ways need further research and are still waiting for useful results.

Second, the HCRB provides a lower bound for globally unbiased estimators. However, recovery algorithms are quite likely to be biased in the sparse setting, and the bias is usually dependent on the noise variance. Moreover, there are occasions that biased estimators can achieve a lower MSE than unbiased estimators. These problems also point out a possible direction for future study.