RBF-network based sparse signal recovery algorithm for compressed sensing reconstruction.

Accepted Manuscript RBF-network based sparse signal recovery algorithm for compressed sensing reconstruction L. Vidya, V. Vivekanand, U. Shyamkumar, Deepak Mishra PII: DOI: Reference:

S0893-6080(14)00241-X http://dx.doi.org/10.1016/j.neunet.2014.10.010 NN 3412

To appear in:

Neural Networks

Received date: 23 July 2013 Revised date: 15 October 2014 Accepted date: 29 October 2014 Please cite this article as: Vidya, L., Vivekanand, V., Shyamkumar, U., & Mishra, D. RBF-network based sparse signal recovery algorithm for compressed sensing reconstruction. Neural Networks (2014), http://dx.doi.org/10.1016/j.neunet.2014.10.010 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

*Manuscript Click here to view linked References

RBF-Network Based Sparse Signal Recovery Algorithm for Compressed Sensing Reconstruction Vidya L. a, Vivekanand V. a,1, Shyamkumar U. a, Deepak Mishra b a Vikram b Indian

Sarabhai Space Centre, ISRO, Thiruvananthapuram, India

Institute of Space Science and Technology, Department of Space, Thiruvananthapuram, India

Abstract The approach of applying a cascaded network consisting of radial basis function nodes and least square error minimization block to Compressed Sensing for recovery of sparse signals is analyzed in this paper to improve the computation time and convergence of an existing ANN based recovery algorithm. The proposed radial basis function - least square error projection cascade network for sparse signal Recovery (RASR) utilize the smoothed L0 norm optimization, L2 least square error projection and feedback network model to improve the signal recovery performance over the existing CSIANN algorithm. The use of ANN architecture in the recovery algorithm gives a marginal reduction in computational time compared to an existing L0 relaxation based algorithm SL0. The simulation results and experimental evaluation of the algorithm performance are presented here. Key words: Compressed Sensing; radial basis function; convergence rate; measurement matrix;

1

Introduction

Compressed Sensing (CS) is a new framework for signal acquisition and smart sensor design. CS based acquisition can be used to under sample the sparse signal, without losing the information content. A scenario where CS becomes Corresponding author v [email protected], +918547569592, VSSC, ISRO, Thiruvananthapuram, India.

1

Preprint submitted to Elsevier

11 October 2014

an effective data acquisition technique is when communication bandwidth is limited, different from classical methods, where the Nyquist rate sampled signal is compressed prior to transmission; CS optimizes the acquisition process and reduces the number samples [1]. Signals like Synthetic Aperture Radar, where fine resolution signal characteristics can be captured only at high sampling rate, which may not be possible always, because of engineering or timing constraints, CS based acquisition and recovery techniques can be utilized to capture these signals at a sampling rate well below the recommended Nyquist rate [2]. In general, CS can be used in applications where the signal can be obtained only from a linear projection space [3]. Recent research on compressed sensing has widened the scope of its application from biomedical to astronomy [4], for each application domain unique recovery algorithms are designed specifically to meet the requirements. The CS recovery algorithms use theoretical foundation from various techniques of function optimization, regression, iterative methods, machine learning and artificial neural networks (ANN). The objective of developing new CS recovery algorithms has changed from general algorithm, which deals with all kind of data to application specific algorithm taking advantage of the inherent features of the signals under consideration. To mention one such development, the paper [5] describes a CS recovery algorithm designed with extensive assumption on the signal timing features and pulse characteristics. As more and more signal restrictions are imposed for each application specific algorithm, the performance comparison of these becomes increasingly difficult, just by evaluating the results. A comprehensive analysis of various CS recovery algorithms for telemetry application is presented in the paper [6]. The paper [7] describes the CSIANN algorithm for Compressed Sensing reconstruction using Artificial Neural Network architecture (ANN), which is a modified version of L0LMS algorithm presented in the paper [8]. In this paper we propose ”radial basis function least square error projection cascade network for sparse signal recovery” (RASR), an iterative signal estimation technique to based on radial basis functions (RBF) network and ANN architecture; the development of the RBF based algorithm takes inspiration from the paper [9]. The rest of the paper is organized as follows. In section 2 the Compressed Sensing theory [10] is summarized. In section 3, the some of the existing benchmark CS recovery algorithms basis pursuit [11], iterative reweighed least square [12], smooth L0 [9] are summarized for the completeness of discussion. Also, the ANN based CS recovery algorithm CSIANN [7] is presented for comparison of the architecture. In section 4 the proposed algorithm, ”radial basis function least square error projection cascade network for sparse signal recovery” (RASR) is presented. The section 5 presents the simulation, the performance enhancement and the application results of RASR on synthetic and naturally sparse signals. A brief discussion on the advantages of RASR and the conclusion about the research is given in section 6. 2

2

Compressed Sensing Theory

The sampling theorem states that to avoid information loss and aliasing while sampling any signal, sample the signal at least two times faster than the signal’s bandwidth. In many emerging applications like magnetic resonance imaging, synthetic aperture radar, abundance of data generated by the data acquisition systems due to high Nyquist sampling rate, demands compression prior to store or transmission, to save communication bandwidth. CS combines the sampling and compression into a single process, and uses nonadaptive linear projections to preserve the time domain characteristics of the signal; and the signal is later reconstructed from these projections using optimization techniques [13]. The transformation of CS from an algebra theory to a practical technique for sparse signal processing began with the work of Donoho, Candes, Romberg, Tao and et al. Their works on CS theory show that a finite-dimensional signal having a sparse or compressible representation can be recovered from a small set of linear, non-adaptive measurements [14]. 2 Consider an arbitrary signal x ∈ ℜN and suppose there exists a suitable basis B that provides a K-sparse representation {α} of the signal x = Bα such that α := {B−1 x}, kαk0 = K, α ∈ ℜN , B ∈ ℜN ×N (i) Based on CS theory, the K-significant samples of {α} or N complete samples of x need not be measured directly, rather it is sufficient to measure and encode M ≪ N projections of the signal from its sparse representations {αi , i = 1...N} to a linear projection space. y := {hAj , αi} ∀j = 1 . . . M, Aj ∈ ℜN

(ii)

In matrix notation, measure y = AB−1 x, where y ∈ ℜM is the CS measurement, A ∈ ℜM ×N is the CS measurement matrix (sensing matrix), Aj are rows of A, and B is the basis transformation matirx. When the signal x is K-sparse in sampling domain (kxk0 = K), the basis transformation of x to get sparse representation is not required (B = I), and the compressed sensing measurement simplifies to the following form. y = Ax, x ∈ ℜN , A ∈ ℜM ×N

(iii)

In the proceeding discussion and mathematical formulation about compressed sensing and recovery algorithms we consider this time domain sparse signal case. Since M ≪ N, the recovery of the signal from the under-determined set of measurements is an ill-posed problem in general. However, the additional assumption about the sparse nature of the signal makes the recovery possible Detailed analysis of CS theory and algorithms can be found in the reference books [10] and [13] 2

3

and practical. The necessary conditions for recovery of the original signal from compressed sensed data are given in [15]. It is shown that the signal recovery becomes possible when columns of the measurement matrix are i.i.d. Gaussian vectors and the lower limit of the number of measurements M satisfies the equation (iv). M ≥ O(Klog(N/K)) (iv) The paper [16] describes the necessary properties of the measurement matrix A such as the restricted isometric property (RIP). This is represented as equation (v) where δK is a small constant value, which restricts the upper bound in the variation of Euclidean norm of the signal x when it is projected using matrix A and measured as y [17]. (1 − δK ) kxk22 ≤ kAxk22 ≤ (1 + δK ) kxk22

(v)

The second condition to be satisfied is the null-space property of the signal represented as Σ2K ∩ Null(A) = ∅, where Σ2K is set of all 2K-sparse signals {x}2K , which restricts the set of all K-sparse signals {x}K that can be measured using the measurement matrix A. There is also a lower limit on the number of dependent columns of A designated as spark. spark(A) > 2K

(vi)

The condition given in equation (vi) is very difficult to compute in real applications. In many cases it is preferable to use properties of A that are easily computable to provide more concrete recovery guarantees [18]. The coherence of the matrix is one such property and is computed according to the equation(vii) where Ai , Aj are column vectors of A. µ(A) = max

khAi , Aj ik2 , 1 ≤ i, j ≤ N kAi k2 kAi k2

(vii)

If all of this conditions are satisfied then compressed sensing theory gives a guarantee that the K-sparse signal x can be acquired according to equation (iii) and the original signal x can be recovered from the measurement y using the optimization algorithm x = arg minx kxk0 , subject to y = Ax. 3

Recovery Algorithms

Considering the problem (L0) : min kxk0 subject to y = Ax x

(1)

The unknown x is composed of two effective parts (i) The support of the vector x, ( supp(x) = {i : xi 6= 0} ) and (ii) the non-zero values over these supported locations. Two alternatives for finding the solution of the above problem are 4

(1) Greedy Algorithms: Here the focus is given for finding the signal support, the non-zero values of x at the supported locations are easily determined by least squares method. These methods build up an approximation one step at a time by making locally optimal choices at each step. (2) Relaxation Methods: The second method do not consider the signal support and takes the unknown x as a vector (x ∈ ℜN ). By smooth approximation of the objective norm function kxk0 , solution of min kxk0 can be obtained using continuous function optimization techniques. These techniques solve a convex optimization problem whose minimizer approximates the target vector x [13]. We focus our research on ANN based relaxation algorithms, however for completeness, some of the existing norm minimization algorithms are discussed here. The difficultly with the above problem (1) is the L0 norm. Basis Pursuit (BP) [11] replaces the L0 norm with the L1 to make the problem easier to work with. (BP ) : min kxk1 subject to y = Ax (2) x

Under the right conditions it can be guaranteed that BP will find a sparse solution or even the sparsest solution. The Lasso [19] is similar to BP and is in fact known as Basis Pursuit De-Noising (BPDN) in some areas [13]. In Lasso minimization, rather than trying to minimize the L1 norm like BP, places a restriction on its maximum value. There are differences in the two formulation, one is that BP works for under determined systems only and the Lasso is more tailored for over determined systems by minimizing the squared reconstruction error ky − Axk rather than constraining it to be equal to zero. (Lasso) : min kAx − yk2 subject to kxk1 < λ x

(3)

It is proved that if the BP finds a solution x and if the L0 norm of x satisfies the condition kxk0 < 0.5 (µ(A)−1 + 1), then x is the sparsest possible solution, where µ(A) is the coherence of matrix A [11]. BP can be thought of as a least squares problem with an L1 regularizer. Another relaxation algorithm is Iterative Re-weighted Least Squares (IRLS) algorithm, this method is also called FOCUSS (FOcal Under determined System Solver) [12]. The main principle of this approach is to smooth the L0 norm and replace it with a continuous function that can be handled using classic optimization techniques. One of the most successful approach is to substitute the L0 norm using Lp norm with some fixed p ∈ (0, 1]. In this iterative algorithm, given a current approximate solution x(k − 1), set Xk−1 = diag(|x(k − 1)|q ). If the matrix is invertible,

2

2−2q then X−1 k−1 x(k − 1) 2 = kx(k − 1)k2−2q , every entry in x is raised to the power 2 − 2q and then they are summed. Thus, by choosing q = 1 − p/2, this expression imitates the Lp -norm, kxkpp .

2

(IRLS) : min X†k−1x subject to y = Ax x

2

5

(4)

This problem is solvable using standard linear algebra and the corresponding Lagrange multiplier is given in the equation (5).

2

L (x) : X†k−1 x + λT (y − Ax). 2

(5)

If X†k−1 is invertible, this implies that X†k−1 has no zero entries on the main

−1

diagonal, and thus X†k−1 = Xk−1 and x(k) = 0.5X2k−1AT λ . Plugging this to the constraint Ax = y gives (6).

x(k) = X2k−1 AT AX2k−1AT

†

y

(6)

The IRLS Algorithm starts with an approximation x(0) = {1} and the initial weight matrix X0 = I, the identity matrix. The iterations solve (6) either directly or iteratively. The second step in the iteration is to update the diagonal weight of the matrix Xk using x(k) (X(j, j) = |x(j)|1−p/2 ). Stop the iteration when kx(k) − x(k − 1)k < ǫ, where ǫ is the convergence limit. Another algorithm which solves L0 minimization problem (1) directly is the Smooth L0 algorithm (SL0) [9]. The SL0 algorithm approximates the L0 norm of the vector x = [x1 . . . xN ]T using a smooth spline function Fσ (x) and minimizes it for a small value of σ, where σ determines the quality of approximation. However for small values of σ there is a possibility of local minima solution. To avoid these local minima, the iteration starts with a large σ and gradually reduces it to a minimum value using a steepest descent approach. The minima of Fσ(k) (x) is used as a initial value for next iteration with smaller σ(k + 1). Since the value of σ(k + 1) has only slightly decreased, Fσ(k) (x) is not too far from the minimizer of Fσ(k+1) (x). SL0 is generally implemented using Gaussian function Fσ (x) = 1 − exp (−x2 /2σ 2 ) approximation of L0 norm. Since the minimization of 1 − exp (−x2 /2σ 2 ) is equivalent to maximization of exp (−x2 /2σ 2 ), the SL0 algorithm is represented as (7) N X

−x2i exp (SL0) : max x 2σ(k)2 i=1

!

subject to y = Ax, σ(k) → 0

(7)

3.1 ANN Based CS Recovery Algorithm Artificial Neural Networks are similar to biological neural networks when considering the way in which they process the input signals and generate the output signals. ANN based algorithms for function approximation and signal estimation are emerging research topics, because of its advantages like adaptability, fault tolerance and self-organization features. ANN algorithms have applications in signal processing, pattern recognition, machine learning and 6

control system design. The simplest ANN architecture can be mathematically represented as (fAN N ) : X → Y, X ∈ ℜN , Y ∈ ℜM (8) P

where f is a nonlinear weighted sum f (x) = T ( i wi g (xi )) and T is an output activation function. The network is assigned a cost function (g(x)) depending on the problem to be solved, and the weights (wi ) in the network are updated to optimize the cost function based on gradient descent rule. Based on various advantages of ANN, The paper [7] suggests ANN implementation of two sparse recovery algorithms CSANN and CSIANN. The algorithm CSANN which uses the error function (9) gives only L2 solution of the L0 problem and takes very large (> 50000) number of iterations to converge to a good recovery precision (< 10−5). !2 N X 1 ECSAN N = Aji xi j = 1 . . . M (9) yj − 2 i=1 A modefied version of the algorithms named the CSIANN, which is infact ANN interpertation of the L0-LMS adaptive algorithm described in paper [8] by redefining the error function given in equation (17-19) of [8] as the output error of ANN with L0 norm as optimization penalty function (10). EL0−LM S

N X 1 = Aji xi yj − 2 i=1

!2

+ γ kxk0 j = 1 . . . M and γ 6= 0

(10)

This ANN interpertation of L0-LMS is presented here (Algorithm1 ) for comparison and completeness of the discussion. The single-layer perceptron model adopted for this algorithm has N inputs and one output. According to the theory of CS, it follows that y = Ax, x ∈ ℜN , y ∈ ℜM , and for ANN, x is used as the weight coefficients of the perceptron, and Aj , the row vector of A ∈ ℜM×N is the input and yj is the reference threshold value of the perceptron. Each yj is a correlation between x ∈ ℜN and Aj . The ordered set hAj , yj i j = 1 . . . M are used for adjusting the weight coefficients x, based on gradient descent method; and the output error decision rule (10) uses least square method. The parameter γ is to balance the penalty function, but the computation of this function is PN NP hard problem and hence it is approximated by kxk0 ≈ i=1 1 − e−β|xi | . Using gradient descent method, the new weight coefficients x(k +1) is updated as givne in equation (11), which is the equation (20) of [8]. x(k + 1) = x(k) + µ (yj − Aj x(k)) Aj − µγβe−β|x(k)| sgn(x(k))

(11)

The learning factor µ controls the rate of change of weight coefficients x. The β and γ are given empirical values (β = 10 and γ = 0.01). In the implementation of the algorithms given in [7], the function exp (−β |x|) is approximated as 1 − β |x| when |x| ≤ 1/β and 0 else where. The iteration given in the equation (11) is done till kx(k + 1) − x(k)k < ǫ, where ǫ determines the convergence of weight coefficients. For example, to get a recovery precision of 7

kx∗ − xorg )k < 10−8, ǫ should be < 10−5 and the the algorithm takes more than 20000 iteration cycles to converge. However, when ǫ < 10−6 , it was observed that, the iteration might remain at local minima some times, making the convergence impossible. To avoid this the iterations need to be stopped after a predetermined number of cycles Nmax , but this value should be extremely high, making this a computationally costly algorithm. Task: minx kxk0 subject to y = Ax Input: A, y, Nmax , error threshold ǫ Initialization: set µ, γ, β, K = µγ, x(0) = A† y, k=1 , j=1 for kx(k) − x(k − 1)k2 > ǫ or k < Nmax do feed samples Aj and yj to ANN gk = µ(yj − Aj x(k))Aj fβ (x(k)) = βsgn (x(k)) e−β|x(k)| x(k + 1) = x(k) + gk − Kfβ (x(k)) Update k =k+1 j = mod((k − 1), M) + 1 end Output: x(k) Algorithm 1: L0-LMS Algorithm description

4

RBF Cascade Network Based CS Recovery

The radial basis function (RBF) networks are widely used as regression framework for function approximation, this resembles neural network in architecture. It has been proved in [20] that RBF networks can be used for universal function approximation, this property is attributed to the linear superposition of localized basis functions used in these networks. Also, the interpolation using RBF kernel is well established [21]. It is used to perform a non-linear mapping from the input space to the hidden space. The interpolation problem can be stated as: given a set of N different points {xi ∈ ℜM |i = 1 . . . N} and corresponding set of real numbers {di ∈ ℜ|i = 1 . . . N} find the function f : ℜM → ℜ that satisfies the interpolation condition f(xi ) = di , i = 1 . . . N, for strict interpolation, the interpolation surface can be constrained to pass through all points [22]. The radial basis function optimizes a suitable function f that has the following form. f (x) = ΣN i=1 wi φ(kx − xi k) 8

(12)

where φ(kx − xi k) is a set of N arbitrary nonlinear function know as radial basis functions. The aim is to design a suitable positive definite and non singular interpolation matrix to obtain weight vector w = φ−1 x. If {xi }N i=1 be a set of distinct points in ℜM then N × N interpolation matrix φ, whose i, j th element is φi,j = φ(kxi − xj k), is non singular [23]. There are many radial basis functions which satisfies the requirement of non singularity, among the many we have chosen Gaussian kernel φ(x) = exp(−x2 /2σ 2 ) for some σ > 0, moreover this Gaussian kernel is localized function hence it is also a positive definite. The two important properties, the non-singularity and the positive definiteness of Gaussian RBF kernel made us to choose this for our application. It is also for the reason that the optimization/regularization with RBF kernel is equivalent to minimizing the quadratic cost function which results in smooth solution. A new CS recovery algorithm using a cascade network for L0 minimization is proposed. This algorithm applies radial basis function, least square error minimization, gradient projection [24] and feedback network structure to reconstruct sparse signal from compressed sensed data. The signal estimation error (y−Ax) of the least square error minimization block is added to RBF approximated L0 norm function, to accelerate the convergence. The proposed algorithm is named Radial basis function least square error cascade network for Sparse signal Recovery (RASR). Here the kxk0 norm is approximated using a linear combination of Gaussian basis functions (13). And this function is also used as the activation function of the first layer ANN.

fσ (x) = 1 − exp − x2 /2σ 2

(13) P

For a set x = [x1 . . . xN ]T , L0 norm of x is determined as kxk0 = N i=1 v (xi ) where v (xi ) = 1 if xi 6= 0 and v (xi ) = 0 if xi = 0. Replacing v (xi ) with Gaussian functions limσ→0 fσ (xi ) = 1 − exp (− |x2i | /2σ 2 ) gives smooth and continuous function approximation of L0 norm. The minimization of L0 norm (13) is equivalent to maximization of exp (− |x2 | /2σ 2 ). For small values of σ, the maximizer of exp (− |x2 | /2σ 2 ) subject to Aj x = yj where j = 1 . . . M is the minimum L0 norm. In the first layer of RASR this function is used for approximating x. At each iteration, the value of σ determines the quality of L0 norm approximation at that level. The minimum L0 solution using RASR is defined in the equation (14). N X

−x2i (RASR) : max exp x 2σ 2 i=1

!

s.t. kyj − Aj xk2 = 0, j = 1 . . . M, σ → σmin .

(14) Incorporating the Gaussian RBF function with decreasing σ in the iterative process determines the higher amplitude components of x and their support locations at the beginning and gradually determines the low amplitude components and their support locations as the algorithm converges (σ → σmin ). Thus RASR algorithm’s signal approximation changes from coarse to finer es9

timation. The figure 1 shows the variation in signal support values for various σ.

1

sigma = 10e−6 sigma = 0.1 sigma=1 0.8

0.6

0.4

0.2

0 −1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

Fig. 1. Signal support determined using radial-basis function 1 − exp − x2 /2σ 2 for values of σ 1, 0.1 and 10−6 . The x- axis is the value of components of x and y-axis is the corresponding support value. When the value of σ is high, only high amplitude values are given support 1, and the smaller values are given fractional support. When σ → σmin all non zero values are given support 1.

In this new RBF network architecture we have used smooth L0 minimization technique using Gaussian function 1 − exp (− |x2 | /2σ 2 ) to accelerate the computational speed and convergence rate compared to CSIANN / L0-LMS algorithm. The figure 2 illustrates the proposed 2 layer cascade network architecture of RASR. The first layer provides the coarse support estimation of x(k) using continuous function approximation of L0 norm and the second layer gives a finer magnitude estimation of x(k) using L2 minimization of the recovery error 21 kyj − Aj xk22 , for values of j = 1 . . . M. 4.1 Objective of RASR algorithm (1) L0 norm using RBF function to get accelerated convergence and minimum reconstruction error compared to CSIANN / L0-LMS algorithm. (2) Element wise L2 mean squared error to reduce computational cost involved in the computation of matrix inverse compared to SL0 algorithm. 4.2 Optimization problem In this new approach the cascaded minimization of the problem defined in (14) using weighted kxk0 norm followed by L2 minimization of signal recovery 10

Fig. 2. The two layer cascade network architecture of RASR. The measurement matrix A and CS measurement y are the input of the network. The RBF network block is initialized using initial weights x(0), and σ0 is the initial value of RBF. The parameters δ determine the rate of convergence

error is used and the problem is redefined as (M+1) separate unconstrained problems. L : σ 2 min kxk0 subject to σ → 0 (15) 1 (16) E : min kyj − Aj xk22 , j = 1 . . . M 2 The solution due to each part is separately computed in two stages of the cascade network. Approximating the L0 norm with RBF function 1−exp (− |x2 | /2σ 2 ) gives !! N X −x2i 2 exp L=σ N− (17) 2σ 2 i=1 The new updated value of x based on L0 minimization is determined using the gradient of the objective function L −x2 ∂L = xi exp 2i , i = 1 . . . N ∂xi 2σ −x2i (k) xi (k + 1) = xi (k) − µxi (k) exp 2σk2

(18) !

(19)

Where µ is a positive constant and the index i = 1 . . . N describes computation for every component of the vector x ∈ ℜN . The first layer of the network minimizes x by iteration according to the equation (19). The span of the radial basis function exp (− |x2 | /2σ 2 ) is reduced in each step of the iteration according to the equation (20). σk+1 = δσk , δ ∈ (0, 1), σk+1 ≥ σmin

(20)

The solution of the optimization problem (16) is computed using gradient descent least square method. However, the error function defined in equation 11

(16) is scalar value, the gradient of the error E spans an N-dimensional subspace ( ∂E ∈ ℜN ). Also, there are M gradient equations, because each row of ∂x the measurement matrix A is considered separately. " "

∂E ∂xi

#

j

∂E ∂x

#

= −ATj (yj − Aj x) ∈ ℜN , j = 1 . . . M

(21)

= −Aji (yj − Aj x) ∈ ℜ, i = 1 . . . N, and j = 1 . . . M

(22)

j

The optimization of x based on minimum recovery error criteria result in M set of equations "

∂E x(k + 1) = x(k) − η ∂x = x(k) +

ηATj

#

j

(yj − Aj x(k)) , j = 1 . . . M

(23)

Where η is a constant. The iteration given in (23) need to be done for j = 1 to j = M. The optimal value of η is determined by equating two consecutive gradient equations from the iteration cycle corresponding to j and j +1. Define ′ x(k+1) of equation (23) as x (k), and determine the new expression for x(k+1) and equating this new gradient equation to the equation (23) yield the value of η. ′

x(k + 1) = x (k) − η ′

∂E ∂x′ (k)

′

x(k + 1) = x (k) + ηATj yj − Aj x (k) ′

′

= x (k) − ηATj Aj x (k) + ηATj yj

′

= I − ηATj Aj x (k) + ηATj yj

= x(k) + ηATj (yj − Aj x(k)) 2I − ηATj Aj

(24)

But unlike σk of the first layer, here η is kept constant throughout the M number of iterations. From the comparison of the two gradient equations (23) T and (24) it is evident that 2I − ηAj Aj := I and this equality gives the value of η. 1 (25) 2I − ηATj Aj := I ⇒ η = kAj k2 Based on this value of η, the iterations carried out in the second network layer is given in equation (26) x(k + 1) = x(k) +

ATj kAj k2

(yj − Aj x(k)) , j = 1 . . . M 12

(26)

Combining the solutions of min(L) and min(E) in series iteration, the computations to be carried out in the RASR algorithm reduces to the following set of equations. !

−x2i (k) xi (k + 1) = xi (k) − µxi (k) exp ,i = 1...N 2σk2

′

x (k + 1) = x(k + 1) +

ATj kAj k2

(yj − Aj x(k + 1)) , j = 1 . . . M

(27)

(28)

The multiplication and accumulation carried according to (27) is a ℜN vector operation and (28) result in M×ℜN vector operations. The iterations are carried out till the stopping criterion σk < σmin is reached. The gradual reduction in the support span of the Gaussian basis function is given in figure 1. The pictorial representation of how cascade minimization of unconstrained L0 (x) followed by unconstrained L2 (Ax − y) leads to minimum of Lagrangian function L0 (x) + λL2 (Ax − y) is illustrated in figure (3).

Fig. 3. Cascade minimization of unconstrained L0 (x) followed by unconstrained L2 (Ax − y) leads to minimum of the Lagrangian function L0 (x) + λL2 (Ax − y)

The x0 is result of the first layer of the network and x1 is the optimal value after M iterations done according to (28). This value is given as the new input to Net1 and the process continues till σ = σmin

4.3 Polynomial Approximation of RASR To reduce the computational load, the RBF function in exponential form is replaced with polynomial cure which closely approximate the RBF function; and the algorithms is denoted as Polynomial approximated RASR, (PASR). In this alternative expression, the cascade architecture is same as described 13

in the previous section. Here the polynomial function approximates the L0 norm and the least square error minimization (LSE) and feedback Network structure are same as in RASR. A linear combination of polynomial functions fσ (x) = 1/(1 + x2 /2σ) are used; similar to the previous case the span of the cure is reduced progressively as the iteration proceeds. The objective of this new approach is summarized (1) L0 norm using polynomial function to get less computational time compared to RASR algorithm. (2) L2 element-vise mean squared error using cascade network to reduce the computational load in matrix inverse operations. The optimization problem is a cascaded minimization of kxk0 norm followed by L2 minimization of signal recovery error. L : min kxk0 s.t. σ → σmin ∩ E : min

1 kyj − Aj xk22 , j = 1 . . . M 2

(29)

lThe parameter settings are same as in the case of the RASR algorithm, and similarly the solution due to each part is separately computed in two layers of the cascade network. As in the case of RASR, the following equations hold for very small values of σ for PASR algorithm. L0 (x) ≈ N −

N X

1/(1 +

i=1

x2i /2σ)

!

(30)

The new updated value of x based on L0 minimization is determined using the gradient of the objective function L ∂L x2 = xi /σ 1 + i ∂xi 2σ

!−2

i = 1...N !−2

(31)

x2 (k) i = 1...N (32) xi (k + 1) = xi (k) − µk xi (k)/σk 1 + i 2σk Using the similar approach given in (21) to (26), the L2 error can be optimized and the computations to be carried out in the PASR algorithm reduces to the following equations (34) - (35) after eliminating higher order terms. The σk is reduced according to (20). The gradient scale factor µk is also reduced in each iterations, to get simplicity in the expression, the µk is expressed in terms of σk as (33). µk = µ0 σk (33) 2 x (k) ) (34) x(k + 1) = x(k) − µ0 x(k)/(1 + σk ATj ′ x (k + 1) = x(k + 1) + (yj − Aj x(k + 1)) , j = 1 . . . M (35) kAj k2 14

4.4 Parameter initialization

Even in the most accurate data acquisition system error due to the thermal noise cannot be avoid, this appears as noise floor in the acquired signal. In the case of sparse signals, when the signal amplitude is zero, the data acquisition system reads the noise floor value, this will appear in the recovery also. The original signal can be recovered perfectly if this noise floor value of the data acquisition system (xnf ) is known. In the case of RASR algorithm, the fine amplitude values are estimated when the span of the RBF function σ is very close to zero, this value of σ corresponding to noise floor can be found by setting (27) = 0. ! −x2nf 0 = xnf − µxnf exp (36) 2 2σmin σmin =

v u u t

x2nf 2 log(µ)

(37)

By setting the maximum iteration count to T , the σ reduction scale factor δ can be determined as δ = (σnf /σ0 )1/T (38) Assuming the acquired data is normalized, the initial values of µ ∈ [1.2, 0) and σ ∈ [2, 0) are empirically determined as follows. x0 = A† y

(39)

σ0 = 2 max{x0 }

(40)

µ = 1.2 max{x0 }

(41)

When the iteration reach σk = σnf the values of x(k) less than xnf becomes 0. But to fine tune the higher amplitude values of x(k), the iteration is continued for σk < σnf δ ξT where ξ ∈ (0.2, 0.4), and the value of σk used in (27) is set to σnf + σk , to avoid spurious low amplitude values. The RASR algorithm approximates the higher amplitude spikes at first and gradually estimates the values of lower amplitude spikes as iteration proceeds. The solution space of RASR changes from minimum L2 error to minimum L0 norm space and finally converges to min||x||0 st : ||y − Ax||2 = 0 solution. Based on the theoretical frame work described in (17) to (40) an algorithm for implementation of RASR is proposed. The initial value of x(0) ∈ ℜN can be any i.i.d. random vector selected from normal distribution, however x(0) = A† y is used to make the initial value close to global minima. The initial value of σ is given in equation (40). The functional block diagram for the network implementation of the RASR algorithm is illustrated in figure 4. The Algorithm2 gives the steps involved in the software implementation of RASR algorithm. 15

Fig. 4. The feedback architecture of radial basis function cascade network for sparse signal recovery (RASR)

5

Simulation, Analysis and Results

All the simulations were carried out using Mathworks Matlab R2010a on the Intel(R) Corei5(T M ) CPU 680 @ 3.60GHz with 1GB RAM. Other CS recovery algorithms used here for comparison of the result are implemented based on the published Matlab codes. Two of the greedy algorithms for CS recovery, the orthogonal matching pursuit [25] and the CoSAMP [26] are taken from the respective author’s site. The basis pursuit algorithm [11] is implemented using L1-magic code [27]. The IRLS code is implemented based on the paper [12], SL0 algorithm is taken from the site mentioned in the paper [9], CSIANN algorithm is implemented based on the L0-LMS algorithms described in [8]. The stopping criteria set for RASR and SL0 algorithms is σ < 10−8 and mse(ky − Axk) < 10−10 is set as stopping criterion for CSIANN. It should be noted that the stopping criterion set in the experiments is very finite value of the order 10−8 . However, if the acceptable error limit is set in the order (10−3 , 10−4), the relative performance comparison of the algorithms would be different. 1 MSE = ΣN (|xi − x∗i |2 ) (42) N i=1 Relative L2 error = kx − x∗ k2 / kxk2 (43) P robability of support(x∗ ) = ksupp(x) − supp(x∗ )k0 / ksupp(x)k0

(44)

The performance comparison of the sparse recovery algorithms are done based on the parameters given in (43) and (44). The average relative L2 error com16

Task: minx kxk0 subject to ky − Axk2 = 0 Input: A ∈ ℜM ×N , y, xnf , T , ξ Initialization: x0 = A† y , σ0 = 2 max{x0 }, µ0 = 1.2 max{x0 }

σnf = x2nf /2 log(µ0 )

1/2

, σmin = (σnf /σ0 )ξ σnf

δ = (σnf /σ0 )1/T for k = 1 . . . T do minimize L2 error for j = 1 . . . M do ′

x (k) = x(k) +

AT j kAj k2 ′

(yj − Aj x(k))

update x(k) ← x (k) end minimize L0 error for i = 1 . . . N do ′

xi (k) = xi (k) − µ0 xi (k) exp

−x2i (k) 2σk2

end ′ update x(k + 1) ← x (k) σk+1 ← σk δ end Output: x(T ) Algorithm 2: RASR Algorithm with parameter initialization puted using the equation (43) gives a measure of correctness of the signal magnitude recovery (where x is the original signal and x∗ is the signal recovered from y using CS recovery algorithms). The estimate on the signal support recovery is computed as a probability of error in the exact recovery of the signal support, determined using the equation (44). The function supp(x) gives a vector corresponding to x where the elements are nonzero (xi 6= 0, i = 1 . . . N). The test condition for algorithm evaluation, if not specified explicitly are N=50, M=30, A ∈ ℜM ×N i.i.d. random matrix with coherence µ(A) < 0.3. The algorithm evaluation for each cardinality and noise-level are carried out 100 times to get a conclusion based on mean value. AT

The parameter αk = kA jk2 (yj − Aj x(k)) given in the RASR algorithm dej scription, gives the relative error in estimation of the signal x∗ . Experimental results on convergence analysis of the algorithm parameters (σk , αk ) for cardinality 1 to 20 is given in figure (5) and (6). In the figure (5), convergence of RASR algorithm parameter αk for signal sparsity k=5,10,15,20 vs number of iterations is plotted. It shows a general asymptotic behavior with limiting oscillations during initial iterations cycles and converges to limiting value at iterations above 80. In the figure (6) RASR algorithm parameter (σk ) for signal sparsity k=5,10,15,20 vs number of iterations is plotted. As the algo17

Fig. 5. RASR algorithm parameter αk

Fig. 6. RASR algorithm parameter (σk )

rithm stopping criteria is determined by the value of σmin alone, the parameter convergence curve for any value of N will be similar. The figure (7) shows average relative error in the recovery of the signal x computed using equation (43) and also, the probability of error in the signal support computed using equation (44), for RASR and various other algorithms (test condition same as section 5). It can be seen from the figures that the recovery performance of RASR is better or comparable to benchmark algorithms for low cardinality signals. However, BP and IRLS and SL0 recover signal support at high cardinality compared to RASR, the recovery performance is similar at low cardinality.

0.6

1 RASR SL0 OMP IRLS BP CoSMAP

Probability of Error in Support

0.8

2

Average and Relative L Error

1

0.4

0.2

0

5

10 Cardinality

0.8

0.6

0.4

0.2

0

15

RASR SL0 OMP IRLS BP CoSMAP

5

10 Cardinality

15

Fig. 7. Average relative error and the probability of error in signal support recovery using RASR, SL0, IRLS and BP and the two greedy algorithms OMP and CoSAMP when CS measurement is not corrupt and cardinality is increased from K=1 to K=15

The convergence of algorithm parameter gk of CSIANN described in Algorithm1 and αk of RASR described in Algorithm2 are given in figure (8) for comparison (the test condition is same as in section 5). The reduction in convergence time of RASR with respect to CSIANN is attributed to the gradual reduction in σ of the L0 approximation function (13) from a high value to value close to zero (σk → 0) followed by the magnitude correction using L2 error minimiza18

convergence error (g)

CSIANN convergence

k=10

0.4 0.3 0.2 0.1 0 0

200

400

600

800

1000 Iterations

1200

1400

1600

1800

2000

RASR convergence 0.4 k=10

alpha

0.3

0.2

0.1

0 0

10

20

30 Iterations

40

50

60

Fig. 8. The convergence of algorithm parameter (a) gk of CSIANN and (b) αk of RASR. The error parameter gk of CSIANN shows a random convergence behavior and takes more than 2000 iterations to converge to acceptable value of mse < 10−5 . The convergence of error parameter αk of RASR algorithm has a general asymptotic behavior with oscillations during initial iterations.

tion technique. The effect of these two processes can be clearly seen in the convergence curve of RASR as oscillations over a general asymptotic curve. yn = Ax + nL

kAxk wn kwn k

(45)

To study the robustness of RASR algorithm, noise injection test on CS measurement is done (test condition same as in section 5). The CS measurement y = Ax is corrupted with white noise wn (mean=0, var.=0.1) of relative strength from -40dB to -10dB (nL = 0.01 to 0.20). The original signal x is recovered from yn (45) instread of y using RASR, SL0, IRLS, BP, OMP and CoSAMP; and the signal recovery performance of the algorithms are compared based on the relative L2 error. The CSIANN algorithm is not used in this evaluation because it takes extremely high number of iteration cycles to converge in presence of noise. In presence of acquisition noise, the recovery performance of RASR and SL0 is comparable, if not RASR is slightly better; and the result is shown in figure (9). All algorithms evaluated here for the comparison with RASR, gave good results for low cardinalities; however, when the data acquisition noise level is more than -20dB, the signal recovery performance of all the CS recovery algorithms starts to degrade. The table 1 summarizes the average relative error in signal recovery and probability of error in the recovery of the signal support for various algorithms compared at noise level -35dB and -20dB. This shows that the RASR algorithm perform better at relatively high noise levels. To determined the combined effect of noise and the number of measurement, 19

1

1 RASR IRLS PASR SL0 BP OMP CoSMAP

0.7

0.8 0.7

2

Average Relative L2 Error

0.8

RASR K=4 IRLS K=10 PASR K=4 SL0 K=4 BP K=4 OMP K=4 CoSMAP K=4

0.9

Average Relative L Error

0.9

0.6 0.5 0.4 0.3

0.6 0.5 0.4 0.3

0.2

0.2

0.1

0.1

0

0 −40

−35

−30

−25

−20

−15

−10

−40

Noise in measurement y (dB)

−35

−30

−25

−20

−15

−10

Noise in measurement y (dB)

Fig. 9. Noise immunity test of RASR and other algorithms. The performance of the algorithms when measurement is injected with white noise of strength -40dB to -10dB and (a) cardinality K=1 (b) K=10

the data acquisition noise is increased from -40dB to -20dB and the number of measurements M is decreased from 30 to 20. The figure 10 shows the SNR in signal recovery using RASR algorithm when N=50 and the sparsity K is varied from 1 to 15. SNR of 80dB is obtained for M=30 with -40dB noise. Table 1: Algorithm performance comparison cardinality

K=10

K=1

noise

-20dB

-35dB

-20db

-35dB

Algorithm

avg.err.

err.Prob.

agv.err.

err.Prob.

OMP

0.5

0.6

0.03

0.6

CoSaMP

0.6

0.2

0.03

0.1

IRLS

0.08

0.68

0.02

0.7

BP

0.08

0.68

0.02

0.8

SL0

0.08

0.65

0.025

0.8

RASR

0.05

0.6

0.025

0.7

To further analyze the robustness of RASR algorithm, the cardinality K of the sparse signal x and the white noise relative signal strength nL are taken as ordered pairs hK, nL i, K ∈ {1 . . . 20}, nL ∈ {1 . . . 20}. The test condition is same as earlier and the measurement matrix A ∈ ℜM ×N is normalized i.i.d. random Gaussian matrix with coherence µ(A) < 0.3. The recovery performance of the algorithms are compared based on the relative L2 error. The CSIANN iteration limit is set to 5000 instead of mse (ky − Axk) < 10−10 . Total 100 experiments for each hK, nL i pair were carried out, to get an average estimate. It is seen that the averaged relative L2 error of the recovered signal using RASR is better compared to CSIANN and the result is comparable with IRLS and BP for sparse signals with small cardinality. The figure 11 shows 20

RASR N=50

3

10

−20dB M=20 −26dB M=25 −33dB M=27 −40dB M=30

2

SNR (dB)

10

1

10

0

10

−1

10

0

5

10

15

Cardinality

Fig. 10. SNR in signal recovery using RASR algorithm for N=50 and various measurement numbers M as cardinality is varied from 1 to 15

the average relative error in signal recovery using RASR, CSIANN, BP and IRLS when CS measurement is corrupted with white noise of relative signal strength -40dB to -13.9dB (1% to 20%) and the cardinality is increased from 1 to 20 along with the increase in noise level. The figure 12 shows the computational cost of CSIANN, SL0 and RASR, determined by multiplying the number of MAC instructions (multiply and accumulate unit instructions) needed to complete one loop, by the number of iterations to get the required recovery precision. RASR takes only half the computational cost of SL0, to get comparable recovery precision at low cardinalities. RASR algorithm converges in 80 iterations (stopping criteria σmin ≤ 10−8 ), but, the CSIANN algorithm convergence time exponentially increases with cardinality when iterations are done using the stopping criterion mse(ky − Axk) < 10−10 . The computational cost of recovering signal using CSIANN algorithm is very large compared to RASR. Under the same test condition given in section 5 and cardinality above 15, CSIANN fails to converge. At cardinality K=2, RASR, SL0 and CSIANN take 3,6 and 95 million MAC instructions respectively. The comparison of the architectural and computational features of RASR, CSIANN and SL0 algorithms are summarized in the Table 2. Considering the experimental observation that RASR and SL0 algorithm converge in 80 iteration for σmin = 10−8 , and CSIANN converges above 10000 iterations, an empirical estimate of net computational load of these algorithms can be determined. The computational load of RASR is of the order ⌊240MN, for SL0 ⌊480MN and for CSIANN the value is exorbitantly large ⌊50000MN. 21

Computational Cost using DSP with MACC unit

Million Instructions

2000

IANN RASR SL0

1500

1000

500

0 2

4

6 8 Cardinality

10

12

Fig. 12. cardinality vs no. of MAC instructions Table 2: RASR feature comparison with CSIANN and SL0

Fig. 11. hK, nL i vs relative error Parameter

CSIANN

SL0

RASR

Model

perceptron

L0

L0 + ANN

No of gradients

1

2

2

Optimization

kyj − Aj xk2

kxk0

min kxk0 ,

min kyj − Aj xk2

+ γ kxk0

+ λ ky − Axk

Conv. guarantee

Nmax

σ < σmin

σ < σmin

Conv. time

increase with

determined

half of SL0

cardinality

by σmin

Nmax × Aj

matrix inverse A†

Vec. opt. on Aj

6MN+3M+3N+1

3MN+3M+4N+1

j = 1...M

Vector operations MACC instructions

5MN+M

j = 1...M

5.1 Results To compare the sparse recovery performance of bench mark algorithms and RASR in real application, four types of sparse pulse data stream is used to analyze the performance; type-A is of 1 sample duration spike, type-B is 5 sample duration with shape {1, 0.9, 0.8, 0.7, 0.6}, pulse type-C is 5 sample duration with shape {-0.1, 0.6, 1, 0.6, -0.1} and the last one is the RADR pulse with low amplitude reflected pulse. The cardinality of the total non zero samples is K=50, in all the cases. The table 3 presents the relative convergence time of the algorithms considered with respect to RASR algorithm’s convergence time, along with the mean square error of the recovered signal (42) averaged over 100 trials. It is seen that convergence time of RASR is better compared to the bench mark algorithms BP and IRLS when measurement is corrupted. The value of signal to error ratio is much improved over CSIANN. It is seen that with stopping condition σmin ≤ 10−8 , RASR gives mse in recovery better than 10−30 . The stopping criterion for CSIANN as described in 22

RASR( 0.45 s), MSE (3.823064e−031) 2 0 −2

0

50

100

150

200 250 300 CSIANN( 0.71 s), MSE (1.142855e−001)

350

400

450

500

0

50

100

150

200 250 300 IRLS (11.56 s), MSE (1.220701e−008)

350

400

450

500

0

50

100

150

200 250 300 SL0 (0.89 s), MSE (2.293287e−017)

350

400

450

500

0

50

100

150

350

400

450

500

2 0 −2 2 0 −2 2 0 −2

200

250

300

Fig. 13. Comparison of recovery performance of RASR, CSIANN, IRLS and SL0 on synthetic spike data (a): RASR takes 0.45s and achieves mse of 10−31 (b): signal recovery using CS-IANN with 100 iterations takes 0.71s and fails to converge (c): signal recovery using IRLS takes 11.56s and achieves mse of 10−8 (d): signal recovery using SL0 takes 0.89s and achieves mse of 10−17

the paper [7] mse(ky − Ax∗ k) < 10−10 does not give acceptable result always, as the mse is computed for CS measurement y, not for the recovered signal x∗ . But for the practical application of the CSIANN algorithm, setting of mse(kx − x∗ k) < 10−10 as stopping criterion is not feasible, because the prior information about the original signal x is not available. To get required recovery precision from CSIANN algorithm, the iterations count Nmax should be set to extremely large value (> 10000), making it computationally inefficient. The figure (13) shows the result of applying RASR, CSIANN, BP and IRLS for recovery of synthetic sparse signal (N=500, M=200) from CS measurements, it is evident from the figure that the RASR gives better performance when compared to CSIANN. To demonstrate the feasibility of using RASR algorithm in recovering realistic data, combined relay-driver current and pyro-pulse current measurement from on board Telemetry system is used. The data has pulse stream characteristics and are captured using simulated compressed sensing hardware using Simulink. The CS encoded data is recovered using RASR, the recovery using SL0, CSIANN, BP and IRLS algorithms are used for comparison, the results are shown in figure 14. The CSIANN fails to converge and SL0 recovery precision is not as accurate as compared to RASR. The setting of mse(kxorg − xk) < 10−10 as stopping criterion for CSIANN is not feasible in practical application, because the information about the original signal xorg is not available. To get required precision from CSIANN the iterations count Nmax should be set to extremely large value or set stopping criterion as mse(kx(k) − x(k + 1)k) < 10−10 , but this condition sometimes keeps the iteration in local minima. The results are tabulated in table 4. In all the simulations the RASR algorithm’s recovery precision is better compared to SL0 and CSIANN/L0-LMS, the two sparse recovery algorithms that are used as basis for the development of this new RASR algorithm. It is seen that convergence time of RASR is better compared to the L0 relaxation based benchmark 23

Table 3A: Algorithm performance comparison Algorithm

Time

spike

δ = 0.9

RASR

1.00

5.30×10−22

1.00

8.52×10−4

1.00

8.05×10−3

PASR

4.35

2.60×10−14

4.37

1.37×10−3

4.06

4.26×10−3

IRLS

28.35

2.71×10−8

32.86

1.40×10−3

30.96

6.86×10−3

SL0

1.19

9.75×10−17

1.30

1.95×10−3

1.19

7.35×10−3

OMP

0.28

7.43×10−30

11.33

1.50×10−3

12.64

6.42×10−3

BP

9.27

1.53×10−19

14.61

1.35×10−3

12.99

6.07×10−3

CoSAMP

0.10

1.15×10−30

4.99

2.86×10−4

4.82

1.19×10−3

CSIANN

1.80

4.44×10−01

2.08

4.82×10−1

1.84

5.39×10−1

pulse(A)

δ = 0.9

RASR

1.00

6.41×10−15

1.00

6.28×10−4

1.00

4.81×10−3

PASR

3.93

1.86×10−8

3.80

7.63×10−4

3.88

7.47×10−3

IRLS

27.6

1.11×10−27

29.8

1.33×10−3

30.9

1.13×10−2

SL0

1.09

9.34×10−17

1.09

1.49×10−3

1.17

9.83×10−3

OMP

0.28

8.07×10−30

9.77

1.15×10−3

12.6

1.28×10−2

BP

8.78

2.64×10−24

12.6

1.31×10−3

11.9

1.06×10−2

CoSAMP

0.92

3.90×10−1

1.19

3.54×10−1

1.01

4.04×10−1

CSIANN

1.74

4.44×10−1

1.76

4.90×10−1

1.83

5.22×10−1

pulse(B)

δ = 0.98

RASR

1.00

1.79×10−30

1.00

1.56×10−3

1.00

8.86×10−3

PASR

0.53

7.98×10−14

0.56

1.26×10−3

0.54

8.32×10−3

IRLS

3.98

5.56×10−8

4.54

2.20×10−3

4.36

7.63×10−3

SL0

0.15

1.42×10−16

0.16

2.33×10−3

0.15

1.14×10−2

OMP

0.04

3.95×10−30

1.61

2.16×10−3

1.92

1.17×10−2

BP

1.08

2.67×10−20

4.27

2.13×10−3

1.69

7.22×10−3

CoSAMP

0.03

7.23×10−1

0.03

8.05×10−1

0.03

5.94×10−1

CSIANN

0.25

5.10×10−1

0.27

4.79×10−1

0.25

4.64×10−1

mse

Time

mse

-33dB

Time

mse

-26dB

-33dB

-26dB

-33dB

-26dB

algorithms BP and IRLS, this fast convergence of RASR is attributed to the simpler iterative steps involved in RASR. It is seen experimentally that with stopping condition σ < 10−9 RASR gives recovery precision mse(x) better than 10−11 . 24

Table 3B: Algorithm performance comparison Algorithm

Time

Time

pulse(C)

δ = 0.9

RASR

1.00

4.82×10−23

1.00

7.40×10−4

1.00

4.51×10−3

PASR

3.34

1.96×10−8

3.28

1.06×10−3

3.28

6.38×10−3

IRLS

27.68

1.20×10−8

28.1

1.63×10−3

29.7

6.57×10−3

SL0

1.63

1.61×10−14

1.72

1.96×10−3

1.73

1.13×10−2

OMP

0.14

2.49×10−6

7.17

1.39×10−3

9.27

9.18×10−3

BP

40.9

3.22×10−25

11.7

1.43×10−3

11.9

6.52×10−3

CoSAMP

0.17

3.06×10−1

0.16

4.30×10−1

0.17

3.08×10−1

CSIANN

1.68

4.49×10−1

1.66

4.65×10−1

1.66

4.46×10−1

Radar pulse

δ = 0.98

RASR

1.00

3.61×10−30

1.00

8.81×10−4

1.00

6.73×10−3

PASR

0.51

4.30×10−11

0.45

7.95×10−4

0.46

6.48×10−3

IRLS

4.06

4.08×10−8

4.24

1.23×10−3

4.21

8.89×10−3

SL0

0.24

7.05×10−32

0.24

1.30×10−3

0.25

8.95×10−3

OMP

0.04

4.89×10−30

0.65

1.03×10−3

1.10

9.67×10−3

BP

1.19

1.71×10−20

1.79

1.08×10−3

1.73

8.58×10−3

CoSAMP

0.02

3.40×10−1

0.02

3.45×10−1

0.02

3.39×10−1

CSIANN

0.23

2.86×10−1

0.23

2.75×10−1

0.23

3.49×10−1

mse

mse

-33dB

Time

mse

-26dB

-33dB

-26dB

Table 4: Algorithm performance comparison Algorithm

relay pulse

pyro pulse

Time(s)

mse

Time(s)

mse

RASR

1.480

1.23×10−11

9.43

1.63×10−11

SL0

0.220

1.65×10−8

1.27

4.06×10−10

IRLS

30.85

1.60×10−33

196.70

1.77×10−33

BP

67.69

8.32×10−21

498.56

1.60×10−31

CSANN

4.29

1.43×10−2

27.00

1.24×10−10

To demonstrate the recovery efficiency of the RASR algorithm, in the case of signals sparse in some basis B, the gray scale image data is used. The image is transformed using DCT, threshold-masked to generate sparse matrix, CS data acquisition technique is applied on each row of the image matrix and the image is recovered using CS-IANN, RASR, BP and the greedy algorithm 25

Fig. 14. Combined realy-driver current measurement from Telemetry recovered using BP, CSIANN, SL0 and RASR

OMP. The image compression and recovery using JPEG is also shown for comparison, where the input image is divided into 8 × 8 blocks, and the 2D DCT is computed for each block. The DCT coefficients are then masked using 4 × 4 upper triangular matrix (10 low frequency coefficients out of 64), quantized and coded to get JPEG compression. 3 Considering the CS data acquisition where the data x is sparse in basis B, and if the i.i.d. Gaussian matrix A is used then the measurement matrix AB−1 is also i.i.d. Gaussian. In the case of image data x if T is DCT matrix then the sparse signal corresponding to x is α = TxT⊺ and the CS measurement will be y = ATxT⊺ . However, the image is acquired as y = ATx and for reconstruction with CS recovery algorithms y ˆ = yT⊺ is used instead of y and the measurement matrix is A instead of AT. The sparse recovered signal α in DCT basis is transformed to x ˆ = T⊺ αT, since T−1 = T⊺ . In order to simulate the spare data acquisition, the measurement matrix AT can be selected such that kATj k0 < M and if it is not true, it can be made by setting {ATij = 0 : |ATij | < ǫ, i = 1 . . . M, j = 1 . . . M}. The result is shown in figure (15). The parameters σ = 2 and µ0 = 1.2 are set instead of using (37) (38), since the sparse representation α obtained after DCT transformation is not normalized. Ti,j =

q  1  M q  2  cos( i(2j+1) π) M

3

2M

i = 0,

0≤j ≤M −1

1 ≤ i ≤ M − 1, 0 ≤ j ≤ M − 1

(46)

The matlab functions dctmtx , dct , and blockproc are used to implement JPEG

26

Fig. 15. Algorithm performance on image recovery (rice image data). The image is transformed using DCT and threshold masked to generate sparse matrix. CS acquisition is done and recovered using RASR (f). The image recovery using OMP (c), BP (d), CSIANN(e). The image compression and recovery using JPEG are done for the comparison (b) with the other algorithms

6

Conclusion

It is clear from the experimental analysis of test results and the convergence time, the RASR algorithm outperform CSIANN/L0-LMS in CS signal recovery precision, convergence and computational time. It is shown experimentally that the data acquisition noise immunity of RASR is better compared to similar sparse recovery algorithms studied and evaluated. In presence of CS data acquisition noise the CSIANN algorithm fails to recover original signal. The convergence time of RASR is much faster compared to the benchmark algorithms BP and IRLS when the recovery precision is of the order 10−10 . RASR takes half the computational cost of SL0, when giving comparable recovery precision. In this paper we have presented the theoretical framework of RASR and experimental analysis of parameter convergence, recovery precision and noise immunity. Further research can be carried out on the theoretical analysis of convergence and robustness of the of RASR algorithm in presence of CS acquisition noise. The recovery result on image shows that the RASR algorithm works on sparse data in transform domain also. The effect of coherence 27

of measurement matrix on convergence is not analyzed. However, in all the experiments carried the measurement matrix with µ(A) < 0.3 is used. In the convergence curve of αk , the algorithm parameter of RASR, second order oscillations are observed between iteration cycles 5 and 15 over a general asymptotic curve. The effect of finer L2 magnitude approximation over the coarse L0 support estimation on the αk convergence curve can be studied further. In summary this paper presents an improved CS recovery algorithm based on L0 − L2 cascade optimization using artificial neural network architecture with less computational cost compared to CSIANN and, comparable precision and speed compared to the L0 relaxation based CS recovery algorithms.

Acknowledgements The authors would like to thank R. Lakshmi Narayanan of Indian Institute of Space Science and Technology, Trivandrum for reviewing the paper.

References [1] D. L. Donoho, “Compressed sensing”, IEEE Trans. Inf. Theory 52 (2006) 12891306. [2] E. Candes, J. Romberg, T. Tao, “Stable signal recovery from incomplete and inaccurate measurements”, Comm. Pure. Appl. Math. 59 (2006) 1207-1223. [3] E. Candes, M.B. Wakin, “An introduction to compressive sensing”, IEEE Signal Process. Mag. (2008) 21-30. [4] Jerome Bobin, Jean Luk Starck, Roland Ottensamer, “Compressed sensing in astronomy” IEEE J. Sel. Top. Signal Process. 2 (2008) 718-726. [5] Chinmay Hegde, Richard G. Baraniuk, “Sampling and recovery of pulse streams”, IEEE Trans. Signal Process. 59 (2011) 1505-1516. [6] Vidya L, Vivekananad V, ShyamKumar U, Deepak Mishra, Lakshminarayanan R, “Feasibility study of applying compressed sensing recovery algorithms for launch vehicle telemetry”, AICER IEEE Conf. (2013) doi. 10.1109/AICERAICMiCR.2013.6576044 [7] Chun-hui Zhao, Yun-long Xu, “An improved compressed sensing reconstruction algorithm based on artificial neural network”, ICECC IEEE Conf. (2011) 1860 - 1863. [8] Jian Jin, Yuantao Gu, and Shunliang Mei, “A stochastic gradient approach on compressive sensing signal reconstruction based on adaptive filtering framework”, IEEE Jour. of Sel. Top. Signal Process. (2009).

28

[9] G. Hosein Mohimani, Massoud Babaie-Zadeh, Christian Jutten, “A fast approach for overcomplete sparse decomposition based on smoothed L0 norm”, IEEE Trans. Signal Process. 57 (2009) 289-301. [10] Y.C. Eldar, G. Kutyniok, “Compressed Sensing: Theory and Applications”, Cambridge University Press, 2012. [11] S. B. Chen, D. L. Donoho, M. A. Saunders, “Atomic decomposition by basis pursuit”, SIAM J Sci. Comput. 20 (1998) 33-61. [12] I.F. Gorodnitsky, B. D. Rao, “Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm”, IEEE Trans. Signal Process. 45 (1997) 600-616. [13] Michael Elad, “Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing”, Springer, 2010. [14] E. Candes, J Romberg, Terence Tao, “Robust uncertainty principles:Exact signal reconstruction from highly incomplete frequency information”, IEEE Trans. on Information Theory. 52 (2006) 489 - 509. [15] E. Candes, J. Romberg, “Sparsity and incoherence in compressive sampling”, Inverse Prob. 23 (2007) 969-985. [16] R. DeVore, “Deterministic constructions of compressed sensing matrices”, J. Complex 23 (2007) 918-925. [17] E. J. Candes, “The restricted isometry property and its implications for compressed sensing”. C. R. Acad. Sci. I, 346 (2008) 589-592 [18] R.G. Baraniuk, M. Davenport, R.A. DeVore, and M. Wakin. “A simple proof of the Restricted Isometry Property for random matrices”, Constr. Approx. 28 (2008) 253-263. [19] R. Tibshirani, “Regression shrinkage and selection via the lasso”, J. Royal. Statist. Soc. B. 58 (1996) 267-288. [20] E.J. Hartman, J.D. Keeler, J.M. Kowalski, “Layered neural networks with Gaussian hidden units as universal approximations”, Neural Comput. MIT Press 2 (1990) 210-215. [21] Haykin S, “Neural Networks and learning Machines”, PHI Pvt. Ltd. Ed. 3 (2009). [22] B.J.C. Baxter, “The interpolation theory of radial basis functions”, Cambridge University PhD Thesis (1992). [23] M. J. D. Powell “The theory of radial basis function approximation in”, Advances in Numerical Analysis II (1990). [24] M.A.T. Figueiredo, R.D. Nowak and S. J. Wright. “Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems”, IEEE J. Sel. Top. Signal Process. 1 (2007) 586-597.

29

[25] J. A. Tropp, A. C. Gilbert, “Signal recovery from random measurements via orthogonal matching pursuit”, IEEE Trans. Inf. Theory 53 (2007) 4655-4666. [26] D. Needell and J.A. Tropp, “CoSaMP: Iterative signal recovery from incomplete and inaccurate samples”, Appl. Comput. Harmonic Anal. 26 (2008) 301-321. [27] E. Candes, J. Romberg, “L1-MAGIC: Recovery of sparse signals via convex programming”, Technical report, Caltech, 2005.

30

Pruning-Based Sparse Recovery for Electrocardiogram Reconstruction from Compressed Measurements.

Compressed-Sensing Reconstruction Based on Block Sparse Bayesian Learning in Bearing-Condition Monitoring.

A variable splitting based algorithm for fast multi-coil blind compressed sensing MRI reconstruction.

Accelerated Compressed Sensing Based CT Image Reconstruction.

Sparse-Recovery Approach for Improved Range Resolution in Narrow-Band Radar.

thresholding algorithm for MR imaging reconstruction using compressed sensing.

RMP: Reduced-set matching pursuit approach for efficient compressed sensing signal reconstruction.

An Efficient Distributed Compressed Sensing Algorithm for Decentralized Sensor Network.

Compressed sensing MRI via two-stage reconstruction.

Sparsity Adaptive Matching Pursuit Detection Algorithm Based on Compressed Sensing for Radar Signals.

Adaptive Dictionary Reconstruction for Compressed Sensing of ECG Signals.

Exploitation of temporal redundancy in compressed sensing reconstruction of fMRI studies with a prior-based algorithm (PICCS).

A reconstruction method based on AL0FGD for compressed sensing in border monitoring WSN system.

A Fourier-based compressed sensing technique for accelerated CT image reconstruction using first-order methods.

Balanced sparse model for tight frames in compressed sensing magnetic resonance imaging.

An adaptive data collection algorithm based on a Bayesian compressed sensing framework.

Ring artifacts correction in compressed sensing tomographic reconstruction.

Compressed sensing MR image reconstruction exploiting TGV and wavelet sparsity.

Surface reconstruction in gradient-field domain using compressed sensing.

Image reconstruction of compressed sensing MRI using graph-based redundant wavelet transform.

Compressed-sensing-based fluorescence molecular tomographic image reconstruction with grouped sources.

Block sparsity-based joint compressed sensing recovery of multi-channel ECG signals.

3D multifocus astigmatism and compressed sensing (3D MACS) based superresolution reconstruction.

Compressed sensing based fingerprint identification for wireless transmitters.