Accepted Manuscript Title: WASICA: An effective wavelet-shrinkage based ICA model for brain fMRI data analysis Author: Nizhuan Wang Weiming Zeng Yingchao Shi Tianlong Ren Yanshan Jing Jun Yin Jiajun Yang PII: DOI: Reference:
S0165-0270(15)00094-1 http://dx.doi.org/doi:10.1016/j.jneumeth.2015.03.011 NSM 7178
To appear in:
Journal of Neuroscience Methods
Received date: Revised date: Accepted date:
16-2-2014 8-3-2015 9-3-2015
Please cite this article as: Wang N, Zeng W, Shi Y, Ren T, Jing Y, Yin J, Yang J, WASICA: An effective wavelet-shrinkage based ICA model for brain fMRI data analysis, Journal of Neuroscience Methods (2015), http://dx.doi.org/10.1016/j.jneumeth.2015.03.011 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Title Page
Title:
ip t
WASICA: An effective wavelet-shrinkage based ICA model for brain fMRI data analysis
Author names and affiliations:
201306, China;
2
us
Lab of Digital Image and Intelligent Computation, Shanghai Maritime University, Shanghai, Department of Neurology, Shanghai Jiao Tong University Affiliated Sixth
People’s Hospital, Shanghai, 200233, China.
M
ponding author:
an
1
cr
Nizhuan Wang1, Weiming Zeng1, Yingchao Shi1, Tianlong Ren1, Yanshan Jing1, Jun Yin1, Jiajun Yang2
ce pt
Type of the article:
ed
Weiming Zeng, College of Information Engineering, Shanghai Maritime University, Shanghai, 201306, CHINA; Tel: 86-021-38282873; Fax: 86-021-38282800; E-mail address:
[email protected] Ac
Full-length research paper
Page 1 of 78
Highlights
1.
Wavelet shrinkage leads to sparse approximation of the sources and suppresses
ip t
noise; Sparse approximation coefficients set satisfies the super-Gaussian distribution;
3.
WP reconstruction generates an effective approximation with nuisance exclusion;
4.
WASICA exhibits the superiority in signal recovery and temporal dynamics
us
cr
2.
extraction;
an
WASICA could recover the super-Gaussian sources or some Gaussian-like
ce pt
ed
M
sources;
Ac
5.
Page 2 of 78
WASICA: An effective wavelet-shrinkage based ICA model for
cr
ip t
brain fMRI data analysis
1
an
Jiajun Yang2*
Lab of Digital Image and Intelligent Computation, Shanghai Maritime University, Shanghai,
M
201306, China.
Department of Neurology, Shanghai Jiao Tong University Affiliated Sixth People’s Hospital,
ce pt
Shanghai, 200233, China.
ed
2
us
Nizhuan Wang1, Weiming Zeng1*, Yingchao Shi1, Tianlong Ren1, Yanshan Jing1, Jun Yin1,
Corresponding author at: College of Information Engineering, Shanghai Maritime University, 201306,
China;
Tel:
86-021-38282873;
Fax:
86-021-38282800;
E-mail
Ac
Shanghai,
address:
[email protected] (W. Zeng). Department of Neurology, Shanghai Jiao Tong University Affiliated Sixth People’s Hospital, Shanghai, 200233, China; E-mail address:
[email protected] (J. Yang).
1
Page 3 of 78
Abstract Background: Researches declared that the super-Gaussian property contributed to the success of some spatial independent component analysis (ICA) algorithms in brain
ip t
fMRI source separation (e.g., Infomax and FastICA), which implied that sparse
cr
approximation transforming the sources (super-Gaussian or Gaussian-like) with
stronger super-Gaussian feature would possibly improve the separation performance
us
of these algorithms.
an
New method: This paper presented a novel wavelet-shrinkage based ICA (WASICA) model, an extension of our previous SACICA, for single-subject analysis. In contrast,
M
two main aspects had been effectively enhanced: (1) sparse approximation
ed
coefficients set formation, made up of two sub-procedures: the wavelet-shrinkage of wavelet packet (WP) tree nodes, and the automatic nodes selection and integration
ce pt
based on the relative WP energy; (2) ICA-based decomposition and reconstruction, composed of temporal dynamics extraction using ICA, WP reconstruction based on the sparse approximation coefficients set and least-square-based functional networks
Ac
reconstruction.
Results: The wavelet-shrinkage and the automatic nodes selection and integration simultaneously transformed both the mixtures and underlying sources into effective sparse approximation coefficients sets, exhibiting stronger super-Gaussian distribution; WP projected-back approximation with nuisance-exclusion contributed to networks reconstruction. Comparison with existing methods: Simulation 1 revealed WASICA successfully 2
Page 4 of 78
recovered super-Gaussian and some Gaussian-like sources. Simulation 2 and hybrid data experiments showed that WASICA with good temporal performance had stronger source recovery ability and signal detection sensitivity spatially than FastICA,
ip t
Infomax and SACICA did; the higher intra-consistency in task-related experiments denoted WASICA occupied stronger spatial robustness against smooth kernels.
cr
Conclusions: WASICA was a promising brain signal separation model with charming
us
spatial-temporal performance.
Keywords: fMRI; wavelet packet decomposition; sparse approximation; wavelet
Ac
ce pt
ed
M
an
shrinkage; ICA
3
Page 5 of 78
1. Introduction 1.1. ICA and fMRI source separation Recently, the independent component analysis (ICA, spatial/temporal ICA) has
ip t
been widely used to analyze the blood oxygen level dependent (BOLD) functional
cr
magnetic resonance imaging (fMRI) data, aiming at detecting functional connectivity
among discrete cortical brain regions at single-subject level or multi-subject level, due
us
to its fully data-driven feature without the prior neuroscience knowledge or
an
experience (McKeown et al., 1998; Biswal et al., 1999; Beckmann & Smith, 2004, 2005; Calhoun et al., 2001a, 2001b; Schöpf et al., 2010; Wang et al., 2012a). In
M
contrast to the univariate general linear model (GLM), which is performed on a
ed
voxel-by-voxel basis (Bagarinao et al., 2003), the ICA method is intrinsically a multivariate approach, which could extract multiple brain networks that are engaged
ce pt
in various elements of cognitive processing. This capability makes ICA an increasingly attractive exploratory tool to study functional networks at rest (Beckmann et al., 2005; Damoiseaux et al., 2006; Schöpf et al., 2010; Wang et al.,
Ac
2012a), during a cognitive task (Calhoun et al., 2008), or under the mental disorder condition, for example, autism spectrum disorder (Ambrosino et al., 2014) and dementia (Rytty et al., 2013). 1.2. Combining wavelets with ICA Very recently, Calhoun et al. (2013) has fully demonstrated that the unimodal super-Gaussian distributions of the sources’ probability density functions (pdf) 4
Page 6 of 78
contribute to the success of FastICA (Hyvärinen et al., 1997) and Infomax (Bell & Sejnowski, 1995) with standard parameter settings. This verification by Calhoun and his colleagues implies that if the pdfs of the underlying sources or the transformed
ip t
sources occupy the stronger unimodal super-Gaussian distributions, the possibly better source separation performance will be achieved by ICA with regular parameters
cr
setting, e.g., Infomax with sigmoid nonlinearity assuming a unimodal super-Gaussian
us
source. Recently, some studies have suggested that the separation performance of FastICA can be further improved by sparse approximation of the fMRI mixtures, so
an
that the coefficients of the transformed data (and underlying sources) exhibit stronger
M
super-Gaussian (Wang et al., 2012b, 2013). Fortunately, according to the former researches, the sparse approximation process can be fast achieved by the wavelet or
ed
wavelet packet transformation, exploiting the sparseness of the wavelet coefficients (Kisilev et al., 2000, 2003; Bronstein et al., 2005; Khullar et al., 2011a, 2011b; Wang
ce pt
et al., 2013). For example, Khullar et al. (2011a) recently have presented a wavelet-domain based framework (w-ICA) for de-noising and source separation on
Ac
the fMRI mixtures at single-subject level, which utilized the multidirectional de-noising scheme with three dimensional stationary discrete wavelet transform. In comparison to the Gaussian smoothing de-noising, this method has demonstrated significant improvement and preserved more accurate signal shape. As a development of w-ICA, Khullar et al. (2011b) have extended this framework to de-noise the multi-group (healthy/patient) fMRI data, gaining higher specificity and increasing shape accuracy compared with Gaussian smoothing. The authors have pointed that the 5
Page 7 of 78
de-correlated nature of the denoised wavelet coefficients was an aid to the more accurate source estimation in w-ICA framework in the wavelet domain. However, in our opinion about this point, the sparseness of the denoised wavelet coefficients is
ip t
also probably beneficial to the separation, owing to that the threshed wavelet coefficients are expected to have the stronger super-Gaussian distribution. Further,
cr
Wang et al. (2012b, 2013) have reported that the sparse approximation coefficients set
us
formed based on the Lp norm of the wavelet packet tree nodes was useful to blind source separation of fMRI data using ICA. Besides the fMRI signal separation field,
an
particularly, Kisilev et al. (2000, 2003) have demonstrated that adopting the selected
M
wavelet coefficients with higher sparsity yielded better separation performance while adopting all wavelet coefficients yielded intermediate sparsity and separation
ed
performance. Bronstein et al. (2005) also have reported that the sparse ICA (SPICA) achieved better separation performance in separating the transmitted and reflected
ce pt
images blindly in comparison to the standard ICA. The above studies imply that the wavelet-based processing combining with a source separation technique such as ICA
Ac
could be possibly promising to the functional connectivity detection on human fMRI data.
1.3. What do we propose? Founding
on
the
researchers’
former
studies,
in
this
research,
a
wavelet-shrinkage based ICA model (WASICA) with more relaxed sources’ pdfs assumptions is proposed, aiming at separating the super-Gaussian sources and some Gaussian-like sources, which is a further development of our previous SACICA 6
Page 8 of 78
model (Wang et al., 2013). In contrast to SACICA, the proposed model is with main validity and novelty in two procedures: the sparse approximation coefficients set formation procedure and ICA-based decomposition and reconstruction procedure. The
ip t
sparse approximation coefficients set formation takes advantage of the following two operations: the wavelet shrinkage of the wavelet packet tree nodes and an automatic
cr
nodes selection and integration based on the relative wavelet packet energy (RWPE),
us
transforming the wavelet packet coefficients into the sparse approximation coefficients set with obviously stronger super-Gaussian distribution than the original
an
spatial domain fMRI data. Meanwhile, the formed sparse approximation coefficients
M
set forces the underlying sources in the mixtures to be sparser or stronger super-Gaussian. Thus, as the aforementioned research (Calhoun et al., 2013) implied,
ed
the sparse approximation coefficients set about the fMRI mixtures is expected to enhance the estimation accuracy of the sources (e.g., super-Gaussian and
ce pt
Gaussian-like sources) as well as the mixing matrix when combining with ICA method. Also, the unmixing operation in ICA-based decomposition and reconstruction
Ac
procedure is based on the denoised reconstruction approximation data, generated by inverse wavelet packet transformation of the sparse approximation coefficients set, which is beneficial to generate more accurate functional networks. The remainder of this paper is organized as follows: the theory and framework of WASICA will be firstly presented, and followed by the simulation and hybrid data experimental tests, which are designed to evaluate the signal recovery ability, signal detection sensitivity and temporal dynamics extraction performance concerning 7
Page 9 of 78
WASICA in comparison to FastICA, Infomax and SACICA. Meanwhile, the real fMRI datasets such as the task-related data and resting-state data are also applied to verify the effectiveness and some merits of WASICA. Finally, results and analysis
advantages and limitations of this new data analysis model.
cr
2. Theory and Method
ip t
will be presented together with interpretations, discussions and conclusions related to
us
2.1. Sparse approximation theory
an
The functional connectivity detection of the human brain activities can be formulated as a kind of blind source separation (BSS) problem, assuming that there
M
exists the functional integration of activity in multiple macroscopic loci (McKeown et
ed
al., 1998; Biswal et al., 1999; Beckmann & Smith, 2004). This BSS problem aims at retrieving the underlying sources, namely the functional networks, denoted in vector
ce pt
s1 s 2 notation as S with the size Q M , assuming the observed mixtures, namely sQ
Ac
x1 x the fMRI mixing data, X 2 with the size P M , can be written as: xP
X AS ,
(1)
where A is a mixing matrix (or temporal dynamics) with the size P Q from RQ to R P , the row vectors si and xi are both of size 1 M (Comon & Jutten, 2010). P is the number of timepoints, Q is the number of hidden sources, and M is the
number of voxels. Next, it is assumed that each source can be (approximately) 8
Page 10 of 78
represented by a linear combination of a few row atoms m with the size 1 M , of be unit energy. Specifically, each (unknown) source signal can be written as: K
si ci (m)m
(2)
m 1
where only few entries ci (m) of row vector cs are significant, most of them being
ip t
i
cr
negligible. Informally, a set of coefficients ci (m) are considered sparse if most of the coefficients are zero or very small and only a few of them are significant. In statistical
are said to have a sparse or
us
(or probabilistic) terms, coefficients ci (m)
an
super-Gaussian distribution if their histogram (or probability density) has a strong peak at the origin with heavy tails. Meanwhile, as the scholars Calhoun et al. (2013)
M
have pointed out, the appropriate choice of origin for the coordinate system in which one foresees to evaluate the sparseness of the signal is quite essential and should be
ed
ce pt
noted. These atoms m
1 are called a complete dictionary 2 with the size K
K M if these atoms can be utilized to reconstruct any signal si which lives in a
Ac
spatial domain of dimension R M . Thus, the sparse model of source si can be written in matrix form:
where csi ci (1), ci (2),
si csi , ci ( K -1), ci ( K )
is a row vector of
(3) K
coefficients.
Furthermore, the sparse representation model of the Q sources S can be formulated as follow: S Cs ,
(4)
9
Page 11 of 78
where the C s satisfies:
, Cs ( K ) , where Cs with the size Q K .
ip t
cs1 cs Cs 2 Cs (1), Cs (2), csQ
When the dictionary is a basis for the signal space, as is the discrete orthonormal
cr
Fourier basis, or a bi-orthogonal wavelet basis (Mallat, 1998), the only transform
us
allowing perfect reconstruction is 1 .
the dictionary takes the following form:
an
Very similarly, a sparse approximation process of the mixtures X considering
, C X ( K ) is a matrix with dimensions P K in
ce pt
ed
cx1 cx2 where C X C X (1), C X (2), cxi c xP
(5)
M
X CX ,
which each c xi is a row vector of K coefficients cx (m) . The C X is called as the i
Ac
sparse approximation coefficients of X subjecting to the dictionary . In this study, the K is assumed equal to M , in which case equations (2)-(4) become equalities instead of approximations. Combining formula (1), (4) and (5), the resultant formula (6) is deduced as follow: CX ACs ,
(6)
when a proper dictionary is selected. For example, if is an orthonormal basis, a unique representation exists for the mixtures (and the sources) and can thus be 10
Page 12 of 78
calculated simply by inverse linear transform, necessarily satisfying CX ACs . According to formula (6), the original BSS problem of X is reduced to a separation problem of sparse C X with likely sparser Cs . Comparing formula (1) with (6), the
ip t
mixing matrix A is unchanged if the approximation CX ACs is with sufficient accuracy by a certain sparse approximation algorithm; for example, the matching
cr
pursuit algorithm (Tropp, 2006; Tropp et al., 2006) was shown to guarantee the
us
relation CX ACs when Cs was sufficiently sparse (Gribonval & Nielsen, 2008). Namely, the mixing matrix A could be estimated on the sparse approximation
an
coefficients C X .
M
Sparse approximation coefficients C X of the mixtures X can facilitate both the estimation of the mixing matrix A and that of the sources by blind source
ed
separation methods such as ICA with Infomax or FastICA algorithms, owing to both C X and Cs being sparser or with stronger super-Gaussian distribution, enhancing
ce pt
the separation performance of these ICA algorithms (Calhoun et al., 2013). Good sparse approximation is challenging in fMRI because the set of significant
Ac
coefficients is different for each volume. Our approach using RWPE guarantees "good quality" estimation of the sparse approximation coefficients using a common C X across volumes (see section 2.2.2.2). Meanwhile, the sparse approximation process of the mixtures could be implemented effectively by wavelet or wavelet packet transformation, which can yield sources with diverse levels of sparsity (Kisilev et al., 2000, 2003; Bronstein et al., 2005; Wang et al., 2012b, 2013). Thus, founding on the above sparse approximation theory, the promising model, namely WASICA, with 11
Page 13 of 78
regard to the sparse coefficients set C X formation and the extraction of temporal dynamics A and functional networks S will be presented in the next section. 2.2. Framework of WASICA
ip t
The framework of WASICA is presented in Fig.1 vividly, which consists of three
cr
main procedures. The wavelet packet decomposition (P1, in Fig.1) regarding to each
volume of mixtures X is firstly implemented, which yields a series of wavelet
us
packet trees. The sparse approximation coefficients set formation (P2, in Fig.1) is
an
followed immediately. This procedure mainly consists of two sub-procedures: the wavelet shrinkage of the wavelet packet tree nodes of each volume (P2.1, in Fig.1)
M
and the automatic nodes selection and integration based on the RWPE (P2.2, in Fig.1).
ed
The ICA-based decomposition and reconstruction (P3, in Fig.1) comes lastly, which consists of three sub-procedures, namely temporal dynamics extraction using ICA
ce pt
(P3.1, in Fig.1), wavelet packet reconstruction based on the sparse approximation coefficients set (P3.2, in Fig.1) and least-square-based functional network reconstruction (P3.3, in Fig.1) orderly. Next, the details of P1, P2 and P3 are
Ac
illustrated in the following 2.2.1, 2.2.2 and 2.2.3 sections, respectively. 2.2.1. Wavelet packet decomposition Sparsity means that only a small fraction of coefficients differ significantly from zero when the signals are properly represented (Kisilev et al., 2000, 2003; Wang et al., 2013), whose pdfs exhibit the stronger super-Gaussian distributions having an obvious peak at the origin with heavy tails. This property will play a great role in 12
Page 14 of 78
blind source separation using Infomax and FastICA algorithms, with regular parameters setting (Calhoun et al., 2013). Considering this, to obtain the sparse approximation coefficients set of the fMRI mixtures X , the 1D wavelet packet
ip t
transformation is firstly implemented on the volumes of the fMRI mixtures, which is similar to the SACICA model. The main reason for the use of 1D wavelet is that each
cr
volume data embedding too many zero leads to that the noise level i about the
us
mixture xi regarding to the following wavelet shrinkage procedure (see section 2.2.2.1) could not be accurately estimated by means of the proposed interquartile
an
range measure when using 3D wavelet under the proposed WASICA framework;
the discussion of Section 5.3).
M
however, 1D wavelet works well enough for handling this condition in WASICA (see
ed
For fMRI sources having oscillatory components, wavelet packets might be appropriate to express these signals. Also, using wavelet packets allows one to
ce pt
analyze given signals not only with a scale-oriented decomposition but also on frequency sub-bands. In the following, a brief theory of wavelet packet transform will
Ac
be presented (the details for Kisilev et al., 2000; Wang et al., 2013). Particularly, the wavelet packet library consists of the triple-indexed family of functions:
jlk (t ) 2 j /2 l (2 j t k ), j,k , l .
(7)
As in the case of the wavelet packet transform, j , k are the scale and shift parameters, respectively, and l is the frequency parameter, related to the number of oscillations of a particular generating function l (t) . The set of functions jl (t) forms a
j, l
wavelet packet and this set of functions can be split into two parts at a 13
Page 15 of 78
coarser scale: j 1,2l (t) and j 1,2l +1 (t) . It follows that these two form an orthonormal basis of the subspace which spans jl (t) . A family of wavelet packet functions on a binary tree could be formed (as shown in Fig.2). The nodes of this tree are numbered , J -1, J
( J
denoting the
ip t
j 0, 1,
by two indices: the depth of the level
, 2 j -1 at the specific
decomposition level) and the number of nodes l = 0, 1, 2,
cr
level j (Mallat, 1998; Kisilev et al., 2000). Furthermore, for the ith source signal
(or functional network) si , the decomposition coefficients csjlk could be expressed as:
(8)
,
denotes the inner product operation and the coefficients csjlk are spilt
j, l
sets corresponding to the nodes of the wavelet packet tree. Further,
i
M
where into
an
csjlk = si , jlk , i
us
i
ed
equation (8) can be written in matrix form as CS S T , where T denotes the transpose operation of a matrix. Similarly, the decomposition coefficients cxjlk of xi i
ce pt
(the ith volume of the fMRI mixtures) could be written as: cxjlki = xi , jlk .
(9)
Ac
Thus, the further matrix form of equation (9) can be expressed as CX X T . In terms of the volume xi , the different wavelet packet tree nodes yield different degrees of source sparsity, and are naturally de-correlated. In particular, the scholars Kisilev et al. (2003) demonstrated that selecting wavelet coefficients which yield higher source sparseness also leads to better separation performance while utilizing all wavelet coefficients yield intermediate separation performance. This research denotes that the proper selection of the wavelet packet tree nodes is a key issue and has 14
Page 16 of 78
obvious impact on the signal source separation performance. Further, Wang et al. (2013) proposed a SACICA model for fMRI analysis, which adopted the FastICA to extract the mixing matrix on a single wavelet packet node selected based on a Lp
ip t
norm, generating better spatial source recoverability and signal detection sensitivity than FastICA. In contrast to SACICA, in this study, the wavelet shrinkage and novel
cr
automatic node selection and integration operations over the wavelet packet tree
us
nodes are proposed to form the more effective sparse approximation coefficients set
an
of the fMRI mixtures. The concrete procedures will be presented in the next section. 2.2.2. Sparse approximation coefficients set formation
M
2.2.2.1. Wavelet shrinkage of wavelet packet tree nodes
ed
For convenience and simplicity, suppose that there are P volumes for the fMRI mixtures and each volume with M voxels is analyzed by the 1D wavelet packet
ce pt
decomposition (which works well for noise level estimation, see Section 5.3) with J levels, yielding 2 j nodes at level j , totally generating P wavelet packet trees. The lth node at level j of the ith volume is denoted as Node(i, j , l ) , where
Ac
1 i P , 0 j J and 0 l 2 j 1 , respectively. Specifically, the node indexes at
the level J is recorded in the vector INDEX J 0,1,
, 2 J 1 with the size
1 2 J .
So far, the wavelet shrinkage has been extensively studied by many scholars and presents state-of-the-art performance in signal denoising (Donoho & Johnstone, 1994,1995; Donoho, 1995), power spectrum estimation (Moulin,1994), etc. The success of wavelet shrinkage is mainly owing to the fact that the wavelet transform is 15
Page 17 of 78
good at energy compression and the assumption that the small coefficients are more likely due to noise, while large coefficients are important signal features (Donoho & Johnstone, 1995). Gao (1998) has pointed out that shrinkage of the empirical wavelet
ip t
coefficients worked best when the underlying set of true coefficients was sparse. Meanwhile, we have noted that the wavelet shrinkage not only suppressed the noise in
cr
wavelet coefficients, but also simultaneously contributed to a sparse approximation to
us
both the mixtures X explicitly and the underlying sources S implicitly. In wavelet shrinkage, common shrinkage functions include the hard thresholding, soft
an
thresholding (Donoho & Johnstone, 1994, 1995; Donoho, 1995), Semisoft (Gao &
M
Bruce, 1997), Garotte (Gao, 1998), and Hard-Soft (Zeng et al., 2004). In this study, the hard shrinkage function is chosen to perform the thresholding operation for each
ed
node at the J level of each wavelet packet tree, except the approximation node (node index l 0 ). The corresponding formulas are expressed as follows:
ce pt
Node H (i,J,l ) Hi ( Node(i,J,l )), l INDEX J l 0 ,
H ( x)
(11)
i i 2log(n) ,
(12)
i
Ac
0 x i , x x i
(10)
i IQR Node(i,J,1: 2J 1) *0.7413 ,
(13)
where 1 i P , IQR represents the calculation of interquartile range (a robust variability measure in statistics) of a given vector, and n represents the length of the vector Node(i,J,1: 2 J 1) , which is a concatenation of the nodes at the J level of the ith wavelet packet tree except the approximation node Node(i,J,0) . After the wavelet shrinkage is implemented on the level J nodes Node(1: P,J,1: 2 J 1) , the 16
Page 18 of 78
corresponding
threshed
nodes
Node H (1: P,J,1: 2 J 1)
are
retrieved,
where
NodeH (1: P,J,0) Node(1: P,J,0) .
2.2.2.2. Automatic nodes selection and integration based on the RWPE
ip t
After the nodes NodeH (1: P,J,0 : 2J 1) are retrieved, how to select and
cr
integrate these wavelet packet tree nodes and to form the sparse approximation coefficients set should be investigated, which has quite important effects on the
us
separation performance of WASICA. Fortunately, the wavelet-based information tools
an
are popular to apply in brain signal for quantitative analysis. For example, wavelet energy, relative wavelet energy, wavelet entropy, relative wavelet entropy and wavelet
M
statistical complexity have been often used in the characterization of scalp EEG
ed
records corresponding to secondary generalized tonic-clonic epileptic seizures (Rosso et al., 2001, 2006). In this study, the relative wavelet packet energy (RWPE) criterion
ce pt
is used to select the threshed wavelet tree nodes and form the sparse approximation coefficients set. Next, the formulation definition of RWPE of the threshed wavelet packet tree nodes is given for convenience. According to the above denotation, the
Ac
energy of the lth node at level J of the ith volume can be formulated as follow:
where
F
Energy i,J,l Node H (i,J,l ) , 2
(14)
F
denotes the Frobenius norm of a vector. The total energy of ith volume
can be expressed as: 2 J 1
Energytot i Node H (i,J,l ) l 0
2 F
.
(15)
Then, the RWPE of the lth node at level J of the ith volume has the following 17
Page 19 of 78
formation: RWPE i,J,l
Energy i,J,l . Energytot i
(16)
Further, the mean relative wavelet packet energy (MRWPE) and percentage of
calculated as the next respective expressions:
cr
MRWPEJ l 2 1 J
MRWPE l
,
(17)
(18)
J
an
PMRWPEJ l
1 P RWPE i,J,l , P i 1
us
MRWPEJ l
ip t
MRWPE (PMRWPE) for the lth node at level J across all P volumes could be
l 0
where l INDEX J . Thus, according to the formula (18), the measuring vector
M
PMRWPEJ 0 : 2 J 1 for each node at level J regarding to all the P wavelet
ed
packet trees is retrieved. In the follow, the vector PMRWPEJ 0 : 2 J 1 is sorted in descending order, and select the top q elements in PMRWPEJ 0 : 2 J 1 such that q 1
l 0
ce pt
PMRWPE l Thresh J
is satisfied, where Thresh is an arbitrary threshold that
represents the proportion of total energy retained; meanwhile, the correspondingly
Ac
selected nodes indexes are recorded in the vector INDEX with the size of q . Immediately, the nodes selection operation can be formulated as: H Node 1: P, J , l , l INDEX Node 1: P, J , l , 0, l INDEX l INDEX J H sel
(19)
where the bold 0 denotes the zero vector. In the last, the sparse approximation coefficients set regarding to the fMRI mixtures can be integrated as follow: H CX Nodesel 1: P, J , INDEX J .
(20)
18
Page 20 of 78
Hereunto, the sparse approximation coefficients set C X is formed, which not only transforms the mixtures into one with sparser feature (or with stronger super-Gaussian property), but also is expected to lead the underlying sources C s in C X to stronger
ip t
sparsity (or stronger super-Gaussian distribution) than the si in X did. According to the research (Calhoun et al., 2013), the sparse approximation coefficients set C X
cr
is expected to be beneficial to the super-Gaussian or Gaussian-like sources separation
us
using ICA with FastICA and Infomax algorithms, whose performance is validated in
an
the Results and Analysis section. 2.2.3. ICA-based decomposition and reconstruction ICA-based
decomposition
and
reconstruction
M
The
consists
of
three
ed
sub-procedures: (i) temporal dynamics extraction using ICA, similar to SACICA; (ii) wavelet packet reconstruction based on the sparse approximation coefficients set; (iii)
ce pt
least-square-based functional networks reconstruction. After the sparse approximation coefficients set C X is formed, the first sub-procedure takes place. Thinking back to formula (6) CX ACs , ICA could be utilized to extract the temporal dynamics A
Ac
from C X , where the estimator of A by ICA is denoted as A . Compared with formula (1) X AS , in presence of noise, formula (6) is expected to retrieve the more accurate estimation A of A due to the sparser feature (or stronger super-Gaussian distribution) of C X and C s , in contrast to X and S , respectively. In the follow, the approximation xi for the fMRI mixture xi could be achieved by inverse wavelet packet transformation immediately, based on the sparse H approximation coefficients cxi , where cxi Nodesel i, J , INDEX J . Thus, the
19
Page 21 of 78
approximation of the ith volume could be formulated as follow: xi cxi , i 1, 2,
,P.
(21)
Further, the wavelet packet reconstruction approximation of the fMRI mixtures X is
cr
ip t
x1 achieved as X xi . xP
us
Since S CS , X CX , and CX ACS , then it follows that X AS and,
A
T
A
1
A
T
X,
(22)
M
s1 s2 S sQ
an
thus, S can be estimated using least squares as shown in following equation:
ed
where X preserves the denoising feature in contrast to X , and A is the estimator temporal dynamic A determined by ICA. Meanwhile, since X is considered a
ce pt
denoised version of X , it should improve estimation of the sources in the least squares procedure and the related experimental validation is presented in Section Appendix B.
Ac
2.3. Implementation of WASICA
In this section, the implementation of WASICA will be briefly described. The
mainly involved operations such as the wavelet packet decomposition, wavelet packet nodes thresholding, nodes energy calculation and inverse wavelet packet transformation were implemented by the in-house functions wpdec, wpthcoef, wenergy and wprec, respectively, provided by the Matlab software (Matlab 2012b, 20
Page 22 of 78
Mathworks Inc., Sherborn, MA, USA). Meanwhile, the Infomax (Makeig et al., 1996) implemented in ICA Toolbox for Psychophysiological Research (version 3.4) or FastICA
(Hyvärinen
et
al.,
1997)
in
FastICA
2.5
toolbox
ip t
(http://research.ics.aalto.fi/ica/fastica/code/dlcode.shtml) was utilized in WASICA to perform the ICA decomposition.
cr
Based on the above description, we have established all the theory, methods and
us
implementation details about WASICA; and, next, we will give some experimental
proposed WASICA in following sections.
M
3. Experimental Tests
an
tests with analysis and discussion to verify the effectiveness and merits of the
ed
The designed experimental tests consist of simulation data, hybrid data, task-related data and resting-state data experiments. The performance evaluation
ce pt
includes the following aspects: the separation performance of WASICA regarding to super-Gaussian sources or Gauss-like sources is firstly investigated on Simulation 1 dataset; then, the recovery ability and detection sensitivity of functional signal and the
Ac
accuracy of temporal dynamics extraction as to WASICA, compared with FastICA, Infomax and SACICA, are assessed on Simulation 2 dataset and the hybrid dataset; the effectiveness and merits validation of WASICA on task-related dataset are further investigated with temporal correlation analysis and spatial intra-consistency analysis, in comparison to GLM and other three above mentioned methods; at last, the effectiveness verification of WASICA on the resting-state dataset is also implemented.
21
Page 23 of 78
3.1. Simulation dataset 3.1.1. Simulation 1: Super-Gaussian and Gauss-like sources dataset These sources dataset was provided by the scholars Vince D. Calhoun and
ip t
Rogers F. Silva, which was recently utilized to explore the really successful nature of some ICA algorithms, e.g., FastICA and Infomax (Calhoun et al., 2013). These
cr
sources dataset includes four cases of simulation sources, i.e., the case of unimodal
us
super-Gaussian sources with small box size (Case A), the one of bimodal close-to-Gaussian sources (in the sense of kurtosis) with medium box size (Case B),
an
the one of bimodal very-close-to-Gaussian sources (in the sense of kurtosis) with
M
large box size (Case C) and the one of unimodal super-Gaussian sources with very large box size (Case D); each case consists of 100 realizations and each realization is
ed
with two simulated sources (S1 and S2) (for details, please see Figure 1 and Table 1 of Calhoun et al., 2013). On the basis of known paired sources S1 and S2 of every
ce pt
realization regarding to each case, the randomly generated mixing matrix with size of 2 2 , whose values belonged to
0,1 ,
was used to form the simulated mixtures
Ac
(formed 100 mixture realizations for each case, totally 400 mixtures realizations for all above four cases), which could be applied to completely assess the performance of WASICA.
3.1.2. Simulation 2: A comprehensive simulation data The recently released toolbox SimTB (http://mialab.mrn.org/software; Erhardt et al., 2012) was used to generate a comprehensive simulation dataset to evaluate the performance of the FastICA, Infomax, SACICA and the proposed WASICA in this 22
Page 24 of 78
study. The SimTB toolbox could provide the ground truth of the simulated spatial maps (SM) and time courses (TC) that different estimation methods could be compared against, which was used in evaluating the capability of the temporal domain
ip t
concatenated group ICA (Calhoun et al., 2001b) under conditions of spatial, temporal, and amplitude variability (Allen et al., 2012) and the intrinsic networks extracting
cr
performance of first-level ICA and second-level ICA (Calhoun & Allen, 2013). Thus,
us
one subject data was generated, with V=148×148 voxels, 12 spatial sources without expansion or contraction (selected from the default source template provided by
an
SimTB) and 120 time points at TR (time of repeat)=2 seconds. The baseline intensity
M
was set to 800, and the baseline map was shown in Fig.3A. Each source, depicted in Fig.3B, represented a spatial pattern that underwent certain activation over time. Two
ed
sources (10 and 12) shared task-related modulation in addition to having unique fluctuations. For source 10, the strength of task-modulation (expressed as the ratio
ce pt
between task event amplitude and unique event amplitude) was set to 4, while task-relatedness was smaller for source 12, set to 2. Task-modulation was introduced
Ac
with a block design (24 seconds (s) off, 24s on, five blocks), which was convolved with a canonical hemodynamic response function to simulate the slow dynamics of the vascular response (Friston et al., 1995). Activation for the other ten sources was described solely by unique hemodynamic fluctuations with no task-related variation. All sources had unique events that occurred with a probability of 0.2 at each TR. For task-modulated sources (10 and 12), unique events were added with smaller amplitudes (0.2 and 0.4, respectively). For sources not of interest (no task modulation), 23
Page 25 of 78
the unique amplitude was 1. For all sources, the percent signal change was centered at 3 with a standard was deviation of 0.25. Additive noise was included to reach a specified contrast-to-noise ratio of 1. The TCs corresponding to the selected 12
ip t
sources were depicted in Fig.3C.
cr
3.2. Hybrid data
To assess the source recovery ability of WASICA at a deeper level, the hybrid
us
data experimental test was designed. One 7×7×7-voxel simulated signal S (marked in
an
Fig.4A) was added to the normalized resting-state fMRI data, at an amplitude equal to 0.3 times of the average amplitude of the signal of in-brain voxels within the mean
ed
mixing function, as shown in Fig.4B.
M
brain volume over all time-points. The signal changed over time, corresponding to
Furthermore, in order to evaluate the detection sensitivity of functional signal
ce pt
concerning WASICA, the weak signal change in contrast to the background was constructed by the smoothing technique on hybrid data at variant smooth kernels (namely full width at half maximum, FWHM). The illustration was shown in Fig.5,
Ac
where Area A denoted the added original simulated signal S, and Area B denoted the subtle signal change area (generated by smooth technique) related to the simulated signal S on the surrounding area of signal S. We called union of Area A and Area B the affected area of signal S due to smoothing, denoted as Area A B . If a
certain analysis method could discover more voxels in Area A B influenced by smooth operation, we said that this method had better performance of detection sensitivity of functional signal. Namely, for a given method, the discovered region 24
Page 26 of 78
closer to Area A B , the stronger functional signal detection sensitivity. 3.3. Task-related Data One participant took part in this designed visual stimulus task experimental test,
ip t
informed about the purpose of this study and given the written informed consent prior
cr
to inclusion. The visual paradigm was OFF-ON-OFF-ON-OFF-ON-OFF in 20 seconds each block. The visual stimulus was a radial blue/yellow checkerboard,
us
reversing at 7 Hz, corresponding to the “ON” state. At the “OFF” state, the participant
an
was required to focus on the cross at the center of the screen. The BOLD fMRI dataset was acquired on a Philips 3.0 Tesla scanner using a multi-element receiver coil
M
to allow partially parallel image acquisition, a single-shot SENSE gradient echo EPI
ed
with TR of 2.0s. There were 40 slices providing the whole-brain coverage, with a SENSE acceleration factor of 3.0 and scan resolution of 80×80; the in-plane
ce pt
resolution was 3mm×3mm; slice thickness was 3 mm; slice gap was 1 mm. 3.4. Resting-state Data
Ac
The resting-state dataset was downloaded from the public neuroimaging database (http://www.nitrc.org/projects/fcon 1000/). This dataset was released by Dr. James J. Pekar and Dr. Stewart H. Mostofsky, including twenty-three healthy participants (8 males and 15 females). The participants’ ages ranged from 20 to 40. BOLD fMRI data were acquired on a Philips 3.0 Tesla scanner using a multi-element receiver coil to allow partially parallel image acquisition with single-shot SENSE gradient echo EPI with the following acquisition parameters: a SENSE acceleration factor=2.0, TR=2.5s, 25
Page 27 of 78
scan resolution=96×96, in-plane resolution=2.67mm×2.67mm, slice thickness=3mm, 47 slices providing the whole-brain coverage and time points=123. In the resting-state experimental test, data from a single subject was selected to evaluate the performance
ip t
of the proposed WASICA.
4. Results and Analysis
cr
4.1. Data processing
us
4.1.1. General data processing description
an
All computation was implemented on a server with Intel(R) Xeon(R) E5606 2.13 GHz processor and 32 GB RAM. The operating system platform was Windows Server
ed
running on the Matlab software.
M
Enterprise Service Pack1. All steps for preprocessing or processing algorithms were
For the simulation data experiments, no preprocessing step was involved. For all experiments,
the
common
preprocessing
was
performed
in
SPM8
ce pt
other
(www.fil.ion.ucl.ac.uk/spm/), including slice-timing, motion correction, spatial normalization. To assess the functional signal detection sensitivity of WASICA, the
Ac
smooth technique with varying Gaussian kernel sizes (FWHM ranged from 1mm to 8mm with step 1mm) was used to generate subtle functional signal area surrounding the simulated signal S on the hybrid data. Similarly, for the task-related data, the smooth Gaussian kernels also ranged from 1mm to 8mm with step 1mm. On the resting-state dataset, the normalized and unsmoothed data was used in Appendix B, and the smoothed data with FWHM equal to 8mm was used in Section 4.6. All the voxels' TCs except the simulation experiments were detrended by a first-degree 26
Page 28 of 78
polynomial (i.e., linear trend) and slow oscillations (i.e., discrete cosine transform (DCT) basis function up to cut-off frequency of 1/250 Hz). In simulation and hybrid data experimental tests, the four methods, namely
ip t
FastICA, Infomax, SACICA, and WASICA, were applied to do the performance comparison, while the GLM was adopted in task-related experimental test besides the
cr
above four methods. In resting-state experimental test, only WASICA was used to
us
explore the resting-state networks (RSN). The order number of simulation data was set to 13 (twelve simulated sources and one baseline component) for the above four
an
data-driven methods. The component order of the other three kinds of experimental
M
tests, for the above four data-driven methods, were automatically estimated by Laplace approximation method (Minka, 2000), which was also used in estimating the
ed
number of sources in the probabilistic ICA model of fMRI data analysis (Beckman & Smith, 2004). For SACICA, the 1D wavelet packet decomposition with three levels
ce pt
(one level for Simulation 1) using db4 wavelet basis and Shannon entropy was applied to the experimental dataset, and the FastICA on the sparse approximation
Ac
coefficients set was implemented (Wang et al., 2013). In WASICA, the 1D wavelet packet decomposition with four levels (one level and three levels for Simulation 1 and Simulation 2, respectively), db4 wavelet basis and Shannon entropy was also applied to the experimental data, where the out-of-brain areas of this data were excluded (no exclusion in the simulation data); for the automatic nodes selection and integration based on the RWPE procedure, the thresholds Thresh for all the experimental tests were set to 0.99. Particularly, the GLM analysis using SPM8 software was applied to 27
Page 29 of 78
the task-related dataset, with the p-value and extent threshold equal to 0.05 (corrected for multiple comparisons) and 10 voxels, respectively. In addition, the location and display of these normalized networks were assessed
ip t
by the PickAtlas software toolbox (Maldjian et al., 2003, 2004) and MRIcroN software (http://www.mricron.com), respectively.
cr
4.1.2. Reliability consideration of ICA estimates
us
In our four designed experimental tests, the stochastic nature of the involved
an
ICAs (e.g., FastICA, Infomax) or ICA-based algorithms (e.g., SACICA, WASICA) was considered. Regarding to Simulation 1, each of the randomly generated 400
M
mixtures (100 mixture realizations for each case, four cases in total) was run 12 times
ed
by each testing algorithm. Meanwhile, for Simulation 2, hybrid, task-related, and resting-state experiments, the stable run results with the maximum stability metric
ce pt
based on the intra-cluster similarity matrix after ICASSO (Himberg et al., 2004) was used, which was firstly proposed by Ma et al. (2011), and was later developed as the default
approach
in
the
GIFT toolbox
(http://mialab.mrn.org/software/gift/,
Ac
GIFTv2.0a). The corresponding parameters of ICA running times, minimum cluster size and max cluster size were set to 12, 2 and 13, respectively. 4.2. Measures for performance assessment To the beginning, the Pearson correlation coefficient and normalized mean square error (NMSE) metrics used in this research could be defined as follows: corrcoef ( A,B)
cov(vec( A), vec( B)) , var (vec( A))var (vec( B))
(23)
28
Page 30 of 78
NMSE ( A,B)
vec( A) vec( B) vec B
2
2 F
,
(24)
F
where A and B were the estimation and reference vectors (or matrices), respectively; cov( ) , var ( ) and vec( ) represented the covariance operation,
ip t
variance operation and vector transform operation (transforming a matrix into a
cr
column vector), respectively.
Further, the normalized inter-symbol interference (ISI) (Moreau & Macchi, 1994;
us
Anderson et al., 2012; Calhoun et al., 2013) was considered to evaluate the separation
an
performance in this study, which was invariant to the scaling and permutation ambiguities:
Q Q Q Q Ga,b Ga,b 1 1 1 , b 1 a 1 max c Gc,b 2Q Q 1 a 1 b 1 max c Ga,c
M
ISI G
(25)
ed
where G WA , Ga,b was the ath and bth element of G and Q represented the number of sources. This performance metric was bounded between 0 and 1 and the
ce pt
lowest the ISI value the better the separation performance. In addition, the receiver operating characteristic (ROC) curve and ROC power
Ac
analysis (Skudlarski et al., 1999) were taken to evaluate the performance among different algorithms. In our hybrid data experiments, ROC curve was utilized to assess the performance of signal recovery ability corresponding to FastICA, Infomax, SACICA and WASICA. Each point determined by the true positive rate (TPR) and false positive rate (FPR) of the ROC curve was calculated on the basis of the real active template (see Fig.5, illustrating the template with FWHM=6mm) and the z-thresh values of the z-score maps varying from 1.0 to 3.0 with
2 step length. The 30
29
Page 31 of 78
sizes of the real active templates of simulated signals in hybrid data experimental test were determined by the varying Gaussian smoothing kernels. The ROC power analysis was designed to evaluate the detection sensitivity of functional signal with
ip t
regard to FastICA, Infomax, SACICA and WASICA across the variant FWHM values. Each ROC power value was the area surrounded by the ROC curve subjected to a
cr
certain FWHM level.
us
4.3. Simulation data result
an
4.3.1. Results for Simulation 1
For each simulated mixture in Simulation 1, the FastICA, Infomaxsuper,
M
Infomaxsub, SACICA, WASICAsuper, WASICAsub and WASICAfast1 were run 12 times
ed
each to alleviate the result random effects of individual runs. Thus, for each method mentioned above, we obtained 1200 separation results with regard to each case, i.e.,
ce pt
Case A, Case B, Case C and Case D. Further, the normalized ISI values were calculated according to equation (25) for each case, and the corresponding mean values and SDs (standard deviation) of ISIs for each case were recorded in Table 1. As
Ac
presented in Table 1, it was easily concluded that (1) FastICA, Infomaxsuper and all versions of WASICA perform well for Case A (the case of unimodal super-Gaussian sources with small box size); (2) WASICAsuper and WASICAsub perform well, and all others mostly fail for Case B (the case of bimodal close-to-Gaussian sources with
1
Infomaxsuper and Infomaxsub denote Infomax with sigmoid nonlinearity that assumes a unimodal super-Gaussian
source and with a nonlinearity which assumes a sub-Gaussian source, respectively. WASICAsuper, WASICAsub and WASICAfast represent WASICA combining with Infomaxsuper, Infomaxsub and FastICA, respectively. 30
Page 32 of 78
medium box size); (3) only Infomaxsub performs well for Case C (the one of bimodal very-close-to-Gaussian sources with large box size); (4) FastICA, Infomaxsuper, WASICAsuper and WASICAsub perform well for Case D (the one of unimodal
ip t
super-Gaussian sources with very large box size). In addition, we also could observe that (1) WASICA embedding with Infomax was better for bimodal Gauss-like sources
cr
estimation than the others did (Case B); WASICA combining with Infomax
us
demonstrated better separation performance than WASICA with FastICA; WASICAsuper with mostly lower mean values and less SDs of ISIs exhibited better and
an
stable performance than WASICAsub did. Thus, we selected WASICAsuper as the one
M
for the following experimental analysis, and we signified WASICAsuper as WASICA for simplicity. Meanwhile, Infomaxsuper was also selected as the representative of
ed
Infomax for the following experimental tests due to the spatial super-Gaussian
ce pt
property of brain functional networks (McKeown et al., 1998). 4.3.2. Results for Simulation 2
In Simulation 2 experimental test, the FastICA, Infomax (namely Infomaxsuper),
Ac
SACICA, WASICA (namely WASICAsuper) were run 12 times each to alleviate the result from random effects of individual runs, and the most stable run results (see Section 4.1.2) of each method were used as the representative results. Meanwhile, the intensity of all the detected SMs and TCs was normalized to the range 1, +1 and +0.5 , -0.5,
respectively, for display purposes. The SMs determined by the above
four methods and the ground truth sources were depicted in Fig.6, denoting that all four methods successfully detected the twelve simulated sources. Generally, the SMs 31
Page 33 of 78
extracted by WASICA were visually with less noise and variation, compared with the estimates by the others. Further, to perform a quantitative performance comparison of the four methods in both the spatial and temporal domains, the Pearson correlation
ip t
coefficients and NMSEs between the spatial/temporal ground truths and the corresponding spatial/temporal estimates determined by the four methods were
0.9826 ) 0.7843,
denoted that all the 12 SMs
us
(with zoom-in correlation range
cr
calculated and plotted in Fig.7. Fig.7A (with normal correlation range 0, 1 ) and B
determined by WASICA had the stronger spatial correlation with their corresponding
an
ground truth sources, compared with the corresponding ones by the other three
M
methods. Further, the minimum one among the spatial correlation coefficients corresponding to WASICA was 0.9403, whereas the minimum one among the spatial
ed
correlation coefficients corresponding to FastICA, Infomax and SACICA were 0.7841, 0.7965 and 0.7843, respectively. This phenomenon also denoted that the WASICA had
ce pt
the more accuracy of source recovery capability in spatial domain, in contrast to the other three methods. In addition, the spatial NMSE curves against the source number
Ac
as to the four methods were depicted in Fig.7E, and it was clearly shown that the spatial NMSE curve corresponding to the WASICA lay at the bottom among the four curves, which further declared that all the twelve SMs extracted by WASICA were indeed closer to the ground truth sources than the corresponding ones by the other three methods. Besides the spatial domain analysis, the temporal domain comparison analysis was also investigated. To the beginning, the temporal Pearson correlation coefficients 32
Page 34 of 78
between the ground truth TCs of all simulated sources and the corresponding TCs determined by the above four methods were calculated, and the temporal correlation curves against different sources corresponding to the above four methods were depicted in Fig.7C (with normal correlation range
1 ) 0,
and D (with zoom-in
ip t
correlation range 0.9360, 1 ). Similar to spatial domain, it also could be observed
cr
that the temporal correlation curve as to WASICA lay at the top among all the four
us
curves. Meanwhile, the mean value and the SD value of each temporal correlation curve were respectively calculated and recorded as 0.9753 and 0.0209 for FastICA,
an
0.9884 and 0.0082 for Infomax, 0.9699 and 0.0203 for SACICA, 0.9959 and 0.0031
M
for WASICA, denoting that WASICA had the largest mean value and least SD value. Furthermore, the temporal NMSE curves against the source number as to the four
ed
methods were also illustrated in Fig.7F, and it was clearly shown that the temporal NMSE curve regarding to WASICA lay at the bottom among the four curves. In
ce pt
addition, the mean ISI values (see equation (25)) and the SD values based on the 12 runs for the above four methods were calculated and recorded in Table 2; As expected,
Ac
the proposed WASICA performed the best with the least mean value and SD value of ISI. To sum up, all the above quantitative analysis in temporal domain revealed that the proposed method had better temporal dynamics extraction performance than the other three methods did on Simulation 2. 4.4. Hybrid data result The ROC curves of signal S on the unsmoothed hybrid data experimental test corresponding to FastICA, Infomax, SACICA and WASICA were depicted in Fig.8A 33
Page 35 of 78
(with normal TPR range
0,1 )
and B (with zoom-in TPR range
0.9329,1 ),
respectively. The ROC analysis demonstrated that WASICA occupied the stronger signal recovery ability in spatial domain than the other three methods did, specifically
ip t
obvious for FPR less than 0.05. Furthermore, in order to assess detection sensitivity of the functional signal regarding to the above four methods, the ROC power analysis
cr
was performed on the hybrid dataset with varying FWHM values (from 0mm to 8mm
power range
0.4284, 0.9968 )
us
with 1mm steps). Fig.8C (with normal power range 0,1 ) and D (with zoom-in presented the evaluated results of the detection
an
sensitivity performance using ROC power analysis, which denoted that the proposed
M
method had the better detection sensitivity of functional signal than the other three methods did. Specifically, the predominance in signal detection sensitivity as to
ed
WASICA was more obvious when the FWHM value was less than 6mm. According to Fig.8D, it was easily observed that FastICA had slightly weaker functional signal
ce pt
detection sensitivity than both SACICA and Infomax did, while these two methods had the similar performance in signal sensitivity detection.
Ac
4.5. Task-related data result
The visual stimulus dataset at different FWHM values was explored by GLM,
FastICA, Infomax, SACICA and WASICA, where the results were displayed in Fig.9. On the basis of Fig.9A, in terms of each row (at a specific FWHM value), the SMs (spatial map) detected by the five methods were very similar, which obviously denoted the effectiveness of the proposed method in spatial domain. Meanwhile, in terms of each column of Fig.9A, it could be observed that the SMs detected by 34
Page 36 of 78
WASICA across the varying FWHM values (the fifth column) were with more consistency and less variation than the ones by the other four methods visually. A quantitative comparison about the spatial consistency is discussed further below in
ip t
detail. Furthermore, the visual stimulus reference block regarding to GLM and the corresponding TCs (note: each method gave one TC for the entire SM with regard to
cr
visual stimulus) respectively given by FastICA, Infomax, SACICA and WASICA on
us
the smoothed data (FWFM=8mm) were exhibited in Fig.9B, and the high temporal correlation (the correlation coefficients marked in Fig.9B) between the TC extracted
an
by WASICA and the other TCs (or reference block for GLM) by the other four
M
methods further demonstrated the temporal domain effectiveness of WASICA for the visual stimulus task data. Similarly, the good temporal domain effectiveness of
ed
WASICA was also supported by the results from the visual task-related data under different FWHM values (see Fig.S1).
ce pt
In addition, the spatial map consistency analysis across variant FWHM values (called intra-consistency) for the visual stimulus task experiment was investigated.
Ac
Suppose that the SMs related to the given stimulus at variant smoothing kernels (ranging from 0mm to V mm with 1mm step) were extracted by a certain method, for example, WASICA, and we denoted the SMs with FWHM size equal to u mm as SM smoothu . Thus, the Pearson correlation coefficients between any two different spatial
maps SM smoothu and SM smoothv could be calculated, where variant smoothing kernel ranged from 0mm to V mm with 1mm step. Thus, on the basis of formula (23), the formulation expression could be represented as: 35
Page 37 of 78
CCSM u,v abs corrcoef ( SM smoothu ,SM smoothv ) , which subjected to u, v 0,1, 2,
(26)
,V 1,V u v . Further, the function vec()
was used to transform the matrix
CCSM
to a vector
V V 1 ' . Based on CCSM , 2
ip t
' ' CCSM vec CCSM , and the length n0 of CCSM was
' CCSM , namely
cr
two intra-consistency metrics across the variant smooth kernels, namely mean
follow:
CC h 1
' SM
(h)
(27)
1
2 CC (h) MCL . h 1 n0
' SM
2
(28)
M
1 VCL n0
n0
an
1 MCL n0
us
consistency level (MCL) and variant consistency level (VCL) could be expressed as
ed
The higher MCL value and lower VCL value, the better intra-consistency of the SMs across the variant smooth kernels. On the visual stimulus experimental test, the MCL
ce pt
and VCL values of the SMs extracted by GLM, FastICA, Informax, SACICA and WASICA, with V equal to 8, were calculated and recorded in Table 3, respectively. As expected, it exhibited that WASICA had the highest MCL values and lowest VCL
Ac
values, which demonstrated that the SMs extracted by WASICA occupied the best intra-consistency against the smooth kernel change among the five methods. The least VCL value of WASICA also supported that the SMs (the fifth column, shown in Fig.9A, detected by WASICA) visually held the strongest intra-consistency and least variation across the variant FWHM values. Last but not the least, from the first row of Fig.9A (unsmoothed), it could be obviously observed that the proposed method discovered more active voxels on the 36
Page 38 of 78
occipital lobe related to the visual stimulus than the other four methods did, where the number of active voxels of each SM (the first row of Fig.9A) determined by GLM, FastICA, Infomax, SACICA and WASICA were 4655, 8227, 8152, 8107 and 9108,
ip t
respectively. Thus, the voxels only discovered by WASICA (on unsmoothed data) in contrast to the GLM, FastICA, Infomax and SACICA, were extracted and depicted in
cr
Fig.10A. It could be seen that the very many voxels (shown in Fig.10A) really
us
belonged to the visual stimulus related occipital cortex area. Furthermore, the TCs corresponding to the depicted voxels were extracted from the original normalized and
an
unsmoothed visual stimulus task data, and the mean TC of these voxels were
M
calculated, normalized and plotted in Fig.10B. Meanwhile, the left circular shift operation with 4 time points (the estimated delay of the BOLD response to neuronal
ed
stimulation) was performed on the normalized mean TC, and the corresponding shifted mean TC was also presented in Fig.10B. It could be clearly observed that the
ce pt
mean TC and shifted mean TC (shown in Fig.10B) of these voxels were highly correlated to the reference block with the Pearson correlation coefficients equal to
Ac
0.3519 and 0.8203 (marked in Fig.10B), respectively. Thus, these results in terms of both the spatial and temporal domains demonstrated the effectiveness of the voxels only detected by WASICA, in comparison to GLM, FastICA, Infomax and SACICA. 4.6. Resting-state data result The proposed WASICA was lastly utilized to assess the RSNs in the resting-state experimental test. As expected, some typical RSNs reported by many previous researches (Beckmann et al., 2005; Damoiseaux et al., 2006; Schöpf et al., 2010; 37
Page 39 of 78
Wang et al., 2012a, 2013) were successfully discovered such as VIN (a predominantly occipital network, Fig.11A), LVN (a lateral visual network involving the lateral and superior occipital gyrus, Fig.11B), DMN (a default mode network involving primarily
ip t
precuneus and prefrontal lobe and parietal regions, Fig.11C), AUN (a network involving the auditory regions, Fig.11D), ECN (an executive control network
cr
including superior and middle prefrontal cortex, anterior cingulate and paracingulate
us
gyri, Fig.11E), DPN1 and DPN2 (networks mainly involving dorsal parietal cortex, Fig.11F and J), LWMN and RWMN (networks including the left and right hemisphere
an
regions known to be involving in working memory, Fig.11G and H), and SMN (a
M
network including bilateral sensorimotor cortex, Fig.11I), being exhibited in Fig.11. In addition, Fig. 11K depicted a network including the superior temporal (BA 22) and
ed
superior frontal (BA 9) areas; Fig. 11L showed part of the prefrontal (BA 10/11) areas and limbic lobe, which seemed to be part of DMN shown in Fig. 11C (Damoiseaux et
ce pt
al., 2006). According to RSNs shown in Fig.11, it clearly demonstrated that WASICA was also effective to perform resting-state data analysis to extract the resting-state
Ac
functional brain sources.
5. Discussion and Conclusion 5.1. Merits of WASICA In this present study, we demonstrated an effective functional connectivity detection model──WASICA, aiming at separating the super-Gaussian sources and some Gaussian-like sources, which demonstrated the superiority of sparse approximation in functional network dynamics detection. Similar to our previous 38
Page 40 of 78
SACICA, the proposed model was composed of three procedures: the wavelet packet decomposition, the sparse approximation coefficients set formation and the ICA-based decomposition and reconstruction. In terms of theory, the proposed model
ip t
first fully took advantage of the sparsity of the wavelet packet tree nodes, which was one of the essential aspects of WASICA success. Then, the wavelet shrinkage
cr
operation of the wavelet packet tree nodes played a great role in not only contributing
us
to a sparser approximation of the fMRI mixtures explicitly and of the sources implicitly, but also suppressing the noise in wavelet coefficients. In the follow, the
an
automatic nodes selection and integration based on the RWPE acted a great part in
M
forming the effective sparse approximation coefficients set as to the mixtures. The features of the wavelet packet transform, wavelet shrinkage operation and the
ed
automatic nodes selection and integration operation were explored in Appendix A, which denoted that these operations could transform the mixtures X (representing
ce pt
the simulation dataset in Appendix A) with sub-Gaussian distribution in spatial domain into the sparse approximation coefficients set C X with strong super-Gaussian
Ac
distribution (or strong sparsity) in wavelet domain. According to the research (Calhoun et al. 2013) and sparse approximation theory, this could contribute to the accuracy of temporal dynamics model estimation for both super-Gaussian and Gaussian-like sources when using ICA (i.e., Infomax) with the regular parameters setting. Furthermore, the approximation set X reconstructed based on the sparse approximation coefficients set C X was expected to exclude much nuisance noise in original data and to enhance the functional network recovery ability and detection 39
Page 41 of 78
sensitivity. Combining with the de-noised approximation dataset
X , the
least-square-based functional networks reconstruction ensured that the estimated functional networks were achieved with more accuracy than the ones on the basis of
ip t
original mixture X , where the related performance validation was presented in Appendix B.
cr
In practice, to begin with, WASICA showed good separation performance for all
us
super-Gaussian cases (Case A and D) and close-to-Gaussian sources case (Case B), and failed in very-close-to-Gaussian sources case (Case C, discussed in Section 5.4)
an
according to the results presented in Table 1. It was worth noting that: on one hand,
M
only WASICA succeeded in Case B where other methods, i.e., FastICA, Infomax and SACICA, mostly failed, which revealed the superiority of WASICA in Gaussian-like
ed
sources separation; on other hand, compared with Infomax, the WASICA embedding with Infomax was less subject to the prerequisite that a nonlinearity assumption
ce pt
should satisfy a unimodal super-Gaussian or a sub-Gaussian source. This merits originated from the effective super-Gaussian property of the formed sparse
Ac
approximation coefficients set in WASICA. Thus, combining WASICA with Infomaxsup was a better choice for signal separation. Further, WASICA had shown the superiority in both the spatial and temporal domain performance in the designed experimental tests. For example, in Simulation 2 experiment, the spatial correlation analysis and spatial NMSE comparison (depicted in Fig.7A, B and E) clearly demonstrated the stronger source recovery ability of WASICA in contrast to the FastICA, Infomax and SACICA; further, the temporal correlation analysis, temporal 40
Page 42 of 78
domain NMSE analysis and the normalized ISI metric evaluation also revealed that the proposed method had the best temporal dynamic model extraction performance among these four methods (shown in Fig.7C, D, F and Table 2). In the follow, in the
ip t
hybrid data experimental tests, the ROC analysis on the unsmoothed hybrid data (shown in Fig.8A and B) denoted that the WASICA held the superiority in the source
cr
recovery ability compared with the other three methods; meanwhile, the ROC power
us
analysis (i.e., the area under the curve, AUC) of the hybrid dataset with different smooth kernels presented that the WASICA was clearly possessed of the stronger
an
signal detection sensitivity than the other three methods (results shown in Fig.8C and
M
D). Thus, the Simulation 2 and hybrid data experimental results revealed that the proposed WASICA had the best spatial and temporal domain separation performance
ed
among the four data-driven methods. Furthermore, the task-related data experimental results also presented the validity and merits of the proposed model. For example,
ce pt
Fig.9 revealed that WASICA had both the spatial domain and temporal domain effectiveness of functional network detection regarding to the visual stimulus data
Ac
across the variant FWHMs; meanwhile, the higher intra-consistency of the visual networks by WASICA revealed that the proposed method held the stronger spatial robustness against the smooth kernels than the GLM, FastICA, Infomax and SACICA did (shown in Table 3); especially, from the first row of Fig.9A (unsmoothed), it was clearly observed that the proposed method discovered more active voxels (depicted in Fig.10A) on the occipital lobe related to the visual stimulus than the other four methods did; the mean TC (calculated based on the time series of voxels shown in 41
Page 43 of 78
Fig.10A from the normalized and unsmoothed task-related data) and the corresponding shifted mean TC (shown in Fig.10B) of these voxels only detected by WASICA were highly correlated to the reference block design related to the visual
ip t
stimulus, which declared that the voxels (depicted in Fig.10A) were located correctly. Lastly, it was also verified that the proposed model could powerfully extract the RSNs
cr
on the resting-state dataset. Some typical RSNs such as VIN, LVN, DMN, AUN, ECN,
us
DPN1, DPN2, LWMN, RWMN, SMN and so on, were successfully extracted (exhibited in Fig.11). In a word, all the theoretical and experimental features and
an
merits mentioned above demonstrated that WASICA was a promising blind functional
M
signal separation model with stronger source recovery ability, signal detection sensitivity, better temporal dynamic model extraction and stronger spatial robustness
ed
against the smoothing kernels for fMRI data analysis.
ce pt
5.2. Selection of decomposition level & PMRWPE Thresh The parameters such as decomposition level and PMRWPE Thresh in WASICA should be noted. The authors recommend that the number of wavelet packet
Ac
decomposition level ( J ) should be set slightly larger, as compared to the simulated single-slice data, for the volume data with multi-slices and huge quantity voxels, to achieve better blind separation performance. For example, in our Simulation 1 and Simulation 2 dataset, the decomposition level was set to J 1 and J 3 , respectively, for the simulated volume data with just one slice; in the other three kinds of experimental tests, the decomposition level was set to J 4 for the normalized volume data with 91 slices. Further, though the setting of the parameter PMRWPE 42
Page 44 of 78
Thresh was empirical, the authors recommend Thresh 0.99 for fMRI data
analysis on the basis of our experimental tests, which usually generated good-enough results. In addition, on the basis of the above given PMRWPE Thresh , the automatic
ip t
nodes selection and integration procedure of WASICA can adaptively choose the different number (namely the q value ) of the shrunk wavelet nodes on the
cr
simulation dataset, hybrid dataset and real dataset, respectively (the automatically
us
determined q values and the total number of nodes at the level J on different
an
experimental tests were recorded in Table S1 of the supplementary material). 5.3. 1D wavelet vs. 3D wavelet
M
Under the WASICA’s framework, the authors recommend that the 1D wavelet
ed
packet analysis should be utilized to form the sparse approximation coefficients set, rather than the 3D wavelet or wavelet packet analysis, although each volume of the
ce pt
fMRI mixtures was 3D data. The main reason was that each data volume contained too many zeros and the noise level i about the mixture xi could not be accurately estimated based on the interquartile range (see formula (13)) or the median estimator
Ac
(proposed by Donoho & Johnstone, 1994). For example, the common normalized volume data was usually with the dimensions 91109 91 , the non-zero voxel number was approximate to 274599, while the zero value voxel number was approximate to 628030. This phenomenon meant the noise level i was always estimated as 0 based on the interquartile range or median measure. Thus, the incorrect noise level estimation could put the wavelet shrinkage sub-procedure and the automatic nodes selection and integration sub-procedure out of action, in the 3D 43
Page 45 of 78
wavelet case. However, as to the 1D wavelet packet transformation based WASICA model, the non-brain area voxels were excluded for the real fMRI data analysis in the implementation procedure, which was beneficial to noise level i estimation.
ip t
Meanwhile, our experimental results also denoted the WASICA using 1D wavelet for C X formation outperformed the WASICA using 3D wavelet for C X formation
cr
under the current WASICA framework (shown in Fig.S2 of the supplementary
us
material). In spite of that, 3D wavelets plus better handling of coefficients related to areas out of the brain would possibly yield better signal compression and spatial
an
characterization (i.e., make C X more super-Gaussian). For example, Kullar et al.
M
(2011b) successfully implemented the denoising on 3D wavelets by "masking" sub-bands, which could be a potentially better way to handle zero-valued coefficients
ed
corresponding to areas out of the brain.
ce pt
5.4. Limitation and Future work
According to the results shown in Table 1, one limitation is that the Case C of bimodal very-close-to-Gaussian sources with large box size was not successfully
Ac
recovered by all versions of WASICA. However, in real fMRI data analysis, spatial functional networks usually depicted stronger super-Gaussian property (McKeown et al., 1998), which implies that this proposed WASICA still could be effective for fMRI data analysis. In spite of that, the bimodal very-close-to-Gaussian sources separation under WASICA is one of our future works. Meanwhile, another future research could be involved in the multi-subjects-based WASICA for group fMRI mixtures separation and distinguishing inference between different control groups. 44
Page 46 of 78
Acknowledgments The authors greatly appreciate the insightful comments of anonymous reviewers, which have helped the authors to improve the manuscript. The authors are highly
ip t
grateful to Dr. Jianbiao Chen and Dr. Elena Allen for help with some discussions
cr
about sparse representation and the usage of SimTB toolbox, respectively. The authors also deeply appreciate the essential help from Dr. Vince D. Calhoun and Mr. Rogers F.
us
Silva for providing the Simulation 1 dataset. Research supported by the National
an
Natural Science Foundation of China (Grant No.31170952), the Innovation Program of Shanghai Municipal Education Commission (Grant No. 11ZZ143), the Research
M
Fund for the Doctoral Program of Higher Education of China (Grant No.
ed
20113121120004), the Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of Higher Learning, the Project Sponsored by the
ce pt
Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry, Shanghai Municipal Natural Science Foundation of China (Grant No. 13ZR1455600) and the Programs for Graduate Special Endowment Fund for
Ac
Innovative developing (Grant No. 2013ycx072) and Excellent Doctoral Dissertation Cultivation (Grant No. 2013bxlp003) of Shanghai Maritime University.
Appendix A. Wavelet shrinkage & automatic nodes selection and integration: the effect investigation In WASICA, the sparse approximation coefficients set formation consisted of two sub-procedures, namely the wavelet shrinkage of wavelet packet tree nodes and 45
Page 47 of 78
the automatic nodes selection and integration based on the RWPE. In this part, the performance of the two sub-procedures was further investigated to declare the merits of WASICA. On Simulation 2, the kurtosis values of each volume data xi , 1 i P ,
coefficients
Node H (i,J,0 : 2J 1)
and
the
selected
and
ip t
the corresponding wavelet coefficients Node(i,J,0 : 2 J 1) , the threshed wavelet integrated
wavelet
cr
H (i, J ,0 : 2 J 1) (namely c xi ) were calculated based on the coefficients Nodesel
us
Matlab in-house function kurtosis, respectively. Please note: the implemented kurtosis function was without subtracting 3, which made the Gaussian distribution with
an
kurtosis value equal to 3, super-Gaussian distribution with value greater than 3 and
M
sub-Gaussian distribution with value less than 3. The kurtosis curves corresponding to H xi , Node(i,J,0 : 2 J 1) , Node H (i,J,0 : 2J 1) and Nodesel (i, J ,0 : 2 J 1) , against
ed
the time points ( 1 i 120 ), were plotted in Fig.A-1 E. The mean value and the SD value as to each kurtosis curve were marked in Fig.A-1 E. It was easily observed that
ce pt
(i) the mixture data X of Simulation 2 in spatial domain had a sub-Gaussian distribution with kurtosis value less than 3; (ii) all the three kinds of the wavelet
Ac
coefficients, satisfying super-Gaussian distribution, had greater kurtosis values than X did;
(iii)
the
kurtosis
values
of
the
original
wavelet
coefficients
( Node(i,J,0 : 2 J 1) ), threshed wavelet coefficients ( Node H (i,J,0 : 2J 1) ) and H (i, J ,0 : 2 J 1) ) demonstrated an selected and integrated wavelet coefficients ( Nodesel
increasing trend, in that order. Further, Fig.A-1 F (with zoom-in mode) also depicted that the kurtosis values of Node H (i,J,0 : 2J 1) was slightly greater than the one of Node(i,J,0 : 2 J 1) , where 1 i 120 . Meanwhile, the histograms with regard to 46
Page 48 of 78
18th volume data and its three corresponding kinds of wavelet coefficients H (18, J ,0 : 2J 1) ) were ( Node(18,J,0 : 2J 1) , Node H (18,J,0 : 2J 1) and Nodesel
presented in order in Fig. A.1 (A-D), with labeling the respective kurtosis values,
ip t
which implied that the wavelet coefficients embedding with more zero were really sparser (or stronger super-Gaussian distribution) than the spatial domain, specifically
cr
H (18, J ,0 : 2J 1) . for Nodesel
us
To sum up, the above mentioned two sub-procedures were really beneficial to form the sparse approximation coefficients set, which we anticipated would help
an
identify transformed underlying sources C s with stronger super-Gaussian distribution
M
and, thus, give more accurate estimation of temporal dynamics according to the research (Calhoun et al., 2013) and sparse approximation theory (described in section
ed
2.1). As was expected, the sparse approximation coefficients set C X regarding to the original X indeed contributed to the temporal dynamic A estimation, as was fully
ce pt
validated in Section 4.3.2 Results for Simulation 2.
Appendix B. Brain networks reconstruction: Mixture
X Vs.
Ac
Approximation set X
The brain networks reconstruction in the last procedure of WASICA (i.e.,
ICA-based decomposition and reconstruction) is based on the unmixing operation of the approximation set X , which is generated by the inverse wavelet packet transformation as to the sparse approximation coefficients set C X (see Eq. (21)). The denoised feature of X in contrast to X could be beneficial to the quality of reconstructed functional networks. In order to validate this assumption, the 47
Page 49 of 78
normalized and unsmoothed resting-state BOLD fMRI data (the same subject used in Section 4.6) was analyzed by WASICA with only different reconstruction procedures, i.e., reconstruction strategies from mixture X or its approximation set X . Also, the
ip t
reliability consideration of WASICA was involved by the most stable run procedure (see Section 4.1.2). Based on two groups of results (one for X , the other for X )
cr
generated by WASICA, seven relatively obvious functional networks were selected
us
and presented by slice-by-slice (shown in Fig. B.1 A-G). Meanwhile, the obviously different regions of the equivalent SMs (spatial map) from two comparable groups
an
were marked by green box for X and blue box for X , respectively. By contrast, it
M
could be observed that the active regions of SMs reconstructed from X with relatively less noise variation visually were more clustered and obvious than the ones
ed
from X . This investigation supports the above assumption, which the approximation set X with denoised feature contributes to the quality of the separated brain
ce pt
networks in contrast to X .
References
Ac
Allen EA, Erhardt EB, Wei Y, Eichele T, Calhoun VD. Capturing inter-subject variability with group independent component analysis of fMRI data: a simulation study. Neuroimage 2012; 59(4):4141–4159. Anderson M, Li X, Rodriguez P, Adali T. An effective decoupling method for matrix optimization and its application to the ICA problem. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012; pp:1885–1888. Ambrosino S, Bos DJ, van Raalten TR, Kobussen NA, van Belle J, Oranje B, Durston 48
Page 50 of 78
S. Functional connectivity during cognitive control in children with autism spectrum disorder: an independent component analysis. J Neural Transm 2014; pp:1–11. Bagarinao E, Matsuo K, Nakai T, Sato S. Estimation of general linear model
ip t
coefficients for real-time application. Neuroimage 2003; 19(2):422–429. Beckmann CF, Smith SM. Probabilistic independent component analysis for
cr
functional magnetic resonance imaging. IEEE Trans Med Imaging 2004;
us
23(2):137–152.
Beckmann CF, Smith SM. Tensorial extensions of independent component analysis
an
for multisubject FMRI analysis. Neuroimage 2005; 25(1):294–311.
M
Beckmann CF, DeLuca M, Devlin JT, Smith SM. Investigations into resting-state connectivity using independent component analysis. Philos Trans R Soc B 2005;
ed
360(1457):1001–1013.
Bell AJ, Sejnowski TJ. An information-maximization approach to blind separation
ce pt
and blind deconvolution. Neural Comput 1995; 7(6):1129–1159. Biswal BB, Ulmer JL. Blind source separation of multiple signal sources of fMRI
Ac
data sets using independent component analysis. J Comput Assist Tomo 1999; 23(2):265–271.
Bronstein AM, Bronstein MM, Zibulevsky M, Zeevi YY. Sparse ICA for blind separation of transmitted and reflected images. Int J Imag Syst Tech 2005; 15(1):84–91. Calhoun VD, Adali T, Pearlson GD, Pekar JJ. Spatial and temporal independent component analysis of functional MRI data containing a pair of task-related 49
Page 51 of 78
waveforms. Hum Brain Mapp 2001a; 13(1):43–53. Calhoun VD, Adali T, Pearlson GD, Pekar JJ. A method for making group inferences from functional MRI data using independent component analysis. Hum Brain Mapp
ip t
2001b; 14(3):140–151. Calhoun VD, Kiehl KA, Pearlson GD. Modulation of temporally coherent brain
cr
networks estimated using ICA at rest and during cognitive tasks. Hum Brain Mapp
us
2008; 29(7):828–838.
Calhoun VD, Potluru VK, Phlypo R, Silva RF, Pearlmutter BA, Caprihan A, Plis SM,
an
Adalı T. Independent component analysis for brain fMRI does indeed select for
M
maximal independence. PloS one 2013; 8(8):e73309.
Calhoun VD, Allen E. Extracting intrinsic functional networks with feature-based
ed
group independent component analysis. Psychometrika 2013; 78(2):243–259. Comon P, Jutten C. Handbook of Blind Source Separation, Independent Component
ce pt
Analysis and Applications. New York: Academic, Elsevier, 2010. Damoiseaux J, Rombouts S, Barkhof R, Scherltens P, Stam CJ, Smith SM, Beckmann
Ac
CF. Consistent resting state networks across healthy subjects. Proc Natl Acad Sci U S A 2006; 103(37):13848–13853. Donoho DL, Johnstone IM. Ideal spatial adaption via wavelet shrinkage. Biometrika 1994; 81(3):425–455. Donoho DL, Johnstone IM. Adapting to unknown smoothness via wavelet shrinkage. J Am Stat Assoc 1995; 90(432):1200–1224. Donoho DL. De-noising via soft-thresholding. IEEE Trans Inform Theory 1995; 50
Page 52 of 78
41(3):613–627. Erhardt EB, Allen EA, Wei Y, Eichele T, Calhoun VD. SimTB, a simulation toolbox for fMRI data under a model of spatiotemporal separability. NeuroImage 2012;
ip t
59(4):4160–4167. Friston KJ, Frith CD, Turner R, Frackowiak RS. Characterizing evoked
cr
hemodynamics with fMRI. NeuroImage 1995; 2:157–165.
us
Gao HY, Bruce AG. WaveShrink with firm shrinkage. Stat Sinica 1997; 7(4):855–874. Gao HY. Wavelet shrinkage denoising using the non-negative garrote. J Comput
an
Graph Stat 1998; 7(4):469–488.
M
Gribonval R, Nielsen M. Beyond sparsity: Recovering structured representations by L1 minimization and greedy algorithms. Adv Comput Math 2008; 28(1):23–41.
ed
Himberg J, Hyvärinen A, Esposito F. Validating the independent components of neuroimaging time series via clustering and visualization. Neuroimage 2004; 22(3):
ce pt
1214–1222.Hyvärinen A, Oja E. A Fast Fixed-Point Algorithm for Independent Component Analysis. Neural Comput 1997; 9(7):1483–1492. Khullar S, Michael A,
Ac
Correa N, Adali T, Baum S, Calhoun VD. Wavelet-based fMRI analysis: 3-D denoising,
signal
separation,
and
validation
metrics.
NeuroImage
2011a;
54(4):2867–2884.
Khullar S, Michael A, Correa N, Adali T, Baum S, Calhoun VD. Wavelet-based denoising and independent component analysis for improving multi-group inference in fMRI data. IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2011b; pp:456–459. 51
Page 53 of 78
Kisilev P, Zibulevsky M, Zeevi YY, Pearlmutter BA. Multiresolution framework for blind source separation, CCIT Report no. 317, Technion Press, 2000. Kisilev P, Zibulevsky M, Zeevi YY. Multiscale Framework For Blind Separation of
ip t
Linearly Mixed Signals. J Mach Learn Res 2003; 4:1339–1363. Ma S, Correa NM, Li XL, Eichele T, Calhoun VD, Adali T. Automatic identification
cr
of functional clusters in FMRI data using spatial dependence. IEEE Trans BIO-MED
us
ENG 2011; 58(12): 3406–3417.
Makeig S, Bell AJ, Jung TP, Sejnowski TJ. Independent component analysis of
an
electroencephalographic data. Advances in neural information processing systems
M
1996; pp:145–151.
Maldjian JA, Laurienti PJ, Burdette JB, Kraft RA. An Automated Method for
ed
Neuroanatomic and Cytoarchitectonic Atlas-based Interrogation of fMRI Data Sets. NeuroImage 2003; 19(3):1233–1239.
ce pt
Maldjian JA, Laurienti PJ, Burdette JH. Precentral Gyrus Discrepancy in Electronic Versions of the Talairach Atlas. Neuroimage 2004; 21(1):450–455.
Ac
Mallat S. A Wavelet Tour of Signal Processing, Academic Press, San Diego, CA, 1998.
McKeown MJ, Makeig S, Brown GG, Jung TP, Kindermann SS, Bell AJ, Sejnowski TJ. Analysis of fMRI data by blind separation into spatial independent component analysis. Hum Brain Mapp 1998; 6:160–188. Minka TP. Automatic choice of dimensionality for PCA. Technical Report 514, MIT, 2000. 52
Page 54 of 78
Moreau E, Macchi O. A one stage self-adaptive algorithm for source separation. IEEE International Conference on Acoustics, Speech, and Signal Processing, 1994; pp: III-49–III-52.
ip t
Moulin P. Wavelet thresholding techniques for power spectrum estimation. IEEE Trans Signal Proces 1994; 42(11):3126–3136.
cr
Rytty R, Nikkinen J, Paavola L, Elseoud AA, Moilanen V, Visuri A, Tervonen O,
us
Renton AE, Traynor BJ, Kiviniemi V, Remes AM. GroupICA dual regression analysis of resting state networks in a behavioral variant of frontotemporal dementia. Front
an
Hum Neurosci 2013; 7:461.
M
Rosso OA, Blanco S, Yordanova J, Kolev V, Figliola A, Schürmann M, Başar E. Wavelet entropy: a new tool for analysis of short duration brain electrical signals. J
ed
Neurosci Methods 2001; 105(1):65–75.
Rosso OA, Martin MT, Figliola A, Keller K, Plastino A. EEG analysis using
ce pt
wavelet-based information tools. J Neurosci Methods 2006; 153(2):163–182. Schöpf V, Kasess CH, Lanzenberger R, Fischmeister F, Windischberger C, Moser E.
Ac
Fully exploratory network ICA (FENICA) on resting-state fMRI data. J Neurosci Methods 2010; 192(2):207–213. Skudlarski P, Constable RT, Gore JC. ROC analysis of statistical methods used in functional MRI: individual subjects. NeuroImage 1999; 9(3):311–329. Tropp JA, Gilbert AC, Strauss MJ. Algorithms for simultaneous sparse approximation. Part I: Greedy pursuit. Signal Process 2006; 86(3):572–588. Tropp JA. Algorithms for simultaneous sparse approximation. Part II: Convex 53
Page 55 of 78
relaxation. Signal Process 2006; 86(3):589–602. Wang N, Zeng W, Chen L. A Fast-FENICA method on resting state fMRI data. J Neurosci Methods 2012a; 209(1):1–12.
ip t
Wang N, Zeng W. A combination model of ICA and sparsity prior with respect to fMRI signal analysis. Front Comput Neurosci Conference Abstract: Bernstein
cr
Conference 2012b. doi: 10.3389/conf.fncom.2012.55.00014.
us
Wang N, Zeng W, Chen L. SACICA: A sparse approximation coefficients based ICA model for functional magnetic resonance imaging data analysis. J Neurosci Methods
an
2013; 216(1):49–61.
M
Zeng W, Luo L, Liang X, Wang S, Guo S. The pre-processing of de-noising in fMRI data based on wavelet analysis. Journal of applied sciences 2004; 22(1):118–123. (In
Ac
ce pt
ed
Chinese)
54
Page 56 of 78
Table Legends Table 1. Source estimates for four cases on Simulation 1. Table 2. The normalized ISI ( mean SD ) values as to FastICA, Infomax, SACICA and
ip t
WASICA on Simulation 2 (12 runs for each method).
cr
Table 3. Intra-consistency metric values as to GLM, FastICA, Infomax, SACICA and
Ac
ce pt
ed
M
an
us
WASICA on visual task dataset.
55
Page 57 of 78
Table 1. Source estimates for four cases on Simulation 1 Results Algorithm
FastICA SACICA Source s1 (excess) Kurtosis 0.8840 0.1459 Infomaxsuper Infomaxsub Source s2 (excess) Kurtosis 0.8118 0.1455 WASICAsuper WASICAsub WASICAfast Summary: FastICA, Infomaxsuper and all versions of WASICA perform well.
an
cr
Unimodal, super-Gaussian sources
ISI( mean SD ) 0.1359 0.1274 0.7595 0.1969 0.7316 0.2054 0.4151 0.2701 0.0593 ± 0.0120 0.0627 ± 0.0521 0.2164 0.2530
Algorithm
ISI( mean SD )
B. Medium:
Algorithm
FastICA SACICA Source s1 (excess) Kurtosis 0.2572 0.1095 Infomaxsuper Infomaxsub Source s2 (excess) Kurtosis 0.0885 0.1131 WASICAsuper WASICAsub WASICAfast Summary: WASICAsuper,WASICAsub performs well. All others mostly fail.
M
us
Bimodal, close to Gaussian sources
C. Large:
good is ISI