Artificial Intelligence in Medicine 61 (2014) 53–61

Contents lists available at ScienceDirect

Artificial Intelligence in Medicine journal homepage: www.elsevier.com/locate/aiim

Unsupervised tissue segmentation from dynamic contrast-enhanced magnetic resonance imaging Gabriele Chiusano a,∗ , Alessandra Staglianò a , Curzio Basso b , Alessandro Verri a a b

Dipartimento di Informatica, Bioingegneria, Robotica e Ingegneria dei Sistemi, Università degli Studi di Genova, Via Dodecaneso 35, 16146 Genova, Italy CAMELOT Srl, Via Greto di Cornigliano 6R, 16152 Genova, Italy

a r t i c l e

i n f o

Article history: Received 5 February 2013 Received in revised form 6 February 2014 Accepted 20 February 2014 Keywords: Dictionary learning Sparse adaptive representation Dynamic contrast enhanced magnetic resonance imaging Unsupervised tissue segmentation

a b s t r a c t Objective: Design, implement, and validate an unsupervised method for tissue segmentation from dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI). Methods: For each DCE-MRI acquisition, after a spatial registration phase, the time-varying intensity of each voxel is represented as a sparse linear combination of adaptive basis signals. Both the basis signals and the sparse coefficients are learned by minimizing a functional consisting of a data fidelity term and a sparsity inducing penalty. Tissue segmentation is then obtained by applying a standard clustering algorithm to the computed representation. Results: Quantitative estimates on two real data sets are presented. In the first case, the overlap with expert annotation measured with the DICE metric is nearly 90% and thus 5% more accurate than state-of-the-art techniques. In the second case, assessment of the correlation between quantitative scores, obtained by the proposed method against imagery manually annotated by two experts, achieved a Pearson coefficient of 0.83 and 0.87, and a Spearman coefficient of 0.83 and 0.71, respectively. Conclusions: The sparse representation of DCE MRI signals obtained by means of adaptive dictionary learning techniques appears to be well-suited for unsupervised tissue segmentation and applicable to different clinical contexts with little effort. © 2014 Elsevier B.V. All rights reserved.

1. Introduction Dynamic contrast-enhanced magnetic resonance imaging (DCEMRI) is a powerful imaging modality which, by allowing for the investigation of biological processes along the temporal axis, provides information about perfusion, capillary permeability, and tissue vascularity [1]. Typically, DCE-MRI is obtained by acquiring a series of T1-weighted sequences before, during, and after the injection of a contrast agent, such as gadoterate meglumine (Gd-DOTA). The contrast agent uptake is higher where the vascularization is stronger, resulting in signal enhancement and brighter image intensities after the injection. Clinically relevant, quantitative information can be extracted by a voxel-wise analysis of the time-varying intensity signal, also known as enhancement curve, which shows a stereotyped behavior across voxels of the same tissue.

∗ Corresponding author. Tel.: +39 0103536617; fax: +39 0103536699. E-mail addresses: [email protected], [email protected] (G. Chiusano), [email protected] (A. Staglianò), [email protected] (C. Basso), [email protected] (A. Verri). http://dx.doi.org/10.1016/j.artmed.2014.02.001 0933-3657/© 2014 Elsevier B.V. All rights reserved.

From the computational viewpoint the analysis of DCE-MRI poses several problems arising from the large amount of noise affecting the signal, patient movements during acquisition, and the need of discriminating between different tissues. Recently, several computational methods based on DCE-MRI have been proposed for quantitative assessment of several diseases like prostate cancer [2], breast cancer [3,4], cardiac and cerebral ischemia [5], renal dysfunction [6], and rheumatoid arthritis [7,8]. This paper presents a data driven method for unsupervised tissue segmentation from DCE-MRI acquisition. Aside from the manual selection of a region of interest (ROI) the method is automatic. Given a DCE-MRI acquisition, after a preliminary stage in which signal distortions due to patient movement are attenuated by means of a motion compensation technique, a sparse representation is obtained from a dictionary of basis signals learned from the data. Since the basis signals resemble the prototypical behavior of the enhancement curves corresponding to different tissues, tissue segmentation is effectively achieved by applying standard clustering techniques on the obtained representation. By computing a different dictionary of basis signals for each dataset, our method exploits in full the adaptivity of dictionary learning.

54

G. Chiusano et al. / Artificial Intelligence in Medicine 61 (2014) 53–61

Fig. 1. An example of motion compensation. (a) Image at time t1 . (b) Image at time t2 , in which a wide movement of right kidney in the ROI is highlighted. (c) The computed displacement field in the ROI, displayed at a larger scale. (d) The image at time t2 after motion compensation.

The rest of this paper is organized as follows: in Section 2 the literature on DCE-MRI analysis is overviewed. Section 3 describes the proposed technique, while Section 4 presents the experimental results obtained on synthetic and real data. Finally, we draw our conclusions in Section 5.

2. Related work In the literature, DCE-MRI analysis is tackled by means of two different parametric approaches: the first approach relies on a pharmacokinetic model of the contrast agent dynamics tuned to the specific process (or disease) under study [2,4,5]. Consequently, the estimated parameters have a direct physiological interpretation. The second approach parametrizes the shape of the enhancement curves with no direct link to the problem physiology [3,7–9]. Segmentation is achieved after fitting a geometric model on the acquired enhancement curves. In other methods, feature vectors extracted from the raw data [10], or the raw data directly [6], are used as a basis for discriminating among different tissues, by means of supervised and unsupervised classification methods respectively. In all these cases, the proposed algorithms are fine-tuned to the specific medical context and often require a fair amount of work in the construction of the feature models. Over the last decades, the signal processing community has shown a growing interest on adaptive sparse coding, starting from the seminal work of Olshausen and Fields [11]. Instead of using over-complete, fixed dictionaries, like Wavelets [12], an adaptive dictionary and the corresponding sparse-codes are learnt from data within an optimization framework (see [13–19] for example). Very good results have been reported in denoising [13], compression [13], scene categorization, object recognition (see [20,15] for more examples) and image super-resolution. Other works cast the dictionary learning problem as a factor-analysis problem, with the factor loading corresponding to the number of the dictionary elements used: in this case the number of atoms is automatically obtained. Other works use non-parametric Bayesian methods [21,22] and the Indian buffet process [23,24]. The use of temporal-curves instead of image patches has been proposed by [25] on electromyographic data and, independently, by us in a preliminary conference version of this paper, [26], on DCEMRI data. In [25] the proposed method computes dictionary and sparse codes of 1-D dimensional signals acquired over time, in order to learn interpretable spatio-temporal primitives from motion capture data and to differentiate between spatio-temporal primitives by using the obtained atoms. The problem is formulated as a tensor factorization problem with tensor group norm constraints over the atoms as well as smoothness constraints.

3. The proposed method In this section we describe the three stages of the proposed method: motion compensation, learning the representation, and tissue segmentation.

3.1. Motion compensation During the past three decades several registration techniques have been developed and widely applied to medical imaging. Motion correction of DCE-MRI time series is a particular case of image registration, in which tissue motion and deformation are the result of breathing and sudden, sussultatory movements combined with perfusion of the contrast agent: strong deformation and large displacements are especially prominent in dynamic cardiac imaging [27,28], breast imaging [29], and abdominal imaging [30]. In all these cases, a registration procedure is mandatory and may pose difficult computational problems. In this work, even if the acquisition process lasts from 6 to 20 min, deformation and displacement are usually quite small. Spatial registration is necessary to compensate for the slight motion artifacts produced by the presence of soft tissues in the kidney, whilst is almost never needed in the wrist. Consequently, in this work, we adopt a simple motion compensation method based on standard optical flow computation: in particular, we employ a 2-dimensional optical flow method available in the Insight Toolkit Library (ITK). Given the simplicity of the given registration tasks, the adopted procedure produces results adequate for our purpose. As shown in Fig. 1, motion compensation is only performed within a ROI outlined by hand.

3.2. Learning the representation The idea behind sparse and adaptive dictionary learning is to represent a certain family of signals, in our case enhancement curves, as linear combination of a few elements selected from a dictionary of basic signals, called atoms. As already mentioned, both the atoms and the coefficients of the linear combinations are learned from the input data. Let us now briefly review dictionary learning in the special case of time-varying signals like DCE-MRI data. We denote an enhancement curve sampled at times t = 1, . . ., p by means of a  p-dimensional vector x = (x1 , . . ., xp ) . Without loss of generality t we also assume that, for all t, x measures the difference between the samples at time t and 1. This is equivalent to set x1 = 0 for every x.

G. Chiusano et al. / Artificial Intelligence in Medicine 61 (2014) 53–61

55

Given n signals, {x1 , . . ., xn } ∈ Rp , we aim at finding a dictionary of K atoms {d1 , . . ., dK } ∈ Rp such that xim , the mth component of the vector xi , is given approximately by xim ≈

K 

j

ui djm

(1)

j=1

with only a few non-null uij for each xi and where the superscripts run through rows and subscripts through columns. Thus  ui = (u1i , . . ., uKi ) is the coefficient vector for the curve xi corresponding to the ith row of the K × n code matrix U. In a more convenient matrix notation, Eq. (1) can thus be rewritten as X ≈ DU with X denoting the p × n data matrix and D the p × K dictionary matrix. The idea is to learn both D and U from the available examples by solving the optimization problem minX − DU2F + U1 D,U

(2)

for  > 0 and subject to the normalization constraints dj  2 ≤ 1 with j = 1, . . ., K for each atom dj . In Eq. (2)  ·  F and  ·  1 denote the Frobenius and the 1 -norm respectively. This problem has been proposed in [19] and in the literature it is referred to as 1 − dictionary learning. The first term in the summation is a data-fidelity term that ensures a small average reconstruction error. The second term is a penalty term inducing sparsity on the coefficients. The parameter  determines the tradeoff between the two terms, whilst the number of atoms K is fixed a priori. The joint minimization in Eq. (2) is non-convex and nondifferentiable, but both the subproblems obtained by keeping U, or D, fixed and minimizing with respect to D, or U, are convex. Several block coordinate descent schemes have thus been proposed – see [15,19], for example. These schemes proceed by alternate minimization according to the following steps: (i) (ii) (iii) (iv)

(b)

initialize D and U; solve problem in Eq. (2) for fixed D; solve problem in Eq. (2) for fixed U; go to step (ii) until convergence.

In this work we use PADDLE1 , an optimization algorithm proposed in [31] and based on first-order proximal methods, for both steps (ii) and (iii). The overall scheme stops either upon reaching the maximum number of iterations, 10 in our case, or if the functional value changes by less than a fixed tolerance, here equal to 10−5 . The columns of the data matrix X correspond to the relevant voxels within the ROI. A voxel x is considered relevant depending on the value of the standard deviation  computed on the samples x1 , x2 , . . ., xp . Since the uptake of the contrast agent is responsible for high standard deviation, voxels are considered relevant, or irrelevant, if the standard deviation is higher, or lower, than a fixed threshold T. We compute the threshold T by means of the Otsu method [32] under the assumption that, in the ROI, the standard deviation follows a bimodal distribution with the first peak very close to 0. Empirical evidence corroborates the validity of this assumption in all of our experiments. A further advantage of this preliminary pruning stage is a considerable speed up of the entire procedure. The dictionary matrix D is initialized by randomly picking K columns from the data matrix X in the first loop of alternate optimization between D and U. In our experiments, this choice led to a

1 The open-source Python implementation is freely available http://slipguru.disi.unige.it/research/PADDLE, Retrieved October 12, 2011.

(a)

at

(c) Fig. 2. Synthetically generated enhancement curves. (a) Templates of the enhancement curves corresponding to the five different types of simulated tissue. (b) The five atoms most used in the reconstruction of synthetic data corrupted by noise. (c) The five principal components computed through principal component analysis. The vertical axes have been rescaled manually.

56

G. Chiusano et al. / Artificial Intelligence in Medicine 61 (2014) 53–61

100 90 80 70 60 50

0

20

40

60

80

100

Fig. 3. Percentage of segmentation accuracy of our method on synthetic data as a function of the percentage of non-null entries in the code matrix U. The accuracy, computed as in Eq. (5), has been averaged over ten runs obtained by adding zeromean Gaussian noise with  = 20 and for K = 70. The grey band shows the standard deviation.

good trade-off between effective convergence of the algorithm and adaptiveness to the data of the resulting dictionary. No significant dependency of the obtained solution on the initialization step has been noted. As a measure of the effectiveness of the obtained reconstruction, we compute the Peak Signal to Noise Ratio (PSNR) defined as



PSNR = 20 · log10

maxi xi 2 X − DU2F /n



.

(3)

where the xi  2 is the Euclidean norm of the vector xi with i = 1, . . ., n. 3.3. Tissue segmentation In each given medical context, the physician is required to identify the number of different tissues of interest, each corresponding to a different type of enhancement curve. Relying on the faithful reconstruction obtained in the previous stage, segmentation of the various tissues can be effectively computed through standard clustering techniques. In this work we use k-means with the number of classes k set a priori by an expert. The k centroids are computed by minimizing the k-means distortion measure min

rij ,j

k n   i=1 j=1

Fig. 4. Denoising and signal reconstruction on synthetic data. The dashed curve in light grey is the noise corrupted version of the template curve depicted in black. The dashed-dotted and dashed grey curves are the reconstructions obtained by our method and by the method described in [3] respectively.

in which the rij ∈ {0, 1} with j = 1, . . ., k indicate which cluster j the vector ui belongs to, while j denotes the jth cluster centroid. For the initialization of the k centroids we apply spectral clustering [33]. We assess segmentation accuracy by means of two different metrics. In the case of synthetic data the measure accuracy as the percentage of the correctly classified voxels or Acc.(%) = 100 ×

 TP + TN  N

(5)

where TP is the number of true positives, TN the number of true negatives, and N the total number of voxels. In the case of real data we compare two different binary segmentations X and Y through the dice metric [34] DICE =

2|X



Y|

|X| + |Y |

(6)

where | · | denotes the number of voxels in the set. 4. Experiments

rij ui − j 2

(4)

Before discussing the results obtained on real data, we describe some preliminary validation we performed on synthetic data.

Fig. 5. Qualitative assessment of kidney segmentation. The black and white curves denote the manual segmentation and the segmentation obtained by our method on the left (a), and right (b), kidney respectively. Exact superposition is shown in grey.

G. Chiusano et al. / Artificial Intelligence in Medicine 61 (2014) 53–61

4.1. Synthetic data We generate 2-D images of 300 × 300 voxels simulating a phantom consisting of five tissues each characterized by different uptaking of the contrast agent. As detailed in [35] for the case of rheumatological pathology, we consider two types of synovial volume (inflamed and healthy), vessels, and two different enhancements, type 3 and type 6. For each of these five tissues we define an enhancement curve template as in Fig. 2(a). We create four sequences of 200 frames corrupted by a zero mean Gaussian noise with standard deviation  equal to 5, 10, 15, and 20 respectively.

57

The choice of the values of K and  was driven by the quality of the segmentation performance. Consistently with what reported in [31], Fig. 3 and Table 1 shows that K = 70 and a value for  corresponding to a sparsity rate around 30% lead to the best segmentation accuracy. As we will see in the case of real data, a value for K about an order of magnitude greater than the number of tissues to be distinguished effectively captures the intraclass variability of the enhancement curves relative to the same tissue and thereby leading to almost perfect segmentation. From Fig. 3 it can also be seen that a quite wide range of sparsity rates are sufficient to achieve accurate segmentation.

Fig. 6. Qualitative assessment of parenchyma segmentation. (a) Manual segmentation. (b) Segmentation obtained by our method.

0. 2

0. 1

0

0

10

20 Atoms

30

40

50

(a)

( b)

0. 3

0. 2

0. 1

0 0

10

20

Atoms

(c)

30

40

50

( d)

Fig. 7. Renal dysfunction dataset dictionary. Histogram of atom usage and plot of the two atoms most used in the reconstruction of the enhancement curves of the clusters corresponding to the parenchymal (a) and (b), and the conductive duct tissue (c) and (d), respectively.

G. Chiusano et al. / Artificial Intelligence in Medicine 61 (2014) 53–61

K=7

K = 14

K = 28

K = 70

80% 63% 63% 61%

83% 80% 68% 62%

100% 82% 79% 75%

100% 100% 99% 97%

As shown in similar settings (see [13,36], for example) denoising is the first effect of sparse coding. In order to asses the denoising performance of our method, we compute the PSNR over the synthetic data as in Eq. (3). The obtained results are both qualitatively and quantitatively very similar for different amount of noise. Therefore, in what follows, we only discuss the case of  = 10. An example of the denoising effect in the obtained reconstruction can be seen in Fig. 4. The PSNR score of the noisy enhancement curves is 21.82 dB, while the PSNR for the denoised version 31.46 dB. As a comparison, the score of the parametric model described in [3] is 27.28 dB. From Fig. 4 it can also be appreciated that the reconstruction obtained by means of our method does not smooth out the relevant features of the underlying signal. The atoms most frequently used in reconstruction are shown in Fig. 2(b). It is interesting to observe that these atoms, up to a scale factor, resemble the different types of simulated enhancement curves, see Fig. 2(a). In contrast, Fig. 2(c) shows the first 5 principal components of the matrix XX that cannot be readily interpreted in terms of enhancement curves. Let us now assess the potential of the proposed method on real data. 4.2. Renal dysfunction dataset DCE-MRI plays an important clinical role in the assessment of kidneys damage caused by renal dysfunction. In order to obtain reliable estimates of functional parameters an accurate segmentation of the parenchymal tissue within each kidney needs to be computed. A DCE-MRI dataset of 26 pediatric patients with renal dysfunction was acquired using a 1.5T MR system at the Istituto Giannina Gaslini (IGG). The images, in the kidney anatomical region, were coronal 3D fast field echo (FFE) each 256 × 256 × 9 voxels (7 mm of slice thickness and 1.33 mm of pixel spacing) and with a time resolution between 108 and 132 time-points, depending on the acquisition. The temporal sampling rate was 5 s. Fig. 5 shows an example of segmentation of both kidneys as a whole obtained by means of our method. Using as reference the manual segmentation performed by a trained expert on the entire dataset, we compare the segmentation obtained by our method against the segmentation described in [6]. Table 3 reports the results in terms of the dice metrics of Eq. (6). It can be seen that the average performance of our method is approximately 5% better with a considerably smaller standard deviation. The segmentation of the parenchyma can be obtained by refining the segmentation step. Instead of two clusters – i.e., kidney vs. non-kidney as in Fig. 5 – in this set of experiments we consider eight clusters. An example of the obtained results for the parenchyma is shown in Fig. 6. The figure clearly shows the very good agreement between the segmentation obtained by the proposed method with the manual segmentation performed by a trained expert. Fig. 7(a) and (b) shows the histograms of atom usage in the reconstruction of two of the eight tissues: the parenchyma and the collective duct. Fig. 7(c) and (d), instead, shows the atoms most used in the reconstruction of both tissues. According to the physicians

30 Di en re nce between SRF

 =5  = 10  = 15  = 20

40

20 10 0 10 20 30

0

20

40 60 Mean SRF

80

100

40 60 Mean SRF

80

100

80

100

(a )

30 20 Di en re nce between SRF

Table 1 Segmentation accuracy of the proposed method on synthetic data for different noise and dictionary size. With the sparsity rate kept fixed at 30%, each entry shows the percentage accuracy – computed as in Eq. (5) – for additive zero-mean Gaussian noise with  = 5, 10, 15 and 20 and for the number of atoms K = 7, 14, 28 and 70.

10 0 10 20 30 40

0

20

(b) 30 20 Di en re nce between SRF

58

10 0 10 20 30 40

0

20

40 60 Mean SRF

(c) Fig. 8. Bland–Altman plots of the SRF between our method and the first expert (a), our method and the second expert (b), and the first and second expert (c). In each plot the continuous line is the average of all the differences, whilst the two dashed lines delimit the confidence interval, computed as the average plus or minus ±1.96 × , with  the corresponding standard deviation.

G. Chiusano et al. / Artificial Intelligence in Medicine 61 (2014) 53–61

59

Fig. 9. Qualitative comparison of the segmentation obtained by means of our method (a), and by the method in [3] (b). Vessels and synovial volume are in white and black respectively.

these atoms resemble the behavior of the enhancement curves of the corresponding tissues. In agreement with the experiments performed on synthetic data, it can be seen that an effective representation is obtained by employing a small number of similar atoms able to capture the intraclass variability among the enhancement curves of the same tissue. An indirect measure of the parenchyma segmentation accuracy can be obtained by measuring the split renal function (SRF) [37]. Following the protocol in [37,38] the SRF is computed in terms of the average of all the enhancement curves from the voxels within the parenchyma. Let left-avg-curve and right-avg-curve be the average curves computed for the left and the right parenchyma respectively. If we denote with al and ar the area beneath left-avgcurve and right-avg-curve respectively computed in a time interval of one minute starting from the wash out in vessels for, the SRF is defined as 100 × al /(al +ar ). In our work we compare the SRF obtained from the segmentation results of our method against the SRF computed by two experts employing the MRU plug-in for ImageJ,2 , a semi-automatic tool specifically designed for evaluating functional parameters of renal dysfunction. Fig. 8 shows the Bland–Altman plots of the computed SRF values. Fig. 8(a) and (b) depicts the comparison between our method and the estimates obtained by the two experts using MRU, whilst Fig. 8(c) the comparison between the two experts estimates. Notice that apart from one case, see Fig. 8(a), all the estimates obtained by our method vs. the experts fall within the corresponding confidence intervals. Interestingly, the same happens when comparing the two experts estimates. As a further measure of consistency, we report both the Pearson and Spearman coefficients in Table 4. According to the opinion of medical experts, these results indicate that our method leads to results which are both very accurate and reliable. 4.3. Juvenile idiopathic arthritis dataset In order to verify the versatility of the proposed method we also considered a dataset of pediatric patients affected by juvenile idiopathic arthritis (JIA). In the current clinical practice JIA diagnosis is based on conventional radiology, but DCE-MRI appears to be very useful for reaching an accurate and reproducible quantitative estimate of the inflamed synovial compartment [39,40]. From the computational viewpoint the problem is the separation of synovial volumes from vessels since the signal intensity of the enhancement curves of both tissues is sustained. As shown in the simulation of Fig. 2(a), the main

2 http://www.univ-rouen.fr/med/MRurography/accueil.htm November 30, 2011.

Retrieved

Table 2 Comparison of the signal recovery performance and computational time between our method (top row) and two other approaches (bottom rows), on the wrist dataset. From the leftmost to the rightmost column: PSNR average and standard deviation, avgPSNR and  PSNR , and computational time, avgtime and  time , over the entire dataset. Method

avgPSNR (dB)

 PSNR (dB)

avgtime (s)

 time (s)

Our method Agner [3] Guo [9]

37 34 19

4.1 4.2 6.1

14 12 305

2.4 2.4 76

Table 3 Segmentation accuracy for kidneys as a whole. Comparison of the dice metric computed as in Eq. (6) for our method (top row) and for the technique based on raw data and described in [6]. From left to right the columns report the dice metric plus or minus its standard deviation for the left (L), the right (R), and both kidneys (L+R).

Our method Zöllner [6]

L

R

L +R

0.88 ± 0.06 0.83 ± 0.09

0.78 ± 0.09 0.75 ± 0.11

0.83 ± 0.09 0.78 ± 0.11

Table 4 Quantitative assessment of the correlation between SRF scores. Top row: Pearson coefficient for our method (OM) vs. the first expert, 1st, our method vs. the second expert, 2nd, and first vs. second expert. Bottom row: same for the Spearman coefficient.

Pearson coeff. Spearman coeff.

OM vs. 1st

OM vs. 2nd

1st vs. 2nd

0.83 0.83

0.87 0.71

0.86 0.78

difference appears to be in the transient phase in which the response in vessels peaks before leveling off. A dataset of 12 pediatric patients with active JIA was acquired using the same MR system of above. The images, in the wrist anatomical region, were coronal 2D FFE each 160 × 160 voxels (10 mm of slice thickness and 1.06 mm of pixel spacing) and with a time resolution between 60 and 120 time-points, depending on the acquisition. The temporal sampling rate was 3 s. Table 2 shows the obtained results in terms of the PSNR computed as in Eq. 3. From the two leftmost columns of Table 2 it can be seen that our method achieves better average PSNR with a smaller or similar standard deviation with respect to both the models in [3,9] respectively. Furthermore, from the two rightmost columns of Table 2, it can be seen that from the computational viewpoint our method is almost as good as the fastest of the alternative methods. An example of the segmentation obtained by means of our method is shown in Fig. 9(a). Overall the computed results have been considered very accurate by the medical doctors involved in the study and a thorough quantitative analysis on a larger number of patients is under way. The results achieved by the method

60

G. Chiusano et al. / Artificial Intelligence in Medicine 61 (2014) 53–61

0.14

conditions. Finally, the method can be extended to different medical contexts with little effort leveraging on the adaptiveness of the whole framework and the minimal request of supervision from the physician side. Future work focuses on the implementation of the proposed scheme on distributed architecture. In conclusion, we believe that the proposed method can constitute the core of a computer aided diagnosis system for DCE-MRI studies, at least in the domains in which motion compensation is not a too difficult problem.

0.12 0.10 0.08 0.06

Acknowledgements

0.04 0.02 0.00 0

20

40

60 80 Time Step

100

120

The authors wish to gratefully thank Gianmarco Ghiggeri, Beatrice Damasio and Chiara Mattiuz of IGG for their invaluable help in data acquisition, preparation and analysis. G.C. was supported by a fellowship from IGG. References

Fig. 10. JIA dataset dictionary. The black and grey atoms, mostly used in the reconstruction of the enhancement curves in the clusters corresponding to synovial volume and vessel tissue, resembles the expected behavior of the contrast uptake as a function of the time steps.

described in [3] are shown in Fig. 9(b). Notice the visibly larger percentage of false negatives for the synovial volumes and false positives for vessels with respect to the segmentation of Fig. 9(a). Even if the ground truth is not readily available, an indirect measure of the effectiveness of our method can be obtained by looking at the shape of the obtained atoms. The black and the grey curve in Fig. 10 depicts the atoms most frequently used in the reconstruction of the enhancement curves corresponding to synovial volumes and vessels respectively. As confirmed by trained physicians, the two atoms capture the main features of the corresponding tissue response and constitute a very good starting point for a segmentation stage the quality of which greatly depends on the effectiveness of the obtained representation. 5. Conclusions In this work we proposed and discussed an unsupervised method for tissue segmentation in DCE-MRI which consists of three steps. In the first step the acquired images are spatially registered to compensate for patient movements. Then, the time-varying signals measured at each voxel are represented as a sparse linear combination of adaptive basis signals. Both the coefficients of the combination and the basis signals are learned from the available data by solving a suitable optimization problem of adaptive and sparse dictionary learning. Finally in the third step the identification of the different tissues is obtained by means of standard clustering algorithms applied to the computed representation. The presented analysis shows that the proposed method has several interesting features. First, the basis signals resemble prototypes of the acquired enhancement curves. This property leads to a naturally sparse representation of the original enhancement curves which can be easily interpretable in terms of the used atoms. Experiments indicate that the intraclass variability of the enhancement curves corresponding to the same tissue is captured by a dictionary size an order of magnitude larger than the number of different tissues of interest. Second, the obtained reconstructions are denoised version and retain all the relevant features of the underlying curves without the need of ad hoc hypothesis on the shape of the acquired signals. Consequently, the segmentation which is obtained by means of standard clustering techniques is very accurate. Third, the method is computationally efficient and provides results which appear to be independent of the initialization

[1] Eida S, Ohki M, Sumi M, Yamada T, Nakamura T. MR factor analysis: improved technology for the assessment of 2D dynamic structures of benign and malignant salivary gland tumors. Journal of Magnetic Resonance Imaging 2008;27(6):1256–62. [2] Alonzi R, Padhani AR, Allen C. Dynamic contrast enhanced MRI in prostate cancer. European Journal of Radiology 2007;63(3):335–50. [3] Agner S, Xu J, Fatakdawala H, Ganesan S, Madabhushi A, Englander S, et al. Segmentation and classification of triple negative breast cancers using DCEMRI. In: IEEE international symposium on biomedical imaging: from nano to macro, 2009, ISBI’09. 2009. p. 1227–30. [4] Schmid VJ, Whitcher B, Padhani AR, Yang GZ. Quantitative analysis of dynamic contrast-enhanced MR images based on Bayesian P-splines. IEEE Transactions on Medical Imaging 2009;28(6):789–98. [5] Harris N, Gauden V, Fraser P, Williams S, Parker G. MRI measurement of blood–brain barrier permeability following spontaneous reperfusion in the starch microsphere model of ischemia. Magnetic Resonance Imaging 2002;20(3):221–30. [6] Zöllner FG, Sance R, Rogelj P, Ledesma-Carbayo MJ, Rorvik J, Santos A, et al. Assessment of 3D DCE-MRI of the kidneys using non-rigid image registration and segmentation of voxel time courses. Computerized Medical Imaging and Graphics 2009;33(3):171–81. [7] Kubassova O, Boesen M, Boyle R, Cimmino M, Jensen K, Bliddal H, et al. Fast and robust analysis of dynamic contrast enhanced MRI datasets. In: Ayache N, Ourselin S, Maeder A, editors. Medical image computing and computer-assisted intervention – MICCAI 2007; vol. 4792 of lecture notes in computer science. Berlin/Heidelberg: Springer; 2007. p. 261–9. [8] Kubassova O, Boyle R, Pyatnizkiy M. Bone segmentation in metacarpophalangeal MR data. In: Singh S, Singh M, Apte C, Perner P, editors. Pattern recognition and image analysis; vol. 3687 of lecture notes in computer science. Berlin/Heidelberg: Springer; 2005. p. 726–35. [9] Guo J, Reddick W. DCE-MRI pixel-by-pixel quantitative curve pattern analysis and its application to osteosarcoma. Journal of Magnetic Resonance 2009;30(1):177–84. [10] Levman J, Leung T, Causer P, Plewes D, Martel AL. Classification of dynamic contrast-enhanced magnetic resonance breast lesions by support vector machines. IEEE Transactions on Medical Imaging 2008;27(5):688–96. [11] Olshausen B, Field D. Sparse coding with an overcomplete basis set: a strategy employed by v1? Vision Research 1997;37(23):3311–25. [12] Mallat S. A wavelet tour of signal processing. 2nd ed. New York: Academic Press; 1999. [13] Aharon M, Elad M, Bruckstein A. K-svd: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing 2006;54(11):4311–22. [14] Bruckstein AM, Donoho DL, Elad M. From sparse solutions of systems of equations to sparse modeling of signals and images. SIAM Review 2009;51(1):34–81. [15] Mairal J, Bach F, Ponce J, Sapiro G. Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research 2010;11(1):19–60. [16] Mairal J, Sapiro G, Elad M. Learning multiscale sparse representations for image and video restoration. In: Proceedings of international conference of computer vision. 2007. p. 2272–9. [17] Ranzato M, Puoltney C, Chopra S, Yann L. Efficient learning of sparse representations with an energy-based model. In: Advances in neural information processing systems. 2006. p. 1137–44. [18] Ranzato M, Huang FJ, Boureau YL, LeCun Y. Unsupervised learning of invariant feature hierarchies with applications to object recognition. In: IEEE conference on computer vision and pattern recognition, 2007, CVPR ’07. 2007. p. 1–8. [19] Lee H, Battle A, Raina R, Ng AY. Efficient sparse coding algorithms. In: Schölkopf B, Platt J, Hoffman T, editors. Advances in neural information processing systems, vol. 19. Cambridge, MA: MIT Press; 2007. p. 801–8.

G. Chiusano et al. / Artificial Intelligence in Medicine 61 (2014) 53–61 [20] Raina R, Battle A, Lee H, Packer B, Ng AY. Self-taught learning: transfer learning from unlabeled data. In: Proceedings of the 24th international conference on machine learning, ICML ’07. ACM; 2007. p. 759–66. [21] Zhou M, Chen H, Paisley J, Ren L, Sapiro G, Carin L. Non-parametric Bayesian dictionary learning for sparse image representations. In: Neural information processing systems. 2009 December. p. 1–9. [22] Paisley J, Carin L. Nonparametric factor analysis with beta process priors. In: Proceedings of the 26th annual international conference on machine learning – ICML ’09. New York, NY, USA: ACM Press; 2009. p. 1–8. [23] Knowles D, Ghahramani Z. Infinite sparse factor analysis and infinite independent components analysis. In: Independent component analysis and signal. Z X. 2007. p. 381–8. [24] Griffiths TL, Gaharamani Z. Infinite latent feature models and the Indian buffet process infinite latent feature models and the indian buffet process. In: Advances in neural information processing systems. 2005. p. 475–82. [25] Kim T, Shakhnarovich G, Urtasun R. Sparse coding for learning interpretable spatio-temporal primitives. In: Lafferty J, Williams C, Shawe-taylor J, Zemel R, Culotta A, editors. Advances in neural information processing systems, 23. Cambridge, MA: MIT Press; 2010. p. 1117–25. [26] Chiusano G, Staglianò A, Basso C, Verri A. DCE-MRI analysis using sparse adaptive representations. In: Suzuki K, Wang F, Shen D, Yan P, editors. Machine learning in medical imaging; vol. 7009 of lecture notes in computer science. Berlin/Heidelberg: Springer; 2011. p. 67–74. [27] Cordero-Grande L, Merino-Caviedes S, Aja-Fernandez S, Alberola-Lopez C. Groupwise elastic registration by a new sparsity-promoting metric: application to the alignment of cardiac magnetic resonance perfusion images. IEEE Transactions on Pattern Analysis and Machine Intelligence 2013;35(11): 2638–50. [28] Makela T, Clarysse P, Sipila O, Pauna N, Pham QC, Katila T, et al. A review of cardiac image registration methods. IEEE Transactions on Medical Imaging 2002;21(9):1011–21. [29] Yujun G, Radhika S, Cheng-Chang L, Jasjit SS, Swamy L. Breast image registration techniques: a survey. Medical and Biological Engineering and Computing 2006;44(1–2):15–26.

61

[30] Torsten R, Calvin RMJ, Walter GO, Jianhui Z. Modeling liver motion and deformation during the respiratory cycle using intensity-based nonrigid registration of gated MR images. Medical Physics 2004;31(3):427–32. [31] Basso C, Santoro M, Verri A, Villa S. Paddle: proximal algorithm for dual dictionaries learning. In: Honkela T, Duch W, Girolami M, Kaski S, editors. Artificial neural networks and machine learning – ICANN 2011; vol. 6791 of lecture notes in computer science. Berlin/Heidelberg: Springer; 2011. p. 379–86. [32] Otsu N. A threshold selection method from gray-level histograms. Automatica 1975;20(1):62–6. [33] Luxburg U. A tutorial on spectral clustering. Statistics and Computing 2007;17(4):395–416. [34] Crum W, Camara O, Hill D. Generalized overlap measures for evaluation and validation in medical image analysis. IEEE Transactions on Medical Imaging 2006;25(11):1451–61. [35] Lavini C, de Jonge MC, van de Sande MGH, Tak PP, Nederveen AJ, Maas M. Pixelby-pixel analysis of DCE MRI curve patterns and an illustration of its application to the imaging of the musculoskeletal system. Magnetic Resonance Imaging 2007;25(5):604–12. [36] Staglianò A, Chiusano G, Basso C, Santoro M. Learning adaptive and sparse representations of medical images. In: Menze B, Langs G, Tu Z, Criminisi A, editors. Medical computer vision. Recognition techniques and applications in medical imaging; vol. 6533 of lecture notes in computer science. Berlin/Heidelberg: Springer; 2011. p. 130–40. [37] Rohrschneider WK, Haufe S, Clorius JH, Tröger J. MR to assess renal function in children. European Radiology 2003;13(5):1033–45. [38] Vivier PH, Dolores M, Taylor M, Dacher JN. MR urography in children. Part 2: how to use Imagej MR urography processing software. Pediatric Radiology 2010;40:739–46. [39] Damasio MB, Malattia C, Martini A, Tomà P. Synovial and inflammatory diseases in childhood: role of new imaging modalities in the assessment of patients with juvenile idiopathic arthritis. Pediatric Radiology 2010;40(6):985–98. [40] Malattia C, Damasio MB, Basso C, Verri A, Magnaguagno F, Viola S, et al. Dynamic contrast-enhanced magnetic resonance imaging in the assessment of disease activity in patients with juvenile idiopathic arthritis. Rheumatology (Oxford, England) 2010;49(1):178–85.

Unsupervised tissue segmentation from dynamic contrast-enhanced magnetic resonance imaging.

Design, implement, and validate an unsupervised method for tissue segmentation from dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI)...
2MB Sizes 0 Downloads 4 Views