IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 3, MARCH 2014

1143

Convex-Relaxed Kernel Mapping for Image Segmentation Mohamed Ben Salah, Ismail Ben Ayed, Jing Yuan, and Hong Zhang

Abstract— This paper investigates a convex-relaxed kernel mapping formulation of image segmentation. We optimize, under some partition constraints, a functional containing two characteristic terms: 1) a data term, which maps the observation space to a higher (possibly infinite) dimensional feature space via a kernel function, thereby evaluating nonlinear distances between the observations and segments parameters and 2) a total-variation term, which favors smooth segment surfaces (or boundaries). The algorithm iterates two steps: 1) a convex-relaxation optimization with respect to the segments by solving an equivalent constrained problem via the augmented Lagrange multiplier method and 2) a convergent fixed-point optimization with respect to the segments parameters. The proposed algorithm can bear with a variety of image types without the need for complex and application-specific statistical modeling, while having the computational benefits of convex relaxation. Our solution is amenable to parallelized implementations on graphics processing units (GPUs) and extends easily to high dimensions. We evaluated the proposed algorithm with several sets of comprehensive experiments and comparisons, including: 1) computational evaluations over 3D medical-imaging examples and high-resolution large-size color photographs, which demonstrate that a parallelized implementation of the proposed method run on a GPU can bring a significant speed-up and 2) accuracy evaluations against five state-of-theart methods over the Berkeley color-image database and a multimodel synthetic data set, which demonstrates competitive performances of the algorithm. Index Terms— Image segmentation, kernel mapping, convex relaxation, augmented Lagrangian method, fixed-point optimization, graphics processing unit.

I. I NTRODUCTION

I

MAGE segmentation is a frequent low-level pre-processing step whose purpose is to partition an image into meaningful and coherent segments answering a given description. This step is generally essential in the development of computer vision and image understanding algorithms because it provides a simpler portrayal of the input image. As demonstrated by several recent studies, e.g., [1]–[7], among others, functionaloptimization formulations can lead to the most effective Manuscript received May 17, 2013; revised October 15, 2013; accepted December 4, 2013. Date of publication January 2, 2014; date of current version February 4, 2014. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Nikolaos V. Boulgouris. M. Ben Salah and H. Zhang are with the University of Alberta, Edmonton, AB T6G 2C5, Canada (e-mail: [email protected]; [email protected]). I. Ben Ayed is with General Electric Canada, London, ON N6A 4V2, Canada, and also with the University of Western Ontario, London, ON N6A 3K7, Canada (e-mail: [email protected]). J. Yuan is with the University of Western Ontario, London, ON N6A 3K7, Canada, and also with Robarts Research Institute, London, ON N6A 5C1, Canada (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TIP.2013.2297019

segmentation algorithms, mainly because they state the problem in a principled and flexible way. In this setting, the solution is obtained by optimizing an objective functional, which embeds various constraints on image segments. The most common are constraints that evaluate conformity of image data within each of the segments to some parametric statistical models [1], [2], [5]–[10], and are often referred to as data terms. In the context of active contour and level set methods, and following the seminal works of Chan and Vese [9] and Zhu and Yuille [10], data terms based on the Gaussian model [10] (and its simpler piecewise constant variant [9]) have been widely adopted to images acquired by conventional cameras, mainly because they lead to computationally simple updates of the segments and their Gaussian parameters (mean and variance) [6], [8]–[10]. Although useful in some cases, the Gaussian model is not descriptive in general. Indeed, images acquired by sensors other than conventional cameras, e.g., medical [2], [11], radar [1], [12], [13] and sonar images [14], often do not follow Gaussian models. For instance, several recent studies derived application-specific curve evolution flows that account for much more complex models, which are more descriptive for specific types of imaging noise, e.g., the G 0A model for synthetic aperture radar (SAR) images [1], Gamma for mammography images [2], and Fisher-Tippett for ultrasound data [7]. In some cases, images may require different models for each of its segments, which further complicate statistical modeling. For instance, in sonar imagery, the luminance within shadow regions is well modeled by a Gaussian whereas Rayleigh is more accurate in the reverberation regions [14]. There are three main difficulties inherent to most of the existing statistical level set and active contour methods. First, statistical modeling may be difficult and time consuming [15]. The updates of the statistical parameters of such models cannot generally be written in a simple closed form, which enhances significantly the computational burden. Second, and independently of the choice of a specific data term (or statistical model), level set methods become computationally very demanding and difficult to implement in the case of N segments with N > 2, often referred to as multiphase segmentation [8]. The complexity of the ensuing coupled partial differential equations behaves exponentially with respect to N [5], [8], which precludes the use of multiphase level sets in applications where the number of segments and image size are large, e.g., high-resolution color photographs (See the example in Fig. 1). Third, these methods rely on standard gradient-descent procedures, which yield

1057-7149 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

1144

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 3, MARCH 2014

Fig. 1. A typical high-resolution color photograph used for computational comparisons: (a) the image (4320 × 3240 pixels); (b) segmentation into 20 regions using the proposed method.

sub-optimal solutions and slow evolutions toward the final segments. The recent study in [3] attempted to address some of these difficulties with a kernel graph cut formulation. By mapping the image data via a kernel function into data of a higher dimension, the authors in [3] built a functional which contains two terms: (1) a multi-label, unary-potential term that enforces homogeneity of the higher-dimensional data within each segment (each segment is defined by a label); and (2) a pairwise-interaction term, which encourages neighboring pixels to have the same label. The functional is amenable to optimization by multi-label graph cuts (e.g., α-expansion) [16], which is more efficient than multiphase level sets. The results in [3] showed that kernel mapping can be a quite effective and flexible alternative to several statistical models (e.g., Gamma, exponential, Gaussian), and can handle various types of images. This is consistent with several pattern classification studies, which have shown evidence that kernel mapping is capable of properly clustering data of non-linear structure [17]–[19]. Multi-label graph cuts [3] can be efficient in the context of 2D grids, with a relatively small number of labels (or segments). However, the extension of [3] to applications where the number of segments and/or image size are large, as is the case of typical high-resolution color photographs, or when the image dimension is high (3D or higher) as is the case of medical data, may result in a very heavy computation and memory load (See Figs. 2 and 3). The complexity of multilabel graph cuts increases rapidly with the number of segments because the optimization involves computations for all pairs of labels. Furthermore, graph cuts are not amenable to parallel computations [20]. In practice, it is well known that graph cuts can yield an excellent performance in the case of 2D grids with 4-neighborhood systems [21]. However, the efficiency may decrease considerably when moving from 2D to 3D (or higher-dimensional) grids and when using larger-neighborhood grids [22]. Another major limitation of such graph-based approaches is the well-known grid bias (or metrication error) [23]. This problem can be addressed by considering larger set of neighboring graph nodes, which unfortunately results in heavy computation and memory loads [24]. Many recent graph-based methods attempted to overcome some of these inherent problems. For instance, the authors of [25], [26] built on the nonnegative matrix factorization (NMF) framework to come up with efficient variants for image segmentation, surveillance, as well as face recognition. This study investigates a convex-relaxed kernel mapping formulation of image segmentation. We optimize, under some

Fig. 2. Execution time vs. the size of the image for the proposed method and the graph cut (GC)-based method. The number of segmentation regions is 20.

Fig. 3. Execution time of the proposed method and graph cuts versus the number of regions.

partition constraints, a functional containing two characteristic terms: (1) a data term which maps the observation space to a higher (possibly infinite) dimensional feature space via a kernel function, thereby evaluating non-linear distances between the observations and segments parameters; and (2) a total-variation (TV) term which favors smooth segment surfaces (or boundaries). The algorithm iterates two steps: (1) a convex-relaxation optimization with respect to the segments by solving an equivalent constrained problem via the augmented Lagrange multiplier method; and (2) a convergent fixed-point optimization with respect to the segments parameters. The proposed algorithm can bear with a variety of image types (because it does not assume a specific statistical model), while having the computational benefits of convex relaxation. Our solution is amenable to parallelized implementations on graphics processing units (GPU), extends easily to high dimensions and does not have the grid-bias problem, unlike graph cuts. A conference version of this work has been presented at ICIP 2012 [27]. This journal version expands on [27] with a broader, more informative discussion of the subject and more rigorous, wider investigation which includes computational evaluations with regard to graph cut in terms of speed up. Several new experiments with radar, 3D cardiac and brain data have been added to enhance the computation appraisal. Moreover, the theoretic part has been considerably enhanced by including detailed proofs of the paper’s propositions. We evaluate the proposed algorithm with three sets of comprehensive experiments and comparisons, including: • Computational evaluations, which demonstrate that the proposed convex-relaxation solution can bring a substantial speed-up in comparison to the recent kernel graph cut method in [3];

BEN SALAH et al.: CONVEX-RELAXED KERNEL MAPPING FOR IMAGE SEGMENTATION





Comparative evaluations against five state-of-the-art methods [3], [6], [28]–[30] over two different datasets: (1) The multi-model synthetic dataset in [3] and (2) the Berkeley color-image database; and Examples with 3D medical-imaging data, including brain and cardiac data, as well as with radar data corrupted with a strong multiplicative noise. We report computational evaluations over the 3D examples, which demonstrate that a parallelized implementation of the proposed method run on a graphics processing unit (GPU) can bring a significant speed-up. II. C ONVEX -R ELAXED K ERNEL M APPING

Image data within different segments may not be separable linearly in many applications. This is due, for instance, to very high levels of noise, e.g., a strong multiplicative noise in the case of SAR data (See the example in Fig. 6) [1], [12]. Rather than modeling explicitly such nonlinearity with complex statistical models, we map the image data to a higher (possibly infinite) dimensional feature space via a kernel function, so that linear separation becomes applicable. Let I :  ⊂ Rn → I (n is an integer) the image function, with I the set of image values and  is the image domain. Let φ(.) be a non-linear mapping from the observation space I to a higher (possibly infinite) dimensional feature space J . We propose to find a partition Ri , i = 1 . . . N, and a set of segments parameters mi , i = 1 . . . N, minimizing the following functional: F K (Ri , i = 1 . . . N; mi , i = 1 . . . N) N   φ(I (x)) − φ(mi )2 dx + = i=1



Ri

 Dat a



λ |∂R | (1)   i Regularizat ion

t erm

where |∂Ri | measures the perimeter of region Ri . F K measures kernel-induced non-Euclidean distances between the observations and the segments parameters. In machine learning, the kernel trick consists of using a linear classifier to solve a non-linear problem by mapping the original non-linear data into a higher-dimensional feature space. Following the Mercer’s theorem [31], which states that any continuous, symmetric, positive semi-definite kernel function can be expressed as a dot product in a higherdimensional space, we do not have to know explicitly the mapping φ. Instead, we can use a kernel function, K (y, z), satisfying: K (y, z) = φ(y)T · φ(z), ∀ (y, z) ∈ I 2 ,

(2)

where “·” is the dot product in the feature space. Substitution of the kernel functions in the data term yields the following non-Euclidean distance measure in the original data space:  JK I, mi  φ(I ) − φ(mi )2 = (φ(I ) − φ(mi ))T · (φ(I ) − φ(mi ))    = K I, I + K mi , mi − 2K I, mi

(3)

1145

TABLE I E XAMPLES OF P REVALENT K ERNEL F UNCTIONS

Let u i :  → {0, 1} denotes the indicator function of region Ri :

1 ∀x ∈ Ri u i (x) = 0 ∀x ∈ / Ri . It is well-known that the perimeter of region Ri ⊂  can be evaluated by the total-variation (TV-L1) term corresponding to its indicator function [32]:  |∇u i | dx |∂Ri | = (4) 

N Thus, writing F K as a function of {u i }i=1 and ignoring constant terms, our problem amounts to the following optimization: N     arg min u i K mi , mi − 2K I, mi dx 

u i ,i=1...N;mi ,i=1...N i=1



+λ s.t.



N 

|∇u i | dx u i (x) = 1, ∀x ∈ 

(5)

i=1

The constraint in the second line in (5) ensures that indicator functions u i , i = 1 . . . N, define a partition of . F K depends both on the indicator functions u i , i = 1 . . . N, and on the segments parameters mi , i = 1 . . . N. Therefore, we adopt an iterative two-step algorithm. The first step consists of fixing the partition and optimizing F K with respect to the parameters via fixed-point computations. As the totalvariation term does not depend on segments parameters, this is equivalent to optimizing the data term. The second step consists of a convex-relaxation optimization over the indicator functions, with the parameters fixed. A. Fixed-Point Optimization With Respect to the Segments Parameters For a fixed set of indicator functions u i , i = 1 . . . N, the derivatives of F K with respect to mi , i = 1 . . . N, yield the following equations   ∂ JK I, mi ∂F K = dx ∂mi ∂mi Ri  ∂  K mi , mi − 2K I, mi dx = (6) Ri ∂mi Table I lists some common kernel functions. In all our experiments we used the radial basis function (RBF) kernel, a kernel which has been prevalent in pattern data clustering [18], [33]. With an RBF kernel, the necessary conditions for a minimum of F K with respect to region parameters are: mi − gu i (mi ) = 0, i ∈ [1, . . . , N]

(7)

1146

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 3, MARCH 2014

where

u i I K (I, mi )dx gu i (mi ) =  , i ∈ [1, . . . , N]  u i K (I, mi )dx

with + the simplex set, i.e., (8)

The solution of (7) can be obtained by fixed-point iterations. This consists of iterating (n is the iteration index) mi,n+1 = gu i (mi,n ), i ∈ [1, . . . , N], n = 1, 2, . . .

(9)

Proposition 1: For i ∈ [1, . . . , N], sequence {mi,n }n=1,2,... opt converges and its limit, mi , is a solution of (7). A detailed convergence proof is given in Appendix A. In the case that mi,n is convergent, proving that its limit at convergence is a minimum of the functional is straightforward. In fact, we have: opt

mi

= li m n→+∞ mi,n+1 = li m n→+∞ g(mi,n ) opt

= g(li m n→+∞ mi,n ) = g(mi )

(10)

opt mi

Consequently, is a solution of the necessary condition obtained in (7). The RBF kernel, although simple, yields outstanding results on a large set of experiments with images of various types (see the experimental section). Also, note that, obviously, this RBF kernel method does not reduce to the Gaussian (or piecewise constant) model methods commonly used in the literature [9], [10], [34]. The difference can be clearly seen in the fixed point update of the segment parameters in (9) which, in the case of the Gaussian model, would be the simple mean update inside each segment [9], [10], [34]. B. Convex-Relaxation Optimization With Respect to the Indicator Functions For a fixed set of parameters mi , i = 1 . . . N, the optimization problem in (5) becomes:  N   |∇u i | dx arg min u i ρi (., mi )dx + λ u i ∈{0,1},i=1...N i=1 N 





u i (x) = 1, ∀x ∈ 

s.t.

(11)

i=1

where functions ρi (., mi ) :  → R are defined as follows   (12) ρi (x, mi ) = K mi , mi − 2K I (x), mi Model (11) is non-convex due to the binary configuration of each function u i ∈ {0, 1}. We relax such binary constraints to the convex interval [0, 1], and approximate (11) by the following convex optimization problem:  N   |∇u i | dx arg min u i ρi (., mi )dx + λ u∈S

s.t.

i=1 N 





u i (x) = 1, ∀x ∈ 

(13)

u(x) = (u 1 (x), ..., u N (x)) ∈ + when ∀x ∈  N  u i (x) = 1, u i (x) ∈ [0, 1], i = 1, . . . , N

(15)

i=1

It is worth noting that, in the case of two regions (N = 2), a thresholding of the solution of the relaxation in Eq. (13) yields a global optimum of the initial binary-valued problem in Eq. (11). This follows directly from the thresholding theorem introduced in the work Nikolova et al. [35]. In the multiregion case (N > 2), convex relaxation (13) is an approximation of (11), i.e., there is no theoretical guarantee that the solution of (13) equals exactly the optimum of (11). However, although they yield only an approximate solution for N > 2, such convex relaxations have been recently shown effective in solving the classical Potts model (e.g. [36]). Also, in the case of our kernel mapping formulation, we will show in the experimental section that such an approximation is quite effective. To solve efficiently (13) via convex optimization [37], [38], let us first consider the following result: Proposition 2: For a fixed set of segment parameters mi , i = 1 . . . N, the optimization problem in (13) is equivalent to the following constrained optimization problem:  max ps dx s.t. ps ,p,q



div qi (x) − ps (x) + pi (x) = 0 , ∀x ∈ , i = 1, . . . , N pi (x) ≤ ρi (x, mi ), ∀x ∈ , i = 1, . . . , N |qi (x)| ≤ λ, ∀x ∈ , i = 1, . . . , N (16) where p(x) = ( p1 (x), . . . , p N (x)) :  → R N and q(x) = (q1 (x), . . . , q N (x)) :  → R N are two vector-valued functions, and ps :  → R is a scalar function. A detailed proof is given in Appendix B. Proposition 2 allows us to derive an efficient multiplierbased algorithm for optimizing (13) with respect to u i , i = 1, . . . , N. The algorithm is based on the standard augmented Lagrangian method [39], [40] whose convergence properties can be found in [37], [41]. We define the following augmented Lagrangian function (c > 0):   N  ps + u i (div qi − ps + pi ) L c ( ps , p, q; u) = 

i=1

N  c div qi − ps + pi 2 − 2 

 dx (17)

i=1

Here follows, in Algorithm 1, a summary of the algorithm (k is the iteration number and the steps below are repeated until convergence). The final partition is obtained as follows: x ∈ Ri if i = maxk=1,...,N u k (x).

i=1

where S is the convex constrained set of u(x) (u 1 (x), ..., u N (x)): S = {u(x)|(u 1 (x), ..., u N (x)) ∈ + ∀x ∈ },

= (14)

C. Parallel Structure of Algorithm 1 In the numerical steps of Algorithm 1, optimization w.r.t qi in (18) and w.r.t pi in (19), i = 1, . . . N, can be handled

BEN SALAH et al.: CONVEX-RELAXED KERNEL MAPPING FOR IMAGE SEGMENTATION

Algorithm 1 Augmented-Lagrangian Optimization

independently for each region i . Therefore, (18) and (19) can be implemented in a parallel way. After computation of (18) and (19), variables ps and segmentation functions u i , i = 1, . . . , N, are updated. D. Complexity of Algorithm 1 w.r.t N and Differences in Complexity With Other Optimizers From the steps detailed in Algorithm 1, one can see that the complexity of the proposed method is linear w.r.t the number of segments N. Such linear behavior is due to the fact that, for each region i, i = 1, . . . , N, updates (20) of pixel-membership functions u i and updates (18)–(19) w.r.t variables qi and pi are independent of the other regions j, j = i . This contrasts with the graph cut method in [3], which requires pairwise interactions between the different regions. We will confirm later in the experiments the linear behavior of our method (Fig. 3). For segmentation methods based on curvature flows and coupled partial differential equations, e.g., active curves and level sets [5], [8], the time complexity often behaves exponentially with respect to N [5], [8]. Enforcing smoothness with curvature flows is difficult to extend to the multiregion case, i.e., N > 2 [5], [8]. For N > 2, curvature flows are often implemented by evolving multiple curves (or level sets), which require coupling constraints that are computationally very demanding and difficult to implement, e.g., the multiphase segmentation algorithms in [5] and [8]. The complexity of the ensuing coupled partial differential equations often behaves exponentially with respect to N, which precludes the use of curvature flows in applications where the number of segments and image size are very large, e.g., high-resolution color photographs (See the example in Fig. 1).

1147

III. E XPERIMENTAL R ESULT We evaluated the proposed algorithm with three sets of comprehensive experiments, including: • Computational evaluations which demonstrate that the proposed convex-relaxation solution can bring a substantial speed-up in comparison to the recent kernel graph cut method in [3], more so when the image size and/or the number of segments are large, as is the case of typical high-resolution color photographs, or when the image dimension is high (3D or higher) as is the case of medical data; • Comparative evaluations against five state-of-the-art methods: [3], [28], NCuts [29], Mean-Shift [30] and [6]. These comparisons were carried out over two different datasets: (1) the multi-model synthetic dataset in [3] and (2) the Berkeley color-image database. We used the following acronyms to refer to each of the competing methods: – The proposed method: CRKM (convex-relaxed kernel mapping) – The graph cut method in [3]: PKGC (parametric kernel graph cuts) – The method in [28]: PRIF – The method in [6], which uses the piecewise Gaussian model in a convex optimization framework: CPGM – Normalized cuts [29]: NCuts – Mean shift [30]: Mean-Shift; and • Examples with 3D medical-imaging data, including brain and cardiac data, as well as with radar data corrupted with a strong multiplicative noise. We report computational evaluations over the 3D examples, which demonstrate that a parallelized implementation of the proposed method run on a graphics processing unit (GPU)1 can bring a significant speed-up. The value of parameter λ for the CRKM method is chosen by varying it in an interval about the value of 2. We made this choice following prescriptions for prior works based on the level set method [43]. The study in [43] showed that a value of smoothness weight approximately equal to 2 is optimal for distributions from the exponential family such as the piecewise Gaussian model [43] and has an interesting minimum description length (MDL) interpretation. A. Computational Comparisons Using a typical high-resolution color photograph of size 4320 × 3240 pixels (Fig. 1), we assessed the computational behavior of the proposed method (CRKM) and the kernel graph cut method in [3] (PKGC) with regard to the number of segmentation regions as well as the image size. We used the parallel-implementation version of CRKM. Fig. 2 plots the execution time as a function of the image size for CRKM 1 We evaluated two implementations (with and without parallelization), one based on a graphics processing unit (GPU) and the other on a central processing unit (CPU). The parallelized implementation was run on a GeForce 310M GPU, and the non-parallelized version on a Intel Pentium 1.73 GHz CPU.

1148

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 3, MARCH 2014

TABLE II P ERFORMANCE OF THE CRKM M ETHOD A GAINST S EVERAL R ECENT

TABLE III AVERAGE P ERFORMANCES OF THE CRKM M ETHOD A GAINST THE

A LGORITHMS IN T ERMS OF THE PRI AND VoI M EASURES ON THE

C OMPETING A LGORITHMS ON THE B ERKELEY D ATABASE [46]

S YNTHETIC M ULTI -M ODEL D ATASET I NTRODUCED IN [3]

and PKGC by varying the image size from 10% to 80% of the original size. For every size, the image is segmented into 20 regions. The computational time increases much more rapidly for the graph cut technique, with an exponentially growing gap between the two methods as the image size increases. Fig. 3 plots the evolution of the execution time as a function of the number of segmentation regions, which is varied between 4 and 20. Starting from 8 regions, PKGC yielded a much steeper slope. Recall that the graph cut method cannot be parallelized mainly because of the interactions between the different labels modeled by the regularization term of the graph cut energy. B. Comparative Evaluations in Regard to Accuracy In this section, we report a comparative verification of CRKM against several state-of-the-art methods: PKGC [3], PRIF [28], NCuts [29], Mean-Shift [30], and CPGM [6]. To demonstrate the ability of the proposed method to adapt to various types of images, we performed comparisons over two completely different datasets: (1) the multi-model synthetic dataset introduced in [3] and (2) the well known Berkeley color-data benchmark. The tuning parameters of all the competing algorithms are set to their optimal values or to the values suggested by the authors. The evaluations are based on two accuracy measures: the Probabilistic Rand Index (PRI) [44] and the Variation of Information (VoI) [45]. The values of VoI are in [0 ∞[, lower is better, and PRI ranges in [0 1], higher is better. 1) The Multi-Model Synthetic Data: In this part, we report quantitative and comparative performance evaluations over the large multi-model synthetic dataset introduced in [3]. This dataset set contains more than 100 synthetic two-segment images, each corresponding to a different combination of a noise model and contrast level between the segments. The added noise models include the Gaussian, exponential, and Gamma distributions. The contrast levels are evaluated by the Bhattacharyya measure between intensity distributions within the segments [3]. The rationale behind using this dataset is to evaluate the ability of the competing algorithms as to handling various types of imaging noise. More details about this dataset and a sample of images are given in section III-C. Table II reports the performances in terms of PRI and VoI for CRKM, PKGC [3], PRIF [28] and CPGM [6]. The proposed method yielded a very competitive performance, and obtained the best results in terms of both criteria. Note that the kernel graph cut method (PKGC) yielded the second best performance. This is expected because PKGC uses a

kernel-mapping formulation that can handle various types of imaging noise. It is also worth noting that PRIF [28] obtained the worst performance on this multi-model data, although it yielded a very competitive performance of the Berkeley colordata benchmark [28] (PRIF is the state-of-the-art method on the Berkeley dataset). This is also expected because PRIF is designed specifically for natural color images. The proposed method presents also lower variance compared to the other methods which is expected. Indeed, this is another confirmation that CRKM handles better data with various statistical models and is not model specific. Since this data set is generated from three different combinations of noise models, the competing techniques perform poorly when applied to the subset of images with models they are not designed for. This results in a higher variance of the performance measures VoI and PRI. The same statement holds for the Berkeley dataset presented in the following. 2) The Berkley Color-Image Data: We run CRKM, PKGC, PRIF, NCuts and Mean-Shift on the Berkeley database. We used both the training and testing images of this database, which amounts to 300 color images with different number of regions. For each image, a set of benchmark segmentation results are provided by 4 to 7 human observers [46]. The evaluations are based on the same accuracy measures as the previous experiments (PRI and VoI). Note that the PRI takes into account the variability of segmentations between the human observers by averaging, with regard to human segmentations, the number of pairs of pixels consistently labeled in the automatic and ground-truth segmentations. Table III reports the performance of CRKM and the competing algorithms in terms of VoI and PRI. The first line of Table III corresponds to the performances of human observers. Although not specifically developed for natural color images, CRKM outperforms PKGC, Mean-Shift, and NCuts in terms of the VoI index. In terms of the PRI index, CRKM outperforms NCuts and yields a performance similar to Mean-Shift. Only PRIF yields the best results in regard to both measures. For visual inspection, further examples of the segmentations obtained by the CRKM on the Berkeley database are shown in Fig. 10. C. Robustness With Respect to the Noise Model In this section we evaluate the robustness of the proposed method with respect to the noise model over the large multimodel synthetic dataset we used earlier. To further illustrate the effect of the choice of a specific noise model, we further support our evaluations by comparisons with two methods

BEN SALAH et al.: CONVEX-RELAXED KERNEL MAPPING FOR IMAGE SEGMENTATION

1149

Fig. 6. Segmentation of (a) ERS SAR image (size 250× 250) corrupted with a strong multiplicative noise and (b) a Monolook SAR image (size 361×151). Fig. 4. (a) Sample images from the synthetic multi-model dataset with Gamma (first row) and exponential (second row) noises; (b) segmentation results with PGM; (c) segmentation results with CPGM; and (d) the CRKM. Image size: 241 × 183.

Fig. 5. Study of the effect of the contrast between regions on the segmentation error for different methods over a large number of experiments: comparisons over the subset of synthetic images perturbed with a Gaussian noise in (a), the exponential noise in (b), and the Gamma noise in (c).

based on the commonly used Gaussian Model: the first one, referred to by PGM, uses a Gaussian-model data term optimized via active contour/level sets as in [47]–[49]; the second is CPGM [6], which uses the Gaussian model but rather in a convex optimization framework. A sample of images from the multi-model dataset is shown in Fig. 4. We display two different noisy versions of a piecewise constant two-region image perturbed respectively with a Gamma (first row) and exponential noises (second row). The overlap between the intensity profiles within regions is relatively important in both images of Fig. 4(a). Fig. 4(b)–(d) depict the segmentation results obtained respectively by PGM, CPGM and CRKM. The segmentation boundaries are drawn in black for CRKM, in white for CPGM and in yellow for PGM. In the first set of experiments, all the images perturbed with a Gaussian noise were segmented using CRKM, PGM and CPGM. Fig. 5(a) displays the corresponding percentage of mis-classified pixels (PMP) as a function of the contrast level evaluated by the Bhattacharyya measure [3]. As the actual noise is Gaussian as assumed in both PGM and CPGM, the three methods yielded satisfactory results (PMP less than 5%). Although unsupervised (i.e., does not assume any knowledge of the noise model), the proposed method yielded the lowest PMP (PMP less than 1.6%) for very low contrast values and a similar performance as the two other methods for a large range of Bhattacharyya distance values. Second, we run the three algorithms on the set of images perturbed with the exponential noise and plotted the segmentation error as a function of the contrast in Fig. 5(b). Both PGM and CPGM produce accurate segmentations for high contrasts, then undergo high error gradient at some Bhattacharyya distance when the contrast becomes relatively low. When Bhattacharyya distance is superior to 0.52, all methods

Fig. 7. Short axis cardiac MRI and 3D segmentations of the endocardium of three distinct subjects. First two rows: two sample slices from each subject’s left ventricle where the region of interest is delineated by a yellow box. Last row: 3D segmentations obtained using the proposed method.

produce accurate segmentations (PMP < 4.5%). When the contrast is relatively low, however, the proposed method outperformed both Gaussian-based methods. In the third experiment depicted by Fig. 5(c), similar experiments were run over the portion of images perturbed with Gamma noise, and a similar behavior was noticed. These experiments demonstrate that the proposed method can deal with various types of image noises and contrasts, without prior assumptions/learning as to image models and model parameters. The advantages of the CRKM include also the ability to segment multi-model images, i.e, images whose regions require different models. This is the case with synthetic aperture radar (SAR) images where the intensity follows a Gamma distribution in a zone of constant reflectivity and a K distribution in a zone of textures reflectivity. Fig. 6(a) depicts a segmentation of an ERS-1 250×250 intensity 1-look SAR image of an agricultural scene into four regions with the proposed method. It is generally recognized that the presence of a strong multiplicative speckle noise is a major complicating factor in SAR segmentation [1], [4], [12]. Fig. 6(b) is a segmentation of a mono-look 361 × 151 SAR image. For both images in Fig. 6(a), (b), we display the original image with the segments enclosed by different colored contours. D. 3D Medical Data In this section, we provide further examples with 3D medical scans in order to demonstrate the flexibility and efficiency of the proposed method.

1150

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 3, MARCH 2014

TABLE IV E XECUTION T IMES ( IN S ECONDS ) FOR G RAPH C UTS AND THE P ROPOSED M ETHOD (B OTH THE GPU AND CPU I MPLEMENTATIONS ) O BTAINED U SING THE S ET OF 20 F RAMES FOR EACH OF THE T HREE C ONSIDERED S UBJECTS

TABLE V C OMPUTATIONAL GPU/CPU T IMES OF THE P ROPOSED S OLUTION FOR THE

4-R EGION B RAIN S EGMENTATION E XAMPLE

Fig. 10. Examples of segmentations obtained by the CRKM algorithm on color images from the Berkeley database.

Fig. 8. Segmentations of brain 3D data (256 × 156 × 256 voxels): 4 segmentation volumes represented by the (dark) red surfaces, each enclosed by the brain exterior surface (light red). (a) to (d) represent the four brain segments.

Fig. 9. The error estimation versus the algorithmic iterations for the 3D brain data.

1) Cardiac Data: The first experiment is performed using a dataset of cardiac 3D + time MR images [50] and the purpose is to provide the 3D left ventricles’ endocardial segmentation. The MR images were acquired from 33 subjects, most of them display a variety of heart abnormalities. Each subject’s image sequence consists of 20 frames and between 8 to 15 slices along the long axis (each image slice consists originally of 256 × 256 pixels). We show in Fig. 7 a sample of the results obtained by applying the proposed technique on the

images corresponding to subjects number 2, 10 and 33. For each subject, the region of interest is initially delineated (the yellow boxes in the two first rows of Fig. 7) and used for all the corresponding frames and slices. In each of the considered cases, regions parameters were initialized randomly (mi , i = 1 . . . N, where N = 2 for this case). The last row of Fig. 7 shows the 3D volume of the segmented left ventricle endocardial region. We did not make any special efforts towards filtering the unwanted neighboring segments because the purpose is rather showing that the proposed method handles such complex data in a very short execution time. For this reason, we report in Table IV the running times corresponding to two implementations of the proposed method (with and without parallelization), as well as the running time required by graph cuts (first row). For these 3D examples, the parallelized version of the proposed method brought substantial speed-ups, and yielded real-time solutions (the third row in Table IV). 2) Brain Data: The second experiment is performed using a 3D brain data, and the purpose is to provide the 3D foursegment volumes which form the brain. The data consists of 256×156×256 voxels, a typical size which is very consuming in terms of time and memory for non-parallel implementations on standard personal computers. Fig. 8 depicts the four 3D-segmentation volumes, each represented by a dark red surface enclosed by the exterior brain segment (shown in light red). The set of parameters (mi , i = 1 . . . 4) is initialized with a simple k-means clustering of the brain volume intensities. Table V reports the GPU/CPU times corresponding to the brain example. The GPU (parallelized) implementation demonstrates that the proposed algorithm requires about 41.34 seconds for this typical brain volume, a speed-up of more than 4 times compared to the CPU (non-parallelized) implementation.

BEN SALAH et al.: CONVEX-RELAXED KERNEL MAPPING FOR IMAGE SEGMENTATION

Region parameters mi (i = 1 . . . 4) converged to m = [10−3 0.161 0.398 0.532]. Such parameters correspond the limit of the sequence we obtained in Eqs. (8)–(9), which can be interpreted as a kernel-weighted mean of intensities/colors within each of the segmentation regions. Fig. 9 plots the error estimation versus the algorithmic iterations, which illustrates the convergence behavior of the proposed method when applied to the brain data. IV. C ONCLUSION This study investigated a convex-relaxed kernel mapping formulation of image segmentation. The formulation can bear with a variety of image types without the need for complex and application-specific statistical modeling, while having the computational benefits of convex relaxation. Many experimental results were shown to support this by considering a multi-model dataset on which the proposed method reached the best performance in regard to accuracy. In addition, we demonstrated that the parallel structure of the method allows to outperform a recent graph-cut technique in regard to speed over 3D medical-imaging examples and high-resolution largesize color photographs. There are several directions of future work derived from this study. First, the kernel mapping technique could be benefited from in a better way by using more discriminative image-based features of higher dimension. Second, in order to further enhance the computational efficiency resulting from the use of parallel structure of the algorithm, joint optimization of the number of regions could be investigated. This would be more interesting when dealing with high dimensional medical data when the number of regions is not known but needs to be small. Finally, more application specific kernel functions could be investigated for better discrimination performance. A PPENDIX A Let gu , u ∈ {u i , i = 1 . . . N} be the fixed-point update function defined in (8). In this section, we prove the convergence of the following sequence:

u I K (I, t j )dx , j = 1, 2, . . . t j +1 = gu (t j ) =   u K (I, t j )dx where K is the RBF kernel. With h(x) = exp(−x), sequence {t j } j =1,2,... can be written as follows:  

I −t j 2 u I h   dx  σ   , j = 1, 2, . . . (A-1) t j +1 = I −t j 2  uh  σ  dx Define sequence { f σ ( j )} j =1,2,... as follows:    I −t j 2  dx, j = 1, 2, . . . uh  fσ ( j ) = σ  First, we demonstrate that sequence f σ converges, and con I −t sequently, is a Cauchy sequence. Because 0 < h  σ j 2 ≤ 1 and u ∈ {0, 1}, we have:   0 < fσ ( j ) ≤ udx ≤ dx (A-2) 



1151

Thus f σ is bounded by the area of the image domain and it suffices to show that f σ is strictly monotonic increasing. Now for t j +1 = t j , j = 1, 2, . . ., consider the following expression:   I − t j +1 2  f σ ( j + 1) − f σ ( j ) = u h   σ    I −t j 2 (A-3)  dx. −h  σ Using the fact that h is convex, then for all u, v ∈ R+ such that u = v, we have h(v) ≥ h(u) + h (u)(v − u).

(A-4)

As function h verifies −h (u) = h(u), then (A-4) becomes h(v) − h(u) ≥ h(u)(u − v).

(A-5)

Now, combining (A-3) and (A-5) gives f σ ( j + 1) − f σ ( j )    I −t 1 j 2  I − t j 2 − I − t j +12 dx uh  ≥ 2 σ  σ   I −t  1 j 2 = 2  2t Tj+1 I −t j +12 −2t Tj I +t j 2 dx uh  σ  σ     I −t 1 T j 2 2  dx − t j +1  uIh  uh = 2 2t j +1 σ σ     I −t  I −t   j 2 j 2  dx − 2t Tj  dx + t j 2 ×  uIh  σ σ     I −t j 2 (A-6)  dx × uh  σ  Using (A-1) in (A-6) yields, after some manipulations    I −t 1 j 2 f σ ( j + 1)− f σ ( j ) ≥ 2 t j +1 − t j 2 uh   dx. σ σ  (A-7) Because t j +1 = t j for j = 1, 2, . . ., the right term in (A-7) is strictly positive and, consequently, sequence fσ is strictly increasing. With inequalities (A-2), this concludes that f σ is convergent. Summing both sides in inequality (A-7) over j, j + 1, . . . , j + m − 1 gives f σ ( j + m) − fσ ( j )   I −t  1 j +m−1 2  dx ≥ 2 t j +m − t j +m−1 2 uh  σ σ    I −t 1 j 2 2  dx uh  + . . . + 2 t j +1 − t j  σ σ  1 ≥ 2 t j +m − t j +m−1 2 + . . . + t j +1 − t j 2 M σ M ≥ 2 t j +m − t j 2 , (A-8) σ  

I −t where M is the minimum of the integral  uh  σ j 2 dx with respect to {t j } j =1,2,... . Note that M is strictly positive. Sequence f σ is convergent and, consequently, is a Cauchy sequence. This result combined with (A-8) concludes that {t j } j =1,2,... is a Cauchy sequence. As a result, {t j } j =1,2,... converges in the Euclidean space.

1152

IEEE TRANSACTIONS ON IMAGE PROCESSING, VOL. 23, NO. 3, MARCH 2014

A PPENDIX B Let us introduce the multiplier functions u i (x), i = 1 . . . N, to the equality constraint in (16). This yields the following formulation equivalent to (16):  ps dx max min E( ps , p, q; u) = ps ,p,q u

+





u i (div qi − ps + pi ) dx s.t.

E( ps , p, q; u) =



+

{(1 −

N 

u i pi +

u i ) ps

i=1

N 

u i div qi }dx (B-2)

i=1

For the constrained optimization problem in (B-1), the conditions of the minmax theorem ([51] , Chapter 6, Proposition 2.4) are all verified: the constraints are convex and the energy function is linear in both functions ps , p, q and multiplier u (Therefore, E is convex l.s.c. for fixed u and concave u.s.c. for fixed ps , p and q. It follows that (i) there exists at least one saddle point [51]; and (ii) the min/max operators in (B-1) can be interchanged: 

  max min E( ps , p, q; u) = min max E( ps , p, q; u) u

u

ps ,p,q

(B-3) Now let us consider the following optimization problem f (v) = max v · w , w≤C

(B-4)

where v, w and C are scalars. When v < 0, w can be negative infinity in order to maximize the value v · w, i.e. f (v) = +∞. Therefore, we should have v ≥ 0 to obtain a meaningful function f (v). It can also be easily seen that for w ≤ C, ⎧ if v = 0 , then w ≤ C maximizes the function and ⎪ ⎪ ⎨ f (v) = 0, if v > 0 , then w = C maximizes the function and ⎪ ⎪ ⎩ f (v) = v · C. (B-5) In summary, we have f (v) = v · C , and v ≥ 0 .

(B-6)

In view of (B-4) and (B-6), it follows that the optimization of E( ps , p, q; u) over pi (x), i = 1, . . . , N, constrained by inequality pi (x) ≤ ρi (x, mi ), amounts to   sup u i (x) pi (x)dx = u i (x)ρi (x, mi )dx pi (x) ≤ ρi (x,mi ) 

ui = 0

(B-9)

Combining (B-7), (B-8), and (B-9) proves the equivalence in proposition 2. R EFERENCES

i=1

N 

N  i=1

Rearranging energy function E, we obtain 

(B-8)

Now notice that ps is not constrained and the optimization with respect to ps leads simply to: 1−

pi (x) ≤ ρi (x, mi ), ∀x ∈ , i = 1, . . . , N |qi (x)| ≤ λ, ∀x ∈ , i = 1, . . . , N (B-1)

ps ,p,q

|qi (x)|≤λ 



N   i=1

Also, it is well-known that [32] :   u i div qi d x = λ |∇u i | d x . max



u i (x) ≥ 0, i = 1, . . . , (B-7)

[1] R. Marques, F. Medeiros, and J. Nobre, “SAR image segmentation based  model,” IEEE Trans. Pattern Anal. Mach. on level set approach and GA Intell., vol. 34, no. 10, pp. 2046–2057, Oct. 2012. [2] P. Rahmati, A. Adler, and G. Hamarneh, “Mammography segmentation with maximum likelihood active contours,” Med. Image Anal., vol. 16, no. 6, pp. 1167–1186, 2012. [3] M. Ben Salah, A. Mitiche, and I. B. Ayed, “Multiregion image segmentation by parametric kernel graph cuts,” IEEE Trans. Image Process., vol. 20, no. 2, pp. 545–557, Feb. 2011. [4] M. B. Salah, A. Mitiche, and I. B. Ayed, “Effective level set image segmentation with a kernel induced data term,” IEEE Trans. Image Process., vol. 19, no. 1, pp. 220–232, Jan. 2010. [5] A. Mitiche and I. Ben Ayed, Variational and Level Set Methods in Image Segmentation, 1st ed. New York, NY, USA: Springer-Verlag, Sep. 2010. [6] T. Goldstein, X. Bresson, and S. Osher, “Geometric applications of the split Bregman method: Segmentation and surface reconstruction,” J. Sci. Comput., vol. 45, no. 1, pp. 272–293, 2009. [7] G. Slabaugh, G. Unal, M. Wels, T. Fang, and B. Rao, “Statistical region-based segmentation of ultrasound images,” Ultrasound Med. Biol., vol. 35, no. 5, pp. 781–795, 2009. [8] L. Vese and T. Chan, “A multiphase level set framework for image segmentation using the Mumford and Shah model,” Int. J. Comput. Vis., vol. 50, no. 3, pp. 271–293, 2002. [9] T. Chan and L. Vese, “Active contours without edges,” IEEE Trans. Image Process., vol. 10, no. 2, pp. 266–277, Feb. 2001. [10] S. C. Zhu and A. Yuille, “Region competition: Unifying snakes, region growing, and bayes/MDL for multiband image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 118, no. 9, pp. 884–900, Sep. 1996. [11] R. Archibald, J. Hu, A. Gelb, and G. E. Farin, “Improving the accuracy of volumetric segmentation using pre-processing boundary detection and image reconstruction,” IEEE Trans. Image Process., vol. 13, no. 4, pp. 459–466, Apr. 2004. [12] I. B. Ayed, A. Mitiche, and Z. Belhadj, “Multiregion level set partitioning on synthetic aperture radar images,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 5, pp. 793–800, May 2005. [13] M. Di Bisceglie and C. Galdi, “CFAR detection of extended objects in high-resolution SAR images,” IEEE Trans. Geosci. Remote Sens., vol. 43, no. 4, pp. 833–843, Apr. 2005. [14] M. Mignotte, C. Collet, P. Perez, and P. Bouthemy, “Sonar image segmentation using an unsupervised hierarchical MRF model,” IEEE Trans. Image Process., vol. 9, no. 7, pp. 1216–1231, Jul. 2000. [15] S. C. Zhu, “Statistical modeling and conceptualization of visual patterns,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, no. 6, pp. 691–712, Jun. 2003. [16] Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 11, pp. 1222–1239, Nov. 2001. [17] A. Jain, P. Duin, and J. Mao, “Statistical pattern recognition: A review,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 1, pp. 4–37, Jan. 2000. [18] I. S. Dhillon, Y. Guan, and B. Kulis, “Weighted graph cuts without eigenvectors: A multilevel approach,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 29, no. 11, pp. 1944–1957, Nov. 2007. [19] B. Schölkopf, A. Smola, and K.-R. Müller, “Nonlinear component analysis as a kernel eigenvalue problem,” Neural Comput., vol. 10, no. 5, pp. 1299–1319, 1998.

BEN SALAH et al.: CONVEX-RELAXED KERNEL MAPPING FOR IMAGE SEGMENTATION

[20] J. Yuan, E. Bae, and X.-C. Tai, “A study on continuous max-flow and min-cut approaches,” in Proc. CVPR, 2010, pp. 2217–2224. [21] Y. Boykov and V. Kolmogorov, “An experimental comparison of mincut/max-flow algorithms for energy minimization in vision,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 9, pp. 1124–1137, Sep. 2004. [22] M. Klodt, T. Schoenemann, K. Kolev, M. Schikora, and D. Cremers, “An experimental comparison of discrete and continuous shape optimization methods,” in Proc. ECCV, vol. 1. 2008, pp. 332–345. [23] T. Pock, D. C. A. Chambolle, and H. Bischof, “A convex relaxation approach for computing minimal partitions,” in Proc. CVPR, Jun. 2009, pp. 810–817. [24] V. Kolmogorov and Y. Boykov, “What metrics can be approximated by geo-cuts, or global optimization of length/area and flux,” in Proc. 10th IEEE ICCV, Oct. 2005, pp. 564–571. [25] N. Guan, D. Tao, Z. Luo, and B. Yuan, “Non-negative patch alignment framework,” IEEE Trans. Neural Netw., vol. 22, no. 8, pp. 1218–1230, Aug. 2011. [26] N. Guan, D. Tao, Z. Luo, and J. Shawe-Taylor, “MahNMF: Manhattan non-negative matrix factorization,” CoRR abs/1207.3438, 2012. [27] M. Ben Salah, I. B. Ayed, J. Yuan, W. Wang, and H. Zhang, “Convex relaxation for image segmentation by kernel mapping,” in Proc. IEEE Int. Conf. Image Process., Oct. 2012, pp. 289–292. [28] M. Mignotte, “A label field fusion Bayesian model and its penalized maximum rand estimator for image segmentation,” IEEE Trans. Image Process., vol. 19, no. 6, pp. 1610–1624, Jun. 2010. [29] J. Shi and J. Malik, “Normalized cuts and image segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 22, no. 8, pp. 888–905, Aug. 2000. [30] D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 5, pp. 603–619, May 2002. [31] K.-R. Müller, S. Mika, G. Ratsch, S. Tsuda, and B. Schölkopf, “An introduction to kernel-based learning algorithms,” IEEE Trans. Neural Netw., vol. 12, no. 2, pp. 181–202, Mar. 2001. [32] E. Giusti, Minimal Surfaces and Functions of Bounded Variation. Canberra, Australia: Notes on Pure Math., 1977. [33] P. J. Huber, Robust Statistics. New York, NY, USA: Wiley, 1981. [34] X. Bresson, S. Esedoglu, P. Vandergheynst, J. Thiran, and S. Osher, “Fast global minimization of the active contour/snake model,” J. Math. Imag. Vis., vol. 28, no. 2, pp. 151–167, 2007. [35] M. Nikolova, S. Esedoglu, and M. Nikolova, “Algorithms for finding global minimizers of image segmentation and denoising models,” SIAM J. Appl. Math., vol. 66, no. 5, pp. 1632–1648, 2006. [36] J. Yuan, E. Bae, X. Tai, and Y. Boykov, “A continuous max-flow approach to potts model,” in Proc. ECCV, 2010, pp. 379–392. [37] D. P. Bertsekas, Nonlinear Programming. Belmont, MA, USA: Athena Scientific, 1999. [38] R. T. Rockafellar and R. J.-B. Wets, Variational Analysis (Fundamental Principles of Mathematical Sciences), vol. 317. Berlin, Germany: Springer-Verlag, 1998. [39] R. T. Rockafellar, “The multiplier method of hestenes and powell applied to convex programming,” J. Optimizat. Theory Appl., vol. 12, no. 6, pp. 555–562, 1973. [40] R. T. Rockafellar, “Augmented Lagrangians and applications of the proximal point algorithm in convex programming,” Math. Oper. Res., vol. 1, no. 2, pp. 97–116, 1976. [41] D. P. Bertsekas, Constrained Optimization and Lagrange Multiplier Methods (Computer Science and Applied Mathematics). New York, NY, USA: Academic, 1982. [42] A. Chambolle, “An algorithm for total variation minimization and applications,” J. Math. Imag. Vis., vol. 20, nos. 1–2, pp. 89–97, 2004. [43] P. Martin, P. Refregier, F. Goudail, and F. Guerault, “Influence of the noise model on level set active contour segmentation,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 6, pp. 799–803, Jun. 2004. [44] A. Y. Yang, J. Wright, Y. Ma, and S. S. Sastry, “Unsupervised segmentation of natural images via lossy data compression,” Comput. Vis. Image Understand., vol. 110, no. 2, pp. 212–225, 2008. [45] R. Unnikrishnan, C. Pantofaru, and M. Hebert, “A measure for objective evaluation of image segmentation algorithms,” in Proc. IEEE Int. Conf. Comput. Vis. Pattern Recognit., vol. 3. Jun. 2005, pp. 34–41. [46] (2003). Berkeley Segmentation and Boundary Detection Benchmark and Dataset [Online]. Available: http://www.cs.berkeley.edu/projects/vision/grouping/segbench [47] C. Samson, L. Blanc-Feraud, G. Aubert, and J. Zerubia, “A level set model for image classification,” Int. J. Comput. Vis., vol. 40, no. 3, pp. 187–197, 2000.

1153

[48] C. Lenglet, M. Rousson, and R. Deriche, “DTI segmentation by statistical surface evolution,” IEEE Trans. Med. Imag., vol. 25, no. 6, pp. 685–700, Jun. 2006. [49] Y. Dong, B. Forster, and A. Milne, “Comparison of radar image segmentation by Gaussian- and gamma-Markov random field models,” Int. J. Remote Sens., vol. 24, no. 4, pp. 711–722, 2003. [50] A. Andreopoulos and J. K. Tsotsos, “Efficient and generalizable statistical models of shape and appearance for analysis of cardiac MRI,” Med. Image Anal., vol. 12, no. 3, pp. 335–357, 2008. [51] I. Ekeland and R. Téman, Convex Analysis and Variational Problems. Philadelphia, PA, USA: SIAM, 1999.

Mohamed Ben Salah received the State Engineering degree from the Polytechnic School of Tunisia in 2003, and the M.Sc. and Ph.D. degrees in computer science from the National Institute of Scientific Research, University of Quebec, Montreal, QC, Canada, in 2006 and 2011, respectively. Since 2011, he has been a Post-Doctoral Fellow with the Department of Computing Science, University of Alberta. His research interests include image and motion segmentation with focus on level sets and graph cuts and have applications in mining problems and medical image analysis. He received the Doctoral Scholarship from the Government of Tunisia, the FQRNT Doctoral Scholarship and the FQRNT Post-Doctoral Fellowship from the Government of Quebec.

Ismail Ben Ayed received the Ph.D. (Hons.) degree in computer vision from the National Institute of Scientific Research, University of Quebec, Montreal, QC, Canada, in 2007. Since then, he has been a Research Scientist with GE Healthcare, London, ON, Canada. He is an Adjunct Professor with the University of Western Ontario. His research interests are in computer vision, optimization, machine learning, and their applications in medical image analysis. He has co-authored a book on image segmentation, over 40 papers in reputable journals and conferences, and six patent applications. He received the GE Innovation Award and two NSERC Fellowships. He serves regularly as a reviewer for the major vision and medical imaging venues.

Jing Yuan is currently a Research Scientist with the Robarts Research Institute, University of Western Ontario, Canada. He is an Adjunct Assistant Professor with the Department of Medical Biophysics, Schulich Medical School, University of Western Ontario. His main research interests include modern optimization methods and developing efficient algorithms for computer vision and medical image processing.

Hong Zhang received the B.Sc. degree from Northeastern University, Boston, USA, in 1982, and the Ph.D. degree from Purdue University, West Lafayette, IN, USA, in 1986, both in electrical engineering. He conducted post-doctoral research at the University of Pennsylvania from 1986 to 1987 before joining the Department of Computing Science, University of Alberta, Canada, where he is currently a Professor and Director of the Centre for Intelligent Mining Systems. He is an NSERC Industrial Research Chair. His current research interests include robotics, computer vision, and image processing.

Convex-relaxed kernel mapping for image segmentation.

This paper investigates a convex-relaxed kernel mapping formulation of image segmentation. We optimize, under some partition constraints, a functional...
6MB Sizes 1 Downloads 3 Views