Interactive Image Segmentation Framework Based On Control Theory Liangjia Zhua , Ivan Kolesova , Vadim Ratnera , Peter Karasevb , and Allen Tannenbauma a Departments

of Computer Science and Applied Mathematics/Statistics, Stony Brook University, Stony Brook, New York, USA; b Atlanta Agilent Technologies, Atalanta, Georgia, USA ABSTRACT

Segmentation of anatomical structures in medical imagery is a key step in a variety of clinical applications. Designing a generic, automated method that works for various structures and imaging modalities is a daunting task. Instead of proposing a new specific segmentation algorithm, in this paper, we present a general design principle on how to integrate user interactions from the perspective of control theory. In this formulation, Lyapunov stability analysis is employed to design an interactive segmentation system. The effectiveness and robustness of the proposed method are demonstrated. Keywords: Interactive Image Segmentation, Control Theory, Lyapunov Stability, Active Contours

1. INTRODUCTION Image segmentation has been an active research field over the past several decades and remains a very challenging task.1 It is frustrating that in certain scenarios, human users can recognize and extract target objects instantly, while it is still hard for computers to accomplish satisfactory results automatically. How to effectively integrate experts’ prior knowledge into a segmentation design has become a basic principle underlying numerous types of existing state-of-the-art segmentation methods.2 However, to the best of our knowledge, there have been only very few attempts that model interactive segmentation process as a closed-loop system.3, 4 In our previous work,5 we formulate interactive image segmentation as a feedback control framework based on single-object region-based active contour models. In this work, we present a generalization of the work to more generic cases, which seamlessly handles both region- and distance-based criteria for multi-object image segmentation.

2. METHODS The overall of the proposed framework in shown in Figure 2. It can be regarded as a dual control system: in the top level, a user adaptively applies inputs to guide the system towards an expected segmentation; in the lower level: the dynamical system reacts accordingly via a standard feedback control loop driven by an estimate of segmentation errors. A large class of segmentation algorithms can be considered as evolutionary dynamical processes. Starting from some given regions, these algorithms evolve the regions based on certain quantifiable criteria. Examples include region growing/competition, classical active contour models, distance-based segmentation. Typically, the evolutionary process can be described by a dynamical system, driven by the optimization of certain energy function(al)s. As an example, the level set formulation of active contours is employed here. Let I : Ω → Rm be an image function defined on Ω ∈ Rn , where m ≥ 2 and n ≥ 2. Suppose an image to be segmented consists of N regions. At any given time t ∈ R+ , each region Ωi (x, t) is associated with a level set function φi (x, t) and is moving from an initial state φi (x, 0) = φi0 (x). The Heaviside function, denoted by H(φ(x)), is used to indicate the exterior and interior regions and its derivative is denoted by δ(φ(x)). Further author information: E-mails: {liangjia.zhu, ivan.kolesov, vadim.ratner, allen.tannenbaum}@stonybrook.edu

Medical Imaging 2015: Image Processing, edited by Sébastien Ourselin, Martin A. Styner, Proc. of SPIE Vol. 9413, 941343 · © 2015 SPIE · CCC code: 1605-7422/15/$18 · doi: 10.1117/12.2082359

Proc. of SPIE Vol. 9413 941343-1 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/15/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

Expert Knowledge Ideal φ∗ (x)

Visualization I(x), H(φ(x, t))

ri

Dynamical System Segmentation ∂φi = Gi (x, t) + F (φi , φˆ∗i ) ∂t

User Control User Input

{tki , xki }

u(x, t)

Input Processing uki (x, t)

Estimator ξˆi (x, t) = S(Ui (x, t))

φ(x, t)

Figure 1. Diagram of the control-based segmentation framework.The feedback compensates for deficiencies in automatic segmentation by utilizing the expert’s knowledge.

Suppose the user has an ideal segmentation of the image in mind: {φ∗i (x)}, i = 1, · · · , N . Then, the goal is to design a feedback control system   ∂φi (x, t) = Gi (x, t) + F (φi (x, t) , φ∗i (x)) δ(φi (x, t)) ∂t , (1) 0 φi (x, 0) = φi (x) such that lim φ(x, t) → φ∗ (x) for i = 1, · · · , N , where Gi (x, t) is the image “force” that determines the t→∞  evolution of region Ωi (x, t) and F (φi (x, t), φ∗i (x)) is the control signal needs to be determined. We decompose each Gi (x, t) into two competing components as Gi (x, t) = −(gi (x, t) − gic (x, t)),

(2)

where gi : Rm × R+ → R represents the evolution force from Ωi (x, t) and gic : Rm × R+ → R is that from all other Ωj (x, t), j 6= i. This way of decomposition has been used in modeling multiple active contours in different region-based algorithms.6–8 On the other hand, since distance information from given points in an image can be implemented using the level set formulation,9 segmentation algorithms that are based on clustering pixels according to the minimal distance to given seeds naturally fit into this formulation. That is, the presented framework works for: 1) region-based active contour models and 2) distance-based clustering. Examples of the image force G(x, t) are given in the following sections.

2.1 Region-based Active Contour Models The function gi (x, t) may be defined as the statistics for the region Ωi (x, t). A simple example is the first order statistics defined as 2 gi (x, t) := [I(x) − µi (t)] δ(φi (x, t)) (3) where I(x) is the image intensity at point x and µi (t) is the mean intensity of region Ωi at time t;10, 11 and gic (xk , t) := min gj (xk , t)

for each xk ∈ Ωi .

j6=i

(4)

Other region-based energies8, 12, 13 can be used in a similar way.

2.2 Distance-based Clustering Let Ωxi be the set of seed points for a region Ωi . The distance between a point x ∈ Ω and the seed region is defined as Z 1 d(x, Ωxi ) = min min gγ (C(p))kC 0 (p)kdp, (5) y∈Ωxi C∈θ(x,y)

0

Proc. of SPIE Vol. 9413 941343-2 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/15/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

where θ(x, y) is the family of all paths connecting points x and y, and p ∈ [0, 1] is the parametrization of a specific path C : [0, 1] → Rm weighted by an image-based function gγ : Rm → R+ . The distance d(x, Ωxi ) may be computed using the level set formulation as well by interpreting it as a front propagation problem with an image-dependent speed function 1/gγ , where gγ (I) = 1 + k∇Ik22 .9 After computing the distance from a point to each seed region, the point is assigned to the closest region. −1 −1 With a slight abuse of notation, let φ−1 i (x, t), φic (x, t), and φmin (x, t) be the distance between point x and region Ωi , the shortest distance between the point x and any regions other than Ωi (x, t), and the shortest distance between the point and all regions, respectively. An example of evolution force acting on φi is defined as  −1 gγ (I) if φ−1 i (x, t) 6= φmin (x, t) gi (x, t) := (6) 0 otherwise

and gic (x, t)

 :=

gγ (I) 0

−1 if φ−1 ic (x, t) 6= φmin (x, t) otherwise .

(7)

This formulation is essentially a clustering processing based on the shortest distance from a point to all regions.

2.3 Existence of Control Law Given the ideal segmentation, we define point-wise error for each region at time t as ξi (x, t) := ε(φi (x, t), φ∗i (x)),

(8)

where the function ε measures the point-wise difference between φi (x, t) and φ∗i . A simple example of ε is chosen as H(φi (x, t)) − H(φ∗i (x)). Following the derivation of our previous work,5 we have the following theorem for the dynamical system defined in equation (1): Theorem 1. The control law F (φi (x, t), φ∗i (x)) = −αi2 (x, t)ξi (x, t),

(9)

where αi2 (x, t) ≥ gM (x), asymptotically stabilizes the system (1) from {φi (x, t)} to {φ∗i (x)}, i = 1, · · · , N . Furthermore, the control law exponentially stabilizes the system with a convergence rate of e−νt when ξ is large in the sense of Z Z ρ δ 2 (φi (x, t))ξi2 (x, t)dx ≤ ξi2 (x, t)dx, i = 1, · · · , N (10) Ω



for given constants ν > 0, ρ > 0. Here gM (x) is the bounds of the image depend term Gi (x, t).

3. RESULTS To demonstrate the effectiveness of the proposed method, the localized region-based active contour energy12 was implemented for the region-based active contour model and a gradient-based distance measure9 was used for the distance-based clustering. Note that, in practice, the information of the ideal segmentation φ∗ (x) may not ˆ t) was used for ξ(x, t) (see our be completely accessible on the fly. An estimate of the segmentation error ξ(x, previous work5 for details). Two orthopedic images were used to quantitatively compare the presented method with the popular GrabCut algorithm.14 The structures being segmented, the epiphysis and physis, are shown in Figure 2. User input via mouse click-and-drag was implemented and measured identically for each algorithm. A location through which the cursor was dragged is defined as an “actuated voxel;” the extent around the cursor that marks seed regions in GrabCut are not counted towards this total. Locations in the image whose assigned label changes between background and foreground are tracked over time and are referred to as “reclassified” voxels. The total number of actuated voxels needed to complete the segmentation is presented in Figure 3. It shows that both the region- and distance-based interactive segmentation methods require less user input than the

Proc. of SPIE Vol. 9413 941343-3 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/15/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

Yoz1 Figure 2. Two test images are used in a quantitative comparison of GrabCut and the proposed algorithm. Manual segmentations are marked in yellow for the epiphysis (second) and physis (the last).

Grabcut in segmenting these two structures. Segmenting the physis is more difficult with GrabCut due to the elongated shape, the nearly identically-looking fluid around the bone and the bimodal appearance of cortical bone above and spongy bone below the physis. A GrabCut iteration can change the segmentation dramatically; when this change is erroneous, significant corrective effort becomes required. In Figure 3, we see this is manifested by the large increases in actuated voxels during the first few rounds of GrabCut user input. In contrast, the proposed algorithm provides rapid continuous visual feedback for the user; small corrections are made before a large error can develop. 800

800

GrabCut Region-based Distance-based

600 # voxels

# voxels

600

400

200

0

GrabCut Region-based Distance-based

400

200

0

0.2

0.4 0.6 time (scaled)

0.8

1

0

0

0.2

(a)

0.4 0.6 time (scaled)

0.8

1

(b)

Figure 3. Comparison of actuated voxels over time after initialization for (a) epiphysis (b) physis. The proposed algorithm has both a lower mean actuated count and tighter clustering across repeated segmentations.

Predictability of how the segmentation changes in response to mouse strokes is a criterion for practical ease of use. Two scatterplots quantify the predictability in Figure 4; dynamic response is characterized in terms of the number of reclassified voxels (Y-axis) and the number of newly actuated voxels (X-axis). Each mark corresponds to one iteration when new user input was applied. Linear regression lines are overlaid on the data. All algorithms have a similar dynamic response in the epiphysis segmentation in Figure 4(a). Two issues become apparent for the juvenile physis segmentation. First, the distribution of GrabCut data points is quite broad; Second, some of the GrabCut data points are below the dashed pink line, indicating a waste of user effort since there are more voxels actuated than reclassified. The dynamic response of GrabCut makes it hard for a user to predict how much change new mouse strokes will cause.

4. CONCLUSIONS This paper has presented a systematical way of applying control theory to design an interactive medical image segmentation system. Preliminary results show the effectiveness and robustness of the proposed method. Though the examples used in this paper are based on level-sets formulation, the design principle is generalizable to other interactive segmentation systems that can be described by dynamical systems. It is extensible to discrete systems as well.

Proc. of SPIE Vol. 9413 941343-4 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/15/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

104

103

103

change in binary labels

change in binary labels

104

102

101

100 0 10

GrabCut Region-based Distance-based

101 102 change in actuated voxels

102

101

100 0 10

(a)

GrabCut Region-based Distance-based

101 102 change in actuated voxels

(b)

Figure 4. Comparison of dynamic response to user input; data points and linear fit lines for (a) epiphysis (b) physis. Points below the dashed pink line indicate wasted user effort since more additional voxels were actuated than reclassified.

Acknowledgments This project was supported by in part by grants from the National Center for Research Resources (P41-RR013218) and the National Institute of Biomedical Imaging and Bioengineering (P41-EB-015902) of the National Institutes of Health. This work was also supported by NIH grants R01 MH82918 and 1U24CA18092401A1 as well as AFOSR grants FA9550-09-1-0172 and FA9550-15-1-0045.

REFERENCES [1] Suri, J., “Computer vision, pattern recognition and image processing in left ventricle segmentation: The last 50 years,” Pattern Analysis and Applications 3(3), 209–242 (2000). [2] Peng, B., Zhang, L., and Zhang, D., “A survey of graph theoretical approaches to image segmentation,” Pattern Recognition 46(3), 1020 – 1038 (2013). [3] Cremers, D., Fluck, O., Rousson, M., and Aharon, S., “A probabilistic level set formulation for interactive organ segmentation,” in [Proceedings of the SPIE Medical Imaging], (Feb. 2007). [4] Ben-Zadok, N., Riklin-Raviv, T., and Kiryati, N., “Interactive level set segmentation for image-guided therapy,” in [IEEE International Symposium on Biomedical Imaging: From Nano to Macro (ISBI).], 1079– 1082, IEEE (2009). [5] Karasev, P., Kolesov, I., Fritscher, K. D., Vela, P. A., Mitchell, P., and Tannenbaum, A., “Interactive medical image segmentation using PDE control of active contours,” IEEE Transactions on Medical Imaging 32(11), 2127–2139 (2013). [6] V´ azquez, C., Mitiche, A., and Lagani`ere, R., “Joint multiregion segmentation and parametric estimation of image motion by basis function representation and level set evolution,” IEEE Transactions on Pattern Analysis and Machine Intelligence 28(5), 782–793 (2006). [7] V´ azquez, C., Mitiche, A., and Ayed, I. B., “Image segmentation as regularized clustering: a fully global curve evolution method,” in [International Conference on Image Processing], 3467–3470 (2004). [8] Gao, Y., Kikinis, R., Bouix, S., Shenton, M. E., and Tannenbaum, A., “A 3D interactive multi-object segmentation tool using local robust statistics driven active contours,” Medical Image Analysis 16(6), 1216– 1227 (2012). [9] Cohen, L. D. and Kimmel, R., “Global minimum for active contour models: A minimal path approach,” International Journal of Computer Vision 24, 57–78 (Aug. 1997). [10] Chan, T. F. and Vese, L. A., “Active contours without edges,” Transactions on Image Processing 10, 266–277 (Feb. 2001).

Proc. of SPIE Vol. 9413 941343-5 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/15/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

[11] Jr., A. Y., Tsai, A., and Willsky, A., “A fully global approach to image segmentation via coupled curve evolution equations,” Journal of Visual Communication and Image Representation 13(1-2), 195 – 216 (2002). [12] Lankton, S. and Tannenbaum, A., “Localizing region-based active contours,” IEEE Transactions on Image Processing 17(11), 2029–2039 (2008). [13] Michailovich, O. V., Rathi, Y., and Tannenbaum, A., “Image segmentation using active contours driven by the bhattacharyya gradient flow,” IEEE Transactions on Image Processing 16(11), 2787–2801 (2007). [14] Rother, C., Kolmogorov, V., and Blake, A., “GrabCut: Interactive foreground extraction using iterated graph cuts,” ACM Transactions on Graphics 23, 309–314 (Aug. 2004).

Proc. of SPIE Vol. 9413 941343-6 Downloaded From: http://proceedings.spiedigitallibrary.org/ on 08/15/2015 Terms of Use: http://spiedigitallibrary.org/ss/TermsOfUse.aspx

Interactive Image Segmentation Framework Based On Control Theory.

Segmentation of anatomical structures in medical imagery is a key step in a variety of clinical applications. Designing a generic, automated method th...
566B Sizes 1 Downloads 11 Views