1640

IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 34, NO. 8, AUGUST 2015

Regression Segmentation for

Spinal Images

Zhijie Wang*, Xiantong Zhen, KengYeow Tay, Said Osman, Walter Romano, and Shuo Li

Abstract—Clinical routine often requires to analyze spinal images of multiple anatomic structures in multiple anatomic planes ). Unfortunately, existing from multiple imaging modalities ( methods for segmenting spinal images are still limited to one specific structure, in one specific plane or from one specific modality ( ). In this paper, we propose a novel approach, Regression Segmentation, that is for the first time able to segment spinal images in one single unified framework. This approach formulates the segmentation task innovatively as a boundary regression problem: modeling a highly nonlinear mapping function from substantially images directly to desired object boundaries. Leverdiverse aging the advancement of sparse kernel machines, regression segmentation is fulfilled by a multi-dimensional support vector regressor (MSVR) which operates in an implicit, high dimensional feature space where diversity and specificity can be systematically categorized, extracted, and handled. The proposed regression segmentation approach was thoroughly tested on images from 113 clinical subjects including both disc and vertebral structures, in both sagittal and axial planes, and from both MRI and CT modalities. The overall result reaches a high dice similarity index (DSI) 0.912 and a low boundary distance (BD) 0.928 mm. With our unified and expendable framework, an efficient clinical tool for spinal image segmentation can be easily achieved, and will substantially benefit the diagnosis and treatment of spinal diseases. Index Terms—Computed tomography (CT), disc, magnetic resonance imaging (MRI), multi-kernel, segmentation, spine, support vector regression, vertebra.

I. INTRODUCTION

P

ATHOLOGICAL conditions affecting the spine, such as the low back pain (LBP), osteoporosis, spondylolisthesis, or herniation, have become acute problems of modern society. LBP, as an example, is the second most common neurological ailment in the United States with $50 billion spent annually on its rehabilitation and health care [1]. Diagnosis and treatment of these pathologies often require integrated quantitative analyses of spinal images of multiple anatomic structures in multiple anatomic planes from multiple imaging modalities

Manuscript received September 30, 2014; accepted October 23, 2014. Date of publication October 29, 2014; date of current version July 29, 2015. Asterisk indicates corresponding author. *Z. Wang is with the GE Healthcare, London, ON, N5Y 0A3 Canada (e-mail: [email protected]). X. Zhen is with the Department of Biophisics, Western University, London, ON, N6A 3L7 Canada (e-mail: [email protected]). K. Tay is with the London Health Sciences Centre, London, ON, N6A 5W9 Canada (e-mail: [email protected]). S. Osman and W. Romano are with the Department of Medical Imaging, Western University, London, ON, N6A 3K7 Canada (e-mail: [email protected]; [email protected]). S. Li is with the GE Healthcare, London, ON, N5Y 0A3 Canada (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TMI.2014.2365746

Fig. 1. Extreme diversity of spinal images including multiple anatomic structures, in multiple anatomic planes, from multiple imaging modalities. A novel approach, regression segmentation, is proposed in this paper to successdiversity in one single unified framework. Segmentation fully tackle the results are shown by red contours.

([2]–[4]). This poses a great challenge to both manual processing and computer processing methods. 1) Manual processing is bound to be infeasible for images because of its known tediousness, inefficiency, and inconsistency [5]. For example, it takes an operator up to 15 min to assess vertebral fracture (an indicator of osteoporosis) even in a single spinal image of one specific structure, in one specific plane or [5]. 2) Computer processing from one specific modality methods although have achieved preliminary success on prospinal images efficiently and consistently ([6]–[9]), cessing they are still unable to handle images since automatically segmenting these images in one single framework is extremely diversity (shown in Fig. 1) inchallenging due to the cluding: 1) totally different appearances of different structures such as vertebra and disc; 2) completely different intensity profiles from different imaging modalities such as MRI, CT, and X-ray; 3) substantially different shapes in different planes such as sagittal and axial ones even for the same anatomic structure. Such a diversity mixed together brings challenges in multiple levels to the conventional segmentation methods. Among many, two typical challenges are shown in Fig. 2: 1) A dehydrated disc shows a dark surface as in Fig. 2(a), while a normal disc clearly shows the nucleus pulposus with stronger edges (which can easily trap edge based methods) than the true disc boundary as shown in Fig. 2(b); 2) The foreground distributions of a vertebra image [as in Fig. 2(c)] and the background

0278-0062 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

WANG et al.:

SPINAL IMAGE SEGMENTATION

Fig. 2. Typical challenges brought by the diversity. (a) Dehydrated disc showing a dark surface. (b) Normal disc clearly showing the nucleus pulposus with stronger edges (which can easily trap edge based methods) than the true disc boundary shown by the red contour. (c) Vertebra image with its true boundary shown by the red contour. (d) Completely overlapped intensity distributions of the foreground in Fig. 2(c) and the background in Fig. 2(b), which can easily confuse intensity based methods.

distribution of a disc image [as in Fig. 2(b)] can be completely overlapped as shown in Fig. 2(d), and this can easily confuse intensity based methods when they try to segment both in one framework. In this paper, we propose a novel regression segmentation spinal imapproach which is able to accurately segment ages in a unified framework (as illustrated in Fig. 1). This approach, for the first time, formulates the segmentation task as a boundary regression problem. By leveraging the strength of sparse kernel machines [30], regression segmentation learns a images directly with model to associate extremely diverse desired boundaries. This learned model is highly nonlinear and possesses a holistic fashion. 1) Holistic regression output: The locations of all the boundary points are regressed simultaneously instead of separately and therefore they are guided by the global shape prior learned from the training data. 2) Holistic regression input: Each boundary point is regressed using the full image as a signal, and therefore the boundary regression model is able to capture the full context of a boundary point. In summary, the proposed work contributes in the following three categories. 1) Application: It achieved the cross-structure, crossimage segmentation in a modality, and cross-plane unified framework. 2) Approach: It formulated the segmentation task as a boundary regression problem to leverage the advancement of machine learning in a holistic fashion. 3) Methodology: It proposed an original multi-dimensional support vector regressor (MSVR) with a multi-kernel learning technique to fulfill the regression segmentation approach. II. RELATED WORK In this section the proposed work's novelty will be further illustrated by comparing to the existing works for computer processing spinal images. As listed in Table I, these works can be roughly categorized into three groups: 1) localization focused; 2) localization and segmentation focused; 3) segmentation focused. The three groups will be reviewed respectively in the following, and each work's key idea and its applicable images will be revisited. Localization: Among the existing works, some of them specifically focus on localization of spinal anatomic structures including vertebra, disc, and cord. For example, the works in

1641

[10], [11] rely on a two-level probabilistic model using both pixel and object level features to detect/localize intervertebral discs in sagittal MRI slices. Similarly, the work in [12] uses Hough transform technique to detect the spinal centerline and then applies a period detection to localize the centers of both vertebrae and discs. A different strategy is proposed in [13] to detect both discs and vertebrae in 3-D MRI volumes by training specific detection models for their own defined anchor vertebrae, bundle vertebrae, and discs. A local articulated model is then employed to effectively model the spatial relations across vertebrae and discs. The localization works discussed above normally serve as a preprocessing step to construct a ROI for the following segmentation step which is the focus of most of the spinal image processing works including this paper. Localization + Segmentation: In addition to localization, some works solve the segmentation problem as well. For example, the work in [1] relies on both intensity model and space information to localize and segment vertebrae in sagittal CT slices. Differently, the work in [14] uses learning based method to detect vertebrae first and then uses normalized cut to segment them in sagittal MRI slices. Prior shape and spatial knowledge are modeled in a two-scale framework in [15] to detect as well as segment vertebrae in 3-D CT volume. Other works exist such as [16], [17] use an intensity model or Hough transform technique to localize intervertebral discs and then obtain refined segmentation by edge detection in sagittal MRI slices. Another interesting work was recently introduced in [18] to localize and segment both discs and vertebrae in either MRI or CT 3-D volumes. Unfortunately, in stead of a unified framework, it requires to train individual models for each anatomic structure and modality. Segmentation: Since the detection/localization of vertebra or disc has been quite robust with detection accuracy over 98 percentage and center error under 3 mm (such as in [12], [14], [16]–[19]), many works assume that a ROI is available and directly focus on the segmentation of spinal anatomic structures. For example, the works in [6], [19] use 3-D shape-based Graph Cuts to segment vertebra bodies in CT images. Similarly, the works in [7], [20] use 3-D balloon snake and level set deformation techniques to segment vertebrae in 3-D CT images. The work in [21] is able to segment vertebrae in either CT or MRI images by fitting a 31-parameter 3-D superquadric model. However, in stead of a unified framework, it unfortunately requires an individual fitting similarity measurement for each modality. Other models such as polynomial model and active appearance model are also used as in [8], [22] to fit vertebrae in sagittal CT slices. Besides vertebra, many works have been focused on segmenting intervertebral discs too. Graph cuts integrated with object interaction prior has been used in [23] for segmenting discs in sagittal MRI slices. Differently, the work in [24] uses a learning method to classify the watershed regions into disc or background in sagittal MRI slices. The work in [25] proposed a novel anisotropic oriented flux descriptors based energy functional and obtained disc segmentation by minimizing the energy functional with level set. Probabilistic atlas is used in [31] to constrain the fuzzy clustering result and achieves disc segmentation in sagittal MRI slices. Other than MRI, peripheral quantitative CT is used to visualize disc in [26] and a multi-scale level set is employed to segment discs in the axial slices. There are

1642

IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 34, NO. 8, AUGUST 2015

TABLE I COMPREHENSIVE REVIEW OF THE RELATED WORKS ON SPINAL IMAGE PROCESSING

Fig. 3. Flowchart of regression segmentation framework where the segmentation task is formulated as a boundary regression problem.

also some works focused on spinal canal/cord segmentation too. For example, the work in [27] employed basic techniques of watershed plus morphological operations to segment the spinal canal in axial MRI slices. Sophisticated methods such as active surface and active boundary are used as well in [28], [29] to segment cord in MRI or CT images. All the works reviewed above have at least one of the following limitations (as shown in Table I): 1) limited to one specific anatomic structure, e.g., either vertebra or disc, 2) limited to one specific modality, e.g., either MRI or CT, 3) limited to one specific plane, e.g., either sagittal plane or axial plane. However, even for a single patient, diverse spinal images with different structures, modalities, and planes are generally required to be

processed together for the diagnosis and treatment purpose. It is extremely inconvenient if any structure, modality, and plane dependent tool has to be chosen every time when analyzing a different type of spinal images. Our proposed unified framework allows physicians to quantitatively and accurately analyze any type of spinal structure in any plane and modality without switching tools. This will greatly improve the convenience and efficiency of the current clinical routine processes. III. UNIFIED REGRESSION SEGMENTATION FRAMEWORK The overview of the regression segmentation framework for spinal images is shown in Fig. 3. The training process learns a highly nonlinear boundary regression model (illustrated by

WANG et al.:

SPINAL IMAGE SEGMENTATION

1643

Fig. 4. Illustration of the point based boundary representation. (a) All the boundary points composing a complicated shape with the first point indicated by a star. (b) Highlighted rectangular portion in (a) showing the boundary points clearly.

the symbolic surface in Fig. 3) that maps image features to desired boundaries. The testing process directly predict the holistic boundary in the test image based on the learned regression model. A boundary in this framework is represented by the coordinates of a set of points (as detailed in Section III-A). With this representation, the boundary regression can be naturally fulfilled by an MSVR (as described in Section III-B) integrated with a multi-kernel learning technique (as in Section III-C). The flexible boundary representation and the highly nonlinear multi-kernel regressor are seamlessly integrated together into an effective approach for handling spinal images. A. Boundary Representation An object boundary is represented by a set of points as shown in Fig. 4(a). is the boundary point's coordinates. To ensure the points are in a consistent coordinate system cross different images, a scale normalization step is conducted to all the input images so that they have the same size in both training and testing processes. Note that a rescaling step is required to transform the predicted boundary obtained in the testing process back into the original image space. For a better visualization of the points, the rectangular portion in Fig. 4(a) is enlarged and shown in Fig. 4(b). For each boundary, the points are obtained in a consistent way: the first point is selected at the location where the boundary is crossed by the horizontal center line as shown in Fig. 4(a), and the rest are evenly sampled along the boundary in the clockwise direction. Throughout our experiments, 100 points were used empirically for boundary representation, and they were able to smoothly approximate complicated shapes such as the example in Fig. 4(a). Compared to other boundary representation methods such as the widely employed PCA shape ([32]), the point based representation has less assumption and is therefore more flexible. This characteristic fits well in the segmentation problem where object shapes come from multiple different classes, vary substantially, and must be handled by a flexible representation. B. Boundary Regression Regression segmentation learns a nonlinear mapping function (1) where image

is the feature vector extracted from the input , and is the boundary coordinate vector which corresponds to the object

segmentation in image . This formulation is fulfilled by an MSVR ([33], [34]) which is a sparse kernel machine and has the ability to model a highly nonlinear mapping function. The advancement of MSVR ensures regression segmentation's capability of tackling extremely diverse images in one single unified framework. MSVR has an additional advantage of predicting all the dimensions in the output vector simultaneously and dependently. This property exactly fits the boundary prediction problem, because the boundary spatial coherence can be fully exploited to achieve more accurate and efficient prediction than using a single-dimensional regressor separately for each point. Formally, the objective of the MSVR is to learn a kernel regressor (2) is a nonlinear transformation to a where higher H-dimensional space, is the M-dimensional image feature input and is the Q-dimensional boundary coordinate output. The direct goal is to learn the optimal parameters of and from a labeled training data set by solving the optimization problem [33], [34] (3) (4) where is the prediction error. The best solution of can be expressed based on the Representer Theorem [35] as a linear combination of the training samples in the transformed feature space (5) and it can be solved by an iterative re-weighted least square (IRWLS) algorithm [36]. Once solved, the Q-dimensional output (i.e., the boundary coordinate vector) for each new input can be computed as (6) where is a kernel function representing the dot product between a training sample and the new input vector in the feature space . The regressor in (6) is constructed taking into account the prediction errors of all dimensional outputs and therefore incorporates the boundary spatial coherence. C. Multi-Kernel Learning To establish accurate boundary regression for spinal images with multiple structures, modalities, and planes, a comprehensive set of image features has to be employed. Unfortunately, combining multiple image feature sets into one kernel

1644

IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 34, NO. 8, AUGUST 2015

was not investigated in the original MSVR work. In this subsection, we extend the MSVR with a multi-kernel learning technique in order to take advantage of an optimally combined feature representation. The complete feature set employed in this paper consists of the following ones due to their capabilities of capturing the global image context and their mutual complementarity. • Texture: WI-SIFT [37] captures the texture information in images by computing a histogram of oriented gradients in image blocks. • Intensity pattern: WI-SURF [37] describes the nature of the underlying intensity pattern based on sums of 2-D Haar wavelet responses. • Semantic context: GIST [38] summarizes an image's semantic context by computing its response to a Gabor filter bank. • Shape: HOG [39] captures the main shape of the anatomic structure in an image based on the occurrences of gradient orientation in localized portions of the image. Consequently, the whole feature set extracted from an input image is represented as

The descent direction is extended from the single dimensional output format in [40]to the multi-dimensional output format

(11)

Algorithm 1 Multi-kernel MSVR learning

Initialize

randomly.

repeat

Use the MSVR solver in III-B to solve the single kernel problem with kernel and obtain .

(7) until converged Each type of feature captures a specific image characteristic and constructs a specific kernel function for MSVR. To achieve an optimal kernel combination, inspired by [40]we extend MSVR with a multi-kernel learning strategy. The extended MSVR employs an adaptively combined kernel in the form of the product of RBF (radial basis function) kernels (8) where is the weighting parameter vector. Each individual RBF kernel is constructed based on one feature out of the four employed features in (7). Correspondingly, the original MSVR objective in (2) is changed into (9) where the feature is transformed into the space parametrized by and the dot product in the space is determined by the adaptively combined kernel in (8), . The goal is then changed into learning the optimal parameters of , , and from the training data, and it can be solved by a nested two step optimization procedure [40] which can be formally represented as

(10) where is a differentiable regularizer function of . As shown in Algorithm 1, the inner loop holds the kernel fixed and learns the value of that optimizes the MSVR parameters as in the previous subsection while the outer loop optimizes the kernel parameter using a gradient descent technique. The descent step is chosen based on the Armijo rule to guarantee convergence.

The seamless combination of MSVR and multi-kernel learning boosts each other's strength and achieves a powerful regressor that is able to learn various appearance and geometric characteristics of spinal structures into one single model. This proposed method provides a novel way to solve the segmentation problem from a totally new perspective, and is therefore able to tackle the large diversity among spinal images. IV. EXPERIMENT The experiments demonstrated the strength of the regression segmentation approach on our own as well as publicly available [41] datasets which include both disc and vertebra structures (of lumbar and thoracic sections), in both sagittal and axial planes, and from both MRI and CT modalities. The proposed approach produced accurate segmentation results for all the categories of spinal images as shown in Fig. 5. In spite of various variations within the dataset such as the shape variation between disc and vertebra (row 1 and 2) and intensity profile variation between MRI and CT modalities (row 3 and 5), the segmentation results are highly consistent with human experts' delineation with an average dice similarity index (DSI) [42] as high as 0.9119 and an average boundary distance (BD) [43] as low as 0.9280 mm. DSI measures the closeness of boundaries between manually segmented (M) and automatically segmented (A) spinal anatomic structures, while BD assesses the difference in boundary shape (12) (13) is the area of the manual segmentation, is the area where of the automatic segmentation, is the distance of pixel

WANG et al.:

SPINAL IMAGE SEGMENTATION

1645

Fig. 5. Visualization of the segmentation results. Each segmentation is represented as a red solid contour, and its corresponding ground truth is represented as a blue dashed contour. Five rows respectively correspond to, sagittal disc MRI images, sagittal vertebra MRI images, axial vertebra MRI images, sagittal vertebra CT images, and axial vertebra CT images.

on boundary to the closest one on boundary, and is the total number of boundary points on , while is the distance of pixel on boundary to the closest one on boundary, and is the total number of boundary points on . DSI and BD have ranges of [0,1] and , and both indicate from the worst match to the best match. A. Data The whole data set contains the ROIs cropped from three image sets collected on 113 clinical subjects (including disc degeneration cases and vertebra fracture cases). Although ROIs can be cropped automatically based on the detection results of a localization method, manually cropped ROIs are employed in this paper to guarantee the quality of inputs to the proposed method. In such a way, the very performance of tackling diversity can be evaluated independently. 1) The first image set consists of 93 subjects (46 men, 47 women, average 49 15 years), scanned using a sagittal T2 weighted MRI with repetition time (TR) of 4000 ms and echo time (TE) of 85 ms under a magnetic field of 1.5T. The scans cover the entire lumbar spine. The in-plane resolution is 0.5 mm 0.5 mm with slice thickness of either 1 mm or 1.6 mm. 2) The second image set consists of 10 subjects (six men, four women, average 48 9 years), scanned using an axial T1 weighted MRI with TR of 500 ms and TE of 11 ms under a magnetic field of 1.5T. The scans cover the entire lumbar spine. The in-plane resolution is 0.4 mm 0.4 mm with slice thickness of 4 mm. 3)

The third image set is publicly available [41] at http://csi-workshop.weebly.com/challenges.html and it consists of CT scans from 10 young adults (16–35 years old). The scans cover the entire thoracic and lumbar spine. The in-plane resolution is between 0.31 and 0.45 mm, and the slice thickness is 1 mm. All the data are tested and trained using a leave-one subject-out strategy based on the clinical subject ID, i.e., images from a certain subject regardless of its modalities/anatomic planes/anatomic structures are tested based on the training conducted on the images from all the other subjects. B. Comprehensive

Performance Analysis

Table II reports the results on the whole data set which is grouped into five categories based on images' shared characteristics. Specifically, the five categories are 1) intervertebral discs in sagittal plane MRI images; 2) vertebrae in sagittal plane MRI images; 3) vertebra in axial plane MRI images; 4) vertebrae in sagittal plane CT images; 5) vertebrae in axial plane CT images. The average results reached a high DSI 0.9119 and a low BD 0.9280 mm, which demonstrated a good conformity between the automatically estimated spinal structure boundaries and the manually delineated ones by experts. Table II also clearly shows that the proposed regression segmentation framework handles each category with a unique characteristic successfully due to its ability to systematically categorize and extract the diversity and specificity.

1646

IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 34, NO. 8, AUGUST 2015

TABLE II STATISTICAL APPRAISAL OF THE CONFORMITY BETWEEN THE AUTOMATICALLY ESTIMATED AND MANUALLY DRAWN SPINAL STRUCTURE BOUNDARIES

TABLE III PROPOSED METHOD HANDLING VARIOUS SPINAL STRUCTURES

PERFORMANCE

OF THE

PERFORMANCE

OF THE

TABLE VI COMPARISON BETWEEN LEARNING BASED ON AXIAL VERTEBRA IMAGES OF A SINGLE MRI MODALITY (CATEGORY 3 DESCRIBED IN SECTION IV-B) AND BOTH MRI AND CT MODALITIES (CATEGORIES 3 AND 4 DESCRIBED IN SECTION IV-B)

TABLE IV PROPOSED METHOD HANDLING VARIOUS IMAGE MODALITIES

C. Unified Regression Segmentation Framework Advantage Analysis

PERFORMANCE

OF THE

TABLE V PROPOSED METHOD HANDLING VARIOUS IMAGE PLANES

In the following, the strength of the regression segmentation approach processing images is further analyzed for each specific structure, modality and plane. Multi-structure analysis: Table III shows the performance of handling the two representative spinal structures: disc and vertebra. The results on all the (465) disc images reached a DSI of 0.8720 and an BD of 0.8737 mm, and the results on all the vertebra (855) images reached a DSI of 0.9336 and BD of 0.8865 mm. Both structures were successfully handled although they have totally different shapes. As a consequence, physicians can analyze arbitrary spinal structures while without switching tools. Multi-modality analysis: Table IV reports the performance of handling the two representative image modalities: MRI and CT. Both of the two modalities were processed with high DSIs over 0.9 and low BDs under 1 mm. It demonstrates that the proposed approach handles successfully the different intensity profiles of MRI and CT images which is a big challenge to the existing methods. Multi-plane analysis: Table V summarizes the performance of handling the spinal images in two representative planes: sagittal and axial planes. Even the same structure such as vertebra may show totally different shapes in different planes, and it imposes a tremendous challenge to segmentation. Both the two planes are successfully processed with high DSIs and low BDs, which demonstrates that the regression segmentation approach is able to segment spinal structures in arbitrary planes.

In this subsection, we conduct experiments to validate the advantages of the unified framework versus framework, the regression segmentation approach versus conventional segmentation methods, the MSVR formulation versus single dimensional support vector regressor (SSVR) formulation, and the multi-kernel MSVR versus individual kernel MSVR. Unified framework advantage}: Besides the demonstrated flexibility, processing spinal images in our proposed unified framework has an additional advantage that the reciprocal information among different image categories can be exploited to learn an accurate prediction model. For example, the training images of the same structure from more than one modality can result in a more accurate model than using images from a single modality. As shown in Table VI, compared to using the 50 axial vertebra MRI images alone from the category 3 described in Section IV-B, the segmentation result is improved with mm BD decrease, by including the 170 axial vertebra CT images from the category 4 described in Section IV-B. Therefore, multiple types of spinal images should be modeled in a unified framework in order to benefit small datasets or reduce the level of data requirement. Regression segmentation advantage: Table VII demonstrates the advantages of the regression segmentation approach compared to the existing popular medical image segmentation methods including Graph cuts [44], Active contours [45], and classification based methods [46], [47]. These existing methods have been widely used in various areas including spinal [6], [23], cardiac [48], [49], and brain [50], [47] images. However, the diversity among spinal images brings formidable challenges to these existing methods. For example, Active contours tend to be mistakenly attracted by high gradient edges of disc nucleus pulposus as shown in Fig. 6(a), and it also may not converge to the object boundary as shown in Fig. 6(b) if the initialization is not close enough. Both Graph cuts and classification based methods may totally fail due to their dependence on intensity profile which unfortunately shows

WANG et al.:

SPINAL IMAGE SEGMENTATION

TABLE VII COMPARISON BETWEEN REGRESSION SEGMENTATION APPROACH AND EXISTING SEGMENTATION METHODS INCLUDING ACTIVE CONTOURS, GRAPH CUTS, AND CLASSIFICATION-BASED METHODS

1647

TABLE VIII COMPARISON BETWEEN MSVR AND SSVR. BOTH COMPUTATIONAL TIMES ARE OBTAINED USING MATLAB

TABLE IX COMPARISON BETWEEN THE MULTI-KERNEL MSVR SINGLE KERNEL MSVRS

Fig. 6. Illustration of the failures from Active contours in (a) and (f), Graph cuts in (g) and (h), and classification based methods in (i) and (j), due to the spinal images. Output boundaries in (a) and (f) are shown large diversity in by red contours, and foreground/background in (g)–(j) is shown by white/black pixels.

AND

Multi-kernel MSVR advantage: Table IX demonstrates that the multi-kernel MSVR achieves a better performance than any other MSVRs with a single kernel based on an individual feature. Especially the BD is greatly lowered by 1 mm on average. Therefore, by fusing multiple feature sets into an adaptively combined kernel, multi-kernel MSVR is able to take advantage of complementary features to improve the segmentation accuracy without tuning parameters. V. CONCLUSION

Fig. 7. Intensity distributions of foreground and background in spinal images are completely overlapped, which brings a tremendous challenge to segmentation methods relying on intensity information.

completely overlapped foreground and background distributions (Fig. 7) in images. As a consequence, foreground and background can be labeled into exact opposite as shown in Fig. 6(g) and (i) where white/black pixels correspond to foreground/background, and results with unrealistic shapes may be produced as shown in Fig. 6(h) and (j). These conventional methods failed completely with DSI results much lower than 0.7. Only with the diversity extracted, categorized and embedded into the learning model, the proposed regression segmentation approach can successfully handle the images and achieves a high DSI over 0.9. MSVR advantage: Table VIII demonstrates the advantages of MSVR exploiting the spatial coherence of object boundary to SSVR. Compared to using a SSVR separately for each boundary point, MSVR is able to improve not only the prediction accuracy ( 0.7 mm BD decrease) but also the computational efficiency ( 200 times faster). The tremendously improved computational efficiency of MSVR comes from the fact that the model computation is only done once in MSVR while times ( points multiplying 2 coordinates each point) in SSVR, and it makes regression segmentation a practically feasible approach for image segmentation.

This paper proposed a unified framework that is for the first time able to segment spinal images with a single learning model. The framework is based on a novel regression segmentation approach which formulates the segmentation task as a boundary regression problem. This proposed segmentation approach leverages the domain knowledge learning in a holistic fashion and therefore can successfully handle the extreme diversity which poses a tremendous challenge to conventional segmentation methods. Experiments on data of 113 clinical subjects demonstrated that the proposed regression segmentation approach produced consistent results with human experts on images of both disc and vertebra structures, in both sagittal and axial planes, and from both CT and MRI modalities. As a result, the provided novel and expendable segmentation framework would allow flexible and integrated quantitative analysis of spinal structures in clinical settings. REFERENCES [1] S. Ghosh, R. S. Alomari, V. Chaudhary, and G. Dhillon, “Automatic lumbar vertebra segmentation from clinical CT for wedge compression fracture diagnosis,” in Proc. SPIE, Mar. 2011, vol. 3, p. 796 303-9. [2] A. K. Allouni, W. Davis, K. Mankad, J. Rankine, and I. Davagnanam, “Modern spinal instrumentation. Part 2: Multimodality imaging approach for assessment of complications,” Clin. Radiol., vol. 68, no. 1, pp. 75–81, Jan. 2013. [3] S. AJ, B. CM, M. KJ, W. N, and H. MB, “Computed tomography alone versus computed tomography and magnetic resonance imaging in the identification of occult injuries to the cervical spine: A meta-analysis,” J. Trauma, vol. 68, no. 1, pp. 109–113, 2010. [4] L. N. McKinnis, Fundamentals of Musculoskeletal Imaging, 4th ed. Philadelphia, PA: F. A. Davis, 2013. [5] P. Smyth, C. Taylor, and J. Adams, “Automatic measurement of vertebral shape using active shape models,” Inf. Process. Med. Imag., vol. 1230, pp. 441–446, 1997. [6] M. Aslan, A. Ali, H. Rara, and B. Arnold, “A novel 3-D segmentation of vertebral bones from volumetric CT images using graph cuts,” in Proc. Int. Symp. Vis. Comput., 2009, pp. 519–528.

1648

[7] A. Mastmeyer, K. Engelke, C. Fuchs, and W. A. Kalender, “A hierarchical 3-D segmentation method and the definition of vertebral body coordinate systems for QCT of the lumbar spine,” Med. Image Anal., vol. 10, no. 4, pp. 560–577, Aug. 2006. [8] M. G. Roberts, T. F. Cootes, E. Pacheco, T. Oh, and J. E. Adams, “Segmentation of lumbar vertebrae using part-based graphs and active appearance models,” in Int. Conf. Med. Image Comput. Comput.-Assist. Intervent., Jan. 2009, vol. 12, pp. 1017–1024. [9] S. K. Michopoulou, L. Costaridou, E. Panagiotopoulos, R. Speller, G. Panayiotakis, and A. Todd-Pokropek, “Atlas-based segmentation of degenerated lumbar intervertebral discs from MR images of the spine,” IEEE Trans. Biomed. Eng., vol. 56, no. 9, pp. 2225–2231, Sep. 2009. [10] R. S. Alomari, J. J. Corso, and V. Chaudhary, “Labeling of lumbar discs using both pixel- and object-level features with a two-level probabilistic model,” IEEE Trans. Med. Imag., vol. 30, no. 1, pp. 1–10, Jan. 2011. [11] J. J. Corso, R. S. Alomari, and V. Chaudhary, “Lumbar disc localization and labeling with a probabilistic model on both pixel and object features,” in Int. Conf. Med. Image Comput. Comput.-Assist. Intervent., Jan. 2008, vol. 11, pp. 202–210. [12] D. Stern, B. Likar, F. Pernus, and T. Vrtovec, “Automated detection of spinal centrelines, vertebral bodies and intervertebral discs in CT and MR images of lumbar spine,” Phys. Med. Biol., vol. 55, no. 1, pp. 247–264, Jan. 2010. [13] Y. Zhan, D. Maneesh, M. Harder, and X. S. Zhou, “Robust MR spine detection using hierarchical learning and local articulated model,” in Med. Image Comput. Comput.-Assist. Intervent., MICCAI, Jan. 2012, vol. 15, pp. 141–148. [14] S.-H. Huang, Y.-H. Chu, S.-H. Lai, and C. L. Novak, “Learning-based vertebra detection and iterative normalized-cut segmentation for spinal MRI},” IEEE Trans. Med. Imag., vol. 28, no. 10, pp. 1595–1605, Oct. 2009. [15] T. Klinder, R. Wolz, and C. Lorenz, “Spine segmentation using articulated shape models,” in Int. Conf. Med. Image Comput. Comput.-Assist. Intervent,, 2008, pp. 227–234. [16] Z. Peng, J. Zhong, W. Wee, and J.-H. Lee, “Automated vertebra detection and segmentation from the whole spine MR images,” in Proc. IEEE Eng. Med. Biol. Soc. Conf., Jan. 2005, vol. 3, pp. 2527–2530. [17] R. Shi, D. Sun, Z. Qiu, and K. L. Weiss, “An efficient method for segmentation of MRI spine images,” in Proc. IEEE/ICME Int. Conf. Complex Med. Eng., May 2007, pp. 713–717. [18] B. Michael Kelm, M. Wels, S. Kevin Zhou, S. Seifert, M. Suehling, Y. Zheng, and D. Comaniciu, “Spine detection in CT and MR using iterated marginal space learning,” Med. Image Anal., vol. 17, no. 8, pp. 1283–1292, Dec. 2013. [19] M. Aslan, A. Ali, and D. Chen, “3-D vertebrae segmentation using graph cuts with shape prior constraints,” in Proc. IEEE Int. Conf. Image Process., 2010, pp. 2193–2196. [20] H. Shen, A. Litvin, and C. Alvino, “Localized priors for the precise segmentation of individual vertebras from CT volume data,” in Int. Conf. Med. Image Comput. Comput.-Assist. Intervent., Jan. 2008, vol. 11, pp. 367–375. [21] D. Stern, B. Likar, F. Pernuš, and T. Vrtovec, “Parametric modelling and segmentation of vertebral bodies in 3-D CT and MR spine images,” Phys. Med. Biol., vol. 56, no. 23, pp. 7505–7522, Dec. 2011. [22] S. Mahmoudi and M. Benjelloun, “A new approach for cervical vertebrae segmentation,” in Progr. Pattern Recognit., Image Anal. Appl., 2007, pp. 753–762. [23] I. Ayed, K. Punithakumar, and G. Garvin, “Graph cuts with invariant object-interaction priors: Application to intervertebral disc segmentation,” in Inf. Process. Med. Imag., 2011, pp. 221–232. [24] C. Chevrefils, F. Cheriet, C.-E. Aubin, and G. Grimard, “Texture analysis for automatic segmentation of intervertebral disks of scoliotic spines from MR images,” IEEE Trans. Inf. Technol. Biomed., vol. 13, no. 4, pp. 608–620, Jul. 2009. [25] M. W. K. Law, K. Tay, A. Leung, G. J. Garvin, and S. Li, “Intervertebral disc segmentation in MR images using anisotropic oriented flux,” Med. Image Anal., vol. 17, no. 1, pp. 43–61, Jan. 2013. [26] A. Wong, A. Mishra, J. Yates, P. Fieguth, D. A. Clausi, and J. P. Callaghan, “Intervertebral disc segmentation and volumetric reconstruction from peripheral quantitative computed tomography imaging,” IEEE Trans. Biomed. Eng., vol. 56, no. 11, pp. 2748–2751, Nov. 2009. [27] C. Chevrefils and F. Chériet, “Watershed segmentation of intervertebral disk and spinal canal from MRI images,” in Proc. Int. Conf. Image Anal. Recognit., 2007, no. 3, pp. 1017–1027.

IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 34, NO. 8, AUGUST 2015

[28] M. A. Horsfield, S. Sala, M. Neema, M. Absinta, A. Bakshi, M. P. Sormani, M. A. Rocca, R. Bakshi, and M. Filippi, “Rapid semi-automatic segmentation of the spinal cord from magnetic resonance images: Application in multiple sclerosis,” NeuroImage, vol. 50, no. 2, pp. 446–455, Apr. 2010. [29] L. Nyúl, J. Kanyó, E. Máté, and G. Makay, “Method for automatically segmenting the spinal cord and canal from 3-D CT images,” in Comput. Anal. Images Patterns, 2005, pp. 456–463. [30] C. Bishop, “Sparse Kernel machines,” in Pattern Recognit. Mach. Learn., 2006, pp. 325–358. [31] S. Michopoulou, L. Costaridou, E. Panagiotopoulos, R. Speller, and A. Todd-Pokropek, “Segmenting degenerated lumbar intervertebral discs from MR images,” in Proc. IEEE Nucl. Sci. Symp. Conf., Oct. 2008, no. c, pp. 4536–4539. [32] D. H. Cooper, T. F. Cootes, C. J. Taylor, and J. Graham, “Active shape models their training and application,” Comput. Vis. Image Understand., vol. 61, pp. 38–59, 1995. [33] F. Pérez-Cruz, G. Camps-Valls, and E. Soria-Olivas, “Multi-dimensional function approximation and regression estimation,” in ICANN, 2002, pp. 757–762. [34] M. Sánchez-fernández, M. De-Prado-Cumplido, J. Arenas-garcía, and F. Pérez-cruz, “SVM multiregression for nonlinear channel estimation in multiple-input multiple-output systems,” IEEE Trans. Signal Process., vol. 52, no. 8, pp. 2298–2307, Aug. 2004. [35] B. Scholkopf and A. Smola, Learning With Kernels. Cambridge, MA: MIT Press, 2002. [36] F. Pérez-Cruz, P. L. Alarcón-Diana, A. Navia-Vázquez, and A. ArtésRodríguez, “Fast training of support vector classifiers,” in Adv. Neural Inf. Process. Syst., 2000, pp. 734–740. [37] H. Badino, D. Huber, and T. Kanade, “Real-time topometric localization,” in Proc. IEEE Int. Conf. Robot. Automat., May 2012, pp. 1635–1642. [38] A. Oliva and A. Torralba, “Modeling the shape of the scene: A holistic representation of the spatial envelope,” Int. J. Comput. Vis., vol. 42, no. 3, pp. 145–175, 2001. [39] N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 2005, vol. 1, pp. 886–893. [40] M. Varma and B. R. Babu, “More generality in efficient multiple kernel learning,” in Proc. 26th Annu. Int. Conf. Mach. Learn., 2009, pp. 1–8. [41] J. Yao, J. E. Burns, H. Munoz, and R. M. Summers, “Detection of vertebral body fractures based on cortical shell unwrapping,” in Int. Conf. Med. Image Comput. Comput.-Assist. Intervent., Jan. 2012, vol. 15, no. 3, pp. 509–516. [42] A. P. Zijdenbos, B. M. Dawant, R. A. Margolin, and A. C. Palmer, “Morphometric analysis of white matter lesions in MR images: Method and validation,” IEEE Trans. Med. Imag., vol. 13, no. 4, pp. 716–724, Dec. 1994. [43] P. Korfiatis, S. Siadopoulos, P. Sakellaropoulos, C. Kalogeropoulou, and L. Costaridou, “Combining 2-D wavelet edge highlighting and 3-D thresholding for lung segmentation in thin-slice CT,” Br. J. Radiol., vol. 80, no. 960, pp. 996–1004, 2007. [44] Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 23, no. 11, pp. 1222–1239, Nov. 2001. [45] M. Kass, A. Witkin, and D. Terzopoulos, “Snakes: Active contour models,” Int. J. Comput. Vis., vol. 331, pp. 321–331, 1988. [46] Y. Freund and R. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” J. Comput. Syst. Sci., vol. 55, no. 1, pp. 119–139, Aug. 1995. [47] A. Quddus, P. Fieguth, and O. Basir, “Adaboost and support vector machines for white matter lesion segmentation in MR images,” Proc. IEEE Eng. Med. Biol. Soc. Conf., vol. 1, pp. 463–466, Jan. 2005. [48] G. Hautvast, S. Lobregt, M. Breeuwer, and F. Gerritsen, “Automatic contour propagation in cine cardiac magnetic resonance images,” IEEE Trans. Med. Imag., vol. 25, no. 11, pp. 1472–1482, Nov. 2006. [49] M. Santarelli, V. Positano, C. Michelassi, M. Lombardi, and L. Landini, “Automated cardiac MR image segmentation: Theory and measurement evaluation,” Med. Eng. Phys., vol. 25, no. 2, pp. 149–159, Mar. 2003. [50] J. H. Morra, Z. Tu, L. G. Apostolova, A. E. Green, A. W. Toga, and P. M. Thompson, “Comparison of AdaBoost and support vector machines for detecting Alzheimer's disease through automated hippocampal segmentation,” IEEE Trans. Med. Imag., vol. 29, no. 1, pp. 30–43, Jan. 2010.

Regression Segmentation for M³ Spinal Images.

Clinical routine often requires to analyze spinal images of multiple anatomic structures in multiple anatomic planes from multiple imaging modalities ...
1MB Sizes 0 Downloads 7 Views