Machine learning methods for the classification of gliomas: Initial results using features extracted from MR spectroscopy.

A FOR

FINE BRAIDED WIRES

HIGHER RADIAL FORCE

EASE OF USE

HIGHER VISIBILITY

Self-Expandable LEO+: Intracranial stent designed for the treatment of wide-neck aneurysm

neuroradiology solutions

Braided design allows the fine nitinol wires to slide smoothly onto each other to reach perfect vessel conformability, as well as provide coil support

www.abmedica.it

Artery diameter

Mesh size Stent’s lenght

ab medica s.p.a.

Via nerviano, 31 - 20020 Lainate (MI) tel +39 02 933051 - fax +39 02 93305400 [email protected]

The original braided stent is also compatible with a coiling microcatheter 1 (VASCO+10, ID: .017”) LOW PROFILE to gain access to vessels down to Ø 1,5mm Avoid microcatheter exchange

FLARED ENDS

Original article

Machine learning methods for the classification of gliomas: Initial results using features extracted from MR spectroscopy

The Neuroradiology Journal 2015, Vol. 28(2) 106–111 ! The Author(s) 2015 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav DOI: 10.1177/1971400915576637 neu.sagepub.com

G Ranjith1, R Parvathy1, V Vikas2, Kesavadas Chandrasekharan1 and Suresh Nair1

Abstract Context: With the advent of new imaging modalities, radiologists are faced with handling increasing volumes of data for diagnosis and treatment planning. The use of automated and intelligent systems is becoming essential in such a scenario. Machine learning, a branch of artificial intelligence, is increasingly being used in medical image analysis applications such as image segmentation, registration and computer-aided diagnosis and detection. Histopathological analysis is currently the gold standard for classification of brain tumors. The use of machine learning algorithms along with extraction of relevant features from magnetic resonance imaging (MRI) holds promise of replacing conventional invasive methods of tumor classification. Aims: The aim of the study is to classify gliomas into benign and malignant types using MRI data. Settings and design: Retrospective data from 28 patients who were diagnosed with glioma were used for the analysis. WHO Grade II (low-grade astrocytoma) was classified as benign while Grade III (anaplastic astrocytoma) and Grade IV (glioblastoma multiforme) were classified as malignant. Methods and materials: Features were extracted from MR spectroscopy. The classification was done using four machine learning algorithms: multilayer perceptrons, support vector machine, random forest and locally weighted learning. Results: Three of the four machine learning algorithms gave an area under ROC curve in excess of 0.80. Random forest gave the best performance in terms of AUC (0.911) while sensitivity was best for locally weighted learning (86.1%). Conclusions: The performance of different machine learning algorithms in the classification of gliomas is promising. An even better performance may be expected by integrating features extracted from other MR sequences.

Keywords machine learning, magnetic resonance spectroscopy, feature extraction, multilayer perceptrons, support vector machine, random forest

Introduction A brain tumor is an abnormal growth of tissue in the brain. A tumor can be life-threatening depending on its location, size and state of development. Brain tumors can be cancerous (malignant) or non-cancerous (benign). Benign brain tumors are slow growing, confined to distinct boundaries and they rarely spread (Figure 1). Malignant brain tumors are cancerous brain tumors, and they usually grow rapidly, are invasive and can often be life-threatening (Figure 2). Gliomas are one of the most common types of brain tumor. Gliomas are tumors that grow from glial cells, which are supportive cells in the brain. Astrocytes and oligodendrocytes are the two main types of supportive cells in the brain. Astrocytoma is the most common type of glioma and begins in cells called astrocytes in the cerebrum or cerebellum. There are four grades of

astrocytoma according to the WHO classification of tumors in the central nervous system:1 Grade I: Pilocytic astrocytoma is a slow-growing tumor that is most often benign and rarely spreads into nearby tissue. Grade II: Low-grade astrocytoma, slow growing, rarely spreads, has a well-defined border. 1

SCTIMST, Sri Chitra Tirunal Institute of Medical Sciences and Technology, Trivandrum, Kerala, India 2 NIMHANS, Sri Chitra Tirunal Institute of Medical Sciences and Technology, Trivandrum, Kerala, India Corresponding author: Mr G Ranjith, Device Testing Lab Sri Chitra Tirunal Institute of Medical Sciences and Technology, BMT Wing, Poojappura, SCTIMST, Trivandrum, Kerala 695012 India. Email: [email protected]

Ranjith et al.

107

Figure 1. T2-weighted images and spectra of a patient with grade II glioma.

Figure 2. T2-weighted images and spectra of a patient with grade IV glioma.

Grade III: Anaplastic astrocytoma is a cancerous tumor that can quickly grow and spread to nearby tissues. Grade IV: Glioblastoma multiforme, malignant, most invasive, grows rapidly and spreads to nearby tissues. Imaging by magnetic resonance imaging (MRI) and histopathological examination has been the mainstay of diagnostic testing for intracranial tumors, especially gliomas. Histopathological analysis has been the gold standard for diagnosis and prognostication. Decision on adjuvant therapy following surgery has thus been dependent on the histopathological diagnosis.

Histopathological analysis, however, involves an invasive biopsy procedure, and this is a key motivation for evolving methods for diagnosis based on MRI alone, especially when the risks involved in the biopsy are high. The diagnostic and therapeutic applications for radiological imaging are increasing rapidly. This has been triggered by the advent of new modalities and sequences in image acquisition enabling capture of smaller anatomical structures at higher resolution. But this has increased the work load of radiologists, who have to deal with a larger number of images per patient to make a diagnosis. This has naturally led to

108

Figure 3. Receiver operator characteristic curve for classification using multilayer perceptrons.

Figure 4. Receiver operator characteristic curve for classification using support vector machine.

Figure 5. Receiver operator characteristic curve for classification using random forest.

the development of automated systems for image processing and diagnosis. The success of these automated schemes is highly dependent on the existence and the ability to extract relevant and discriminative features from the data.

The Neuroradiology Journal 28(2)

Figure 6. Receiver operator characteristic curve for classification using locally weighted learning.

Computer-aided diagnosis (CAD) methods are being used increasingly in medical diagnosis, including the detection and diagnosis of breast cancers, lung cancers, colon cancers and brain tumors.2 Machine learning is the study of computer algorithms which can learn complex relationships or patterns from empirical data and make accurate decisions. Machine learning provides an effective way to automate the analysis and diagnosis for medical images. In the United States, CAD using machine learning is a standard procedure in the initial screening for lung cancer diagnosis using computed tomography (CT) images.3,4 Machine learning algorithms have also been effectively used for the detection of pulmonary embolisms.5 Computer-aided polyp detection for the diagnosis of colon cancers has also been reported to be highly accurate.6 Computer-aided diagnosis has also been used in the classification of breast tumors into benign and malignant types using texture features extracted from digital mammograms.7 For brain tumors, CAD has been used in tumor grading and classifying primary tumors from metastasis using features extracted from different MRI sequences such as MR spectroscopy, diffusion-weighted and perfusion MRI.8–11 The aim of the study was to evolve automated methods for the classification of gliomas into benign and malignant types using machine learning algorithms. The results from histopathology were considered as the ground truth. For feature extraction data from MR spectroscopy alone was used.

Materials and methods Retrospective data from 28 patients diagnosed with glioma in the neurosurgery department of our institute were considered for the study. Patients who were graded as WHO Grade II (low-grade astrocytoma), Grade III (anaplastic astrocytoma) and Grade IV (glioblastoma multiforme) in histopathological examination were included in the study. There were 16 patients

Ranjith et al.

109 Table 1. Performance of the four machine learning algorithms. Algorithm

Sensitivity(%)

Specificity(%)

PPV(%)

NPV(%)

AUC(95% CI)

Multilayer perceptrons Support vector machine Random forest Locally weighted learning

75.0 83.3 80.6 86.1

89.8 75.5 85.7 75.5

84.4 71.4 80.6 72.1

83.0 86.0 85.7 88.1

0.876 0.08 0.794 0.10 0.911 0.06 0.817 0.09

diagnosed with low-grade astrocytoma, six patients with anaplastic astrocytoma and six patients with glioblastoma multiforme. Grade II astrocytoma cases were considered benign and Grade III and Grade IV cases were considered malignant. Images were acquired on a 1.5 Tesla Siemens MRI machine. Multivoxel spectroscopy (chemical shift imaging) at TE 135 ms was the sequence used. The MR spectrum was acquired at different areas on the tumor. Since data were available for only 28 patients, the analysis was done using multiple spectrum acquired at different areas of the tumor per patient. A minimum of two and a maximum of four spectra were selected per patient as per the availability of data. The maximum of four was fixed in order to avoid bias by a single or group of patients. Overall there were 85 cases (spectra), of which 49 were benign cases and 36 were malignant. The peak integral values of four metabolites, creatine (two resonant peaks at 3.0 and 3.9 ppm), choline (3.2 ppm) and n-acetyl aspartate (NAA; 2.0 ppm), as computed by the Siemens MRI software Syngo, were used for the analysis. Since there was large variation in the metabolite peak integral values across patients, the metabolite peak integral value ratios were taken as features for the classification. The ratios taken were NAA/ Cr, Cho/Cr, Cho/NAA, NAA/ Cr2 and Cho/Cr2 where NAA is n-acetyl aspartate (2.0 ppm), Cho is choline (3.2 ppm), Cr is creatine (3.0 ppm), and Cr2 is creatine (3.9 ppm). The classification was done using WEKA open source software.12 Classification was performed using four machine learning algorithms: multilayer perceptrons, support vector machines, random forests and locally weighted learning. For multilayer perceptions the back propagation algorithm was used for learning and a sigmoid activation function was used for the nodes. The number of hidden layers was fixed at one. For support vector machines the polynomial kernel was used. A fivefold cross-validation scheme was used to validate the analysis. The total available data would be divided into five sets; four of these would be used for training and the fifth for testing. The testing set would be switched with one of the training sets in the next iteration and so on until all the sets have been used as testing sets.

Results Sensitivity/specificity analysis of the four classification algorithms was analyzed based on the error rate.

The error rate can be described by the terms true and false positive and true and false negative as follows: True Positive (TP): The test result is ‘malignant’ in the presence of malignancy. True Negative (TN): The test result is ‘benign’ in the absence of malignancy. False Positive (FP): The test result is ‘malignant’ in the absence of malignancy. False Negative (FN): The test result is ‘benign’ in the presence of malignancy. The performances of the different machine learning algorithms are summarized in Table 1.

The figures below show the receiver operating characteristic (ROC) curves for the four schemes. The area under the ROC with the 95% confidence interval for the four schemes are as follows: Multilayer perceptrons (area under curve (AUC) ¼ 0.876, 95% CI ¼ [0.795,0.956]), support vector machine (AUC ¼ 0.794, 95% CI ¼ [0.693,0.894]), random forest scheme (AUC ¼ 0.911, 95% CI ¼ [0.842,0.979]), locally weighted learning (AUC ¼ 0.817, 95% CI ¼ [0.721, 0.912]). Sensitivity was best for locally weighted learning (86.1%) and specificity for multilayer perceptrons (89.8%).

Discussion Magnetic resonance spectroscopy is an analytical technique used for studying the metabolic changes in the brain tissue accompanying diseases such as brain tumor, stroke, Alzheimer’s disease, seizures, depression, etc. In healthy tissue metabolites are present in steady-state concentrations typical for that specific tissue. Metabolite concentrations may shift due to stress, functional disturbance or illness. These changes in concentration are detectable with MR spectroscopy. Diagnosis using MR spectroscopy can substantially improve the non-invasive categorization of human brain tumors, especially for gliomas.13 The major peaks observed in 1.5 Tesla MRS of brain tumor include creatine (two resonant peaks at 3.0 and 3.9 ppm), choline (3.2 ppm) and n-acetyl aspartate (NAA; 2.0 ppm). NAA is considered as a neuronal marker since it is synthesized and stored mainly in the neurons. The concentration of NAA is an indicator of neuronal density and viability.14 A high level of NAA concentration is indicative of the presence of brain

110

The Neuroradiology Journal 28(2) Table 2. Feature ranking based on information gain evaluation. Rank

Feature

Information Gain

1 2 3 4 5

NAA/Cr2 Cho/NAA Cho/Cr2 NAA/Cr Cho/Cr

0.408 0.211 0.2 0.131 0.126

tumor.15 NAA concentration levels are also useful for classifying primary brain tumors from metastasis and the grading of gliomas. It has been reported that glioblastoma, a malignant brain tumor of glial origin, has very low NAA concentration levels.16 Choline, another metabolite in the brain, is a marker of cell membrane integrity and density. Low-grade gliomas are characterized by relatively lower levels of choline concentration. An elevation in the choline level is indicative of a progression in the glioma grading.17 Creatine is relatively stable in the brain, and is therefore used as a reference metabolite. However, there may be a drop in creatine levels in tumorous tissues because of the higher metabolic demands of the tissue. Since the absolute values of the metabolites at different regions vary between patients, it is standard practice to calculate the metabolite concentration ratios for comparison. In our analysis the metabolite ratios were taken as features for the classification algorithms. A ranking of the features based on information gain was performed on the whole data set. Table 2 shows the relative ranks and the information gains of the five features used for the classification. NAA/Cr2 was found to be the most discriminating feature between benign and malignant tumors. Lower values of NAA/Cr2 were invariably associated with malignant tumors, while larger values of NAA/Cr2 were mostly associated with benign tumors. This result matches well with the literature, where a malignant tumor is associated with a drop in NAA levels. Sahin et al.18 in their work had reported that the Cho/ Cr ratio was a clear marker for tumor grading. They reported that lower values of Cho/Cr (less than 1.3) were indicative of low-grade gliomas, while higher values of Cho/Cr (greater than 1.3) were indicative of a progression to higher grades. In our analysis, Cho/Cr was not found to be a good discriminative feature and was ranked fifth in the information gain evaluation. Cho/NAA was found to be a better discriminating feature and was ranked 2 in the information gain evaluation. Our results are comparable with results obtained by other investigators who have undertaken tumor classification using features extracted from MR spectroscopy. Zeng et al.,19 in their work using metabolite ratios as features, conducted ROC analysis to arrive at thresholds for each feature. They obtained a

sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of 84.00%, 83.33%, 91.30% and 71.43% using a threshold value of 2.04 for Cho/Cr. A threshold value of 2.20 for the Cho/NAA ratio resulted in sensitivity, specificity, PPV and NPV of 88.00%, 66.67%, 84.62% and 72.73%, respectively. Sahin et al.18 have used features extracted from MR spectroscopy and perfusion-weighted imaging and arrived at thresholds for the features using ROC analysis, and have reported a specificity of 100% and sensitivity of 71.4%. A major limitation in our study is that because of the paucity of data, we combined Grade III and Grade IV gliomas under a single head ‘malignant’ gliomas; we acknowledge the fact that these two grades may have distinctive features of their own, and combining the two may not be fully justified. This work was done as part of a larger study where it is proposed that features from different MR sequences such as MR spectroscopy, diffusion and perfusion MRI are extracted and integrated to arrive at a classification scheme for gliomas. The work presented is from features extracted from MR spectroscopy alone. Feature extraction from other sequences and their integration into a classification scheme is proposed for the future.

Conclusion The paper presents an efficient method of classifying gliomas into benign and malignant types using features extracted from MR spectroscopy. The performance of different machine learning algorithms in the classification of gliomas is promising. Area under the ROC curves in excess of 0.80 has been obtained in three of the four algorithms used. The results here presented have been obtained using features extracted from MR spectroscopy alone. By combining these features with features extracted from other MR modalities it might be possible to achieve an even better performance. This method of automatic detection and classification of gliomas into benign and malignant types holds promise of supplementing or even replacing conventional invasive methods of tumor grading such as histopathological analysis. Funding This work was financially supported by the Kerala State Council for Science, Technology and Environment.

Conflict of interest The authors declare no conflict of interest.

References 1. Louis DN, Ohgaki H, Wiestler OD, et al. The 2007 WHO Classification of tumours of the central nervous system. Acta Neuropathol 2007; 114: 97–109.

Ranjith et al. 2. Doi K. Computer-aided diagnosis in medical imaging: Historical review, current status and future potential. Comput Med Imaging Graph 2007; 31: 198–211. 3. Kaneko M, Eguchi K, Ohmatsu H, et al. Peripheral lung cancer: Screening and detection with low-dose spiral CT versus radiography. Radiology 1996; 201: 798–802. 4. Swensen SJ, Jett JR, Sloan JA, et al. Screening for lung cancer with low-dose spiral computed tomography. Am J Respir Crit Care Med 2002; 165: 508–513. 5. Schoepf UJ, Schneider AC, Das M, et al. Pulmonary embolism: Computer-aided detection at multidetector row spiral computed tomography. J Thorac Imaging 2007; 22: 319–323. 6. Summers RM, Yao J, Pickhardt PJ, et al. Computed tomographic virtual colonoscopy: Computer aided polyp detection in a screening population. Gastroenterology 2005; 129: 1832–1844. 7. Cheng HD, Cai XP, Chen XW, et al. Computer-aided detection and classification of microcalcifications in mammograms: A survey. Pattern Recogn 2003; 36: 2967–2991. 8. El-Dahshan ESA, Mohsen HM, Revett K, et al. Computer-aided diagnosis of human brain tumor through MRI: A survey and a new algorithm. Expert Syst Appl 2014; 41: 5526–5545. 9. Quratul Ain M, Jaffar A and Choi T-S. Fuzzy anisotropic diffusion based segmentation and texture based ensemble classification of brain tumor. Appl Soft Comput 2014; 21: 330–340. 10. Svolosa P, Tsolakia E, Kapsalaki E, et al. Investigating brain tumor differentiation with diffusion and perfusion metrics at 3T MRI using pattern recognition techniques. Magn Reson Imaging 2013; 31: 1567–1577.

111 11. Al-Okaili RN, Krejza J, Wang S, et al. Advanced MR imaging techniques in the diagnosis of intraaxial brain tumors in adults. Radiographics 2006; 26(Suppl 1): S173–S189. 12. Hall M, Frank E, Holmes G, et al. The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter 2009; 11: 10–18. 13. Fan G. Magnetic resonance spectroscopy and gliomas. Cancer Imaging 2006; 6: 113–115. 14. Urenjak J, Williams SR, Gadian DG, et al. Specific expression of N-acetylaspartate in neurons, oligodendrocyte-type-2 astrocyte progenitors,and immature oligodendrocytes in vitro. J Neurochem 1992; 59: 55–61. 15. Warren KE, Frank JA, Black JL, et al. Proton magnetic resonance spectroscopic imaging in children with recurrent primary brain tumors. J Clin Oncol 2000; 18: 1020–1026. 16. Soares DP and Law M. Magnetic resonance spectroscopy of the brain: Review of metabolites and clinical applications. Clin Radiol 2009; 64: 12–21. 17. Shimizu H, Kumabe T, Shirane R, et al. Correlation between choline level measured by proton MR spectroscopy and Ki-67 labeling index in gliomas. Am J Neuroradiol 2000; 21: 659–665. 18. Sahin N, Melhem ER and Wang S. Advanced MR imaging techniques in the evaluation of nonenhancing gliomas: Perfusion-weighted imaging compared with proton magnetic resonance spectroscopy and tumor grade. Neuroradiol J 2013; 26: 531–541. 19. Zeng Q, Liu H, Zhang K, et al. Noninvasive evaluation of cerebral glioma grade by using multivoxel 3D proton MR spectroscopy. Magn Reson Imaging 2011; 29: 25–31.

Classification of P-glycoprotein-interacting compounds using machine learning methods.

Gliomas: classification with MR imaging.

Sparse extreme learning machine for classification.

Classification of sodium MRI data of cartilage using machine learning.

Classification of Paediatric Inflammatory Bowel Disease using Machine Learning.

Machine Learning methods for Quantitative Radiomic Biomarkers.

Machine learning methods for microRNA gene prediction.

Machine learning methods in chemoinformatics.

Studying depression using imaging and machine learning methods.

Comparison of Machine Learning Methods for the Arterial Hypertension Diagnostics.

Machine Learning Algorithms for Automatic Classification of Marmoset Vocalizations.

Identification and characterization of plastid-type proteins from sequence-attributed features using machine learning.

Visual classification: expert knowledge guides machine learning.

Machine Learning Methods for Attack Detection in the Smart Grid.

A machine learning approach for viral genome classification.

Sparse Bayesian extreme learning machine for multi-classification.

Harnessing ontology and machine learning for RSO classification.

Exploring Guidelines for Classification of Major Heart Failure Subtypes by Using Machine Learning.

Screening for prediabetes using machine learning models.

Recognition of anaplastic foci within low-grade gliomas using MR spectroscopy.

EMD-Based Temporal and Spectral Features for the Classification of EEG Signals Using Supervised Learning.

[The proton MR spectroscopy of intracranial tumors. The differential diagnostic aspects for gliomas, metastases and meningiomas].

Machine Learning Methods for Predicting HLA-Peptide Binding Activity.

Monitoring cardiac stress using features extracted from S₁ heart sounds.