Lasers Med Sci DOI 10.1007/s10103-013-1447-6
Cervical cancer detection based on serum sample Raman spectroscopy Jos´e Luis Gonz´alez-Sol´ıs · Juan Carlos Mart´ınez-Espinosa · Luis Adolfo Torres-Gonz´alez · Adriana Aguilar-Lemarroy · Luis Felipe Jave-Su´arez · Pascual Palomares-Anda
Received: 1 June 2013 / Accepted: 18 September 2013 © Springer-Verlag London 2013
Abstract The use of Raman spectroscopy to analyze the biochemical composition of serum samples and hence distinguish between normal and cervical cancer serum samples was investigated. The serum samples were obtained from 19 patients who were clinically diagnosed with cervical cancer, 3 precancer, and 20 healthy volunteer controls. The imprint was put under an Olympus microscope, and around points were chosen for Raman measurement. All spectra were collected at a Horiba Jobin-Yvon LabRAM HR800 Raman Spectrometer with a laser of 830-nm wavelength and 17-mW power irradiation. Raw spectra were processed by carrying out baseline correction, smoothing, and
J. L. Gonz´alez-Sol´ıs () Biophysics and Biomedical Sciences Laboratory, Centro Universitario de los Lagos, Universidad de Guadalajara, Enrique D´ıaz de Le´on 1144, Paseo de la Monta˜na, 47460 Lagos de Moreno, Jalisco, Mexico e-mail: [email protected]
J. C. Mart´ınez-Espinosa Mathematics and Biotechnology Academy, Instituto Polit´ecnico Nacional-UPIIG, 36275 Silao de la Victoria, Mexico L. A. Torres-Gonz´alez Departamento de Ciencias B´asicas, Universidad Iberoamericana Le´on, Blvd. Jorge V´ertiz Campero, Fracciones Canad´a de Alfaro, 37238 Le´on, Guanajuato, Mexico L. F. Jave-Su´arez · A. Aguilar-Lemarroy Centro de Investigaci´on Biom´edica de Occidente, Instituto Mexicano del Seguro Social, Sierra Mojada 800, Col. Independencia, 44340 Guadalajara, Jalisco, Mexico P. Palomares-Anda Instituto Mexicano del Seguro Social, Av. P. los Insurgentes S/N, Col. los Para´ısos, 37320 Le´on, Guanajuato, Mexico
normalization to remove noise, florescence, and shot noise and then analyzed using principal component analysis (PCA). The control serum spectrum showed the presence of higher amounts of carotenoids indicated by peaks at 1,002, 1,160, and 1,523 cm−1 and intense peaks associated with protein components at 754, 853, 938, 1,002, 1,300–1,345, 1,447, 1,523, 1,550, 1,620, and 1,654 cm−1 . The Raman bands assigned to glutathione (446, 828, and 1,404 cm−1 ) and tryptophan (509, 1,208, 1,556, 1,603, and 1,620 cm−1 ) in cervical cancer were higher than those of control samples, suggesting that their presence may also play a role in cervical cancer. Furthermore, weak bands in the control samples attributed to tryptophan (545, 760, and 1,174 cm−1 ) and amide III (1,234–1,290 cm−1 ) seem to disappear and decrease in the cervical cancer samples, respectively. It is shown that the serum samples from patients with cervical cancer and from the control group can be discriminated with high sensitivity and specificity when the multivariate statistical methods of PCA is applied to Raman spectra. PCA allowed us to define the wavelength differences between the spectral bands of the control and cervical cancer groups by confirming that the main molecular differences among the control and cervical cancer samples were glutathione, tryptophan, β carotene, and amide III. The preliminary results suggest that Raman spectroscopy could be a highly effective technique with a strong potential of support for current techniques as Papanicolaou smear by reducing the number of these tests; nevertheless, with the construction of a data library integrated with a large number of cervical cancer and control Raman spectra obtained from a wide range of healthy and cervical cancer population, Raman–PCA technique could be converted into a new technique for noninvasive real-time diagnosis of cervical cancer from serum samples.
Lasers Med Sci
Keywords Cervical cancer · Serum · Raman spectroscopy · Principal component analysis
Introduction According to the World Health Organization (WHO) and Surveillance Epidemiology and End Results (SEER) , cervical cancer is the second most common cancer in women worldwide, with about 500,000 new cases and 250,000 deaths each year. Although the median age at diagnosis for cancer of the cervix uteri was 49 years of age, women as young as 17 can contract the disease. Almost 80 % of cases occur in low-income countries and middle-income countries. Cervical cancer is the most common cancer for women in Central America and Southern Africa. The Caribbean, other parts of Africa, South America, and South Eastern Asia also have very high incidences of this disease. Human papillomavirus (HPV) infection is the major risk factor for development of cervical cancer; nevertheless, not all women with HPV infection develop cervical cancer. Diagnostic methods have therefore been developed in an attempt to overcome the innate limitations of conventional cytology. Screening techniques for cervical cancer include conventional exfoliative cervicovaginal cytology, i.e., the cervical Papanicolaou (Pap) smear, fluid sampling techniques with automated thin-layer preparation (liquid-based cytology), automated cervical screening techniques, neuromedical systems, HPV testing, polar probe, laser-induced fluorescence, visual inspection of the cervix after applying Lugol’s iodine (VILI) or acetic acid (VIA), speculoscopy, and cervicography . If further testing to Pap smear is needed, a colposcopy is usually performed. Results for colposcopy are reported by the pathologist as cervical intraepithelial neoplasia (CIN), carcinoma in situ (CIS), or squamous cervical carcinoma (SCC). CIN precancerous changes are categorized according to severity, CIN I, CIN II, and CIN III. CIN III is considered the same as CIS or stage 0 cervical cancer. The cancer has not yet invaded deeper tissues. However, if not surgically removed, there is a high chance that it can progress to invasive cervical cancer (SCC). The screening strategies mentioned above though applicable to the developed world may not be cost effective enough for widespread application in the third world countries. Currently, cervical cytology is widely regarded as the gold standard for cervical cancer screening in all developed countries. It is however not feasible to implement a systematic cytology-based screening program in several countries mainly due to severe restrictions on the availability of infrastructure, resources, and funding. There is therefore a need to develop low-cost screening strategies for cervical cancer. This will necessarily involve the use of a very simple technique that can be easily taught to and
practiced by paramedical personnel. Such techniques will need to be faster, less invasive, and cost effective while retaining adequate sensitivity and specificity to perform as practical screening techniques. Raman spectroscopy is a spectroscopic technique that is fast emerging as promising alternatives in biology and medicine, including cancer diagnosis. Raman spectroscopy is a nondestructive analytical technique that provides fingerprint spectra with spatial resolution of an optical microscope with almost no sample preparation. Raman spectroscopy is based on the measurement of the vibrational energy levels of chemical bonds by measuring the inelastically scattered light following excitation. Biologically associated molecules such as nucleic acids, protein, lipids, and carbohydrates all generate strong signals in Raman spectra. Therefore, the Raman spectroscopic method can be used to generate whole-cell fingerprints for the differentiation of biological samples . Because each molecule has unique vibrations, the Raman spectrum of the tissue will consist of a series of peak of bands, characteristic of the biochemical composition of the tissue. Many Raman spectroscopic studies performed on adults suggest that it is possible to differentiate normal, benign, premalignant, and malignant lesions in the breast , brain , gastrointestinal tract , and cervix . In addition, these studies have shown to be a real-time in vivo tool that can differentiate normal and diseased tissue. Such a device can guide surgical resections as recently shown in in vivo testing in adults [8, 9], reduce the amount of time needed for pathologic examination by frozen section or routine histological, and help reduce operation time and cost. Recently, multivariate analysis has been applied to Raman spectroscopy to classify epithelial precancers and cancers. In particular, principal component analysis (PCA) has been used to differentiate between epithelial precancers and cancers , detection of breast cancer in serum sample , and detection and monitoring leukemia in patients under chemotherapy treatment . An end goal of this research is to show that serum samples from patients with cervical cancer and from the control group can be discriminated when the PCA is applied to their Raman spectra. In addition, PCA allows defining the wavelength differences between the spectral bands of the control and cervical cancer spectra. The preliminary results suggest that Raman spectroscopy could be a highly effective technique with a strong potential of support for current techniques as Pap smear by reducing the number of these tests; nevertheless, with the construction of a data library integrated of a large number of cervical cancer and control Raman spectra obtained from a wide range of healthy and cervical cancer population, Raman–PCA technique could be converted into a new technique for noninvasive real-time diagnosis of cervical cancer from serum samples. To the
Lasers Med Sci
best of our knowledge, this is the first report of preliminary results evaluating the usefulness of Raman spectroscopy in the diagnosis of cervical cancer using serum samples.
laser power irradiation over the samples was 17 mW. The Raman system was calibrated with a silicon semiconductor using the Raman peak at 520 cm−1 . All spectra were taken in the region from 400 to 1,800 cm−1 , with a resolution of 0.6 cm−1 .
Methodology Data analysis Samples Fresh serum samples were obtained from 3 and 19 patients who were clinically diagnosed by the pathologist with CIN I and SCC, respectively, and 20 healthy volunteer controls. All patients were from the western central region of Mexico and had similar ethnic and socioeconomic backgrounds. The age at diagnosis for cervical cancer for the three groups was between 20 and 55 years. None of the patients were under chemotherapy cancer treatment. The cervical cancer and control serum samples were obtained through the Human Ethical Committees of Mexican Hospitals (Instituto Mexicano del Seguro Social from Guadalajara and Le´on cities). Written consent was obtained from the subjects, and the study was conducted according to the Declaration of Helsinki. Blood samples were obtained between 7:00 and 9:00 A.M. and were centrifuged to get the serum. All spectra were obtained on the same day. The samples were frozen at 189 ◦ C in a liquid nitrogen dewar before Raman spectroscopy analysis was performed. To ensure statistically sound sampling, at least five spectra were collected from different regions of each serum sample. A total of 288 spectra were collected with 138 spectra from 20 control patients, 18 spectra from 3 CIN I patients, and 132 spectra from 19 cervical cancer patients. Details of the samples used in the study are shown in Table 1.
The average of Raman spectra taken per patient in the control and cervical cancer groups was 6.9 and 6.8, respectively. Raw spectra were processed by carrying baseline correction, smoothing, and normalization to remove noise, sample florescence, and shot noise from cosmic rays and then analyzed using PCA . After the initial processing, the mean spectrum of each group was calculated. The mean spectra were analyzed to obtain general biochemical information for each data group [14, 15]. Sensitivity and specificity were used to judge diagnostic ability: sensitivity =
TP , TP + FN
and TN , TN + FP where TP is true positive, FN is false negative, TN is true negative, and FP is false positive. The main information obtained from the PCA is described by the first principal components. By plotting the loading vectors as a function of the wave number, the position of relevant differences  between the control and cervical cancer groups could be determined. PCA and all the algorithms for data analysis were implemented in MATLAB commercial software. specificity =
All spectra were collected at a Jobin-Yvon LabRAM HR800 Raman Spectrometer with a laser of 830-nm wavelength. A drop of serum was placed onto an aluminum substrate, which was examined by an Olympus microscope coupled to the Raman system, and several points were chosen for Raman measurement with an exposure of 20 to 40 s. The laser beam was focused on the surface of the sample with a 50× objective. The radius of the beam was 1.0 μm, and the
We collected 288 spectra as shown in Table 1 from 20 control, 3 CIN I, and 19 SCC serum samples. The mean Raman spectra of the control and SCC samples showed significant differences (Fig. 1). Spectral differences between the SCC and CIN I samples were not observed. The control serum spectrum (Fig. 1) showed the presence of higher amounts of carotenoids indicated by peaks at 1,002, 1,160, and 1,523 cm−1 and intense peaks associated with protein components at 754, 853, 938, 1,002, and 1,300 to 1,345, 1,447, 1,550, 1,620, and 1,654 cm−1 . The major differences between the cervical cancer and control spectra were an increase in the intensity of the bands 446 (glutathione), 566, 622 (Phe), 642 (Tyr), 695, 828 (glutathione), and 1,404 (glutathione) cm−1 in the cervical cancer spectrum and a decrease at 938 (protein components), 955 (CH2 rock), 1,028 (Phe), 1,063 (Phe), 1,083 (lipids), 1,103 (Phe),
Table 1 Details of the serum samples used in the study Spectrum number
Number of cases
1–132 133–150 151–288
SCC CIN I Control
19 3 20
Lasers Med Sci Table 2 Main bands observed in the control and cervical cancer serum spectra and the corresponding assignment of biomolecules
Fig. 1 Mean Raman spectra of the control and cervical cancer serum samples
1,126 (protein, lipids), 1,174 (Trp, Phe), 1,447 (lipids), and 1,234–1,282 (amide III) cm−1 . Minor differences occur at 509 (Trp), 1,523 (β carotene), 1,556 (Trp), and 1,587 (vibrational modes of backbone and amino acid residues of proteins). The bands at 545 (Trp), 714 (polysaccharides), 742 (phospholipid), and 760 (Trp) cm−1 seem to disappear in the cervical cancer serum sample. Table 2 shows the main bands observed in the control and cervical cancer spectra and the corresponding assignment of biomolecules. PCA results One of the major advantages of spectroscopic diagnosis is high objectivity. This is facilitated by the fact that the spectra are amenable to multivariate statistical tools such as PCA, artificial neural network (ANN), and hierarchial cluster analysis (HCA). In the present studies, the spectra of all three classes of samples, namely, control, CIN I, and cervical cancer (SCC), were pooled and analyzed by PCA to obtain discrimination among the classes. In PCA, large spectral data are decomposed into small number of independent variations known as principal components (PCs) and contributions of these components are known as scores. Scores of components are one of the widely used parameters for classification. As mentioned earlier, the selected region of 1,400–1,800 cm−1 gave the best discrimination. Plots of the first three principal components are shown in Fig. 2. All 150 spectra from CIN I and SCC cases were correctly separated from the control serum samples (100 % sensitivity) (Fig. 2). Only 4 of the 138 spectra from the control serum samples were misclassified as cervical cancer (97.1 % specificity). Because a cluster of CIN I spectra is fully contained in a cluster of cancer spectra, the CIN I samples could not be discriminated when PCA is applied.
Bands (cm−1 )
Serum sample where the biomolecules appear
446 509 545 566 622 642 661 695 714 742 754 760 828 853 875 897 938 955 1,002 1,028 1,063 1,083
Glutathione Trp Trp
Cancer Cancer, control Control Cancer Cancer Cancer Cancer, control Cancer, control Control Control Cancer, control Control Cancer, control Cancer, control Control Cancer, control Control Control Cancer, control Control Control Control
1,103 1,126 1,160 1,174 1,208 1,230–1,282 1,300–1,345 1,404 1,447 1,523 1,556 1,587 1,603 1,620 1,654
Phe Tyr Glutathione Polysaccharides Phospholipid Protein Trp Glutathione Tyr Trp COC str Skeletal str α CH2 rock Phe Phe Phe Phospholipids OPO and CC Phe Protein, phospholipid CC str β carotene Trp, Phe Trp Amide III Trp, α helix, phospholipids Glutathione Phospholipid, CH scissor in CH2 β carotene Trp Protein, Tyr Tyr, Phe Tyr, Trp C=C str Proteins, amide I, α helix, phospholipids
Control Cancer, control Cancer, control Control Cancer, control Control Cancer, control Cancer Cancer, control Cancer, control Cancer, control Cancer, control Cancer, control Cancer, control Cancer, control
In this study, as we know a priori how many groups there are and which samples correspond to each group, we
Lasers Med Sci
Fig. 2 Scatter plot of the control, CIN I, and cervical cancer (SCC) serum samples
applied a multivariate technique, linear discriminant analysis (LDA), to our PCA result as a technique acting in a supervised manner. LDA identified the two most natural groups, and as observed in Fig. 2, two large groups of patients were reported by the pathologist, one group containing all the blue points (control group) and another group containing all the green and red points (CIN I and SCC groups). However, LDA did not show clear discrimination between CIN I and SCC patients. PCA was applied to discriminate between the Raman spectra of the serum from control and cervical cancer patients using cross-validation. In cross-validation, the data is randomly split into two sets, a training set and a test set. In this approach, one sample (testing data) at a time was left out and PCA was applied after data reduction. Ten components for smoothing without baseline correction spectra and 12 components for smoothing with baseline correction spectra were considered for this analysis. In both cases, we were able to observe the two large groups of spectra as what we have obtained in Fig. 2. The sensitivity and specificity for data with smoothing and baseline correction and for smoothed data without baseline correction were 100 and 97 %, respectively. To bring out the differences in spectral profiles more clearly, the position of relevant difference spectra was computed by plotting the first principal component as a function of the wave number . According to custom, the principal differences between groups are represented by peaks with higher intensity. Nevertheless, several of these high peaks could be representing natural biochemical differences among only control patients. In order to know these natural differences, we plot the first principal component versus the wave number between the 138 control spectra. Figure 3
shows control–control plots with the position of relevant differences between the control patients and control–cancer plots with the position of relevant differences between the control and cervical cancer patients using the first principal component, PC1. By discarding the most intense peaks matching between the control–control and control–cancer plots, we obtain real biochemical differences among the control and cervical cancer serum samples. These differences appear at 451 (glutathione), 474, 520, 558, 566, 586, 622 (Phe), 642 (Tyr), 714 (polysaccharides), 735 (phospholipid), 760 (Trp), 828 (glutathione), 897, 943, 955, 1,028 (Phe), 1,053, 1,063 (Phe), 1,123 (protein), 1,149, 1,160 (β carotene), 1,167 (Trp), 1,274 (amide III), 1,342 (Trp), 1,404 (glutathione), 1,411, 1,447 (phospholipid), 1,481, 1,496, 1,523 (β carotene), 1,541, 1,588 (protein, Tyr), 1,602 (Tyr), 1,621 (Tyr, Trp), 1,654 (proteins), 1,703 (glutathione), and 1,715 cm−1 . As can be observed in Fig. 3, it could be an alternate method for viewing the differences in intensity observed by the loading vectors of PC1.
Discussion Our preliminary study suggests that Raman spectroscopy can differentiate the serum samples from cervical cancer and control patients with high sensitivity and specificity. With further construction of a large database, analysis, and hardware development, this technique has the potential to be a noninvasive real-time diagnostic tool in classifying the serum samples of cervical cancer patients. It requires no sample preparation and provides objective, specific, and fast results. Carotenoids have been shown to inhibit cancerous changes in several organs including the skin, mammary gland, lung, liver, and colon . In our study, PCA analysis showed that two β carotene-related peaks had differences between the control and cervical cancer groups (1,156 and 1,523 cm−1 ) (Fig. 3). It may also be confirmed in Fig. 2 by observing the presence of slightly higher amounts of carotenoids indicated by peaks at 1,002, 1,160, and 1,523 cm−1 in the cervical cancer serum samples. The Raman bands assigned to glutathione (446, 828, and 1,404 cm−1 ) and tryptophan (509, 1,208, 1,556, 1,603, and 1,620 cm−1 ) in cervical cancer were higher than those of the control serum samples (Fig. 2), suggesting that their presence may also play a role in cervical cancer. Furthermore, weak bands in the control samples attributed to tryptophan (545, 760, and 1,174 cm−1 ) and amide III (1,234–1,290 cm−1 ) seem to disappear and decrease in the cervical cancer samples, respectively. Acquisition of spectral profiles from different serum samples is a necessary step to create a library of molecular fingerprints relevant to cancer patients. Collection of a large
Lasers Med Sci
Fig. 3 Plots of the first principal component as a function of the wave number. By discarding the most intense peaks matching between the control–control and control–cancer plots, we obtain real biochemical differences among the control and cervical cancer serum samples
amount of these types of data from different institutions will help in designing and optimizing a probe that can collect data in patients. With further large control database built, our results suggest than Raman spectroscopy and PCA could be tools with the potential to monitor cancer patients under chemotherapy treatment by offering a faster alternative technique and reducing subjectivity to human error. By observing the biochemical changes throughout the chemotherapy treatment, the improvement of the cancer patient is observed when the spectra cluster of the patient approaches cluster of the control samples .
Conclusion Our preliminary results demonstrated that Raman spectroscopy and principal component analysis can be used to discriminate between the serum samples from cervical cancer and healthy patients with high sensitivity and specificity. The study confirmed that the main molecular differences were glutathione, tryptophan, β carotene, and amide III. The presence of these biomolecules suggests that they may play an important role in the early detection of this cancer. Acquisition of spectral profiles from different serum samples is a necessary step to create a library of molecular fingerprints relevant to cancer patients. Collection of a large amount of these types of data from different institutions will help in designing and optimizing a probe that can collect data in patients. Raman spectroscopy could become a potentially useful clinical tool of support for current
techniques as Pap smear by reducing the number of these tests, and in the near future, it could be a technique for noninvasive real-time diagnosis of cancers based on probing changes at the molecular level. It requires no sample preparation and provides objective, specific, and fast results. This technique may also be a useful adjunct to other conventional techniques for guiding and directing the cancer treatment.
Acknowledgments The authors wish to thank CONACYT for the financial support under grant number 45488. Also, we wish to acknowledge the financial support of the Research Network of CONACYT, Soft Condensed Matter.
References 1. Howlader N, Noone AM, Krapcho M, Garshell J, Neyman N, Altekruse SF, Kosary CL, Yu M, Ruhl J, Tatalovich Z, Cho H, Mariotto A, Lewis DR, Chen HS, Feuer EJ, Cronin KA (eds) (2013) SEER Cancer Statistics Review, 1975–2010, National Cancer Institute. Bethesda, MD, http://seer.cancer.gov/csr/1975 2010, based on November 2012 SEER data submission, posted to the SEER web site 2. Duraisamy K, Jaganathan KS, Bose JC (2011) Methods of detecting cervical cancer. Adv Biol Res 5(4):226–232 3. Choo-Smith LP, Edward MHG, Endtz HP et al (2002) Medical applications of Raman spectroscopy: from proof of principle to clinical implementation. Biopolymers 67:1–9 4. Chowdary MVP, Kalyan Kumar K, Kurien J, Mathew S, Murali Krishna C (2006) Discrimination of normal, benign, and malignant breast tissues by Raman spectroscopy. Biopolymers 83:556– 569 5. Banerjee HN, Zhang L (2007) Deciphering the finger prints of brain cancer astrocytoma in comparison to astrocytes by using
Lasers Med Sci
near infrared Raman spectroscopy. Mol Cell Biochem 295:237– 240 Mahadevan-Jansen A, Mitchell MF, Ramanujam N, Malpica A, Thomsen S, Utzinger U, Richards-Kortum R (1998) Nearinfrared Raman spectroscopy for in vitro detection of cervical precancers. Photochem Photobiol 68:123–132 Bohorfoush AG (2006) Tissue spectroscopy for gastrointestinal diseases. Endoscopy 28:372–380 Haka AS, Volynskaya Z, Gardecki J et al (2006) In vivo margin assessment during partial mastectomy breast surgery using Raman spectroscopy. Cancer Res 66:3317–3322 Rabah R, Weber R, Serhatkulu GK, Cao A, Dai H, Pandya A, Naik R, Auner G, Poulik J, Klein M (2008) Diagnosis of neuroblastoma and ganglioneuroma using Raman spectroscopy. J Pediatr Surg 43:171–176 Stone N, Kendall C et al (2002) Near-infrared Raman spectroscopy for the classification of epithelial pre-cancers and cancers. J Raman Spectrosc 33:564–573 Pichardo-Molina JL, Frausto-Reyes C, Barbosa-Garca O, Huerta-Franco R, Gonzlez-Trujillo JL, Ramrez-Alvarado CA, Gutirrez-Jurez G, Medina-Gutirrez C (2006) Raman
spectroscopy and multivariate analysis of serum simples from breast cancer patients. Laser Med Sci 10103:432–438 Gonz´alez-Sol´ıs JL, Mart´ınez-Espinosa JC, Salgado-Rom´an JM, Palomares-Anda P (2013) Monitoring of chemotherapy leukemia treatment using Raman spectroscopy and principal component analysis. Laser Med Sci (in submission) Boelens HF, Eiler PH, Hankemeier T (2005) Sing constrains improve the detection of differences between complex spectral data sets: LC-IR as an example. Anal Chem 77(24):7998– 8007 Stone N, Kendall C, Smith J et al (2004) Raman spectroscopy for identification of epithelial cancers. Faraday Discuss 126:141– 157 De Gelder J, De Gussem K, Vandenabeele P, Moens L (2007) Reference database of Raman spectra of biological molecules. J Raman Spectrosc 38:1133–1147 Nogueira VG, Silveira L (2005) Raman spectroscopy study of atherosclerosis in human carotid artery. J Biomed Opt 10:031117 Hata TR, Schlz TA, Ermakov IV et al (2000) Non-invasive Raman spectroscopic detection of carotenoids in human skin. J Invest Dermatol 115:441–448