Applying Distance Histogram to retrieve 3D cardiac medical models Leila C. C. Bergamasco, MS, Fátima L. S. Nunes, PhD. Escola de Artes, Ciências e Humanidades - Universidade de São Paulo. São Paulo – Brazil Abstract Three-dimensional models are being extensively used in the medical area in order to improve clinical medical examinations and diagnosis. The Cardiology field handles with several types of image slices to compose the diagnosis. MRI (Magnetic Resonance Imaging) is a non-invasive technique to detect anomalies from internal images of the human body that generates hundreds of images, which takes long for the specialist to analyze frame by frame and the diagnosis precision can be affected. Many cardiac diseases could be identified through shape deformation, but systems aimed to aid diagnosis usually identify shapes in two-dimensional (2D) images. Our aim is to apply a shape descriptor to retrieve three-dimensional cardiac models, obtained from a set of 2D slices, which were segmented and reconstructed from MRI images using their geometry information. Preliminary results show that the shape deformation in 3D models can be a good indicator to detect Congestive Heart Failure, a very common heart disease. 1. Introduction Content-based Image Retrieval (CBIR) uses information from the content of a given image as a query to provide the user the most similar images contained in a database. In Medicine, CBIR can help physicians by retrieving exams similar to one presented as a query. CBIR can also be useful as a training tool for medical students as well as for research purposes. CBIR has been broadly applied in Computer Aided Diagnosis (CAD) systems, in the two-dimensional (2D) context . Three-dimensional (3D) models have been used more frequently in many areas, such as Engineering, Multimedia and Health. One of the factors responsible for the increase of CBIR use is the advance of technology which provides mechanisms to reduce costs and improve hardware and software performance. 3D models can yield more information as compared to 2D images, offering models with different variations in color, contrast and resolution . Due to those characteristics, many complex medical examinations use 3D models for diagnosis in Medicine. Magnetic Resonance Imaging (MRI) is a medical image modality which allows experts to identify anomalies, such as aneurysm , artery coronary disease and tumors, from internal images of the body without invasive methods . This medical examination is very accurate and the patient practically does not suffer any collateral damages. Regarding cardiology specialty, MRI is very useful to detect Congestive Heart Failure (CHF), a heart disease which consists in the inability of the heart to pump blood commensurate with metabolic needs. Currently, 2% of the United States population is being affected by this disease . As a drawback, the exam generates hundreds of slices for each patient. These slices can be used to generate a 3D model, which can be used to compose a diagnosis. Therefore, techniques are necessary to information retrieval accurately and quickly by using these models as queries. There are currently two approaches to solve the access and retrieval problems regarding images: using 2D or 3D exam information. In the 2D approach, the system processes the slices to detect and to extract abnormalities. This type of imaging is related to 2D projections obtained by modalities such as radiography and ultrasound or a set of slices from a Computed Tomography (CT) or MRI exam. The disadvantage of this approach is the number of 2D images generated by each patient, especially in MRI and CT, and consequently the time spent by the physician to analyze the whole set. One solution for this problem is to generate a 3D model from these frames and to retrieve information from it. This solution can be faster and more accurate than the 2D approach , , . However, this approach also has limitations, once it requires that the volume or the surface model be available. The process to obtain this type of artifact is not trivial, since the frames must be segmented and a reconstruction algorithm is required. The use of CBIR in 3D models in the medical field is innovative, since most of the works considering CBIR are focused on 2D images or contemplate common models, such as simple domestic objects - lamps, chairs, tables and animals , , . Many diseases, such as CHF, are diagnosed through the cardiac muscle shape deformation by observing images generated by medical images modalities. Thus, shape descriptors can be an interesting approach to this retrieval.
This paper presents the Distance Histogram technique, a shape descriptor, which measures the standard distance between the 3D model surface and its center. It was applied to detect cardiac diseases such as CHF, the main symptom of which is the volume increase in the left ventricular chamber. This article is organized as follows: in Section 2, the main CBIR concepts are explained. Section 3 presents related works which deal with computational solutions to detect CHF problem. Section 4 is about the methodology used. Section 5 discusses the results and Section 6 presents the conclusions. 2. Technical Background 2.1. CBIR CBIR systems retrieve images from large databases where the search analyzes the content of the images rather than metadata. Some steps are desirable in this kind of system, such as Pre-Processing, Feature Extraction, Similarity Function and Relevance Feedback. In the Pre-Processing stage, it is possible to apply algorithms to prepare images to next step, the feature extraction. Through pre-processing techniques, relevant features are highlighted and noise which might cause discrepancy in the results is discarded . In the Feature Extraction stage, descriptors are developed, which are essential in any CBIR system developed. They describe low-level visual features such as color, texture and shape. It is possible to develop more than one descriptor for the same image. This set of features is organized in a feature vector. Authors have studied faster and more robust extraction that improves the accuracy of CBIR systems . Similarity Functions calculate the content difference between two images based on their features. One of the images is given as search parameter and another is stored in the database and has their features previously extracted. There are different ways for measuring this difference, such as metric distances and Artificial Intelligence techniques . The Relevance Feedback method is an optional step to CBIR systems. It consists in applying techniques to decrease the semantic gap between user and computer. Through a user evaluation about the results presented, it is possible to refine the search and to improve the tool precision . 2.2. Model Retrieval CBIR in 3D environment is innovative and has different denominations, such as CBIR 3D, 3D Model Retrieval and Content Based 3D Model Retrieval . For reference purposes, the term 3D Model Retrieval will be used in this paper. The steps to retrieve 3D models are similar to those used to recover 2D images. Some Similarity Functions and Evaluation Metrics for 2D CBIR are also applied in the 3D context. For descriptors, authors propose different taxonomies to deal with. Bustos et al. (2005) proposed the following order : • Statistics: statistical descriptors reflect object properties such as number of vertices and polygons, surface area, volume, bounding volume, and statistical moments. A variety of statistical descriptors are proposed in the literature for 3D model retrieval. • Extension-based descriptors: extension-based methods build object descriptors from features sampled along certain spatial directions from an object center. • Volume-based descriptors: these methods derive object features from volumetric representations obtained by discretizing object surface into voxel grids. • Surface geometry: these descriptors focus on characteristics derived from surfaces model. • Image-based descriptors: use 2D projections rendered from the 3D models. For Similarity Functions, metric distances, such as Euclidean and Manhattan Distance, are frequently applied. Some researchers also implement Artificial Intelligence techniques such as Support Vector Machine (SVM) or Neural Network to calculate the similarity degree among 3D objects. To evaluate the system, the Precision versus Recall metric is the most used .
It is possible to verify that the concept of 3D is controversial, since descriptors can be applied to 3D surfaces, voxels or even on 2D images. In this paper, the retrieval will consider geometric models, using its surface. As previously mentioned, the Model Retrieval approach is innovative and several studies have tested systems by using common 3D objects found in 3D models databases such as Princeton Benchmark . Therefore, it is important to analyze the effectiveness of descriptors in a more specific environment and in order to propose improvements to solve real problems. 3.
CHF is originated mainly in the Left Ventricle (LV), where Systolic and Diastolic Dysfunction occurs, affecting the ventricle contractility and, consequently, blood pumping . In an advanced stage, a deformation of this structure occurs. Some systems were developed to detect CHF by using medical images. The EigenHearts method seeks to analyze changes in the contractility of heart tissue by contour patterns obtained in Echocardiography images . Authors calculated the average difference in circumference during systole (contraction) and diastole (dilatation) and applied it in new exams to identify contour variation. Yang et al.  studied the right ventricle and blood flow velocity. To achieve the goal, they made a 3D reconstruction of MRI volunteers’ exams and applied the Nervier-Strokes equation to simulate the bloodstream and the arterial pressure. Figure 2 presents the maximum and minimum points of pressure found in a structure. With several simulations, the heart morphology and arterial pressure conditions were observed to be good indicatives to detect CHF . Considering that the Electrocardiogram can be useful to detect CHF and to provide information to the CAD system, Elfadil and Ibrahim  divided power spectral density into six regions and calculated their average. The results were used as input in a Neural Network. Authors stated that it was possible to identify patients with CHF with 90% accuracy. Although previous approaches can be useful, the problem of processing a large set of images remains. Thus, 3D model retrieval can help to decrease processing, as shown in the next section.
Methodology 4.1. The Distance Histogram
The Distance Histogram Descriptor  was chosen because it considers 3D model surface and geometry to compute the characteristics of a 3D model descriptor. Khe, Feng and Ning  previously tested this descriptor by using generic models to verify its efficiency obtaining error rates lower than 20%. The Distance Histogram is a method which computes the distance between the center of the model and its surface in random points represented by vertices coordinates. The distances are normalized in relation to the maximum distance found and it is divided into ranges that form the histogram bins. In spite of its simplicity, this method has several properties desirable for similarity retrieval : • Invariance: distance histogram has transformation invariance properties. • Robustness: random sampling ensures that the distance histogram is robust to noise. • Efficiency: construction of the distance histogram for a database of 3D models is generally fast and efficient. Figure 1 shows the generic steps to calculate the Distance Histogram. Algorithm 1 details the computation used to obtain the descriptor.
Figure 1: Distance Histogram method: from a set of slices, the volume is extracted and reconstructed. Histograms are created based on the distance between model centroid and surface at random points .
The algorithm consists of two steps: the first one catches the surface coordinates at random points represented by vertices coordinates (coordinateSurface) and computes the Euclidean distance between this point and its centroid. Each distance is stored in a vector here named dist. A variable maxDist is used to store the maximum distance found during the process. In the second step, distances are normalized in a range [0, 1] dividing the vector values by the variable maxDist. From the normalized distances, the histogram is built, counting the number of points within a specific distance from the centroid.
4.2. Materials and Procedures To test Distance Histogram in medical images, a set of MRI cardiac exams provided by Instituto do Coração (InCor) of University of São Paulo, one of the largest centers for treating heart diseases in Brazil, was used. We tested the descriptor using 30 image sets, 53% of which presented CHF and 47% recording normal cases, 76.6% of the patients were over 40 years old. Figure 2 shows the percentage of male and female in the two groups with and without CHF. Each medical examination had about 45 frames of heart images obtained during diastole. Frames have 256x256 pixels and contrast resolution of 16 bits per pixel.
Figure 2: Distribution of men and women patients in the two study groups. Figure 3 shows the Model Retrieval system flow. In step 1, users provide a set of MRI slices from an exam in the axial view. In step 2, slices were segmented in order to separate the left ventricle internal edge using Seg3D software .
Figure 3: Flow of 3D Model Retrieval system. Frames segmented were reconstructed in step 3, by using ImageVis3D software . Figure 4 shows the model obtained after reconstruction. It is important to highlight that a surface reconstruction was performed. At step 4, the Distance Histogram is applied and the bin values are stored in a feature vector and indexed in a database. We here used 1000 random points and 101 bins in the histogram. Figure 5 shows the distances found at the surface for each point selected. Note that the distances are not homogeneous. In step 5, Euclidean Distance was applied to each patient histogram. Each histogram bin is a feature of our descriptor. Thus, the feature vector of the Distance Histogram descriptor has 101 positions. The patient with clinical case more similar should have the Euclidean Distance value near 0.
Figure 4: Left Ventricle reconstructed.
Figure 5: Distribution distances at the random points. In step 6, the three most similar cases are shown to the user and, finally, in step 7, the Precision versus Recall metric was chosen to evaluate our prototype. This metric is largely used in research both in two and threedimensional contexts. Equations 1 and 2 show Precision and Recall, respectively. From these two equations, the Precision versus Recall curve is derived, showing the evolution of precision for each recall value. (1) (2)
Results and discussion
In order to verify the Distance Histogram descriptor efficiency for detecting shape alteration of 3D models in the medical environment, we chose the CHF heart disease, which has as its primary symptom the alteration of the heart muscle, causing shape alteration of the 3D object generated from 2D images. For testing the prototype, we performed 20 queries. Half of the queries have used a model with CHF as query model and the other half have used a model without the disease. In this work, we chose 1000 random points to compute the distance between the 3D model centroid and its surface. This amount of points was defined after several empirical tests. Decreasing the amount of points could reduce the accuracy of the system. The same principle was applied to the set of intervals of the histogram. We defined 101 bins because a smaller value could cause reducing time in the processing, but accuracy may also be compromised. In fact, the time to perform a search in our system ranged from 3 to 5 seconds, which is a good result. Euclidean Distance was used to measure the difference between the query model and the model from database. Figure 6 shows the three most similar cases returned from 3D perspective, Figure 9 show their respective Distance Histograms.
Figure 6: Results of a query using a model surface of a patient with CHF. (a) exam with CHF and (b) the three most similar results. 3rd
Figure 7: Distance Histograms of query and models retrieved. As seen in Figure 6 and 7, it is not trivial to identify, by observing 3D models and Histograms, which are the most similar. However, through the descriptor and its mathematical properties were possible retrieve relevant models. The CBIR system also produced the Precision versus Recall curve for a case with CHF and without CHF, shown in Figure 8. This curve shows precision greater than 70% for about 50% of recall and represents the precision mean obtained from the 10 queries performed for model with and without CHF. For each recall value, the precision corresponds to the average computed considering all the queries executed. An important detail regarding heart deformations is about the difference between men and women heart size. For example, a tall man who exercises regularly can has a larger heart than an average woman. If someday this woman acquires the CHF disease, her heart size and her heart weight will probably still be smaller than the heart of the man who has a heart without problems. This situation could produce retrieval mistakes. Because this factor we can notice that the Precision versus Recall curve not obtain 100% of precision. Considering the mentioned problem between images from men and women and aiming to test thoroughly the algorithm we pre-selected some models which a more significant shape deformation, based on the Distance Histogram values. Thus the relevant models became more defined and its retrieval was more accurate, as Figure 9 shows. For CHF cases the precision rate increased by 25%, from 60% to 80% with an average of 70% of precision for larger values of recall.
Precision x Recall Curve 1 0,8 0,6 0,4 0,2 0
Recall Figure 8: Precision versus Recall curve for 10 queries, with and without CHF.
1 0,8 0,6 0,4 0,2 0
CHF No-CHF 0,0625 0,125 0,1875 0,25 0,3125 0,375 0,4375 0,5 0,5625 0,625 0,6875 0,75 0,8125 0,875 0,9375 1
Precision x Recall Curve
Recall Figure 9: Precision versus Recall curve for 10 queries with pre-selected models and presenting CHF and normal cases. This result shows that Distance Histogram does not work well when the models present subtle changes in the shape. This problem is caused due to generic nature of the descriptor: it analyzes general characteristics of the 3D models, considering the whole object. A possible solution for this problem is to analyze separately different parts of the model that represents the heart. We intend to develop this approach as continuity of the present project. 6.
In order to verify the Model Retrieval efficiency in a medical database to help physicians in their diagnosis, we developed the Distance Histogram technique and applied this descriptor to the 3D left ventricle reconstructed from MRI resonance. The intention was to verify the efficiency of this descriptor to retrieve models similar to one model provided as query considering models reconstructed from cardiac images, with and without CHF disease. More exhaustive tests are necessary in order to evaluate the system’s performance in larger databases. However, the initial results were very satisfactory, with precision higher than 70%. In an environment with considerable changes of shape it is possible obtain a precision rate about 80%, indicating that the Distance Histogram technique may be useful in the medical area, specifically to detect heart problems involving shape alteration. Therefore, our approach can be useful to aid the diagnosis of this kind of heart diseases, once the system can retrieval similar cases from the database, in a fast way. From the models retrieved the physician can obtain support to compose the diagnosis. 7. Acknowledgement This research was supported by the State of São Paulo Research Foundation (Fundação de Amparo à Pesquisa do Estado de São Paulo)- Fapesp - Process 2010/15691-0 and 2011/15949-0, Brazilian National Council of Scientific and Technological Development (Conselho Nacional de Desenvolvimento Científico e Tecnológico) CNPq - Process 559931/ 2010-7 and the National Institute of Science and Technology Medicine Assisted by Scientific Computing (Instituto Nacional de Ciência e Tecnologia - Medicina Assistida por Computação Científica) - INCT-MACC. 8.
 Doi K. Current status and future potential of computer-aided diagnosis in medical imaging. In: The British Journal of Radiology. 78; 2005. P. 3–19.  Nunes FLS Introdução ao processamento de imagens médicas para auxílio ao diagnóstico. Vol. Atualizações em informática. PUC-Rio, editor. Rio de Janeiro; 2006.
 Pereira VM, Bijlenga P, Marcos A, Schaller K, Lovblad K. Diagnostic approach to cerebral aneurysms. European Journal of Radiology. 2012; Article in Press.  Heyn C, Sue-Chue-Lam D, Jhaveri K, Haider MA. MRI of the pancreas: Problem solving tool. Journal of Magnetic Resonance Imaging. 2012; 36(5): 1037–1051.  Kumar V, Abbas AK, Fausto N, Aster JC. Robins e Cotran. Bases patológicas das doenças. 8th ed. Elsevier; 2010.  Vranic DV. 3D Model Retrieval [PhD Thesis].University of Leipezig. Germany; 2001.  Chen ZQ, Zou KS, Ip WH, Chan CY. 3D Model Retrieval Based on Fuzzy Weighted Shape Distributions. In: Advanced Materials Research. vol. 201-203. Trans Tech Publications; 2011. p. 1678–1681.  Yang F, Leng B. OFS: A Feature Selection Method for Shape-based 3D Model Retrieval. In: Proceeding of 10th IEEE International Conference on Computer-Aided Design and Computer Graphics. 10. Beijing, China: IEEE Computer Society; 2007. p. 114–119.  Gong B, Xu C, Liu J, Tang X. Boosting 3D Object Retrieval by Object Flexibility. In: Proceeding in 17th ACM international conference on Multimedia. 17. Beijing, China: ACM - Association for Computing Machinery; 2009. p. 525–528.  Smeulders AWM, Worring M, Santini S, Gupta A, Jain R. Content-based image retrieval at the end of the early years. Pattern Analysis and Machine Intelligence, IEEE Transactions on. 2000 dec;22(12):1349 –1380.  Qin Z, Jia J, Qin J. Content based 3D model retrieval: A survey. In: Proceedings of 6th International Workshop on Content-Based Multimedia Indexing. CBMI. 6. London, England: IEEE Computer Society; 2008. p. 249–256.  da Silva Torres R, Falcão AX. Content-Based Image Retrieval: Theory and Applications. In: Revista de Informática Teórica e Aplicada. Vol. 13; 2006. P. 165–189.  Bustos B, Keim DA, Saupe D, Schreck T, Vranic DV. Feature-based similarity search in 3D object databases. ACM Computing Surveys. 2005; 37:2005.  Yubin Y, Hui L, Yao Z. Content-Based 3-D Model Retrieval: A Survey. In: Proceedings of 7th IEEE Transactions on Systems, Man, and Cybernetics. vol. 36. IEEE Computer Society; 2007. p. 1081–1098.  Princeton Benchmark: Princeton http://shape.cs.princeton.edu/benchmark/
 Martini FH. Fundamentals of Anatomy and Physiology. 17th ed. Pearson; 2006.  Ahanathapillai V, Hamilton DJ. Eigenhearts for Diagnosis of Congestive Heart Failure (CHF). In: Advances in Medical, Signal and Information Processing, 2006. MEDSIP 2006. IET 3 rd International Conference On; 2006. p. 1 –4.  Yang C, Tang D, Haber I, Geva T, del Nido PJ. In vivo MRI-based 3D FSI RV/LV models for human right ventricle and patch design for potential computer-aided surgery optimization. Comput Struct. 2007 Jun; 85(1114):988–997.  Elfadil N, Ibrahim I. Self-organizing neural network approach for identiﬁcation of patients with Congestive Heart Failure. In: Multimedia Computing and Systems (ICMCS), 2011 International Conference on; 2011. p. 1-6.  Khe L, Feng Z, Ning H. An Eﬀective Approach to Content-Based 3D Model Retrieval and Classiﬁcation. In: Proceedings of the 1th International Conference on Computational Intelligence and Security (CIS). 1. China: IEEE Computer Society; 2007. p. 361–365.
 Seg3D: Volumetric Image Segmentation and Visualization. Scientiﬁc Computing and Imaging Institute (SCI); Available from: http://www.seg3d.org.  ImageVis3D: A Real-time Volume Rendering Tool for Large Data. Scientiﬁc Computing and Imaging Institute (SCI); Available from: http://www.imagevis3d.org.