This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2351620, IEEE Transactions on Image Processing

Robust Volumetric Texture Classification of Magnetic Resonance Images of the Brain using Local Frequency Descriptor Rouzbeh Maani, Sanjay Kalra, Yee-Hong Yang

Abstract—This paper presents a method for robust volumetric texture classification. It also proposes 2D and 3D gradient calculation methods designed to be robust to imaging effects and artifacts. Using the proposed 2D method, the gradient information is extracted on the XYZ orthogonal planes at each voxel based on a local coordinate system. The local coordinate system and the local 3D gradient computed by the proposed 3D gradient calculator are then used to define volumetric texture features. It is shown that the presented gradient calculation methods can be efficiently implemented by convolving with 2D and 3D kernels. The experimental results demonstrate that the proposed gradient operators and the texture features are robust to imaging effects and artifacts such as blurriness and noise in 2D and 3D images. The proposed method is compared with 3 state-of-the-art volumetric texture classification methods the 3D volumetric Gray Level Cooccurance Matrix (3D GLCM), 3D volumetric Local Binary Patterns (3D LBP), and Second Orientation Pyramid (SOP) on magnetic resonance imaging (MRI) data of the brain. The experimental results show the superiority of the proposed method in accuracy, robustness, and speed. Index Terms—Volumetric Texture Classification, Robust, Local Frequency Descriptor, Local Gradient, Three Dimensional, MRI.

I. I NTRODUCTION

T

Extures analysis is an interesting and popular topic in computer vision and image processing and has been used in different applications such as automated inspection [1], document processing [2], remote sensing [3], fingerprint identification [4], and medical image analysis [5]. While the majority of the methods are developed for 2D images, the analysis of 3D data is considerably less developed due to the computational complexity. A good texture feature should be invariant to geometric transformations and robust to imaging effects and artifacts (e.g., noise, blurriness, illumination changes). Some wellknown 2D texture methods include but not are limited to the gray level cooccurrence matrix (GLCM) [6], the gray level Aura matrix (GLAM) [7], the Run Length Matrix (RLM) [8], filter responses in the frequency [9] and spatial [10], [11] domains, Wavelets [12], Orientation Pyramid [13], Markov R. Maani and Y.H. Yang are with the Department of Computing Science, University of Alberta, Canada. E-mail: [email protected] S. Kalra is with the Departments of Medicine, Biomedical Engineering, , and Computing Science, University of Alberta, Canada.

Random Fields (MRFs) [14], Gaussian MRFs [15], spin images [16], and the Local Binary Patterns (LBP) [17] and its variants [18], [19], [20], [21]. Texture analysis of 3D data can be broadly divided into two categories: spacetime texture and volumetric texture. Spacetime texture considers phenomena evolving in time such as fire flames, waterfalls, or even a group of runners on a street. The 3D data includes a sequence of 2D images, each of which is acquired at a different time, and the data is defined as (x, y, t) where x and y denote space and t represents time. Some recent examples of texture methods in this category include the extensions of the LBP (e.g., the volume local binary patterns (VLBP) [22], the LBP on three orthogonal planes (LBP-TOP) [23]), 3D steerable filters [24], free form deformations [25], wavelet decomposition [26], and distributions of space time orientation structure [27]. The second group, volumetric texture, is about the data acquired from a 3D volume. The data is represented by (x, y, z), where x, y, and z denote space. Examples include medical images acquired by different modalities such as magnetic resonance imaging (MRI), computed tomography (CT), and ultrasound (US). There are a few methods to analyze 3D volumetric data. The developed methods are usually extensions of current popular 2D methods such as the 3D extension of the GLCM [28], [29], the RLM [30], and the LBP [31]. Some other 3D methods include texture anisotropy [32], energy of 3D wavelet transform [33], Second Orientation Pyramid (SOP) filtering [34], [35], Markov Random Field (MRF) [36], and Gaussian MRF [37]. Texture methods have been noted for medical imaging studies particularly in recent years, including 3D texture analysis of MRI. Some important examples include using of GLCM and Gabor filter responses for diagnosis of dementia [38], extracting GLCM and RLM to study the pathological changes of hippocampus in patients with Alzheimer’s disease and mild cognitive impairment [39], employing GLCM to study brain asymmetry [40] and quantifying pathological changes in multiple sclerosis [41], and using wavelet features to detect brain tumors [42] and to study epilepsy [43]. Nonetheless, the need for analysis of volumetric data, particularly in the domain of medical image analysis, encourages developing new texture methods that are able to analyze volumetric data in a robust and accurate manner. In this paper, we present a 3D method to analyze volumetric data. We use a newly developed gradient calculator based on the observations in our previous method, local frequency

1057-7149 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2351620, IEEE Transactions on Image Processing IEEE TRANSACTION ON IMAGE PROCESSING

2

descriptor (LFD), [44]. In LFD we consider P samples on a radius of R at each pixel and apply the 1D Fourier transform to the samples and use the first two frequency components (i.e., low frequencies) to construct texture features. We observed that these low frequency components comprise the majority of the signals in textures, and therefore they are able to represent the texture around a pixel. In this paper we show that the second frequency component is closely related to the underlying edge structure and therefore can be used as a gradient calculator. We also present an efficient kernel based implementation of the method to replace the current LFD method which applies the 1D Fourier transform to P samples. This implementation provides a fast gradient computation by convolving images with the kernels. The information extracted by the gradient operators is used to construct 3D texture features. The method of computing local gradients using this approach is presented in section II. Then, we demonstrate how the computed gradients are used to construct a local coordinate system and to compute the textural features. In section IV, the experiments are explained and the results are discussed. Finally, the conclusions are presented in Section VI.

(a)

(b)

(c) Fig. 1. An edge is defined as the line separating a dark region from a bright region. a) An example of the sampling method around an edge, b) the function of samples, and c) the DFT of the rectangular function is a sinc shape function.

II. T HE P ROPOSED M ETHOD A. Local Edge Computation In this paper we present a method of computing local gradients. This method is inspired by the LFD method [44]. LFD considers P neighbors on a circle with radius of R at each pixel with gray level values of f1 , ..., fP , respectively. The local circular function at a pixel, LCFP,R = (f1 , ..., fP ), provides useful information used by several methods. Some methods such as Local Binary Patterns (LBP) and its variants apply a threshold (the gray value of the center pixel) and binomial factor 2n to construct the binary patterns from this function. However, it is argued by Maani et al. [45] that applying a threshold compromises some important textural information. They suggest applying 1D Fourier transform to the LCFP,R function and called the output the Local Frequency Descriptor (LFD):

LF D(n) =

P X

fk .e

−2πi(k−1)(n−1) P

, (n = 1, ..., P ),

(1)

texture datasets, and therefore can well represent the texture around a pixel [44]. To explain the edge detection ability of LF D(2), we need to show the characteristics of the LCFP,R function when it is around an edge. We define an edge as the line separating a dark region from a bright region as shown in Figure 1(a). Now, let us traverse the circular samples in Figure 1(a). We start from a dark region (f1 ), and after a while, we cross the edge and go into the bright region (fi ). We stay in the bright region for a while, and cross the edge again (fj ), and come back to the starting point in the dark region. Indeed, the function of an edge can be characterized as a rectangular shape function in this circular sampling approach (Figure 1(b)). For simplicity, assume that the rectangular function has a value of one. The DFT of the rectangular shape function with width M (using Eq. 1) is a sinc shape function of the following form:

k=1

where LF D(n) consists of P complex numbers representing the frequency components of the LCFP,R function. Since the LCFP,R function consists of real numbers, the frequency components are conjugate symmetric about the DC component. That is, the 2nd and the P th components (similarly the 3rd th and the (P − 1) , and so on) have the same magnitude but opposite phase. We demonstrate that the second frequency component (or equivalently the P th component) is highly affected by the local edge around the pixel and can be used to find the magnitude and orientation of the local edge. Our motivation is based on the observation that the low frequency components (LF D(1), LF D(2), and LF D(P )) comprise more than 90% of the LCFP,R signal in some well known

LF Drect =

sin( π(n−1)M ) P sin( π(n−1) ) P

× e−

iπ(n−1)(M −1) P

.

(2)

It can be easily observed that the magnitude of this sinc ) shape function, | sin(π(n−1)M/P sin(π(n−1)/P ) |, is maximum at n = 1 and then at n = {2, P } (Figure 1(c)). As a result, an edge (rectangular shape LCFP,R function) manifests itself with high values in LF D(1), LF D(2), and LF D(P ). This is consistent to the experimental results showing that these low frequency components comprise a large portion of textural signals (more than 90%) [44]. It is noteworthy that among these three components, the LF D(1) (DC component) is affected by the average intensity, indicating if a pixel is located in a dark or bright region, while

1057-7149 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2351620, IEEE Transactions on Image Processing IEEE TRANSACTION ON IMAGE PROCESSING

3

LF D(2) (or LF D(P )) is more affected by the actual edge information around the pixel. Another possible interpretation of LF D(2) is that it approximates a rectangular shape function better than the other components. Now, we formally define the Local Frequency Descriptor Gradient (LF DG) by setting n = 2: P X −2πi(k−1) P . (3) LF DG = fk e k=1

(a)

(b)

The LF DG (i.e., the second component of LF D) is a complex number, and therefore, Eq. 3 can be further decomposed into real and imaginary parts: Re(LF DG) =

P X

fk cos(

k=1

Im(LF DG) = −

P X k=1

2π(k − 1) ), P

fk sin(

2π(k − 1) ). P

(c)

(4)

(5)

The magnitude and phase p of the LF DG can be simply Re(LF DG)2 + Im(LF DG)2 , computed by |LF DG| = and ∠LF DG = atan2(Im(LF DG), Re(LF DG)) 1 . The magnitude of the LF DG represents the amount of rectangular shape function (i.e., the strength of an edge), while the phase indicates the starting location of the rectangular shape function (i.e., the edge orientation). One may note that the exact value of the phase depends on the neighboring order strategy. We start on a horizontal line and traverse the neighbors in the clockwise direction as shown in Figure 1(a). Using this protocol will result in the same orientation value computed by the conventional gradient orientation formula.2 The rectangular shape LCFP,R function is analogous to the uniform patterns in the LBP method3 which represent edges of varying positive and negative curvatures [17]. However, the uniform patterns are acquired by applying a threshold which makes the patterns sensitive to noise, while the LF DG is a low frequency component which is less affected by noise. Similar to LBP , LF DG can be acquired using different R and P , giving edge information at different scales. Figure 2 demonstrates the capability of the LF DG to find edge information. It compares the magnitude and the phase of the LF DG (P = 8, R = 1) with the magnitude and orientation of gradients computed by three common approaches: central difference (i.e., ∆h f (x) = f (x + h/2) − f (x − h/2), h = 2), the Sobel operator (with size of 3 × 3), and the first order derivative of Gaussian (i.e., ∆f (x) = f (x) ∗ ∆G (σ), where ∗ x2 − 2σ 2 denotes convolution and ∆G (σ) = √−2x e , the normal3 2πσ ized derivative of Gaussian, with σ = 0.25). As can be observed, the magnitude of LF DG is zero in the flat regions, while it is maximized on the pixels located on edges. The phase of LF DG faithfully represents the orientation of the local edge. The color map in the right bottom of the images shows the color associated with a given angle. 1 atan2 is the operator that computes the arc tangent of the two variables by considering the signs of both arguments. 2 The conventional gradient orientation is computed by atan2(∆y/∆x). 3 Uniform patterns refers the binary patterns in which the transition between 0 and 1 is not greater than 2.

(d)

(e)

(f)

(g)

(h)

(i)

(j) Fig. 2. Comparing LF DG with other gradient calculators. A sample image is shown in (a). The original size of the sample image is 1152×253 pixels and its format is jpg. The magnitude and orientation of gradients are computed by central difference (b and c), Sobel operator (d and e), first order derivative of Gaussian (f and g), and LF DG (h and i). The upper right part of number “7” in the orientation maps in its original size is shown in (j) which is computed by central difference, Sobel, derivative of Gaussian, and LF DG from left to right, respectively. The angular color map is shown in the bottom right of the image showing the orientation.

It can be observed that the orientational values computed by

1057-7149 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2351620, IEEE Transactions on Image Processing IEEE TRANSACTION ON IMAGE PROCESSING

(a)

4

(b)

Fig. 3. LFDG can be interpreted as the projection of the samples’ values onto an axis (e.g., ~ x, ~ y in 2D). a) The angle between the samples and the ~ y axis (θy ) is π2 larger than the angle of the samples with ~ x axis (θx ) in 2D. b) Extension of LFDG to 3D by sampling on a sphere using N = 4 and M = 8 setting.

LF DG is smoother and involves more pixels (see Figure 2(j)). Also, one may note that the Sobel operator generates some artifacts around some curvatures. The LF DG operator has several advantages that make it favorable for computer vision and image processing applications. The first advantage is that it is robust to noise. The reason is that noise appears in high frequency components. However, the LF DG is a low frequency component which is not affected by noise. The robustness of low frequency components has been reported by some research studies [45], [44]. We also present some experiments to show the robustness of LF DG in the presence of noise. LF DG can be made invariant to linear changes of illumination. One can observe that any linear change of illumination changes the magnitude of LCFP,R function linearly. This effect, however, does not change the phase of the LF DG (because the ratio of Im(LF DG) to Re(LF DG) remains the same). The magnitude of LF DG remains unchanged if the intensity of the image is normalized before computing the gradients. We normalize the intensity by setting the mean of the intensities to zero and the standard deviation to one. Finally, LF DG is robust to blurriness. The reason is that the blur effect mainly dampens the high frequency components and the low frequencies (including LF DG) are less affected. This property is shown in Section IV-A. B. 3D Gradient Estimation Eqs. 4 and 5 can be interpreted as the projection of the local gradients into the # x and # y axes or gx and gy , respectively (G = [gx , gy ]). We rewrite Eq. 5 as: Im(LF DG) =

P X k=1

 fk cos

π 2π(k − 1) + 2 P

Fig. 4. 3D LF DG on a sample MRI brain image. The rows show Axial, Coronal, and Sagittal views. The first column is the original image, the next 3 columns are the gx , gy , and gz , and the last column is the coordinate system for each view.

sample point with # x and # y axes are shown by θx and θy , respectively. It can be observed that   14π gx = Re(LF DG) =f1 cos (0) + ... + f8 cos 8   (7) 14π gy = Im(LF DG) =f7 cos (0) + ... + f6 cos . 8 In other words, the projection of the gradient onto an axis can be computed by multiplying the intensity value of the samples (fk ) by cos(θ), where θ is the angle between the sample’s location and the axis. Using this intuition, the LF DG can be easily extended to 3D: LF DGa =

P X

fk cos(θa ),

(8)

k=1

where θa is the angle between an arbitrary axis (a) and the k th sample and the samples are located on a sphere of radius R. To define the location of samples we divide the range [−1 1] on axis a into N distances. For each distance, we use M samples on the sphere. By considering 2 samples on the poles (i.e., parallel, and anti parallel to the axis), we have (N − 1)M + 2 samples on the sphere (Figure 3(b)). Figure 4 shows a 3D MRI brain image with the gradients computed for the # x , # y , and # z axes (R = 2, N = 16, M = 4). The brain is shown from 3 views: axial, coronal, and sagittal, and the coordinate system is shown for each view. C. Local Coordinate System

 .

(6)

By comparing this equation with Eq. 4 one can see that the two equations differ only by a factor of π2 in the cos function. We may also note that the direction of the # x axis can be obtained by adding π2 to the direction of # y axis. Figure 3(a) illustrates the concept using eight samples. The angle of each

To compute 3D texture features, first a local coordinate system is defined at each voxel. We use the 2D LF DG operator to specify the local coordinate system. To do so, the 2D LF DG is applied to the XY , XZ, and Y Z planes to give local gradients on each plane. We call the 2D local gradient # # # vectors as V XY , V XZ , and V Y Z and use them to define local # # # coordinate system. The V XY , V XZ , and V Y Z vectors are

1057-7149 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2351620, IEEE Transactions on Image Processing IEEE TRANSACTION ON IMAGE PROCESSING

5

(a)

(b)

Fig. 6. Tessellation of a unit circle for N = 8 and M = 6. a) The original tessellation (histogram bins), b) The triangle bins around the pole are merged to form one bin. Fig. 5. Construction of local coordinate system. In the left, the original ~XZ and V~2 = V ~XY . coordinate system is shown. In this example, V~1 = V The plane through V~1 and V~2 with normal vector ~k is shown. In the right, the new local coordinate system is shown.

# #

#

sorted based on their magnitude. Assume that V1 , V2 , and V3 # # # are the sorted version of V XY , V XZ , and V Y Z :

#

#

#

|V1 | ≥ |V2 | ≥ |V3 |,

(9)

where | | denotes the L2 norm operator. To avoid confusion with the original axes # x = [1, 0, 0]T , # y = [0, 1, 0]T , and # T z = [0, 0, 1] , assume that the local coordinate system is # # # represented by i = [xi , yi , zi ]T , j = [xj , yj , zj ]T , and k = [xk , yk , zk ]T (using the right-hand rule), respectively. To define # # the local coordinate system V1 and V2 are considered. The # # local coordinate system is constructed such that the i and j # # axis are located on the plane passing from V1 and V2 . To do # # # so, we set i in the direction of V1 . Then we set k

# k = #

#

# # .

V1 × V2

|V1 × V2 |

#

(10)

V

# = # . V

|V |

E. 3D Texture Features Since we use a local coordinate system, we need to find an efficient way to assign vectors to the bins. One may note that Eq 11 gives the local gradients in the original coordinate # # # system. Therefore, we need to use vectors i , j , and k to find #norm the corresponding bin for each vector V . We compute the # elevation (e) and azimuth (a) indices. The elevation of V norm # #norm is found by projecting V on the k axis:

#

# #

#

E = (V norm · k ) k ,

Using this strategy, k is perpendicular to the plane passing # # # # through V1 and V2 . Therefore, i and j are located on the # # same plane passing through V1 and V2 . Figure 5 illustrates the construction of the local coordinate system. # We compute the local 3D gradient V (vx , vy , vz ) along the # # # x , y , and z axes using Eq. 8. The direction of the gradient vectors are computed by normalizing the local gradients:

#norm

each of which has the same solid angle of 4π/(N M ). In this paper we merge the triangles around the poles which results in (N − 2)M + 2 regions. Figure 6 illustrates a quantization for N = 8 and M = 6. We merge the regions around the two poles to avoid instabilities occurring when a vector is close to a pole. In such cases a small difference (e.g., noise) can change the vector’s designated bin.

(11)

The normalization step makes the vector a point on the surface of a unit sphere. D. Orientation Quantization In the next step, the direction of local gradients are quantized into 3D histogram bins using their local coordinate system. In order to quantize the vectors, we use the idea presented in [32] with some modifications. In this approach, a unit sphere is parameterized by the elevation (−1 ≤ z ≤ 1) and azimuth (0 ≤ φ < 2π). In order to have regions with equal area on the sphere, the elevation (−1 ≤ z ≤ 1) is divided into N equal distances and the azimuth into M equal intervals. This tessellation results in 2M spherical triangles at the poles and (N − 2)M spherical quadrangles on the rest of the sphere

(12)

where operator · is the dot product of the two vectors. The elevation index is then computed by:    # #   Sign(|E|)| E| + 1   (N − ) , (13) e= 2 where b c denotes the floor operator,  a small positive number # (i.e., 10−10 ), and Sign(|E|) the sign operator:  # #  −1 , if V norm · k < 0 # #norm # Sign(|E|) = (14) 0 , if V · k =0  # # +1 , if V norm · k > 0. The range of the elevation index using Eq. 13 is [0 N − 1]. # To compute azimuth we first find the projection of V norm # onto the ij plane. This can be done by subtracting E from # V norm : # # # V ij = V norm − E. (15)

#

#

The angle between V ij and i can be found by: ! # V ij # φ = acos # · i , |V ij |

(16)

where acos denotes arc cosine. One may note that this equation does not differentiate between the vectors that have

1057-7149 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2351620, IEEE Transactions on Image Processing IEEE TRANSACTION ON IMAGE PROCESSING

6

an angle of φ in the clockwise direction and those in the counter-clockwise direction on the ij plane. To correct this, # # we use the cross product of V ij and i :

#

#

#

C = i × V ij .

#

(17)

#

If C is in the direction of k (i.e., their dot product is greater than zero) then φ is a positive angle. Otherwise φ is replaced by 2π − φ:  # # φ , if C • k ≥ 0 φ= (18) 2π − φ , otherwise. The angle φ is in the range [0 2π). The azimuth index is now defined as:   φM , (19) a= 2π where a is in the range [0 M − 1]. The bin index is a number in the range [1 (N − 2)M + 2] computed based on the elevation and azimuth indices:  1 , if e = 0  #norm (N − 2)M + 2 , if e = N − 1 index(V )=  (e − 1)M + a + 2 , otherwise. (20) To construct the final texture features we use intensity ordering approach [46]. Assume that a voxel with index n is rep# # # resented by a quadruple V OX(n) = (I(n) , V (n) , V1(n) , V2(n) ), # where I(n) is the intensity of the voxel, V (n) the 3D gradient, # # V1(n) , and V2(n) , the two larger gradients computed by the 2D LF DG (i.e., sorted “in-plane” gradients using Eq. 9). The voxels are sorted in ascending order of their intensity values, V OX(1) , ..., V OX(N ) , where N is the total number of voxels in the volume and I(1) ≤ I(2) ≤ ... ≤ I(N ) . The ordered voxels are divided into K partitions, o n P r(k) = V OX(n) |I(d N (k−1) +1e) ≤ I(n) ≤ I(d N.k e) , K K (21) where d e denotes the ceiling operator. In each partition, P r(k), the orientation histograms are computed separately: X 1 # Bin(V (n) , i).µ(k) (22) H(k, i) = |P r(k)| ∀V OX(n) ∈P r(k)

where i is the index of the bins (i = {1, ..., (N − 2)M + 2}), |P r(k)| the cardinality of partition k, (  #  V # 1, if index | V#(n) | = i (n) Bin(V (n) , i) = (23) 0, otherwise, and µ(k) the ratio of the magnitude of the larger “in-plane” gradients in partition k with respect to the whole volume P # # ∀V OX(n) ∈P r(k) |V1(n) ||V2(n) | µ(k) = . (24) P # # ∀V OX(n) |V1(n) ||V2(n) | The final feature vector is a (N − 2)M + 2 × K feature vector constructed by concatenating the orientation histograms in all subregions: F eat = {H(1, 1), H(1, 2), ..., H(K, (N − 2)M + 2)}. (25)

III. I MPLEMENTATION D ETAILS The proposed LFDG operator for gradient calculation can be efficiently implemented. The idea is to define a kernel (2D/3D) to represent Eqs. 4, 6, and 8 and the gradient is computed by convolving the defined kernel with the image. For a radius of R, the kernel has a size of N × N in 2D and N × N × N in 3D where N = 2R + 1. For instance, for R = 1, the kernel is 3 × 3 in 2D and 3 × 3 × 3 in 3D. There are two factors to form the kernel: the value of the samples, fk , in Eqs. 4, 6, and 8, and the cosine of the angle between each sample and the axis. Since the location of the samples are known for a specific R and P , both factors can be found easily. The value of each sample is found using bilinear interpolation from its four nearest neighbors in 2D and eight nearest neighbors in 3D. To compute the value of each sample, fk , we consider a P × N 2 matrix. The rows of the matrix represent the samples and the columns the weight of each element in the kernel (assume that the elements of are put in a 1 × N 2 row vector). One may note that for those samples that are not located on the pixels’ location and need bilinear interpolation, four elements in the corresponding row of the matrix have non zero values. Otherwise, only one element has the value of one the rest of the elements are zero on that row. We call this matrix B representing the bilinear weights of the kernel. Figure 7(a) illustrates the construction of a 3 × 3 kernel. For construction of B we have two cases in which the samples are either located at the center of pixels or not. For example, since fi is not located on a pixel four elements on the ith row of the B matrix have non-zero values. The values are w1 = (1 − d1 )(1 − d4 ), w2 = (1 − d2 )(1 − d4 ), w3 = (1 − d1 )(1 − d3 ), and w4 = (1 − d2 )(1 − d3 ), where di are the distances shown in Figure 7. In the case of fj , the sample is located exactly on pixel 2; hence, the entry (j, 2) in the matrix is 1 and the rest of the elements on the j th row are zero. Now, we consider a 1 × P row vector called C to represent the cosine weights in Eqs. 4, 6, and 8. Each element of C is the cosine of the angle between the sample fk and the axis. We define Ker = C × B which is a 1 × N 2 row vector. The Ker is reshaped to N × N , and the values of Ker are reflected around the center for the convolution operation. The convolution of Ker with an image is equivalent to the summations in Eqs. 4, 6, and 8. Figure 7(b) shows two kernels for the “x” axis computed for radius of 1 but with different sample numbers (P = {8, 16}). It can be observed that both kernels sum to 0, and have the differentiation property (i.e., computing the difference in the direction of the axis) which is common in all gradient calculators. Nonetheless, a different pattern is observed in each kernel. IV. E XPERIMENTAL R ESULTS A. 2D Experiments In this section the robustness of the 2D LF DG in different noise and blur conditions is evaluated. For these experiments we use the Kodak dataset4 . This dataset includes 24 color 4 http://r0k.us/graphics/kodak/kodim08.html

1057-7149 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2351620, IEEE Transactions on Image Processing IEEE TRANSACTION ON IMAGE PROCESSING

7

(a)

(b)

Fig. 7. a) Construction of a 3 × 3 2D kernel, matrix B represents the bilinear interpolation weights and C the cosine coefficients. b) Two sample kernels for the “x” axis: top (R = 1, P = 8), and bottom (R = 1, P = 16).

images which are converted to the gray scale format. In the experiments, a random Gaussian noise with a specific standard deviation (σ = 1, 1.5, ..., 3) is added. To reduce the variability of randomness, the experiments are repeated fifty times for each noise level. We compare the LF DG operator with three commonly used methods: the central difference (CD), the Sobel operator, and the first order derivative of Gaussian (DG). To make the result of the methods comparable, the gradient vectors in image I are normalized as follows:

#

#

Vi = P

Vi #

∀ V i ∈I

# .

|V i |

(26)

Then we measure the normalized error as follows: Err =

#

#

# # N 1 X |V noise − V orig | i i , # orig N i=1 |V i |

(27)

where V noise and V orig are the gradient vectors in the noisy i i and the original image, and N the number of pixels in the image. To avoid instability due to small values in the # denominator, the vectors with |V orig | < 10−7 are excluded. i We use 16 samples on radius of 2 for LF DG and 5×5 window size for CD, Sobel, and DG gradient operators since this window size is considered a good compromise between gradient localization and noise [47]. Table I shows the normalized noise error for these methods. One can see that the LF DG and DG provide very similar results. We further analyze the results to identify methods that are statistically better than others. We perform the two-tailed t-test with p < 0.05 starting with the best methods (i.e., lowest mean errors). We test if the top two methods have a significant statistical difference. If there is no statistical significance between the top two methods (meaning they are equivalently good) we perform the t-test for the third method and identify if it has significant statistical difference with all the best methods and continue this process until there is a significant difference between best method(s) and the next top methods. The methods that are significantly better are shown in bold font. This analysis shows that LFDG, DG, and Sobel are equivalently the best top methods for different levels of noise. To produce the blur effect, we convolve the image with a Gaussian smoothing

TABLE I N ORMALIZED NOISE ERROR FOR GRADIENT OPERATORS : C ENTRAL D IFFERENCE (CD), S OBEL , DERIVATIVE OF G AUSSIAN (DG), AND LFDG. M ETHODS WHICH SIGNIFICANTLY OUTPERFORM ARE SHOWN IN BOLD . Method CD Sobel DG LFDG

σ=1 1.54±0.12 1.27±0.15 1.24±0.16 1.24±0.16

σ = 1.5 1.60±0.10 1.40±0.12 1.38±0.12 1.38±0.13

σ=2 1.63±0.10 1.47±0.10 1.45±0.11 1.46±0.11

σ = 2.5 1.65±0.09 1.51±0.09 1.50±0.10 1.50±0.10

σ=3 1.66±0.09 1.54±0.08 1.53±0.09 1.53±0.09

TABLE II N ORMALIZED BLUR ERROR FOR GRADIENT OPERATORS : C ENTRAL D IFFERENCE (CD), S OBEL , DERIVATIVE OF G AUSSIAN (DG), AND LFDG. M ETHODS WHICH SIGNIFICANTLY OUTPERFORM ARE SHOWN IN BOLD . Method CD Sobel DG LFDG

σ=1 0.89±0.05 0.42±0.04 0.38±0.04 0.36±0.04

σ = 1.5 1.13±0.08 0.61±0.07 0.56±0.06 0.51±0.06

σ=2 1.31±0.08 0.87±0.09 0.80±0.08 0.73±0.07

σ = 2.5 1.40±0.08 1.03±0.09 0.97±0.09 0.92±0.08

σ=3 1.47±0.07 1.14±0.09 1.09±0.09 1.05±0.09

x2 +y 2

kernel defined as K(x, y) = e− 2σ2 with a window size of W ×W (W = (1.5×σ+.5)×2+1). To evaluate the robustness against blur effect we use σ = 1, 1.5, ..., 3. Table II compares the normalized blur error for the methods. As one can see, LF DG operator provides a lowest error for the blur effect. This robustness is statistically significant. The 2D experiments demonstrate that LF DG has a comparable performance to DG and Sobel in the presence of Gaussian noise while outperforms these methods in the presence of blur effect. B. 3D Datasets In contrast to the 2D texture classification problem for which there are several standard evaluation datasets, there is no standard 3D texture dataset for evaluation. As a result, it is common that each method uses its own synthetic or realistic data which makes the comparison between methods hard. To address this issue, in this paper we use four datasets (one synthetic and three realistic) that are publicly available. The synthetic dataset [48]5 is composed of 15 texture classes. Each class has ten 64×64×64 samples generated by Fourier texture synthesis [49]. 5 http://www.rfai.li.univ-tours.fr/fr/ressources/

3Dsynthetic images database.html

1057-7149 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2351620, IEEE Transactions on Image Processing IEEE TRANSACTION ON IMAGE PROCESSING

8

TABLE III S PECIFICATIONS OF 3D DATASETS . Dataset Fourier Aging ISBR Brain Tumor

Class No. 15 2 3 2

Images in Class 10 20 18 10

TABLE IV T HE PARAMETER VALUES USED FOR EACH DATASET. Type Synthetic MRI MRI MRI

The realistic datasets are MRI images of the brain. In the first realistic dataset we evaluate texture features in distinguishing between young versus old brains. It is well known that the brain structure changes by aging [50], and therefore, it is useful if texture features can detect the changes. We picked 20 images of young brains (average age of 18.6) and 20 old brains (average age of 86.8) from the OASIS brain database 6 . In each group half of the images belong to females and half to males. The images are T1-weighted, with size of 176×208×176 and voxel size of 1×1×1 m3 . All images are preprocessed to remove the skull, to register to the Talairach atlas [51], and to correct the intensity (normalized). The second realistic dataset aims to evaluate the capability of texture features for tissue classification problem. To do this, the segmentation dataset of ISBR [52]7 is used. This dataset includes 18 T1-weighted brain MRIs (4 females and 14 males) in which the gray matter (GM), white matter (WM) and cerebrospinal fluid (CSF) are segmented. The images are 256 × 128 × 256 with voxel size of 0.94 × 0.94 × 1.5 m3 . The third realistic dataset helps to evaluate the performance of the texture features to classify tumor versus normal tissues in the brain. We use the pre-operative MRI (group 2) images from the BITE (Brain Images of Tumors for Evaluation) dataset [53]8 . This brain tumor dataset consists of 14 images of T1-weighted MRI with gadolinium. These images have been registered to a common coordinate space. The images are 394 × 466 × 378 with voxel size of 0.5 × 0.5 × 0.5 m3 . The masks of the tumors are provided. To do the experiments we consider the location of the tumor and define the non-tumor region on the same location but on the other hemisphere. By this approach, for each tumor region we have a corresponding non-tumor region. For simplicity we considered brains that have only one tumor region on only one side of the brain. Therefore, only 10 brains could be used. Table III shows the specifications of the datasets. C. Tuning Parameters There are three main parameters used to compute the texture features. The first and second parameters are the scales at which the local coordinate system and local gradients are computed (determined by (Rc , Pc ) and (Rg , Pg )). The third parameter is the number of partitions, K. We use half of the data in each dataset to learn these parameters. The radii of the scales (Rc and Rg ) are searched from 1, 2, 3, 4 with Pc = 4 + 4Rc , and Pg = 2 + 32Rg , and the best K is found from 1, 2, ..., 6. Table IV shows the best parameters used for texture computation. Figure 8(a) shows the classification 6 http://www.oasis-brains.org/ 7 http://www.nitrc.org/projects/ibsr 8 http://www.bic.mni.mcgill.ca/Services/ServicesBITE

Dataset Fourier Aging ISBR Brain Tumor

Rg 3 1 3 3

Pg 16 8 16 16

Rc 2 2 2 1

Pc 66 66 66 34

K 1 6 2 3

(a)

(b) Fig. 8. Parameters tuning step. a) the classification rate for the bast K for datasets. The entry (i, j) in each 4×4 matrix shows the classification accuracy for Rc = i and Rg = j. b) the classification rate for different values of K and best Rg and Rc .

rate for the best K for each dataset (e.g., K = 2 for ISBR). The entry (i,j) on each 4 × 4 matrix shows the classification accuracy for Rc = i and Rg = j. For example, the classification rate for the ISBR with Rg = 3, Rc = 1 is 99.70%. The higher classifcation rates are shown with a brighter color. In Figure 8(b) the classification rate for different values of K and the best Rg and Rc are shown. As one can see, a change of parameters changes the classification rate. However, in most of the cases this change is quite small. In the Aging and ISBR datasets we have several parameters that result in 100% accuracy. In these cases any of the values could be selected. The value of K=1 in the Fourier dataset indicates that partitioning (increasing K) does not improve the resuls in this dataset.

1057-7149 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2351620, IEEE Transactions on Image Processing IEEE TRANSACTION ON IMAGE PROCESSING

No specific pattern is observed for choosing the number of partitions (K); however, it seems that choosing a larger number (K ≥ 4) is a good option in majority of cases. One may note that increasing K has remarkably improved the performance in the Aging dataset. The best values for Rc and Rg change from database to database, but, there are some cases which provide reasonably high classification rate for all datasets (e.g., Rc = 3, Rg = 1). D. Texture Classification We compare the proposed LF D method with three wellknown texture classification methods: 3D GLCM [29], 3D LBP [31], and multiresolution Second Orientation Pyramid (SOP) [34], [35]. We have implemented the first two methods since their codes are not publicly available. We use the implementation provided the authors of the SOP method9 . We use the recommended parameters provided by the authors of these methods. For GLCM the parameters have been set to 8 bins for intensity, 6 bins for gradient magnitude, 6 bins for gradient angle, and distance of 4. For LBP, the parameters are: 2 for radius, 10 for number of samples, and 12 for number of uniform patterns. For SOP we set the level of the pyramid to 4. For classification, we use the Nearest Neighbor (NN) classifier, and L2 distance: v uN uX 2 (f1 (n) − f2 (n)) , (28) kf1 − f2 k2 = t n=1

where f1 and f2 are the given two feature vectors, f1 (n) and f2 (n) are the nth feature in the vector, and N is the total number of features. We choose half of the data randomly for training and the rest for testing. To reduce the variability of randomness, we repeat the classification 50 times. The classification accuracy of the methods (mean±standard deviation) are shown in Table V and the best methods that are statistically better than the others are shown in bold for each dataset. Since feature selection is considered by many texture methods, we also perform feature selection on these methods using the Bhattacharyya space of the features [13], [34]. The best 30 features from each method are selected. The methods after feature selection are shown by asterisk and are shown on the bottom of Table V. The performance of the methods with no feature selection shows the quality of the features for each method. As one can see, the proposed method provides the highest classification accuracy in all four datasets. In the Tumor dataset LBP is equally as good as the proposed method. While GLCM is the second best method for the ISBR and Aging datasets, it is the third best method in the Fourier and Tumor datasets. SOP is the third best method for the ISBR and Aging datasets, but has the lowest poor performance in the Fourier and Tumor datasets. Finally, LBP is one of the best methods in the Tumor dataset and is the second best method in the Fourier dataset, but has the lowest performance in the Aging and ISBR datasets. Overall, the proposed method consistently 9 http://www.dcs.warwick.ac.uk/∼creyes/m-vts/

9

outperforms the other methods. However, the performance of the other methods highly depends on the dataset. Feature selection has different impacts. It can be observed that feature selection can improve the performance of SOP in majority of cases. The reason is that the feature computed by SOP include the low and high frequency responses while depending on the nature of the dataset only some frequency responses are useful. As a result, feature selection is an important step to improve the performance of SOP. On the other hand, feature selection does not help improve the performance of LBP and LF D. This indicates that the features computed by these methods are equally important and therefore feature selection is not helpful. Feature selection has inconsistent results for the GLCM . It does not make a remarkable difference for the Fourier dataset, degrades the performance for the ISBR, and improves the performance for the Aging and Tumor datasets. A possible reason is that the Bhattacharyya space feature selection does not effectively select the best features of GLCM in the ISBR dataset. Nevertheless, after feature selection, the LF D method is the best method in the Aging and ISBR datasets and equally outperforms as one of the top methods along with SOP and LBP in the Fourier and Tumor datasets, respectively. E. Robustness To examine the robustness of the methods, we add noise and blur effects to the 3D images. In the first experiment, similar to the 2D experiment, a random Gaussian noise with a specific standard deviation (σ = 1, 1.5, ..., 3) is added and the experiment is repeated 50 times. Figure 9 plots the average classification rate. The image with σ = 0 represents the original classification where there is no noise. It can be observed that the LFD method has the highest classification rate in all levels of noise in majority of cases. The most noise sensitive method in the Fourier and Aging datasets is 3D GLCM which loses its performance more than other methods. The ranking of the methods remain the same in the Fourier, and ISBR; however, in the Aging and Brain Tumor datasets the ranks of LBP and GLCM change with each other. In the next experiment the 3D images are blurred with a 3D Gaussian kernel with different bandwidths (σ = 1, 1.5, ..., 3). Figure 10 shows the performance of the methods. The first difference between the noise and blur effect is that the methods have inconsistent behavior for the blur effect. Indeed, the blur effect is a compromise between removing noise and losing details. The inconsistent behavior of methods for different datasets is caused by the nature of data and the approach of each method for computing features. Fourier dataset includes variety of texture classes but the textures are homogeneous. Aging dataset has only two texture classes but the textures are heterogeneous. ISBR consists of three texture classes (i.e., WM, GM, and CSF) and the texture classes are quite separated. Tumor dataset includes mainly homogeneous textures. GLCM uses coarse information (i.e., quantizes the information into small number of bins, e.g., 8 bins for intensities). The blur effect for this method removes noise, but, does not destroy the level of details that GLCM

1057-7149 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2351620, IEEE Transactions on Image Processing IEEE TRANSACTION ON IMAGE PROCESSING

10

TABLE V T HE CLASSIFICATION ACCURACY ( MEAN±STD ) OF DIFFERENT METHODS FOR THE DATASETS . T HE METHODS AFTER FEATURE SELECTION ARE SHOWN BY ASTERISK ∗ . M ETHODS WHICH SIGNIFICANTLY OUTPERFORM ARE SHOWN IN BOLD . Datasets

Methods Fourier

ISBR

Brain Tumor

3D GLCM

(72.85±4.07)%

(91.30±5.96)%

(96.67±3.28)%

(67.80±10.16)%

3D LBP

(79.60±3.95)%

(63.50±10.51)%

(87.70±5.26)%

(82.20±10.55)%

SOP

(70.16±5.04)%

(66.00±13.36)%

(91.85±3.59)%

(55.60±13.87)%

LFD

(98.51±1.19)%

(100.00±0.00)%

(100.00±0.00)%

(81.40±9.48)%

3D

(a)

Aging

GLCM∗

(72.59±7.78)%

(97.70±4.07)%

(89.63±6.94)%

72.00±11.61)%

3D LBP∗

(77.92±3.95)%

(63.30±15.17)%

(87.33±5.55)%

82.20±10.93)%

SOP∗

(97.41±1.53)%

(72.70±11.26)%

(91.48±3.45)%

62.80±14.71)%

LFD∗

(97.23±1.37)%

(100.00±0.00)%

(100.00±0.00)%

(81.40±9.48)%

(b)

(c)

(d)

Fig. 9. Classification rate of the methods in the presence of different levels of noise in the four datasets: a) Fourier, b) Aging, c) ISBR, and d) Brain Tumor.

(a) Fig. 10. Tumor.

(b)

(c)

(d)

Classification rate of the methods in the presence of different levels of blur effect in the four datasets: a) Fourier, b) Aging, c) ISBR, and d) Brain

needs. Hence, the performance of GLCM is improved by the blur effect. LBP uses the difference of the central voxel and its surrounding region. In datasets with more homogeneous textures (i.e., Fourier and Tumor) the blur effect removes this type of details. However, if the textures are quite heterogeneous (i.e., Aging) or separated (i.e., ISBR) the blur effect helps removing noise with little effect on the LBP features. SOP uses the information extracted from the low, mid, and high frequency subbands. In general, the blur effect changes the high frequency features. If the blur effect makes a good balance between the required level of details and the noisy information captured by the high frequency features, the performance of SOP improves. LF D has a different behavior compared to other methods. LF D uses detailed information and the extracted LF D features are not sensitive to noise. As a result, the blur effect which usually removes noise does not improve the performance of LF D. The noise and blur

effects slightly degrade the performance of LF D, however, LF D keeps its rank (as the best method) in both effects. V. RUN T IME One important factor in practical usage is the run time of a method particularly in 3D data which usually requires a high amount of processing. The efficient kernel based implementation of the proposed method for gradient calculation makes it useful for practical usages. Table VI shows the run time of the methods. All methods have been efficiently implemented in Matlab (using vectorization for speed). The programs run on a PC with an Intel quad core CPU with 16GB RAM running Windows 7 Professional. It can be observed that LFD is the fastest method in all datasets. The next method is 3D LBP in the Fourier, ISBR, and Brian Tumor. However, it has a slower runtime in the Aging dataset which consists of large images. The slowest

1057-7149 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2351620, IEEE Transactions on Image Processing IEEE TRANSACTION ON IMAGE PROCESSING

11

TABLE VI T HE AVERAGE RUN TIME IN SECONDS OF DIFFERENT METHODS FOR EACH IMAGE IN DIFFERENT DATASETS . Datasets

Methods Fourier

Aging

ISBR

Brain Tumor

3D GLCM

18.78

317.21

590.59

93.92

3D LBP

1.84

137.14

46.98

7.63

SOP

2.41

59.70

104.41

21.00

LFD

0.91

9.51

22.91

3.79

method is 3D GLCM since it has to compute the coocurrence matrix for all directions with distances of 1 to 4. The proposed method is more than two times faster than 3D LBP in small datasets, and more than 6 times faster than SOP (the second fastest method) in the Aging dataset. Indeed for the Aging dataset, LFD can be computed in about 10 seconds, while it takes about 1 minute for SOP, more than 2 minutes for 3D LBP, and more than 5 minutes for 3D GLCM. VI. C ONCLUSIONS This paper presents a robust method to compute local gradient in 2D and 3D images at different scales. The proposed gradient operator can be used in general application where a robust gradient computation is needed. The performance of the operator is equivalent to Sobel and first order derivative of Gaussian in the presence of Gaussian noise but is more robust in the presence of blur effect. This paper also provides an accurate, robust, and fast method for 3D volumetric texture classification. The features are based on the proposed local gradients computed “in plane” (2D) and “in volume” (3D). The 2D local gradients are used to define a local coordinate system at each voxel. Using the two larger gradients make the local coordinate system robust. The direction of the 3D local gradient vectors are quantized in 3D orientation bins defined on a local coordinate system. The local gradient computation is efficiently implemented by 2D and 3D kernels which make the computation fast. The experimental results on synthetic and realistic 2D and 3D images demonstrate the accuracy, robustness, and speed of the proposed method compared to the state-of-the-art volumetric texture classification methods. ACKNOWLEDGMENTS The authors would like to thank the anonymous reviewers for their useful comments and the authors of the SOP method for sharing their codes. Our work made use of the infrastructure and resources of AICT (Academic Information and Communication Technologies) of the University of Alberta. Financial support has been provided by NSERC, the Alberta Innovates Graduate Student Scholarship, the Killam Memorial Scholarship, the ALS Society of Canada, and the ALS Association. R EFERENCES [1] W.-C. Li and D.-M. Tsai, “Wavelet-based defect detection in solar wafer images with inhomogeneous texture,” Pattern Recognition, vol. 45, no. 2, pp. 742–756, 2012.

[2] M. Zhao, S. Li, and J. Kwok, “Text detection in images using sparse representation with discriminative dictionaries,” Image and Vision Computing, vol. 28, no. 12, pp. 1590–1599, 2010. [3] J. A. dos Santos, P. Gosselin, S. Philipp-Foliguet, R. da S. Torres, and A. Falao, “Multiscale classification of remote sensing images,” IEEE Transactions on Geoscience and Remote Sensing, vol. PP, no. 99, pp. 1–12, 2012. [4] L. Nanni and A. Lumini, “Local binary patterns for a hybrid fingerprint matcher,” Pattern recognition, vol. 41, no. 11, pp. 3461–3466, 2008. [5] P. Lehana, S. Devi, S. Singh, P. Abrol, S. Khan, and S. Arya, “Investigations of the mri images using aura transformation,” Signal & Image Processing: An International Journal (SIPIJ), vol. 3, no. 1, pp. 95–104, 2012. [6] R. M. Haralick, K. Shanmugam, and I. H. Dinstein, “Textural features for image classification,” Systems, Man and Cybernetics, IEEE Transactions on, no. 6, pp. 610–621, 1973. [7] X. Qin and Y.-H. Yang, “Similarity measure and learning with gray level aura matrices (glam) for texture image retrieval,” in Computer Vision and Pattern Recognition, 2004. CVPR 2004. Proceedings of the 2004 IEEE Computer Society Conference on, vol. 1. IEEE, 2004, pp. I–326. [8] M. M. Galloway, “Texture analysis using gray level run lengths,” Computer graphics and image processing, vol. 4, no. 2, pp. 172–179, 1975. [9] X. Chu and K. Chan, “Rotation and scale invariant texture analysis with tunable gabor filter banks,” in Advances in Image and Video Technology, ser. Lecture Notes in Computer Science, T. Wada, F. Huang, and S. Lin, Eds. Springer Berlin / Heidelberg, 2009. [10] T. Leung and J. Malik, “Representing and recognizing the visual appearance of materials using three-dimensional textons,” International Journal of Computer Vision, vol. 43, no. 1, pp. 29–44, 2001. [11] M. Varma and A. Zisserman, “A statistical approach to texture classification from single images,” International Journal of Computer Vision, vol. 62, no. 1, pp. 61–81, 2005. [12] M. Unser and M. Eden, “Multiresolution feature extraction and selection for texture segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 717–728, 1989. [13] C. Reyes-Aldasoro and A. Bhalerao, “The bhattacharyya space for feature selection and its application to texture segmentation,” Pattern Recognition, vol. 39, no. 5, pp. 812–826, 2006. [14] G. R. Cross and A. K. Jain, “Markov random field texture models,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, no. 1, pp. 25–39, 1983. [15] F. S. Cohen, Z. Fan, and M. A. Patel, “Classification of rotated and scaled textured images using gaussian markov random field models,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 13, no. 2, pp. 192–202, 1991. [16] S. Lazebnik, C. Schmid, and J. Ponce, “A sparse texture representation using local affine regions,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 8, pp. 1265–1278, 2005. [17] T. Ojala, M. Pietik¨ainen, and T. M¨aenp¨aa¨ , “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971–987, 2002. [18] Z. Guo, L. Zhang, and D. Zhang, “A completed modeling of local binary pattern operator for texture classification,” IEEE Transactions on Image Processing, vol. 19, no. 6, pp. 1657–1663, 2010. [19] S. Liao, M. W. K. Law, and A. C. S. Chung, “Dominant local binary patterns for texture classification,” IEEE Transactions on Image Processing, vol. 18, no. 5, pp. 1107–1118, 2009. [20] M. Heikkila, M. Pietik¨ainen, and C. Schmid, “Description of interest regions with local binary patterns,” Pattern Recognition, vol. 42, no. 3, pp. 425–436, 3 2009. [21] H. Lategahn, S. Gross, T. Stehle, and T. Aach, “Texture classification by modeling joint distributions of local patterns with gaussian mixtures,” IEEE Transactions on Image Processing, vol. 19, no. 6, pp. 1548–1557, 2010. [22] S. Zhang, H. Yao, and S. Liu, “Dynamic background modeling and subtraction using spatio-temporal local binary patterns,” in IEEE International Conference on Image Processing, 2008, pp. 1556–1559. [23] G. Zhao and M. Pietikainen, “Dynamic texture recognition using local binary patterns with an application to facial expressions,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 915–928, 2007. [24] J. Zhang, J. Liang, and H. Zhao, “Local energy pattern for texture classification using self-adaptive quantization thresholds,” IEEE Transactions on Image Processing, vol. 22, no. 1, pp. 31–42, 2012.

1057-7149 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TIP.2014.2351620, IEEE Transactions on Image Processing IEEE TRANSACTION ON IMAGE PROCESSING

[25] S. Koelstra, M. Pantic, and I. Patras, “A dynamic texture-based approach to recognition of facial actions and their temporal models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 32, no. 11, pp. 1940–1954, 2010. [26] S. Dubois, R. P´eteri, and M. M´enard, “A comparison of wavelet based spatio-temporal decomposition methods for dynamic texture recognition,” Pattern Recognition and Image Analysis, pp. 314–321, 2009. [27] K. G. Derpanis and R. Wildes, “Spacetime texture representation and recognition based on a spatiotemporal orientation analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 6, pp. 1193–1205, 2012. [28] A. Kurani, D. Xu, J. Furst, and D. Raicu, “Co-occurrence matrices for volumetric dat,” in The Seventh IASTED International Conference on Computer Graphics and Imaging, 2004, pp. 426–443. [29] V. Kovalev, F. Kruggel, H. Gertz, and D. V. Cramon, “Three-dimensional texture analysis of mri brain datasets,” IEEE Transactions on Medical Imaging, vol. 20, no. 5, pp. 424–433, 2001. [30] D. Xu, A. Kurani, J. Furst, and D. Raicu, “Run-length encoding for volumetric texture,” in The 4th IASTED International Conference on Visualization, Imaging, and Image Processing, 2004, pp. 6–8. [31] L. Paulhac, P. Makris, and J. Ramel, “Comparison between 2D and 3D local binary pattern methods for characterisation of three-dimensional textures,” Image Analysis and Recognition, pp. 670–679, 2008. [32] V. A. Kovalev, M. Petrou, and Y. S. Bondar, “Texture anisotropy in 3d images,” IEEE Transactions on Image Processing, vol. 8, no. 3, pp. 346–360, 1999. [33] K. Jafari-Khouzani, H. Soltanian-Zadeh, K. Elisevich, and S. Patel, “Comparison of 2d and 3d wavelet features for tle lateralization,” in SPIE Medical Imaging, 2004, pp. 593–601. [34] C. Reyes-Aldasoro and A. Bhalerao, “Volumetric texture segmentation by discriminant feature selection and multiresolution classification,” IEEE Transactions on Medical Imaging, vol. 26, no. 1, pp. 1–14, 2007. [35] ——, “Volumetric texture description and discriminant feature selection for MRI,” in Information Processing in Medical Imaging, 2003, pp. 282–293. [36] E. Ranguelova and A. Quinn, “Analysis and synthesis of threedimensional gaussian markov random fields,” in Image Processing, 1999. ICIP 99. Proceedings. 1999 International Conference on, vol. 3. IEEE, 1999, pp. 430–434. [37] S. Jain, M. Papadakis, S. Upadhyay, and R. Azencott, “Rigid-motioninvariant classification of 3-d textures,” Image Processing, IEEE Transactions on, vol. 21, no. 5, pp. 2449–2463, 2012. [38] T. Sivapriya, V. Saravanan, and P. R. J. Thangaiah, “Texture analysis of brain mri and classification with bpn for the diagnosis of dementia,” in Trends in Computer Science, Engineering and Information Technology. Springer, 2011, pp. 553–563. [39] X. Li, H. Xia, Z. Zhou, and L. Tong, “3d texture analysis of hippocampus based on mr images in patients with alzheimer disease and mild cognitive impairment,” in Biomedical Engineering and Informatics (BMEI), 2010 3rd International Conference on, vol. 1. IEEE, 2010, pp. 1–4. [40] V. A. Kovalev, F. Kruggel, and D. Y. von Cramon, “Structural brain asymmetry as revealed by 3d texture analysis of anatomical mr images,” in Proceedings. 16th International Conference on Pattern Recognition, vol. 1, 2002, pp. 808–811. [41] J. Mathias, P. Tofts, and N. Losseff, “Texture analysis of spinal cord pathology in multiple sclerosis,” Magnetic Resonance in Medicine, vol. 42, no. 5, pp. 929–935, 1999. [42] A. Kharrat, N. Benamrane, M. Ben Messaoud, and M. Abid, “Detection of brain tumor in medical images,” in Signals, Circuits and Systems (SCS), 2009 3rd International Conference on, 2009, pp. 1–6. [43] K. Jafari-Khouzani, M.-R. Siadat, H. Soltanian-Zadeh, and K. Elisevich, “Texture analysis of hippocampus for epilepsy,” in Medical Imaging 2003, 2003, pp. 279–288. [44] R. Maani, S. Kalra, and Y. H. Yang, “Rotation invariant local frequency descriptors for texture classification,” IEEE Trans Image Process, vol. 22, no. 6, pp. 2409–2419, 2013. [45] ——, “Noise robust rotation invariant features for texture classification,” Pattern Recognition, vol. 46, no. 8, pp. 2103–2116, 2013. [46] B. Fan, F. Wu, and Z. Hu, “Rotationally invariant descriptors using intensity order pooling,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 10, pp. 2031–2045, 2012. [47] E. Nezhadarya and R. K. Ward, “A new scheme for robust gradient vector estimation in color images,” IEEE Transactions on Image Processing, vol. 20, no. 8, pp. 2211–2220, 2011. [48] L. Paulhac, P. Makris, J.-Y. Ramel et al., “A solid texture database for segmentation and classification experiments.” in 4th International

12

[49] [50] [51] [52] [53]

Conference on Computer Vision Theory and Applications, 2009, pp. 135–141. J.-P. Lewis, “Texture synthesis for digital painting,” in ACM SIGGRAPH Computer Graphics, vol. 18, no. 3, 1984, pp. 245–252. A. M. Fjell and K. B. Walhovd, “Structural brain changes in aging: courses, causes and cognitive consequences,” Reviews in the Neurosciences, vol. 21, no. 3, pp. 187–222, 2010. J. Talairach and P. Tournoux, “Co-planar stereotaxic atlas of the human brain. 3-dimensional proportional system: an approach to cerebral imaging,” 1988. T. Rohlfing, “Image similarity and tissue overlaps as surrogates for image registration accuracy: widely used but unreliable,” IEEE Transactions on Medical Imaging, vol. 31, no. 2, pp. 153–163, 2012. L. Mercier, R. F. Del Maestro, K. Petrecca, D. Araujo, C. Haegelen, and D. L. Collins, “Online database of clinical mr and ultrasound images of brain tumors,” Medical Physics, vol. 39, p. 3253, 2012.

Rouzbeh Maani is currently a PhD candidate in the computing science department at the University of Alberta. His research interests include computer vision, image processing, and medical image analysis. Particularly, he is working on texture analysis and its application. He has published about 10 conference and journal papers. He has served as a reviewer for several journals including Pattern Recognition and IEEE Transactions on Image Processing.

Sanjay Kalra is an associate professor in the Division of Neurology, Department of Medicine, and an adjunct professor in the Department of Biomedical Engineering at the University of Alberta. After obtaining his MD from the University of Toronto, he completed his neurology training at McGill University and then a research fellowship at the Montreal Neurological Institute. Dr. Kalra’s research interests are in the development and application of advanced magnetic resonance imaging techniques to study neurological disorders.

Yee-Hong Yang (SM IEEE) received his B.Sc. (first honours) from the University of Hong Kong, his M.Sc. from Simon Fraser University, and his M.S.E.E. and Ph.D. from the University of Pittsburgh. He was a faculty member in the Department of Computer Science at the University of Saskatchewan from 1983 to 2001 and served as Graduate Chair from 1999 to 2001. Since July, 2001, he has been a Professor in the Department of Computing Science at the University of Alberta. He served as Associate Chair (Graduate Studies) in the same department from 2003 to 2005. His research interests cover a wide range of topics from computer graphics to computer vision, which include physically based animation of Newtonian and non-Newtonian fluids, texture analysis and synthesis, human body motion analysis and synthesis, computational photography, stereo and multiple view computer vision, and underwater imaging. He has published over 100 papers in international journals and conference proceedings in the areas of computer vision and graphics. He is a Senior Member of the IEEE and serves on the Editorial Board of the journal Pattern Recognition. In addition to serving as a reviewer to many international journals, conferences, and funding agencies, he has served on the program committees of many national and international conferences. In 2007, he was invited to serve on the expert review panel to evaluate computer science research in Finland.

1057-7149 (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Robust volumetric texture classification of magnetic resonance images of the brain using local frequency descriptor.

This paper presents a method for robust volumetric texture classification. It also proposes 2D and 3D gradient calculation methods designed to be robu...
11MB Sizes 3 Downloads 6 Views