Using Relevance Feedback to Distinguish the Changes in EEG During Different Absence Seizure Phases Jing Li, Xianzeng Liu and Gaoxiang Ouyang Clin EEG Neurosci published online 21 September 2014 DOI: 10.1177/1550059414548721 The online version of this article can be found at: http://eeg.sagepub.com/content/early/2014/09/20/1550059414548721

Published by: http://www.sagepublications.com

On behalf of:

EEG and Clinical Neuroscience Society

Additional services and information for Clinical EEG and Neuroscience can be found at: Email Alerts: http://eeg.sagepub.com/cgi/alerts Subscriptions: http://eeg.sagepub.com/subscriptions Reprints: http://www.sagepub.com/journalsReprints.nav Permissions: http://www.sagepub.com/journalsPermissions.nav

>> OnlineFirst Version of Record - Sep 21, 2014 What is This?

Downloaded from eeg.sagepub.com at UNIV OF SAN DIEGO on September 30, 2014

548721 research-article2014

EEGXXX10.1177/1550059414548721Clinical EEG and NeuroscienceLi et al

Original Article

Using Relevance Feedback to Distinguish the Changes in EEG During Different Absence Seizure Phases

Clinical EEG and Neuroscience 1–9 © EEG and Clinical Neuroscience Society (ECNS) 2014 Reprints and permissions: sagepub.com/journalsPermissions.nav DOI: 10.1177/1550059414548721 eeg.sagepub.com

Jing Li1,2, Xianzeng Liu3, and Gaoxiang Ouyang1,4

Abstract We carried out a series of statistical experiments to explore the utility of using relevance feedback on electroencephalogram (EEG) data to distinguish between different activity states in human absence epilepsy. EEG recordings from 10 patients with absence epilepsy are sampled, filtered, selected, and dissected from seizure-free, preseizure, and seizure phases. A total of 112 two-second 19-channel EEG epochs from 10 patients were selected from each phase. For each epoch, multiscale permutation entropy of the EEG data was calculated. The feature dimensionality was reduced by linear discriminant analysis to obtain a more discriminative and compact representation. Finally, a relevance feedback technique, that is, direct biased discriminant analysis, was applied to 68 randomly selected queries over nine iterations. This study is a first attempt to apply the statistical analysis of relevance feedback to the distinction of different EEG activity states in absence epilepsy. The average precision in the top 10 returned results was 97.5%, and the standard deviation suggested that embedding relevance feedback can effectively distinguish different seizure phases in absence epilepsy. The experimental results indicate that relevance feedback may be an effective tool for the prediction of different activity states in human absence epilepsy. The simultaneous analysis of multichannel EEG signals provides a powerful tool for the exploration of abnormal electrical brain activity in patients with epilepsy. Keywords EEG, absence epilepsy, relevance feedback, classification, multiscale permutation entropy Received April 16, 2014; revised July 20, 2014; accepted August 1, 2014.

Introduction According to the World Health Organization, the incidence of epilepsy has affected more than 50 million individuals worldwide (ie, about 0.6% to 1% of the world’s population). This not only affects the patients themselves but also brings inconvenience to their families. Consequently, it is important to predict seizures as early as possible such that clinicians can prescribe necessary medication for stopping the disease progression.1 During the past few decades, EEG signals have become one of the most useful tools for studying the processes involved in epileptic seizures.2-4 Currently, computational methods for analysing nonlinear EEG signals mainly consist of traditional linear methods such as Fourier transforms and spectral analysis5 and nonlinear algorithms such as Lyapunov exponents,6 correlation dimension,7,8 similarity,9 and power of scale freeness of visibility graph (PSVG).10 Understanding the transition of brain activity toward an absence seizure (ie, preseizure) is a very demanding task. EEG has become one of the most important diagnostic tools in clinical neurophysiology, most notably in epilepsy. Generally, the EEG is a recording of the mean electrical activity of the brain from the scalp in different locations of the head (scalp EEG). More specially, it is the sum of the extracellular current flows of a large group of neurons, and the EEG activity can be classified by its

frequency, voltage, morphology, synchrony, and periodicity. Typical absence seizures are accompanied by an EEG hallmark of brief ictal and interictal 2.5- to 3-Hz spike-and-wave complexes with a maximum amplitude over the frontorolandic regions.11 A previous analysis of EEG dynamic changes of Genetic Absence Epilepsy Rat from Strasbourg (GAERS) has demonstrated that EEG epochs prior to seizures exhibit a higher degree of regularity/predictability than that in seizure-free EEG epochs, but they present a lower degree than that in seizure EEG epochs.12,13 These EEG precursors in rat models give us a clue in predicting human absence epilepsy via EEG signals.

1

State Key Laboratory of Cognitive Neuroscience and Learning & IDG/ McGovern Institute for Brain Research, Beijing Normal University, Beijing, China 2 School of Information Engineering, Nanchang University, Nanchang, China 3 The Comprehensive Epilepsy Center, Departments of Neurology and Neurosurgery, Peking University People’s Hospital, Beijing, China 4 Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal University Beijing, China Corresponding Author: Gaoxiang Ouyang, Center for Collaboration and Innovation in Brain and Learning Sciences, Beijing Normal University Beijing 100875, China. Email: [email protected] Full-color figures are available online at http://eeg.sagepub.com

Downloaded from eeg.sagepub.com at UNIV OF SAN DIEGO on September 30, 2014

2

Clinical EEG and Neuroscience

In this article, we propose a machine learning scheme to analyze EEG recordings and to explore how EEG data provide evidence for the existence of a preseizure phase in human absence epilepsy. Machine learning algorithms (eg, kernel machines including support vector machines [SVMs]14,15) have been used for epilepsy diagnosis based on EEG signals.16 Lima et al17 applied relevance vector machines (RVMs) to the detection of epileptic activity and found in terms of accuracy the best-calibrated RVM models have shown comparable performance to those of SVMs. Shoeb and Guttag18 used machine learning techniques to detect the onset of an epileptic seizure via the construction of patient-specific binary classifiers. Furthermore, Shoeb et al19 applied SVMs to the detection of seizure termination in scalp EEG and obtained satisfactory results. Similarly, Nandan et al20 adopted several types of SVMs to detect epileptic seizure in an animal model of chronic epilepsy and gave comparison results. However, the aforementioned algorithms mostly paid attention to classifier construction and did not consider the interaction at all. In this article, we propose a machine learning algorithm based on relevance feedback (RF), a classical human–computer interaction technique in multimedia information processing. Through embedding the interaction, promising results are achieved in distinguishing the changes in EEG during different absence seizure phases. The scheme proposed here involves 3 stages, which are (a) signal processing (feature extraction), (b) dimensionality reduction, and (c) RF-based classification. Each stage will be briefly described subsequently. In this study, we collected 19 channels of EEG recordings from 10 patients (6 males and 4 females) with absence epilepsy. The EEG signals were sampled and filtered. After that, they were selected and dissected from seizurefree (data set I), preseizure (data set II), and seizure phases (data set III). For each data set, a total of 112 two-second 19-channel EEG epochs from 10 patients were selected. Multiscale permutation entropy (MPE) explores the local order structure of successive coarse-grained time series. It is calculated at multiple scales to extract useful information for classification and has shown promising performance in absence epilepsy.21 To this end, we extract MPE features in the first stage. When the dimension of extracted feature vectors is much higher than the number of training examples or it exceeds a certain value, curse of dimensionality22 will occur and subsequent classification performance may be degraded. Considering this, we use dimensionality reduction23,24 to alleviate this problem and obtain more compact representation for more accurate prediction of absence epilepsy. Here, we use linear discriminant analysis (LDA)24 to find a projection that reduces the higher dimensional feature space to a lower dimensional subspace. After dimensionality reduction, we use RF to embed human– computer interaction into the classification task of different phases in human absence epilepsy. RF describes how we as humans interact with machines, where a machine is defined as any mechanical or electrical device that transmits or modifies energy to assist in the performance of human tasks. RF originated from document retrieval,25 but has been widely used in multimedia information retrieval because it can bridge the

semantic gap between the low-level visual features and highlevel image concepts. Although RF was previously adopted in medical imaging,26 this study is a first attempt to apply the statistical analysis of RF to the distinction of different EEG activity states in absence epilepsy. Traditional RF methods in information retrieval include the following 2 steps27: (a) when retrieved results are returned to the user, some relevant and irrelevant examples are labeled as positive feedbacks and negative feedbacks, respectively and (b) the retrieval system refines the retrieved results based on these labeled examples. These 2 steps are conducted iteratively until the user is satisfied with presented results. Over the past few decades, RF techniques have been developed based on diverse machine learning techniques: feature selection, semisupervised learning, query modification, density estimation of positive samples, negative samples analysis, and distance metric learning.28-31 To accomplish the classification task of different activity states in absence seizures, we adopt the direct biased discriminant analysis (DBDA),32 treating RF as a (1 + x)-class biased learning problem. The organization of this article is as follows. We introduce the material and methods in the next section, which is followed by experimental results. The discussion and conclusions are presented in the final section.

Material and Methods Data for Acquisition EEG recordings were collected from 10 patients (6 males and 4 females) with absence epilepsy, aged from 8 to 21 years. The study protocol had previously been approved by the ethics committee of Peking University People’s Hospital and the patients had signed informed consent that their clinical data might be used and published for research purposes. The EEG data were recorded by the Neurofile NT digital video EEG system from a standard international 10-20 electrode placement (Fp1, Fp2, F3, F4, C3, C4, P3, P4, O1, O2, F7, F8, T3, T4, T5, T6, Fz, Cz, and Pz). They were sampled at a frequency of 256 Hz using a 16-bit analog-to-digital converter and filtered within a frequency band from 0.5 to 35 Hz. Afterward, the EEG signals were selected and dissected from different seizure phases: seizure-free (data set I), preseizure (data set II), and seizure (data set III) phases. For each data set, a total of 112 two-second 19-channel EEG epochs from 10 patients were extracted. The timing of onset and offset in spike-wave discharges (SWDs) was identified by an epilepsy neurologist, and these SWDs were defined as large-amplitude rhythmic 2.5- to 4-Hz discharges with typical spike-wave morphology lasting longer than 1 second. The criteria for selecting seizure-free, preseizure, and seizure data are that the interval between the seizure-free data and the beginning point of seizures is greater than 15 seconds, the interval is between 0 and 2 seconds prior to seizure onset, and the interval is the first 2 seconds of the absence seizure, respectively. Figure 1 shows representative examples of 19-channel EEG recordings during seizure-free (I),

Downloaded from eeg.sagepub.com at UNIV OF SAN DIEGO on September 30, 2014

3

Li et al

Figure 1. Representative examples of 19-channel (from Fp1 to Pz) EEG recordings, where I, II, and III denote the EEG epochs during seizure-free, preseizure, and seizure intervals, respectively.

preseizure (II), and seizure (III) phases, respectively. It is found that generalized SWDs with a repetition rate of 3 Hz are typically associated with clinical absence seizures.

Feature Extraction To investigate the dynamical characteristics of EEG data during different seizure phases, MPE21 was used to extract informative features from all EEG recordings. The MPE method is similar to the multiscale entropy (MSE) analysis,33 detailed information for which can be found in Ouyang et al.21 The code of MPE was downloaded from MATLAB Central File Exchange (MPerm.m). The MPE procedure contains the following 2 steps. First, a “coarse-graining” process is applied to a given time series {x1 , x2 ,… , xN } to construct a consecutive coarse-grained time series y (js ) by averaging a successively increasing number of data points in non-over lapping windows. Each element of y (js ) is calculated according to

y (js ) = 1 s

js

∑

i = ( j −1) s +1

xi , (1)

where s is the scale factor and 1 ≤ j ≤ N / s. The length of each coarse-grained time series is the integral part of N/s. Next, permutation entropy34—the local order structure of the time series, is calculated for each coarse-grained time series

and then plotted as a function of s. Before computing the permutation of a coarse-grained time series y j , a series of vectors V m (n) = [ yn , yn +1 ,… , y( n + m −1)] (1 ≤ n ≤ N / s − m + 1) with m length m is derived from y j . Afterward, V (n) can be ranked in an increasing order: [ yn + j1 −1 ≤ yn + j2 -1 ≤ yn + jn -1 ] . For different values of m, there will be m! possible order patterns π, which are also called permutations. Let f(π) denote the frequency of a permutation with π in the time series, the relative frequency is p (π) = f (π) ( N / s − m +1) . Consequently, the permutation entropy (PE) for the time series is defined as

m!

PE = − ∑ p (π) ln p (π) (2) π =1

The maximum value of PE is log(m!), which means all permutations have an equal probability. The minimum value of PE is zero, which indicates that the time series is very regular. In other words, the smaller is the value of PE, the more regular are the time series.

Dimensionality Reduction After feature extraction, LDA was used to reduce the dimension of feature vectors for alleviating computational complexity while preserving sufficient discriminative information in the subsequent classification stage. Linear discriminant analysis23 is a supervised learning algorithm that takes the class label information into account. Given

Downloaded from eeg.sagepub.com at UNIV OF SAN DIEGO on September 30, 2014

4

Clinical EEG and Neuroscience

Figure 2. The relevance feedback procedure.

a set of labeled training examples, it aims at separating the examples from different classes far away while keeping the examples within the same class close to each other. LDA is one of the most classical linear subspace methods that projects original higher dimensional data points xi ∈ ℜn 1 ≤ j ≤ N / s to a lower dimensional space through a linear transformation. Each feature vector can be considered as a point in the feature space. Given that the original high-dimensional data points X = { x1 , x2 ,..., xN } in ℜn belong to c classes, the betweenclass scatter matrix Sb and the within-class scatter matrix Sw are given by T 1 c Sb = ∑ N i ( mi − m ) ( mi − m ) N i=1 , (3) T 1 c Ni S w = ∑ ∑ ( xi ; j − mi ) ( xi ; j − mi ) N i=1 j =1 where in the i-th class, Ni is the number of data points, xi; j represents the jth example, and mi = (1 N i ) ∑ nji=1 xi ; j is the mean value; N = ∑ ic=1 N i is the number of all training examples; and c N m = (1 N ) ∑ i=1 ∑ j =i1 xi ; j is the mean vector of the whole input data. The formulation of LDA is to maximize the ratio between Sb and Sw in the projected low-dimensional subspace: U T SbU U = arg max . (4) opt U T S wU U The generalized eigenvalue problem is SbU = λ S wU , and the resulted lower dimensional subspace is spanned by U = {u1 , u2 ,..., u L } ( L ≤ c −1 ). Herein, the covariance matrix of all training examples is T n c St = (1 N ) ∑ i=1 ∑ ji=1 ( xi ; j − m ) ( xi ; j − m ) = Sb + S w , which is also called the total-class scatter matrix.

a similarity measure. Within top returned results, the user labels some relevant examples as positive feedbacks and some irrelevant examples as negative feedbacks, respectively. Based on these labeled feedbacks, the RF model can be enhanced iteratively and return final results to the user. In this article, we use DBDA32 as the relevance feedback technique. DBDA is regarded as an improvement of biased discriminant analysis (BDA),35 which treats positive examples and negative examples asymmetrically. They will be introduced as follows. Biased Discriminant Analysis. As users usually label both positive examples and negative examples, RF is considered as a 2-class pattern classification problem. However, just like “happy families are all alike, every unhappy family is unhappy in its own way” (Leo Tolstoy’s Anna Karenina), positive examples are all alike and each negative example is negative in its own way. That is, there is an asymmetry between positive examples and negative examples. Moreover, users are only interested with one class (the positive class), that is, the returned results should be similar to the query, negative examples are too few to represent the true nonlinear distributions. Therefore, it is more reasonable to assume there is one positive class but the number of other classes is uncertain. Based on the aforementioned concepts, BDA35 treats RF as a (1 + x)-class biased learning problem (biased toward the positive class) and labels training examples as only positive or negative in order to explore whether they belong to the target class or not. In this way, positive examples are pulled closer to each other while negative examples are pushed away from the positive ones. It is easier for us to understand BDA after introducing the formulation of LDA in a previous section. The objective of BDA is to maximize the ratio between the biased matrix Sy and the positive covariance matrix Sx

Relevance Feedback

Generally, RF is widely considered as a 2-class learning problem, treating positive examples and negative examples in a symmetric way. The learning flowchart of RF is given in Figure 2. When a query is input, its features are extracted and compared with those previously stored in the data set based on

where

W = arg max W

W T S yW W T S xW

, (5)

Nx T S x = ∑ ( xi − mx ) ( xi − mx )

, T (6) S y = ∑ ( yi − mx ) ( yi − mx )

Downloaded from eeg.sagepub.com at UNIV OF SAN DIEGO on September 30, 2014

i =1 Ny

i =1

5

Li et al given that xi belongs to the positive class, yi denotes the negative examples. Herein, Nx is the number of positive examples, N Ny is the number of negative examples, and mx = (1 N x ) ∑ i=1x xi is the mean vector of the positive examples. To obtain W, we −1 can compute the eigenvectors of S x S y . Direct Biased Discriminant Analysis. DBDA,32 which is regarded as an enhanced BDA, adopts the same idea as direct LDA.36 In DBDA, it is assumed that the null space of Sy contains no important information for discriminating different classes and the discriminant vectors are restricted in the subspace spanned by class centers. Therefore, the formulation of DBDA is obtained by first diagonalizing Sy and then removing its null space

Y T S yY = Dy 0 (7)

Here, Dy comprises the corresponding nonzero eigenvalues of Sy and Y comprises the eigenvectors. Then, Sx is transformed to

−

1

−

1

K x = Dy 2 Y T S xYDy 2 , (8)

where Kx is diagonalized by eigenanalysis:

U T K xU = Dx . (9)

The BDA transformation matrix is defined as

−

1

−

1

W = YDy 2 UDx 2 .

(10)

Adaptive Neuro-Fuzzy Inference System To compare the accuracy of classification between the proposed relevance feedback scheme and some traditional methods, the Adaptive Neuro-Fuzzy Inference System (ANFIS)37 is also adopted to evaluate the ability and effectiveness of the MPE measures in classifying different seizure phases. The ANFIS learns features in the data set and adjusts the system parameters according to a given error criterion. For more details please refer to Jang.37 To improve the generalization, 3 ANFIS classifiers are trained with the backpropagation gradient descent method in combination with the least squares method when the calculated MPE measures are used as input. Each of the ANFIS classifier is trained so that they are likely to be more accurate for one state of EEG signals than the other states. The samples with target outputs, seizure-free (data set I), preseizure (data set II), and seizure (data set III) phases are given the binary target values of (0, 0, 1), (0, 1, 0), and (1, 0, 0), respectively. Each ANFIS classifier is implemented using the MATLAB software package (MATLAB version 7.0 with fuzzy logic toolbox).

Experimental Results Multiscale Permutation Entropy Measure of EEG Data The MPE measure was applied to analyze all 6384 two-second EEG epochs in this study (112 × 19-channel from each data set

I, II, and III). Scale 1 (ie, s = 1) is the only scale considered by traditional single-scale-based methods. For example, the permutation entropy values for EEG segments of channel F3 were averaged at 1.694 ± 0.092 in the seizure-free phase, 1.569 ± 0.116 in the preseizure phase, and 1.369 ± 0.105 in the seizure phase, respectively. That is, the entropy values in seizure-free and preseizure phases are larger than those in the seizure phase. We computed MPE at 5 scales with similar results of permutation entropy measures obtained for the other 4 scales. For all EEG epochs from 19 channels, the mean permutation entropy values of EEG were plotted by subgroups, as shown in Figure 3. Next, to investigate whether their distributions over the three states are significantly different, the 1-way analysis of variance test is used for calculating entropy values on each scale and each channel, respectively. As calculated, the critical value is Fcrit(2, 333) = 4.67 at α = 0.01. At this point, the test statistic must exceed to reject the null hypothesis. For example, the results of the MPE at scale 1 from channel F3 are shown in Table 1. It can be seen that the statistical test yields an F statistic (F = 274.1) that is much higher than the threshold Fcrit. This suggests the null hypothesis, that is, no differences between these 3 different groups, should be rejected. The results of F statistic from all 19 channels and 5 scales are shown in Figure 4. It can be seen that, on all 19 channels and 5 scales, the values of F statistic (x) are much higher than the threshold Fcrit. Therefore, the differences between 3 different EEG epochs are significant at the 1% significance level for each channel and each scale. However, there is considerable overlap between the permutation entropy values in seizure-free, preseizure, and seizure phases, so it is difficult to use the permutation entropy itself directly for classification. To distinguish the changes in EEG during preseizure phase, a further step (classification) is needed to ensure the performance.

Classification In this section, we carried out a series of statistical experiments to explore the utility of using RF on EEG data to distinguish between different activity states in human absence epilepsy. For each data set, 112 19-channel EEG epochs from 10 patients were extracted at 5 different scales and each channel of the 112 samples was described by a 5-dimensional feature vector. Multiple scales may reveal information about neural connectivity that is diagnostically useful,21,38,39 and thus are more suitable for the phase-classification problem in epilepsy. Afterward, LDA was implemented for dimensionality reduction. As the rank of the between-class scatter matrix is at most c − 1 (ie, c is the number of classes), the maximum number of eigenvectors with nonzero eigenvalues is c − 1. As a result, we represented each EEG epoch by a 2-dimensional feature vector. Subsequently, RF was evaluated on data set I, data set II, and data set III with 3 concept groups, which are seizure-free, preseizure, and seizure phases, respectively. Classification With Relevance Feedback Classifier. In this study, experiments with 68 different queries (ie, EEG epochs) from all

Downloaded from eeg.sagepub.com at UNIV OF SAN DIEGO on September 30, 2014

6

Clinical EEG and Neuroscience

Figure 3. Multiscale permutation entropy (MPE) analysis of EEG recordings during seizure-free, preseizure, and seizure phases for each channel. Bars represent the mean values and standard deviation of permutation entropy (PE) for each group. Table 1. One-Way Analysis of Variance (ANOVA) Test. ANOVA Source of Variation Between samples Within samples Total

Sums of Degrees of Mean Squares Freedom Square 6.03 3.65 9.68

2 333 335

3.015 0.011

F Statistic 274.1 (P < 0.01)

data sets were performed over nine iterations. As an RF algorithm, DBDA is embedded in the proposed scheme and the computer automatically conducts the relevance feedback iterations without mislabeled epochs using 3 concept groups described previously. When a query was submitted, its MPE features were extracted and reduced by LDA. Then, all the EEG epochs in the data sets were sorted based on a similarity metric. For each iteration, the concept group was serially compared with the concept groups of the top 50 sorted EEG epochs, where the first 5 relevant (correct) EEG epochs are labeled as positive feedbacks and the first 5 irrelevant (incorrect) EEG epochs are labeled as negative feedbacks. Using this feedback process, the system is trained based on machine learning using the embedded DBDA algorithm. Then, all EEG epochs were re-sorted based on the recalculated

similarity metric. The RF process iterates unless fewer such EEG epochs were found among the top 50 sorted epochs in which case the fewer number found was used as feedbacks. Precision and standard deviation (SD) were used to evaluate the performance. Precision is the percentage of the correctly classified EEG epochs in the top N returned results, describing the effectiveness of the RF algorithm; while SD serves as an error-bar to record its robustness. Both metrics were computed as the average values of the 68 queries. Figure 5 shows the average precision for the 68 experiments for the top 10, 20, 30, and 40 results, which demonstrated the effectiveness of RF in classifying different phases in epilepsy. The dendrograms in Figure 6 visually represent the correlation of data and intuitively express the difference before and after dimensionality reduction by LDA. In a dendrogram, an individual example is arranged along the x-axis of the dendrogram and referred to as leaf nodes, each of which has a right and left subbranch of clustered examples. The y-axis denotes the rescaled intraclass or interclass distance, and the height of a node can be considered as the distance between the left and right subbranch clusters. In Figure 6, the dendrogram on the left is obtained by calculating raw MPE features for EEG epochs, while the one on the right is obtained by further processing the MPE features by LDA.

Downloaded from eeg.sagepub.com at UNIV OF SAN DIEGO on September 30, 2014

7

Li et al

and 40 results are all higher than those of single-channel MPE features, which were 89.5% versus 86.0%, 83.0% versus 74.0%, 80.0% versus 68.0%, and 78.0% versus 65.0%, respectively. This shows the robustness of multichannel features and demonstrates they contain more discriminative information for the classification of different phases in epilepsy.

Figure 4. The values of F statistic from all 19 channels and 5 scales. The mean F statistic is 168.1 (from 95.1 to 286.3), 163.7 (from 89.2 to 268.8), 165.9 (from 83.9 to 263.5), 160.4 (from 74.3 to 263.0), and 159.2 (from 74.0 to 265.7) on scale 1 to 5, respectively.

Figure 5. Average relevance feedback (RF) performance up to 9 iterations for 68 randomly selected queries by extracting 19-channel multiscale permutation entropy (MPE) features. Precision and SD are reported in top 10, 20, 30, and 40 retrieved results, respectively.

Furthermore, to explore whether MPE features calculated from multichannel EEG recordings outperform those extracted from a single channel, we only computed the MPE features corresponding to the electrode F3 instead of the 19-channel MPE features. After conducting LDA and DBDA, the classification results are given in Figure 7. Roughly speaking, singlechannel MPE features obtain comparable performance to multichannel MPE features for the top 10, 20, 30, and 40 retrieved results. However, take a deeper look at Figures 5 and 7, at the beginning of RF (the first iteration), the classification rates of multichannel MPE features for the top 10, 20, 30,

Classification With ANFIS Classifier. At last, the performance of the above measures to discriminate among groups is also evaluated by means of the ANFIS classifier, and 10-fold cross-validations are employed to demonstrate the accuracy of classification.40,41 The MPE features are used as input data in the ANFIS classifier. Classification results of the ANFIS model revealed that, of 336 EEG segments in 3 groups, 299 are classified correctly. The total classification accuracy of ANFIS model is 89.0% by using 10-fold cross-validations, which is 9.6% lower than the average precision in the top 10 results (97.5%) by the proposed RF scheme.

Discussion and Conclusions In this article, we propose a machine learning scheme to investigate the utility of using RF on EEG data to distinguish between different activity states (ie, seizure-free, preseizure, and seizure phases) in human absence epilepsy. To this end, MPE is first calculated to analyze the dynamical characteristics of EEG data during different activity states. For an MPE algorithm, the dimension m plays a key role in calculating the MPE. Since there are only very few distinct phases for EEG recordings,42 when m is too small (