An SVM-wrapped multiobjective evolutionary feature selection approach for identifying cancer-microRNA markers.

IEEE TRANSACTIONS ON NANOBIOSCIENCE, VOL. 12, NO. 4, DECEMBER 2013

275

An SVM-Wrapped Multiobjective Evolutionary Feature Selection Approach for Identifying Cancer-MicroRNA Markers Anirban Mukhopadhyay , Senior Member, IEEE, and Ujjwal Maulik, Senior Member, IEEE

Abstract—MicroRNAs (miRNAs), have been shown to play important roles in gene regulation and various biological processes. Recent studies have revealed that abnormal expression of some specific miRNAs often results in the development of cancer. Microarray datasets containing the expression profiles of several miRNAs are being used for identification of miRNAs which are differentially expressed in normal and malignant tissue samples. In this article, a multiobjective feature selection approach is proposed for this purpose. The proposed method uses Genetic Algorithm for multiobjective optimization and support vector machine (SVM) classifier as a wrapper for evaluating the chromosomes that encode feature subsets. The performance has been demonstrated on real-life miRNA datasets for and the identified miRNA markers are reported. Moreover biological significance tests have been carried out for the obtained markers. Index Terms—MicroRNA marker, multiobjective feature selection, Pareto-optimality, support vector machine.

M

I. INTRODUCTION

icroRNAs (miRNAs) are endogenous 22 nucleotide (nt) ribonucleic acids (RNAs) that play important regulatory roles in animals and plants by targeting messenger RNAs (mRNAs, DNA products that are translated into proteins) for cleavage or translational repression. This has now been well-established that miRNAs regulate different cellular processes such as cell differentiation development and genomic stability in eukaryotes [1]. In recent times, researchers are involved in developing computational methods for analysis of miRNAs. Such computational techniques are expected to complement the experimental work that needs to be carried out in the wet laboratory for analyzing miRNAs. It has been found in several studies that some miRNAs are differentially expressed in normal and cancerous tumor tissues of all types. This findings suggest possible links between miRNAs and oncogenesis. Moreover, it has also been found that some miRNAs are differentially expressed in tissue-specific tumors, which indicates that it might be possible to diagnose the cancer type from these onco-miRNA sigManuscript received July 01, 2012; revised June 03, 2013; accepted August 07, 2013. Date of publication November 05, 2013; date of current version January 07, 2014. Asterisk indicates corresponding author. *A. Mukhopadhyay is with the Department of Computer Science & Engineering, University of Kalyani, Kalyani 741235, India (e-mail: [email protected]). U. Maulik is with the Department of Computer Science & Engineering, Jadavpur University, Kolkata 700032, India (e-mail: [email protected]). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TNB.2013.2279131

natures. Hence the development of computational methods for detecting onco-miRNAs that target onco-genes is a crucial task that could provide alternate ways of diagnosis and therapy of the disease. Microarray data have been extensively studied for gene expression analysis leading to many methodological works including gene clustering and gene marker selection [2]–[6]. However, the field of analyzing miRNA microarrays has gained much attention more recently. One of the major barriers in the study of miRNA functions was the absence of methods for quantitative expression profiling [7]. It is possible to use the existing marker selection methods used in gene expression studies for miRNA expression data also. However, miRNA expression datasets have certain characteristics which might need to be considered while applying such methods for miRNA microarray datasets. In most of the cases, the expression profiles of miRNAs derived from microarray experiments are tissue-specific in nature. Moreover, miRNAs are sometimes obtained for expression profiling from common tissues (by locality) of various patients for the purpose of disease diagnosis. As the miRNAs are shorter in length, the purity, variance, and dimension of the miRNA microarray datasets are usually much smaller than those of the genes. Therefore, it is important to explore these datasets to discover underlying biological activities of miRNAs [8]. A miRNA expression dataset consisting of samples and miRNAs can be thought as a two-dimensional matrix represents the expres, where each element sion level of the th miRNA for the th sample. To identify relevant miRNAs, the problem has been modeled as a feature selection problem where the miRNAs are considered as features. So the problem is to obtain a subset of features that optimizes some performance measure. Here, we have applied a multiobjective Genetic Algorithm-based feature selection technique that encodes a possible feature subset in its chromosomes. Non-dominated Sorting Genetic Algorithm-II (NSGA-II) [9], a popular multiobjective GA has been utilized as the underlying optimization tool. The fitness of the chromosomes have been evaluated using support vector machine (SVM) [10], [11] classifier, and three objective functions have been simultaneously optimized. The objective functions considered here are the number of features, specificity, and sensitivity. The first objective is minimized and the other two objectives are maximized. Subsequently, the most promising miRNAs responsible for distinguishing the normal and malignant classes are obtained by a

1536-1241 © 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

276


third measure (called -measure). Radial basis function (RBF) kernel is used for SVM classifier. The performance of the proposed method has been demonstrated on publicly available miRNA expression datasets of six different tissue samples viz., breast, colon, kidney, lung, prostate, and uterus. The experimental results establish the utility of the proposed technique. First the experiments have been conducted for discovering miRNA markers that distinguish the normal and malignant samples globally for all types of tissue samples. Subsequently, the biological significance tests have been conducted for the selected markers. The rest of the article is organized as follows: The next section provides the basic concepts of multiobjective optimization. Section III gives a brief discussion of SVM classifiers. In Section IV, the proposed method is described in detail. Sections V provides the data description and preprocessing. Section VI reports the experimental results. Finally, Section VII concludes the article.

Hence the multiobjective feature selection scheme considered here uses NSGA-II as an underlying multiobjective framework.

II. MULTIOBJECTIVE OPTIMIZATION Many real-life optimization problems are multiobjective in nature. Unlike single objective optimization, several objectives are simultaneously optimized in multiobjective optimization (MOO). The MOO problem can formally be stated as follows of decision [12]: Find the vector variables which satisfies a number of equality and inequality constraints and optimizes the vector function

III. SUPPORT VECTOR MACHINE CLASSIFIERS Support vector machine (SVM) classifiers are inspired by statistical learning theory and they perform structural risk minimization on a nested set structure of separating hyperplanes [10], [11]. Viewing the input data as two sets of vectors in a -dimensional space, an SVM constructs a separating hyperplane in that space, which maximizes the margin between the two classes of points. To compute the margin, two parallel hyperplanes are constructed on each side of the separating one, which are “pushed up against” the two classes of points. Intuitively, a good separation is achieved by the hyperplane that has the largest distance to the neighboring data points of both classes. Larger margin or distance between these parallel hyperplanes indicates better generalization error of the classifier. Fundamentally the SVM classifier is designed for two-class problems. It can be extended to handle multi-class problems by designing a number of one-against-all or one-against-one two-class SVMs. , where Suppose a dataset consists of feature vectors , denotes the class label for the data point . The problem of finding the weight vector can be formulated as minimizing the following function: (1) subject to (2)

The constraints define the feasible region which contains denotes an optimal all the admissible solutions. The vector solution in . The concept of Pareto-optimality is useful in the domain of multiobjective optimization. A formal definition of Pareto-optimality from the viewpoint of the minimization problem may be given as follows: A decision vector is called Pareto-optimal if and only if there is no that dominates , i.e., there is no such that

Here, is the bias and the function maps the input vector to the feature vector. The dual formulation is given by maximizing the following: (3) subject to (4)

and

In other words, is Pareto-optimal if there exists no feasible vector which causes a reduction on some criterion without a simultaneous increase in at least another. In general, Paretooptimum usually admits a set of solutions called non-dominated solutions. Among the available MOO techniques, the genetic algorithm (GA) [13] based techniques such as Non-dominated Sorting GA-II (NSGA-II) [9], Strength Pareto Evolutionary Algorithm (SPEA) and SPEA2 [14], Pareto Archived Evolutionary Strategy (PAES) [15] are very popular. NSGA-II is an improvement over its previous version NSGA in terms computation time. In [16], it has been shown that NSGA-II performs better compared to several other MOO techniques.

The parameter , called as regularization parameter, controls the tradeoff between complexity of the SVM and the misclascoefficients are sification rate. Only a small fraction of the entries are known as nonzero. The corresponding pairs of support vectors and they fully define the decision function. Geometrically, the support vectors are the points lying near the sepais called the kernel rating hyperplane. function. Kernel functions are used for mapping the input space to a higher dimensional feature space so that the classes become linearly separable. Among various available kernel functions radial basis function (RBF) has been used because it is known to perform well for overlapping classes and noisy data. The RBF kernel is defined as (5)

MUKHOPADHYAY AND MAULIK: AN SVM-WRAPPED MULTIOBJECTIVE EVOLUTIONARY FEATURE SELECTION APPROACH

Fig. 1. Chromosome encoding scheme for NSGA-II based feature selection.

IV. PROPOSED METHOD The proposed technique consists of two stages. First, a multiobjective feature selection method wrapped with SVM classifier is employed. Next, the selected miRNAs in different solutions of the non-dominated set are used to obtain a single set of most promising miRNAs that distinguish the two classes of tissue samples, benign and malignant effectively. The two stages are described in detail below. A. Feature Selection Using NSGA-II A NSGA-II [9] based feature selection algorithm has been developed that is wrapped with SVM classifier. In this technique, each chromosome in the population is a binary string having two parts. The first part is of length equal to the number of miRNAs in the dataset. For a chromosome, bit “1” indicates that the corresponding miRNA is selected, and bit “0” indicates that the corresponding miRNA is not selected. The second part of the chromosome is of length and it encodes the value of SVM regularization parameter in binary. The decimal value encoded in bits is mapped in the range [0,100] to obtain the parameter . Fig. 1 illustrates the chromosome encoding scheme. Three objective functions are optimized simultaneously. For computing the objective values for a chromosome, first the subset of miRNAs that are encoded in the chromosome are extracted. Let this set be denoted by . So contains those miRNAs for which the bit positions of the chromosome has value “1.” The samples in the training set are classified on the using leave-one-out cross validation by SVM in subspace order to find out the objective function values corresponding to the chromosome. From the output of the cross validation, , false positives , true the number of true positives negatives and false negatives are computed. The first objective function is the sensitivity [17] which is defined as: (6) The second objective puted as:

is the specificity [17] which is com(7)

The third objective is the number of selected features which is given by: (8) The goal is to maximize the first two objectives while minimizing the third one simultaneously. That means, we try to find the smallest set of miRNAs that correctly classify the benign and malignant tissue samples. In the final generation, a set

277

of non-dominated solutions each encoding a promising feature (miRNA) subset is obtained. For selection, crowded binary tournament selection method [9] is used. After selection, uniform crossover has been applied on the chromosomes and subsequently bit-flip mutation is applied to generate the next generation. Elitism has been incorporated to track the good chromosomes found so far. Elitism is performed by combining parent and child population and transferring the non-dominated solutions from the combined population to the next generation. The process of fitness computation, selection, crossover, and mutation is executed for a given number of generations and the final generation produces a set of non-dominated solutions. B. Obtaining Final Set of miRNAs From Non-Dominated Set Like any multiobjective optimization technique, the NSGA-II-based multiobjective feature selection algorithm gives a set of non-dominated solutions, each of which encodes a possible feature (miRNA) subset. For selecting the most promising feature subset that provide good values for both specificity and sensitivity, the solution that provides the best -measure value is selected. The -measure is defined as [17]

(9) Here Recall is equivalent to sensitivity described in (6). Precision (also known as positive predicted value) is defined as follows [17]: (10) Higher value of indicates better balance between sensitivity and specificity and thus indicates better classification. The feature subset encoded in the solution providing the best -measure is considered as the final set of miRNA markers. After selecting the miRNA markers from the training set, we then classify the test samples based on the selected miRNA markers. The classification performance is demonstrated both statistically and visually. Finally we discuss biological validation of the miRNA markers using cancer-miRNA association network and pathway analysis of the targeted genes. The overall procedure is shown in Fig. 2. V. DATASETS AND PREPROCESSING We obtained a publicly available miRNA expression dataset from the following website: http://www.broad.mit.edu/cancer/ pub/miGCM. The complete dataset consists of 217 mammalian miRNAs from different cancer types. From this, we have extracted six datasets consisting of the samples from breast, colon, kidney, lung, prostate, and uterus. Each dataset is described by all the 217 miRNAs [18]. The number of normal and tumor samples of each of the tissue types is given in Table I. Each sample vector of the datasets is normalized to have mean 0 and variance 1. The final dataset contains two classes, one representing all the normal samples (32 samples) and another representing all the tumor samples (57 samples). For initial filtering

278


Fig. 3. ROC curves for MOGA-selected features and SNR-selected features.

the mean of all absolute SNR scores. This gives us 100 miRNAs which are used further. The dataset is then randomly divided into training and test sets with roughly equal distribution. However, while dividing into training and test sets, it is ensured that both training and test set contain at least one sample from normal and cancerous samples of each of the tissue types. After ensuring this, we got 40 training samples and 49 test samples. The feature selection algorithms are applied on the training set only. Finally the test set is classified by the trained SVM on the selected miRNAs and the classifier performance is reported. Note that the test set is completely disjoint with the training set. VI. RESULTS AND DISCUSSION Fig. 2. Overview of the complete procedure adopted.

TABLE I NUMBER OF NORMAL AND TUMOR SAMPLES PRESENT IN EACH TISSUE TYPE

of miRNAs, signal-to-noise ratio (SNR) is used. First for each miRNA, a the SNR [19] is computed. SNR is defined as (11) where and , , denote the mean and standard deviation of class for the corresponding miRNA. Note that larger absolute value of SNR for a miRNA indicates that the miRNA’s expression level is high in one class and low in another. Hence this bias is very useful in distinguishing the miRNAs that are expressed differently in the two classes of samples. After computing the SNR value of each miRNA with respect to normal and cancerous classes, the miRNAs are sorted in descending order of the absolute SNR values. Then we choose from top the miRNAs having absolute SNR values greater than or equal to

In this section, first the performance of the proposed technique to discover the miRNA markers is reported. Thereafter, the biological relevance of the discovered miRNA markers is discussed. The proposed NSGA-II-based feature selection technique is executed with the following parameters: , , , . We tested different times with crossover probabilities ranging from 0.85 to 0.95 and mutation probabilities ranging from 0.05 to 0.15, because it is customary to have a high crossover rate and low mutation rate for genetic algorithms. We found that the performance of our algorithm is generally robust with respect to the choice of probabilities in these ranges. Therefore we fixed the crossover and mutation probabilities to 0.9 and 0.1, i.e., the mean values of the corresponding ranges, respectively. For evaluating the performance of the classifiers, we have reported the values of Sensitivity, Specificity, Classification accuracy, Area Under ROC (AUC) [20], and -measure. Moreover the statistical significance of the selected features are also reported in terms of -values. For the purpose of illustration, the Receiver Operator Characteristic (ROC) curves are also shown. A. Classifier Performance The proposed multiobjective feature selection technique is executed on the preprocessed training dataset for multiple times and for each run, the output set of features is collected. We went on taking the union of feature sets in subsequent runs and found that after 10 runs the union of the feature sets does


279

TABLE II CLASSIFIER PERFORMANCE METRIC VALUES FOR DIFFERENT FEATURE SELECTION METHODS

TABLE III miRNA MARKERS SELECTED BY MULTIOBJECTIVE TECHNIQUE

not change any more. Therefore we stop after first 10 runs. Then from the union feature set, the frequency of each feature, i.e., the number of times it appears in a output feature set over 10 runs, is computed. As we wanted to select the miRNAs that have appeared in the output set at least in 50% of the runs of the algorithm, we choose the miRNAs with at least 50% frequency. We found 5 such miRNAs and these 5 miRNAs are selected as the final set of miRNA markers. Final value of the SVM parameter is taken as the average of the 10 runs and it comes out as 846.92. The selected miRNAs that are differentially expressed in normal and tumor samples irrespective of the tissue types are discovered using the proposed method. Table II reports the classifier performance. Moreover, we have reported the performance of the SVM classifier using top five features selected by SNR score, and two statistical significance tests for differential expression, viz. t-test [21] and non-parametric Wilcoxon’s ranksum test [22]. As our method gives five miRNA markers, that’s why we have chosen top five miRNAs given by the above filtering methods. Additionally, two SVM wrapper-based feature-selection methods, viz., LASSO (Least Absolute Shrinkage and Selection Operator) [23], [24] and SCAD (Smoothly Clipped Absolute Deviation) [25] are also used for comparison purpose. For LASSO and SCAD, we have used the implementation of an R package called penalizedSVM [26]. The -values of statistical significance of the selected features are also reported in table. The -value of a selected feature set is computed by a randomization test. Suppose is the classification accuracy obtained using the selected feature set . We generate random feature subsets each having same length as that of . If , denotes classification

accuracy of the th random feature subset, then the -value of the randomization test for is defined as (12) denotes the number of times a randomly selected where feature subset provides better classification accuracy than that provided by the selected feature set using the proposed method. Here the value of is taken as 10 000. Lower -value indicates more statistically significant feature subset, and the chance that these features have been selected randomly goes down. It is evident from the table that the features selected by multiobjective technique provide better classifier performance measure values compared to that of the other algorithms. The best values in each column are marked in bold font. With respect to classification accuracy, sensitivity, specificity scores, the multiobjective technique provides the maximum scores. It is to be noted that LASSO and SCAD select 95 and 11 miRNAs out of the 100 initial miRNAs, respectively. Hence it appears that LASSO fails to minimize the number of selected miRNAS. SCAD does a reasonable job in this regard, but still the proposed multiobjective algorithm outperforms it with respect to different performance metrics. Moreover, the -value of statistical significance is also better (lower) for multiobjective technique than that of the SNR-selected features. This establishes that the proposed technique selects statistically significant set of features providing high value of the classifier performance measures. This is also established from Fig. 3 where the ROC curves for the selected feature set by different methods are shown. As is evident from the figure, the ROC curve corresponding the MOGA-selected features is closer to

280


the upper left corner of the plot, i.e., closer to 0 false positive rate and 1 true positive rate point. Also as reported in Table II that AUC value provided by MOGA-selected features (0.9583) is better than that provided by the other methods. Hence it is evident that the MOGA-selected features provides the better performance compared to the other techniques considered here. For the purpose of illustration, the top 5 markers selected by multiobjective feature selection technique are reported in Table III and their expression level (Up or Down) is malignant cells. Moreover the frequencies of selection of these markers out of 10 runs of the algorithm are also reported in the table. It can be noted that all the miRNA markers have been selected in most of the runs (60% to 100%). The miRNA hsa_mir-338 is found to be selected in all the runs of the algorithm. All the selected miRNA markers are found to be up-regulated in the normal samples and down-regulated in the cancerous samples. Further, the average fold-change for each marker is reported for both training and test samples. Fig. 4 shows the heatmaps of the training and test datasets for the 5 miRNAs selected by multiobjective feature selection technique. The hot colors (yellowish and reddish) represent higher values of expression and cool colors (bluish and greenish) represent lower values of expression levels. It appears from the figure that for both training and test datasets, these miRNAs are differentially expressed in normal and cancerous tumor classes. B. Biological Relevance To study the biological relevance of the obtained miRNA markers, first we study the known cancer associations for the selected markers as found from a recently published human cancer-miRNA network [27]. Also we have found the target mRNAs for each miRNA. The target mRNAs are obtained using recently proposed TargetMiner tool [28]. Although there are many available tools for miRNA target prediction with very little overlap among their predicted targets, TargetMiner is chosen because it is based on intelligent choice of biologically validated non-targets, and it is shown to outperform other miRNA target prediction algorithms. Table IV reports the selected miRNA markers and the number of target mRNAs for each of them as retrieved by TargetMiner. The cancer types associated with each miRNA as retrieved from the human cancer-miRNA network are also reported. It is quite interesting that all the selected markers are found to be associated with some types of cancer. The maximum cancer association has been found for hsa_miR-195 which has connection with 5 different types of cancer, and also this miRNA targets maximum number of mRNAs (3175). Hence these results indicate that the identified miRNA markers are indeed associated with different types of cancer. To study how the selected miRNA markers are involved in various biological activities, we have studied the KEGG pathway enrichment of the target genes of each of the selected miRNAs using Database for Annotation, Visualization and Integrated Discovery (DAVID) available at http://david.abcc.ncifcrf.gov/. Table V reports the results of this study. We have reported the top five significant pathways for the target genes and corresponding -values as obtained from DAVID. It is really interesting to see that the KEGG pathway term “pathways in cancer” comes within the top five significant pathways for each

Fig. 4. Heatmaps of training and test datasets for the MOGA-selected 5 miRNAs. (a) Training data; (b) test data. TABLE IV THE NUMBER OF TARGET MRNAS FOR EACH MIRNA MARKER AS RETRIEVED BY TRAGETMINER AND CANCER TYPES ASSOCIATED WITH EACH MIRNA AS RETRIEVED FROM CANCER-miRNA NETWORK

of the five selected miRNA markers. This signifies that the selected miRNA markers are indeed involved in different cancer pathways. Moreover, some more specific cancer pathways are noticed within the top five significant pathways of the miRNA markers. For example, hsa_mir-338 have target genes that are involved in the pathway of Renal cell carcinoma ( -value: 9.0e-04). There are three specific cancer pathways within the top five significant pathways for hsa_mir30b. These are Small


TABLE V TOP FIVE SIGNIFICANT KEGG PATHWAYS AS DISCOVERED USING DAVID FOR THE TARGET GENES OF EACH OF THE SELECTED miRNA MARKERS

cell lung cancer ( -value: 9.9e-05), Chronic myeloid leukemia ( -value: 2.1e-04), and Colorectal cancer ( -value: 5.0e-04). The pathway Colorectal cancer ( -value: 1.1e-07) is also found for hsa_mir-340. Moreover, hsa_mir-195 is found to have target genes involved in Prostate cancer pathway ( -value: 4.4e-08). These results indicate that the selected miRNA markers are highly involved in different cancer pathways, thus can really be treated as potential miRNA markers. VII. CONCLUSIONS In this article, a multiobjective GA-based feature selection algorithm wrapped with SVM classifier has been developed for identification of miRNA markers from miRNA expression datasets. The technique optimizes different performance criteria simultaneously and evolves the desired subset of features (miRNAs). The SVM generalization parameter is also evolved along with the relevant feature subset. Results on real-life miRNA expression datasets of different tissue types, viz., Breast, Colon, Kidney, Lung, Prostate, and Uterus, have been demonstrated. Moreover, many identified miRNA markers are also found to have association with different types of cancer as per recent literatures. Finally a pathway enrichment study has been conducted that reveals that the genes targeted by selected miRNA markers are involved in several cancer pathways. As a scope of further research, performance of different popular classifiers, other than SVM, is to be studied. Moreover, the miRNA markers identified are needed to be further investigated biologically. REFERENCES [1] D. P. Bartel, “MicroRNAs—Genomics, biogenesis, mechanism, and function,” Cell, vol. 116, no. 2, pp. 281–297, Jan. 2004. [2] P. Tamayo, D. Slonim, J. Mesirov, Q. Zhu, S. Kitareewan, E. Dmitrovsky, E. Lander, and T. Golub, “Interpreting patterns of gene expression with self-organizing maps: Methods and application to hematopoietic differentiation,” Proc. Natl. Acad. Sci. USA, vol. 96, pp. 2907–2912, 1999. [3] A. P. Gasch and M. B. Eisen, “Exploring the conditional coregulation of yeast gene expression through fuzzy k-means clustering,” Genome Biol., vol. 3, no. 11, pp. 0059.1–0059.22, 2002.

281

[4] U. Maulik, A. Mukhopadhyay, and S. Bandyopadhyay, “Combining pareto-optimal clusters using supervised learning for identifying co-expressed genes,” BMC Bioinformatics, vol. 10, no. 27, 2009. [5] A. Mukhopadhyay, S. Bandyopadhyay, and U. Maulik, “Multi-class clustering of cancer subtypes through SVM based ensemble of paretooptimal solutions for gene marker identification,” PLoS One, vol. 5, no. 11, p. e13803, 2010. [6] U. Maulik, S. Bandyopadhyay, and A. Mukhopadhyay, Multiobjective Genetic Algorithms for Clustering—Applications in Data Mining and Bioinformatics. New York: Springer, 2011. [7] J. M. Thomson, J. Parker, C. M. Perou, and S. M. Hammond, “A custom microarray platform for analysis of microRNA gene expression,” Nature Methods, vol. 1, no. 1, pp. 47–53, Oct. 2004. [8] S. Bandyopadhyay and M. Bhattacharyya, “Analyzing miRNA co-expression networks to explore TF-miRNA regulation,” BMC Bioinformatics, vol. 10, 2009. [9] K. Deb, A. Pratap, S. Agrawal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: NSGA-II,” IEEE Trans. Evol. Comput., vol. 6, pp. 182–197, 2002. [10] V. Vapnik, Statistical Learning Theory. New York: Wiley, 1998. [11] K. Crammer and Y. Singer, “On the algorithmic implementation of multiclass kernel-based vector machines,” J. Mach. Learn. Res., vol. 2, pp. 265–292, 2001. [12] C. Coello Coello, “Evolutionary multiobjective optimization: A historical view of the field,” IEEE Comput. Intell. Mag., vol. 1, no. 1, pp. 28–36, 2006. [13] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning. New York: Addison-Wesley, 1989. [14] E. Zitzler, M. Laumanns, and L. Thiele, “SPEA2: Improving the strength pareto evolutionary algorithm,” Zurich, Switzerland, Tech. Rep. 103, 2001, Gloriastrasse 35, CH-8092. [15] J. D. Knowles and D. W. Corne, “The pareto archived evolution strategy: A new baseline algorithm for pareto multiobjective optimisation,” in Proc. IEEE Cong. Evol. Comput, Piscataway, NJ, USA, 1999, pp. 98–105, IEEE Press. [16] K. Deb, S. Agrawal, A. Pratap, and T. Meyarivan, “A fast elitist nondominated sorting genetic algorithm for multi-objective optimization: NSGA-II,” in Proceedings of the Parallel Problem Solving from Nature VI Conference, Paris, France, 2000, vol. 1917, Lecture Notes in Computer Science, pp. 849–858. [17] D. L. Olson and D. Delen, Advanced Data Mining Techniques, 1st ed. New York: Springer, 2008. [18] J. Lu, G. Getz, E. A. Miska, E. Alvarez-Saavedra, J. Lamb, D. Peck, A. Sweet-Cordero, B. L. Ebert, R. H. Mak, A. A. Ferrando, J. R. Downing, T. Jacks, H. R. Horvitz, and T. R. Golub, “MicroRNA expression profiles classify human cancers,” Nature, vol. 435, no. 7043, pp. 834–838, Jun. 2005. [19] T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gassenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, D. D. Bloom?eld, and E. S. Lander, “Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring,” Science, vol. 286, pp. 531–537, 1999. [20] A. P. Bradley, “The use of the area under the ROC curve in the evaluation of machine learning algorithms,” Pattern Recognit., vol. 30, no. 7, pp. 1145–1159, Jul. 1997. [21] P. J. Bickel and K. A. Doksum, Mathematical Statistics: Basic Ideas and Selected Topics. San Francisco, CA, USA: Holden-Day, 1977. [22] M. Hollander and D. A. Wolfe, Nonparametric Statistical Methods, 2nd ed. New York: Wiley, 1999. [23] R. Tibshirani, “Regression shrinkage and selection via the lasso,” J. Royal Stat. Soc. B, vol. 58, pp. 267–288, 1994. [24] P. Bradley and O. L. Mangasarian, “Feature selection via concave minimization and support vector machines,” in Mach. Learn. Proc. 15th Int. Conf. (ICML ’98), 1998, pp. 82–90. [25] J. Fan and R. Li, “Variable selection via nonconcave penalized likelihood and its oracle properties,” J. Amer. Stat. Assoc., vol. 96, no. 456, pp. 1348–1360, 2001. [26] N. Becker, W. Werft, G. Toedt, P. Lichter, and A. Benner, “penalizedSVM: A r-package for feature selection svm classification,” Bioinformatics, vol. 25, no. 13, pp. 1711–1712, 2009. [27] S. Bandyopadhyay, R. Mitra, U. Maulik, and M. Q. Zhang, “Development of the human cancer microrna network,” BMC Silence, vol. 1, no. 6, Feb. 2010. [28] S. Bandyopadhyay and R. Mitra, “TargetMiner: microRNA target prediction with systematic identification of tissue-specific negative examples,” Bioinformatics, vol. 25, no. 20, pp. 2625–2631, Oct. 2009.

Evolutionary Multiobjective Image Feature Extraction in the Presence of Noise.

Feature selection and semi-supervised clustering using multiobjective optimization.

An Evolutionary Approach for Identifying Driver Mutations in Colorectal Cancer.

A graph-theoretic approach for identifying non-redundant and relevant gene markers from microarray data using multiobjective binary PSO.

Classification of motor imagery tasks for BCI with multiresolution analysis and multiobjective feature selection.

An Evolutionary Algorithm with Double-Level Archives for Multiobjective Optimization.

Adaptive memetic computing for evolutionary multiobjective optimization.

Interrelationship-Based Selection for Decomposition Multiobjective Optimization.

Inferring population structure and relationship using minimal independent evolutionary markers in Y-chromosome: a hybrid approach of recursive feature selection for hierarchical clustering.

Identifying (Quasi) Equally Informative Subsets in Feature Selection Problems for Classification: A Max-Relevance Min-Redundancy Approach.

Identifying differentially coexpressed module during HIV disease progression: A multiobjective approach.

Multiobjective optimization for model selection in kernel methods in regression.

A Novel Approach to Multiple Sequence Alignment Using Multiobjective Evolutionary Algorithm Based on Decomposition.

Feature selection methods for identifying genetic determinants of host species in RNA viruses.

An integrated feature ranking and selection framework for ADHD characterization.

An Improved Multiobjective Optimization Evolutionary Algorithm Based on Decomposition for Complex Pareto Fronts.

A feature selection approach based on term distributions.

An unsupervised feature selection dynamic mixture model for motion segmentation.

Feature Selection and Classification of Electroencephalographic Signals: An Artificial Neural Network and Genetic Algorithm Based Approach.

A novel multiobjective evolutionary algorithm based on regression analysis.

EVOLUTIONARY LANDSCAPES FOR COMPLEX SELECTION.

A Simple and Fast Hypervolume Indicator-Based Multiobjective Evolutionary Algorithm.

The artful mind: sexual selection and an evolutionary neurobiological approach to aesthetic appreciation.

A Review of Surrogate Assisted Multiobjective Evolutionary Algorithms.