Multiobjective Evolutionary Algorithms applied to Feature Selection in Microarrays Cancer Data
DOI:
https://doi.org/10.31908/19098367.2014Keywords:
Cancer Microarrays, Feature Selection, Gene Expression, Multiobjective Evolutionary AlgorithmsAbstract
Microarray analysis of gene expression is a current topic for the diagnosis and classification of human cancer. A gene expression data microarray consists of an array of thousands of features of which most are irrelevant for classifying patterns of gene expressions. Choosing a minimal subset of features for classification is a difficult task. In this work, a comparison is made between two multi-objective evolutionary algorithms applied to sets of gene expressions popular in the literature (lymphoma, leukemia and colon). In order to remove the strongly correlated characteristics, a pre-processing stage is performed. An extensive and detailed analysis of the results obtained for the selected multi-objective algorithms is shown.
References
S. Selvaraj and J. Natarajan, “Microarray data analysis and mining tools,” Bioinformation, vol. 6, no. 3, pp. 95–99, 2011.
P. M. Narendra and K. Fukunaga, “A branch and bound algorithm for feature subset selection,” IEEE Transactions on Computers, vol. C-26, no. 9, pp. 917–922, Sept 1977.
M. Dash and H. Liu, “Feature selection for classification,” Intelligent data analysis, vol. 1, no. 3, pp. 131–156, 1997.
H. Liu and Z. Zhao, “Manipulating data and dimension reduction methods: Feature selection,” in Encyclopedia of Complexity and Systems Science. Springer, 2009, pp. 5348–5359.
H. Liu, H. Motoda, R. Setiono, and Z. Zhao, “Feature selection: An ever evolving frontier in data mining,” in Feature Selection in Data Mining, 2010, pp. 4–13.
A. W. Whitney, “A direct method of nonparametric measurement selec-tion,” IEEE Transactions on Computers, vol. 100, no. 9, pp. 1100–1103, 1971.
T. Marill and D. Green, “On the effectiveness of receptors in recognition systems,” IEEE transactions on Information Theory, vol. 9, no. 1, pp. 11–17, 1963.
P. Pudil, J. Novovicovˇa,´ and J. Kittler, “Floating search methods in feature selection,” Pattern recognition letters, vol. 15, no. 11, pp. 1119– 1125, 1994.
Q. Mao and I. W.-H. Tsang, “A feature selection method for multivariate performance measures,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 35, no. 9, pp. 2051–2063, 2013.
F. Min, Q. Hu, and W. Zhu, “Feature selection with test cost constraint,” International Journal of Approximate Reasoning, vol. 55, no. 1, pp. 167– 179, 2014.
B. Xue, M. Zhang, W. N. Browne, and X. Yao, “A survey on evolutio-nary computation approaches to feature selection,” IEEE Trans. Evol. Comput., vol. 20, no. 4, pp. 606–626, 2016.
C. S. R. Annavarapu, S. Dara, and H. Banka, “Cancer microarray data feature selection using multi-objective binary particle swarm optimiza-tion algorithm,” EXCLI Journal, vol. 15, pp. 460–473, 2016.
A. Hasnat and A. U. Molla, “Feature selection in cancer microarray data using multi-objective genetic algorithm combined with correlation coef-ficient,” in 2016 International Conference on Emerging Technological Trends (ICETT), 2016, pp. 1–6.
M. M. Mafarja and S. Mirjalili, “Hybrid whale optimization algorithm with simulated annealing for feature selection,” Neurocomputing, vol. 260, pp. 302 – 312, 2017. U. Alon, N. Barkai, D. A. Notterman, K. Gish, S. Ybarra, D. Mack, and J. Levine, “Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays,” Proceedings of the National Academy of Sciences, vol. 96, no. 12, pp. 6745–6750, 1999.
A. A. Alizadeh, M. B. Eisen, R. E. Davis, C. Ma, I. S. Lossos, Rosenwald, J. C. Boldrick et al., “Distinct types of diffuse large b-cell lymphoma identified by gene expression profiling,” Nature, vol. 403, no. 6769, pp. 503–511, 2000.
T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri et al., “Molecular classification of cancer: class discovery and class prediction by gene expression monitoring,” Science, vol. 286, no. 5439, pp. 531– 537, 1999.
J. S. Dussaut, C. A. Gallo, F. Cravero, M. J. Martínez, J. A. Carballido, and I. Ponzoni, “Gernet: a gene regulatory network tool,” Biosystems, vol. 162, pp. 1–11, 2017.
J. A. Carballido, C. A. Gallo, J. S. Dussaut, and I. Ponzoni, “On evo-lutionary algorithms for biclustering of gene expression data,” Current Bioinformatics, vol. 10, no. 3, pp. 259–267, 2015.
P. G. Kumar, T. A. A. Victoire, P. Renukadevi, and D. Devaraj, “Design of fuzzy expert system for microarray data classification using a novel genetic swarm algorithm,” Expert Systems with Applications, vol. 39, no. 2, pp. 1811–1821, 2012.
R. K. Singh and M. Sivabalakrishnan, “Feature selection of gene expression data for cancer classification: a review,” Procedia Computer Science, vol. 50, pp. 52–57, 2015.
S. Shahbeig, M. S. Helfroush, and A. Rahideh, “A fuzzy multi-objective hybrid tlbo–pso approach to select the associated genes with breast cancer,” Signal Processing, vol. 131, pp. 58–65, 2017.
K. Deb, Multi-Objective Optimization using Evolutionary Algorithms. John Wiley & Sons, 2001. [24] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Classification (2nd Ed). Wiley, 2001.
M. Walesiak, A. Dudek, and M. A. Dudek, “clustersim package,” 2011. [26] M. Kuhn, “Caret package,” Journal of Statistical Software, vol. 28, no. 5, pp. 1–26, 2008.
N. S. Altman, “An Introduction to Kernel and Nearest-Neighbor Non-parametric Regression,” The American Statistician, vol. 46, no. 3, pp. 175–185, 1992.
K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: NSGA-II,” IEEE Trans. Evol. Comput., vol. 6, no. 2, pp. 182–197, 2002.
A. J. Nebro, E. Alba, G. Molina, F. Chicano, F. Luna, and J. J. Durillo, “Optimal antenna placement using a new multi-objective chc algorithm,” in 9th annual conference on Genetic and evolutionary computation. New York, NY, USA: ACM Press, 2007, pp. 876–883.
J. J. Durillo and A. J. Nebro, “jMetal: A java framework for multi-objective optimization,” Advances in Engineering Software, vol. 42, pp. 760–771, 2011.