A novel filter wrapper hybrid greedy ensemble approach optimized using the genetic algorithm to reduce the dimensionality of high-dimensional biomedical datasets

Gangavarapu, T.; Patil, N.

Please use this identifier to cite or link to this item: http://idr.nitk.ac.in/jspui/handle/123456789/9670

Title:	A novel filter wrapper hybrid greedy ensemble approach optimized using the genetic algorithm to reduce the dimensionality of high-dimensional biomedical datasets
Authors:	Gangavarapu, T. Patil, N.
Issue Date:	2019
Citation:	Applied Soft Computing Journal, 2019, Vol.81, , pp.-
Abstract:	The predictive accuracy of high-dimensional biomedical datasets is often dwindled by many irrelevant and redundant molecular disease diagnosis features. Dimensionality reduction aims at finding a feature subspace that preserves the predictive accuracy while eliminating noise and curtailing the high computational cost of training. The applicability of a particular feature selection technique is heavily reliant on the ability of that technique to match the problem structure and to capture the inherent patterns in the data. In this paper, we propose a novel filter wrapper hybrid ensemble feature selection approach based on the weighted occurrence frequency and the penalty scheme, to obtain the most discriminative and instructive feature subspace. The proposed approach engenders an optimal feature subspace by greedily combining the feature subspaces obtained from various predetermined base feature selection techniques. Furthermore, the base feature subspaces are penalized based on specific performance dependent penalty parameters. We leverage effective heuristic search strategies including the greedy parameter-wise optimization and the Genetic Algorithm (GA) to optimize the subspace ensembling process. The effectiveness, robustness, and flexibility of the proposed hybrid greedy ensemble approach in comparison with the base feature selection techniques, and prolific filter and state-of-the-art wrapper methods are justified by empirical analysis on three distinct high-dimensional biomedical datasets. Experimental validation revealed that the proposed greedy approach, when optimized using GA, outperformed the selected base feature selection techniques by 4.17% 15.14% in terms of the prediction accuracy. 2019 Elsevier B.V.
URI:	10.1016/j.asoc.2019.105538 http://idr.nitk.ac.in/jspui/handle/123456789/9670
Appears in Collections:	1. Journal Articles

Files in This Item:

There are no files associated with this item.

Show full item record