RGIFE: a ranked guided iterative feature elimination heuristic for biomarkers identification

by Nicola Lazzarini

16:00 (40 min) in Daysh G.07

Current -omics technologies are able to sense the state of a biological sample in a very wide variety of ways. Given the high dimensionality that typically characterises these data, relevant knowledge it's often hidden and hard to identify. Machine learning methods, and particularly feature selection algorithms have proven very effective at identifying small but relevant subsets of variables from a variety of application domains, including -omics data. Many methods exist with varying trade-offs between the size of the identified variable subsets and the predictive power of such subsets. We present an heuristic for biomarkers identification called RGIFE: rank-guided iterative feature elimination. RGIFE is guided in its biomarkers identification process by the information extracted from the machine learning models and incorporates several mechanisms to ensure that it creates minimal and highly predictive feature sets. The presentation will analyse the performance of RGIFE from both a computational and a biological point of view. In addition a case study, based on knee-OA data, will be presented.