Defesa de doutorado da discente Helen Costa Lima, dia 03/09 as 09:00.
Defesa de doutorado da discente Helen Costa Lima, dia 03/09 as 09:00.
Título: Hybrid feature selection approaches using metaheuristics for hierarchical classification
Link: meet.google.com/cdv-cojr-ehh
Banca Examinadora:
Titulares: Prof. Dr. Fernando Esteban Barril Otero (Univ. of Kent); Prof. Dr. Ricardo Cerri (UFSCAR); Prof. Dr. Túlio Ângelo Machado Toffolo (UFOP); Prof. Dr. Eduardo José da Silva Luz (UFOP); Prof. Dr. Luiz Henrique de Campos Merschmann (UFLA, coorientador); Prof. Dr. Marcone Jamilson Freitas Souza (UFOP, orientador)
Suplentes: Prof. Dr. Alexandre Plastino de Carvalho (UFF); Prof. Dr. Puca Huachi Vaz Penna (UFOP)
Abstract:
Feature selection is a widespread preprocessing step in the data mining field. One of its purposes is to reduce the number of original dataset features to improve a predictive model's performance. However, despite the benefits of feature selection for the classification task, as far as we are aware, few studies in the literature address feature selection for hierarchical classification context.
This work proposes two main supervised hybrid feature selection approaches, combining a filter and a wrapper step, wherein a global model hierarchical classifier evaluates feature subsets. The first uses the General Variable Neighborhood Search metaheuristic and a feature ranking constructed with the Hierarchical Symmetrical Uncertainty measure. The second one proposes an extension of the Correlation-based Feature Selection measure for hierarchical classification and uses a Best First Search algorithm to search the feature subset space. We used twelve datasets from protein and image domains to perform computational experiments to validate the effect of the proposed algorithms on classification performance when using two global hierarchical classifiers proposed in the literature. Statistical tests showed that using our methods as a feature selection led to a predictive performance that is consistently better or equivalent to that obtained using all features, with the benefit of reducing the number of features needed, which justifies their use for the hierarchical classification scenario.