TitleUnsupervised Feature Selection for Noisy Data
Publication TypeConference Paper
Year of Publication2019
AuthorsMahdavi, K, Labarta, J, Giménez, J
Conference NameAdvanced Data Mining and Applications
Date Published11/2019
PublisherSpringer International Publishing
Conference LocationCham
ISBN Number978-3-030-35231-8
KeywordsFeature selection, Independent Component Analysis, Noise separation, Oblique rotation
Abstract

Feature selection techniques are enormously applied in a variety of data analysis tasks in order to reduce the dimensionality. According to the type of learning, feature selection algorithms are categorized to: supervised or unsupervised. In unsupervised learning scenarios, selecting features is a much harder problem, due to the lack of class labels that would facilitate the search for relevant features. The selecting feature difficulty is amplified when the data is corrupted by different noises. Almost all traditional unsupervised feature selection methods are not robust against the noise in samples. These approaches do not have any explicit mechanism for detaching and isolating the noise thus they can not produce an optimal feature subset. In this article, we propose an unsupervised approach for feature selection on noisy data, called Robust Independent Feature Selection (RIFS). Specifically, we choose feature subset that contains most of the underlying information, using the same criteria as the Independent component analysis (ICA). Simultaneously, the noise is separated as an independent component. The isolation of representative noise samples is achieved using factor oblique rotation whereas noise identification is performed using factor pattern loadings. Extensive experimental results over divers real-life data sets have showed the efficiency and advantage of the proposed algorithm.

URLhttps://doi.org/10.1007/978-3-030-35231-8_6
DOI10.1007/978-3-030-35231-8_6