Feature Selection for High-Dimensional Data with RapidMiner
Loading...
Date
2012-02-28
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Feature selection is an important task in machine learning, reducing dimensionality of learning problems by selecting few relevant features without losing too much
information. Focusing on smaller sets of features, we can learn simpler models from
data that are easier to understand and to apply. In fact, simpler models are more
robust to input noise and outliers, often leading to better prediction performance
than the models trained in higher dimensions with all features. We implement several feature selection algorithms in an extension of RapidMiner, that scale well
with the number of features compared to the existing feature selection operators in
RapidMiner.