Feature Selection for High-Dimensional Data with RapidMiner
Loading...
Date
2011-01
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Feature selection is an important task in machine learning, reducing dimensionality of learning problems by selecting few relevant features without losing too much information. Focusing on smaller sets of features, we can learn simpler models from data that are easier to understand and to apply. In fact, simpler models are more robust to input noise and outliers, often leading to better prediction performance than the models trained in higher dimensions with all features. We implement several feature selection algorithms in an extension of RapidMiner, that scale well with the number of features compared to the existing feature selection operators in RapidMiner.