Data Mining on Ice
Loading...
Date
2012-02-21
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In an atmospheric neutrino analysis for IceCube’s 59-string configuration,
the impact of detailed feature selection on the performance of machine learning
algorithms has been investigated. Feature selection is guided by the principle of
maximum relevance and minimum redundancy. A Random Forest was studied as an
example of a more complex learner. Benchmarks were obtained using the simpler
learners k-NN and Naive Bayes. Furthermore, a Random Forest was trained and
tested in a 5-fold cross validation using 3.5 × 104 simulated signal and 3.5 × 104
simulated background events.