Classification Method Performance in High Dimensions

Weihs, Claus; Kassner, Tobias

Classification Method Performance in High Dimensions

Dateien

Technical Report 2 2018.pdf (783.33 KB)

Datum

2018-04-13

Autor:innen

Weihs, Claus

Kassner, Tobias

Zusammenfassung

We discuss standard classiﬁcation methods for high-dimensional data and a small number of observations. By means of designed simulations illustrating the practical relevance of theoretical results we show that in the 2-class case the following rules of thumb should be followed in such a situation to avoid the worst error rate, namely the probability π1 of the smaller class: Avoid “complicated” classiﬁers: The independence rule (ir) might be adequate, the support vector machine (svm) should only be considered as an expensive alternative, which is additionally sensitive to noise factors. From the outset, look for stochastically independent dimensions and balanced classes. Only take into account features which inﬂuence class separation sufﬁciently. Variable selection might help, though ﬁlters might be too rough. Compare your result with the result of the data independent rule “Always predict the larger class”.

Schlagwörter

Classification, High Dimensions, Performance

URI

http://hdl.handle.net/2003/36834
http://dx.doi.org/10.17877/DE290R-18835

Sammlungen

Forschungsberichte

Komplettanzeige

Classification Method Performance in High Dimensions

Dateien

Datum

Autor:innen

Zeitschriftentitel

ISSN der Zeitschrift

Bandtitel

Verlag

Sonstige Titel

Zusammenfassung

Beschreibung

Inhaltsverzeichnis

Schlagwörter

Schlagwörter nach RSWK

Zitierform

URI

Sammlungen

Befürwortung

Review

Ergänzt durch

Referenziert von