Joachims, Thorsten2004-12-062004-12-0619992000-01-120943-4135http://hdl.handle.net/2003/260110.17877/DE290R-5102This paper proposes and analyzes an approach to estimating the generalization performance of a support vector machine (SVM) for text classification. Without any computation intensive resampling, the new estimators are computationally much more efficient than cross-validation or bootstrap, since they can be computed immediately from the form of the hypothesis returned by the SVM. Moreover, the estimators delevoped here address the special performance measures needed for text classification. While they can be used to estimate error rate, one can also estimate the recall, the precision, and the F1. A theoretical analysis and experiments on three text classification collections show that the new method can effectively estimate the performance of SVM text classifiers in a very efficient way. The paper is written in English.enUniversität DortmundForschungsberichte des Lehrstuhls VIII, Fachbereich Informatik der Universität Dortmund ; 25004Estimating the generalization performance of a SVM efficientlyreport