Authors: Munteanu, Alexander
Wornowizki, Max
Title: Demixing empirical distribution functions
Language (ISO): en
Abstract: We consider the two-sample homogeneity problem where the information contained in two samples is used to test the equality of the underlying distributions. For instance, in cases where one sample stems from a simulation procedure modelling the data generating process of the other sample consisting of observed data, a mere rejection of the null hypothesis is unsatisfactory. Instead, the data analyst would like to know how the simulation can b e improved while changing it as little as possible. Based on the popular Kolmogorov-Smirnov test and a general nonparametric mixture model, we propose an algorithm which determines an appropriate correction distribution function describing how the simulation procedure can b e corrected. It is constructed in such a way that complementing the simulation sample by a given proportion of observations sampled from the correction distribution do es not lead to a rejection of the null hypothesis of equal distributions when the modified and the observed sample are compared. We prove our algorithm to run in linear time and evaluate it on simulated and real spectrometry data showing that it leads to intuitive results. We illustrate its practical performance considering runtime as well as accuracy in a real world scenario.
URI: http://hdl.handle.net/2003/37171
http://dx.doi.org/10.17877/DE290R-19167
Issue Date: 2014-02
Appears in Collections:Sonderforschungsbereich (SFB) 876

Files in This Item:
File Description SizeFormat 
munteanu_wornowizki_2014a.pdfDNB607.76 kBAdobe PDFView/Open


This item is protected by original copyright



This item is protected by original copyright rightsstatements.org