|Title:||Demixing empirical distribution functions|
|Abstract:||We consider the two-sample homogeneity problem where the information contained in two samples is used to test the equality of the underlying distributions. For instance, in cases where one sample stems from a simulation procedure modelling the data generating process of the other sample consisting of observed data, a mere rejection of the null hypothesis is unsatisfactory. Instead, the data analyst would like to know how the simulation can b e improved while changing it as little as possible. Based on the popular Kolmogorov-Smirnov test and a general nonparametric mixture model, we propose an algorithm which determines an appropriate correction distribution function describing how the simulation procedure can b e corrected. It is constructed in such a way that complementing the simulation sample by a given proportion of observations sampled from the correction distribution do es not lead to a rejection of the null hypothesis of equal distributions when the modified and the observed sample are compared. We prove our algorithm to run in linear time and evaluate it on simulated and real spectrometry data showing that it leads to intuitive results. We illustrate its practical performance considering runtime as well as accuracy in a real world scenario.|
|Appears in Collections:||Sonderforschungsbereich (SFB) 876|
Files in This Item:
|munteanu_wornowizki_2014a.pdf||DNB||607.76 kB||Adobe PDF||View/Open|
This item is protected by original copyright
Items in Eldorado are protected by copyright, with all rights reserved, unless otherwise indicated.