Distribution-free analysis of homogeneity

Loading...
Thumbnail Image

Date

2015

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

In this dissertation three problems strongly connected to the topic of homogeneity are considered. For each of them a distribution-free approach is investigated using simulated as well as real data. The first procedure proposed is motivated by the fact that a mere rejection of homogeneity is unsatisfactory in many applications, because it is often not clear which discrepancies of the samples case the rejection. To capture the dissimilarities our method combines a fairly general mixture model with the classical nonparametric two-sample Kolmogorov-Smirnov test. In case of a rejection by this test, the proposed algorithm quantifies the discrepancies between the corresponding samples. These dissimilarities are represented by the so called shrinkage factor and the correction distribution. The former measures the degree of discrepancy between the two samples. The latter contains information with regard to the over- and undersampled regions when comparing one sample to the other in the Kolmogorov-Smirnov sense. We prove the correctness of the algorithm as well as its linear running time when applied to sorted samples. As illustrated in various data settings, the fast method leads to adequate and intuitive results. The second topic investigated is a new class of two-sample homogeneity tests based on the concept of f-divergences. These distance like measures for pairs of distributions are defined via the corresponding probability density functions. Thus, homogeneity tests relying on f-divergences are not limited to discrepancies in location or scale, but can detect arbitrary types of alternatives. We propose a distribution-free estimation procedure for this class of measures based on kernel density estimation and spline smoothing. As shown in extensive simulations, the new method performs stable and quite well in comparison to several existing non- and semiparametric divergence estimators. Furthermore, we construct distribution-free two-sample homogeneity tests relying on various divergence estimators using the permutation principle. The tests are compared to an asymptotic divergence procedure as well as to several traditional parametric and nonparametric tests on data from different distributions under the null hypothesis and several alternatives. The results suggest that divergence-based methods have considerably higher power than traditional methods if the distributions do not predominantly differ in location. Therefore, it is advisable to use such tests if changes in scale, skewness, kurtosis or the distribution type are possible while the means of the samples are of comparable size. The methods are thus of great value in many applications as illustrated on ion mobility spectrometry data. The last topic we deal with is the detection of structural breaks in time series. The method introduced is motivated by characteristic functions and Fourier-type transforms. It is highly flexible in several ways: firstly, it allows to test for the constancy of an arbitrary feature of a time series such as location, scale or skewness. It is thus applicable in various problems. Secondly, the method makes use of arbitrary estimators of the feature under investigation. Hence, a robustification of the approach or other modifications are straightforward. We demonstrate the testing procedure focussing on volatility as well as on kurtosis. In both cases our approach leads to reasonable rejection rates for symmetric distributions in comparison to several test derived from the literature. In particular, the test shines in presence of multiple structural breaks, because its test statistic is constructed in a blockwise manner. The position and number of the presumable change points located by the new procedure also correspond to the true ones quite well. The method is thus well suited for many applications as illustrated on exchange rate data.

Description

Table of contents

Keywords

Nonparametric, Correct simulation, Divergence measure, Testing constant variance

Citation