Inference for multivariate and high-dimensional data in heterogeneous designs

Loading...
Thumbnail Image

Date

2021

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

In the presented cumulative thesis, we develop statistical tests to check different hypotheses for multivariate and high-dimensional data. A suitable way to get scalar test statistics for multivariate issues are quadratic forms. The most common are statistics of Waldtype (WTS) or ANOVA-type (ATS) as well as centered and standardized versions of them. Also, [Pauly et al., 2015] and [Chen and Qin, 2010] used such quadratic forms to analyze hypotheses regarding the expectation vector of high-dimensional observations. Thereby, they had different assumptions, but both allowed just one respective two groups. We expand the approach from [Pauly et al., 2015] for multiple groups, which leads to a multitude of possible asymptotic frameworks allowing even the number of groups to grow. In the considered split-plot-design with normally distributed data, we investigate the asymptotic distribution of the standardized centered quadratic form under different conditions. In most cases, we could show that the individual limit distribution was only received under the specific conditions. For the frequently assumed case of equal covariance matrices, we also widen the considered asymptotic frameworks, since not necessarily the sample sizes of individual groups have to grow. Moreover, we add other cases in which the limit distribution can be calculated. These hold for homoscedasticity of covariance matrices but also for the general case. This expansion of the asymptotic frameworks is one example of how the assumption of homoscedastic covariance matrices allows widening conclusions. Moreover, assuming equal covariance matrices also simplifies calculations or enables us to use a larger statistical toolbox. For the more general issue of testing hypotheses regarding covariance matrices, existing procedures have strict assumptions (e.g. in [Muirhead, 1982], [Anderson, 1984] and [Gupta and Xu, 2006]), test only special hypotheses (e.g. in [Box, 1953]), or are known to have low power (e.g. in [Zhang and Boos, 1993]). We introduce an intuitive approach with fewer restrictions, a multitude of possible null hypotheses, and a convincing small sample approximation. Thereby, nearly every quadratic form known from the mean-based analysis can be used, and two bootstrap approaches are applied to improve their performance. Furthermore, it can be expanded to many other situations like testing hypotheses of correlation matrices or check whether the covariance matrix has a particular structure. We investigated the type-I-error for all developed tests and the power to detect deviations from the null hypothesis for small sample sizes up to large ones in extensive simulation studies.

Description

Table of contents

Keywords

Citation