Inference for multivariate and high-dimensional data in heterogeneous designs
Loading...
Date
2021
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In the presented cumulative thesis, we develop statistical tests to check different hypotheses
for multivariate and high-dimensional data. A suitable way to get scalar test statistics
for multivariate issues are quadratic forms. The most common are statistics of Waldtype
(WTS) or ANOVA-type (ATS) as well as centered and standardized versions of them.
Also, [Pauly et al., 2015] and [Chen and Qin, 2010] used such quadratic forms to analyze
hypotheses regarding the expectation vector of high-dimensional observations. Thereby,
they had different assumptions, but both allowed just one respective two groups.
We expand the approach from [Pauly et al., 2015] for multiple groups, which leads to a
multitude of possible asymptotic frameworks allowing even the number of groups to
grow. In the considered split-plot-design with normally distributed data, we investigate
the asymptotic distribution of the standardized centered quadratic form under different
conditions. In most cases, we could show that the individual limit distribution was only
received under the specific conditions. For the frequently assumed case of equal covariance
matrices, we also widen the considered asymptotic frameworks, since not necessarily
the sample sizes of individual groups have to grow. Moreover, we add other cases in which
the limit distribution can be calculated. These hold for homoscedasticity of covariance matrices
but also for the general case. This expansion of the asymptotic frameworks is one
example of how the assumption of homoscedastic covariance matrices allows widening
conclusions. Moreover, assuming equal covariance matrices also simplifies calculations or
enables us to use a larger statistical toolbox. For the more general issue of testing hypotheses
regarding covariance matrices, existing procedures have strict assumptions (e.g. in
[Muirhead, 1982], [Anderson, 1984] and [Gupta and Xu, 2006]), test only special hypotheses
(e.g. in [Box, 1953]), or are known to have low power (e.g. in [Zhang and Boos, 1993]).
We introduce an intuitive approach with fewer restrictions, a multitude of possible null hypotheses,
and a convincing small sample approximation. Thereby, nearly every quadratic
form known from the mean-based analysis can be used, and two bootstrap approaches are
applied to improve their performance. Furthermore, it can be expanded to many other
situations like testing hypotheses of correlation matrices or check whether the covariance
matrix has a particular structure.
We investigated the type-I-error for all developed tests and the power to detect deviations
from the null hypothesis for small sample sizes up to large ones in extensive simulation
studies.