Robust modelling of count data
Loading...
Date
2012-03-30
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
M-estimators as modified versions of maximum likelihood estimators and their asymptotic
properties play an important role in the development of modern robust statistics
since the 1960s. In our thesis, we construct new M-estimators based on Tukey’s bisquare
function to fit count data robustly. The Poisson distribution provides a standard framework
for the analysis of this type of data.
In case of independent identically distributed Poisson data, M-estimators based on the
Huber and Tukey’s bisquare function are compared to already existing estimators implemented
in R via simulations in case of clean data and of additive outliers. It turns out
that it is difficult to combine high robustness against outliers and high efficiency under
ideal conditions if the Poisson parameter is small, because such Poisson distributions are
highly skewed. We suggest an alternative estimator based on adaptively trimmed means
as a possible solution to this problem. Our simulation results indicate that a modified
version of the R-function glmrob with external weights gives the best robustness properties
among all estimation procedures based on the Huber function. A new modified
Tukey M-estimator provides improvements over the other procedures which depend on
the Tukey function and also those which depend on the Huber function, particularly in
case of moderately large and very large outliers. The estimator based on adaptive trimming
provides even better results at small Poisson means.
Furthermore, our work constitutes a first treatment of robust M-estimation of INGARCH
models for count time series. These models assume the observation at each point in time
to follow a Poisson distribution conditionally on the past, with the conditional mean
being a linear function of previous observations and past conditional means. We focus
on the INGARCH(1,0) model as the simplest interesting variant. Our approach based
on Tukey’s bisquare function with bias correction and initialization from a robust AR(1)
fit provides good efficiencies in case of clean data. In the presence of outliers, the biascorrected
Tukey M-estimators perform better than the uncorrected ones and the conditional
maximum likelihood estimator. The construction of adequate Tukey M-estimators
or the development of other robust estimators for INGARCH models of higher orders
remains an open problem, albeit some preliminary investigations for the INGARCH(1,1)
model are presented here.
Some applications to real data from the medical field and artificial data examples indicate
that the INGARCH(1,0) model is a promising candidate for such data, and that the issue
of robust estimation tackled here is important.
Description
Table of contents
Keywords
Asymptotic properties, Count data, GLM models, INGARCH models, Medical applications, Mestimator, Poisson model, Robustness, Tukey M-estimator