Robust modelling of count data

Elsaied, Hanan Abdel kariem Abdel latif

Robust modelling of count data

Files

Dissertation.pdf (1.45 MB)

Date

2012-03-30

Authors

Elsaied, Hanan Abdel kariem Abdel latif

Alternative Title(s)

Applications in medicine

Abstract

M-estimators as modified versions of maximum likelihood estimators and their asymptotic properties play an important role in the development of modern robust statistics since the 1960s. In our thesis, we construct new M-estimators based on Tukey’s bisquare function to fit count data robustly. The Poisson distribution provides a standard framework for the analysis of this type of data. In case of independent identically distributed Poisson data, M-estimators based on the Huber and Tukey’s bisquare function are compared to already existing estimators implemented in R via simulations in case of clean data and of additive outliers. It turns out that it is difficult to combine high robustness against outliers and high efficiency under ideal conditions if the Poisson parameter is small, because such Poisson distributions are highly skewed. We suggest an alternative estimator based on adaptively trimmed means as a possible solution to this problem. Our simulation results indicate that a modified version of the R-function glmrob with external weights gives the best robustness properties among all estimation procedures based on the Huber function. A new modified Tukey M-estimator provides improvements over the other procedures which depend on the Tukey function and also those which depend on the Huber function, particularly in case of moderately large and very large outliers. The estimator based on adaptive trimming provides even better results at small Poisson means. Furthermore, our work constitutes a first treatment of robust M-estimation of INGARCH models for count time series. These models assume the observation at each point in time to follow a Poisson distribution conditionally on the past, with the conditional mean being a linear function of previous observations and past conditional means. We focus on the INGARCH(1,0) model as the simplest interesting variant. Our approach based on Tukey’s bisquare function with bias correction and initialization from a robust AR(1) fit provides good efficiencies in case of clean data. In the presence of outliers, the biascorrected Tukey M-estimators perform better than the uncorrected ones and the conditional maximum likelihood estimator. The construction of adequate Tukey M-estimators or the development of other robust estimators for INGARCH models of higher orders remains an open problem, albeit some preliminary investigations for the INGARCH(1,1) model are presented here. Some applications to real data from the medical field and artificial data examples indicate that the INGARCH(1,0) model is a promising candidate for such data, and that the issue of robust estimation tackled here is important.

Keywords

Asymptotic properties, Count data, GLM models, INGARCH models, Medical applications, Mestimator, Poisson model, Robustness, Tukey M-estimator

Subjects based on RSWK

Medizin, Poisson-Verteilung, Robuste Statistik, Schätzfunktion

URI

http://hdl.handle.net/2003/29404
http://dx.doi.org/10.17877/DE290R-6953

Collections

Fachgebiet Statistik in den Biowissenschaften

Full item page

Robust modelling of count data

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Alternative Title(s)

Abstract

Description

Table of contents

Keywords

Subjects based on RSWK

Citation

URI

Collections