Robust normality test and robust power transformation with application to state change detection in non normal processes

Loading...
Thumbnail Image

Date

2013-06-14

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The primary objective of this thesis is the construction of a powerful state change detection procedure for monitoring time series, which can help decision makers to react faster to changes in the system and define the proper course of action for each case. Without losing sight of our primary goal, we first derived a robust test of approximate normality based on the Shapiro-Wilk test (RSW), which detects if the majority of the data follows a normal distribution. The RSW test is based on the idea of trimming the original sample, and replacing the observations in the tail by artificially generated normally distributed data, and then performing the Shapiro-Wilk test on the modified sequence. We show that under the null hypothesis of normality the modified sequence is asymptotically normally distributed and that the RSW test statistic has the same asymptotic null distribution as the Shapiro-Wilk test statistic. The RSW test proves to be resistant to outliers and outperforms the other considered robust test for normality in the presence of outliers. Intending to use the RSW test to create a robust estimator of the Box-Cox transformation, we also investigate its behaviour with respect to the inverse Box-Cox transformation. It proves to be resistant to outliers in this case and also outperforms its competitors in presence of a few outliers. Secondly, we used the RSW test to derive a robust estimator of the Box-Cox transformation parameter (ˆλRSW). This conforms to the fact that the Box-Cox transformation only achieves approximate normality and the Shapiro-Wilk test of normality is one of the most powerful tests of normality. Gaudard & Karson (2000) already derived a non robust estimator of the Box-Cox transformation parameter based on the Shapiro-Wilk test statistic that outperformed the other estimators considered in their comparison. As expected, ˆλRSW is preferable to the maximum-likelihood and the M-estimators (we considered), mainly because it yields a better transformation in the sense that not only are the transformed samples more symmetrical according to the medcouple (a robust measure of symmetry and tail weight), but they also have a higher pass rate for the RSW test and the MC1 test at a significance level of 5%. Finally, returning to the state change detection, we opt for the method of Harrison & Stevens (1971), which considers four states: the steady state (normal state), the step change (level shift), the slope change and the outlier. The assumption of normally distributed data restricts the usage of the procedure, so we transform the data with ˆλRSW to achieve approximate normality. We extend the update equations to two observations in the past, that is to compute the probability of occurrence of a state change at time t - 2 given all available data until time t. This extension is used when we derive classifications rules for the incoming observations, given that the procedure only computes a posteriori probabilities for the different states and does not classify them. We use linear discriminant analysis and intensive simulations to derive the classification rules. We derived an instantaneous classification separating the step change and the outlier from the slope change and the steady state at the arrival of each observations and a one-step-after classification that separates the three classes outlier, step change and slope change, steady state one step after each observation is available. The simulations show that the first rule has an out-of-sample classification error of 2.1% and the second rule 3.11%. Opposed to this, the naive classification rule, which is to classify according to the estimated a posteriori probability, yields misclassification errors of 5.35% and 7.29%, respectively. Unfortunately, a classification rule for the slope change is not derived. One could take advantage of the fact that information on the past can be extended to as many observations in the past one wishes, increasing the probability of detecting a slope change. In addition, we do not consider other classification procedures than the linear discriminant analysis, although it is possible for other classification procedures to yield better results than ours. For all the computations in this work, we used the software package R Core Team (2012).

Description

Table of contents

Keywords

Normality test, Power transformations, Robust, State change detection

Citation