Robust and non-parametric control charts for time series with a time-varying trend
Loading...
Date
2019
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The detection of structural breaks is an important task in many online-monitoring applications. Dynamics in the underlying signal can lead to a time series with a time-varying trend. This complicates the distinction between natural fluctuations and sudden structural breaks that are relevant to the application. Moreover, outlier patches can be confused with structural breaks or mask them.
A frequently formulated goal is to achieve a high detection quality while keeping the number of false alarms low. An example is the monitoring of vital parameters in intensive care where false alarms can lead to alarm fatigue of the medical staff, but missed structural breaks may cause health-threatening situations for the patient. The number of false alarms is often controlled by the average run length (ARL) or median run length (MRL), which measure the duration between two consecutive alarms.
Typical procedures for online monitoring under these premises are control charts. They compute a control statistic from the most recent observations and compare it to control limits. By this, it can be decided whether the process is in control or out of control.
In this thesis, control charts for the mean function are developed and studied. They are based on the sequential application of two-sample location tests in a moving time window to the most recent observations. The window is split into two subwindows which are compared with each other by the test. Unlike popular control schemes like the Shewhart, CUSUM, or EWMA scheme, a large set of historical data to specify the control limits is not required. Moreover, the control charts only depend on local model assumptions, allowing for an adaptation to the local process behaviour. In addition, by choosing appropriate window widths, it is possible to tune the control charts to be robust against a specific number of consecutive outliers. Thus, they can automatically distinguish between outlier patches and location shifts.
Via simulations and some theoretical considerations, control charts based on selected two-sample tests are compared. Assuming a locally constant signal, the ability to detect sudden location shifts is studied. It is shown that the in-control run-length distribution of charts based on rank tests does not depend on the data-generating distribution. Hence, such charts keep a desired in-control ARL or MRL under every continuous distribution. Moreover, control charts based on robust competitors of the well-known two-sample t-test are considered. The difference of the sample means and the pooled empirical standard deviation are replaced with robust estimators for the height of a location shift and the scale. In case of tests based on the two-sample Hodges-Lehmann estimator for shift and the difference of the sample medians, the in-control ARL and MRL seem to be nearly distribution free when computing the control limits with a randomisation principle. In general, the charts retain properties of the underlying tests. Out-of-control simulations indicate that a test which is efficient over a wide range of distributions leads to a control chart with a high detection quality. Moreover, the robustness of the tests is inherited by the charts. In the considered settings, the two-sample Hodges-Lehmann estimator leads to a control chart with promising overall results concerning the in- and out-of-control behaviour.
While being able to deal with very slow trends, the moving-window approach deteriorates for stronger trends of an in-control process. By confusing trends with location shifts, the number of false alarms becomes unacceptably large. The aforementioned approach is extended by constructing residual control charts based on local model fitting. The idea is to compute a sequence of one-step-ahead forecast errors from the most recent observations to remove the trend and apply the tests to them. This combination makes it possible to detect location shifts and sudden trend changes in the original time series. Robust regression estimators retain the information on change points in the sequence better than non-robust ones. Based on a literature summary on robust online filtering procedures, which is also part of this thesis, the one-step-ahead forecast errors are computed by repeated median regression. The conclusions are similar as for the locally constant signal: Efficient robust tests lead to control charts with a high detection quality and robustness is preserved. However, due to correlated forecast errors, in-control ARL and MRL are not completely distribution free. Still, it is possible to construct charts for which these measures seem approximately distribution free. Again, a chart based on the two-sample Hodges-Lehmann estimator turns out to perform well.
A first investigation under the assumption of a local autoregressive model of order one is also provided. In this case, the in- and out-of-control performances of the charts depend not only on the underlying distribution but also on the strength of the autocorrelation. Under distributional assumptions, the results indicate that an acceptable detection quality for small to moderate autocorrelation can be achieved.
The application of the control charts to data from different real-world applications indicates that they can reliably detect large structural breaks even in trend periods. Additional rules can be helpful to further reduce the number of false alarms and facilitate the distinction between relevant and irrelevant changes. Furthermore, it is shown how the procedures can be modified to detect sudden variability changes in time series with a non-linear signal.
Description
Table of contents
Keywords
Control charts, Non-linearity, Non-parametric, Online monitoring, Robustness, Shift detection, Signal extraction, Time-series analysis