SFB 
823 
Goodness-of-fit tests in long-
range dependent processes 
under fixed alternatives 
D
iscussion P
aper 
 
Holger Dette, Kemal Sen 
 
 
 
 
Nr. 48/2010 
 
 
 
 
 
 
 
 
 
 
Goodness-of-fit tests in long-range dependent processes under
fixed alternatives
Holger Dette, Kemal Sen
Ruhr-Universität Bochum
Fakultät für Mathematik
44780 Bochum
Germany
email: holger.dette@ruhr-uni-bochum.de, kemal.sen@ruhr-uni-bochum.de
FAX: +49 234 32 14559
December 3, 2010
Abstract
In a recent paper Fay and Philippe (2002) proposed a goodness-of-fit test for long-range depen-
dent processes which uses the logarithmic contrast as information measure. These authors estab-
lished asymptotic normality under the null hypothesis and local alternatives. In the present note
we extend these results and show that the corresponding test statistic is also normally distributed
under fixed alternatives.
AMS Subject Classification: 60F05, 62F03
Keywords: long-range dependence, goodness-of-fit test, asymptotic power, periodogram
1 Introduction
Nowadays long-range dependent processes represent a well accepted class of stochastic processes for
modelling real phenomena in such diverse areas as hydrology, behaviour research, network traffic or
finance [see Koutsoyiannis et al. (2009), Stroe-Kunold et al. (2009), Park and Willinger (2000), Granger
1
(1980), Greene and Fielitz (1977) among many others]. Numerous parametric models have been pro-
posed for the analysis with long-range dependent processes. The most important among them are
fractional ARIMA processes which were independently introduced by Granger and Joyeux (1980) and
Hosking (1981) and fractional Gaussian noise processes [see Beran (1994)]. Many of the methods assume
that the specific form of the spectrum is known except for a finite dimensional parameter. The results
of the statistical analysis depend sensitively on pre-specified model assumptions, and the conclusions
from the data may be misleading if these assumptions are violated. For this reason several authors have
pointed out the importance of being able to check the goodness-of-fit of a specific model assumption
in long-range dependent processes. Beran (1992) proposed a method for testing how well a specified
model, such as a fractional Gaussian noise, fits the data. His results were extended by Deo and Chen
(2000) who investigated an integral of the squared periodogram. Chen and Deo (2004) suggested a
generalized Portmanteau test based on the discrete spectral average estimator and obtained the asymp-
totic null distribution for Gaussian long-memory time series. While most of the tests proposed by these
authors are based on the estimation of the L2 distance between the unknown spectral density and the
best approximation by the parametric class, Fay and Philippe (2002) used a logarithmic contrast for
the construction of a test for a specific parametric form of the spectral density [see also Mokkadem
(1997) or Dette and Spreckelsen (2003) for an application of this measure in the context of ARMA
processes]. These authors established the asymptotic normality of a corresponding test statistic under
the null hypothesis and local alternatives.
As pointed out by Chen and Deo (2004), most theoretical results in the context of goodness-of-fit testing
address the asymptotic behaviour of a test statistic when the null hypothesis is correctly specified,
and an additional question of interest is the power property of the corresponding test when the null
hypothesis is actually misspecified. This problem requires asymptotic inference under the alternative
and has found considerable interest in the context of classical regression analysis [see Dette (1999) or
Dette (2002) among others]. Dette and Spreckelsen (2003) investigated the asymptotic properties of
an L2-test proposed by Paparoditis (2000) for the parametric form of the spectral density in stationary
short-range dependent processes, but less results are available for goodness-of-fit tests in long range
dependence processes.
The present paper is devoted to the asymptotic analysis of the test statistic proposed by Fay and Philippe
(2002) under fixed alternatives. In Section 2 we introduce the necessary notations and assumptions and
review the results of Fay and Philippe (2002). Section 3 presents our main results which show that the
test statistic proposed by Fay and Philippe (2002) is also asymptotically normally distributed under
fixed alternatives. We state a general result which contains the situation of a true null hypothesis as a
special case and also discuss potential applications of our results. Finally, for the sake of a transparent
presentation, some technical details are deferred to an appendix in Section 4.
2
2 Preliminaries
Let X = (Xt)t∈Z denote a stochastic process which admits the linear representation
(2.1) Xt = σ
∑
j∈Z
ajZt−j
where
∑
j∈Z a
2
j < ∞ and (Zt)t∈Z denotes a Gaussian white noise process. Following Fay and Philippe
(2002) we represent the spectral density f of the process (Xt)t∈Z as
f(λ) = σ2|1− eiλ|−2d1f ∗(λ); λ ∈ [−pi, pi](2.2)
where d1 ∈ [0, 1/2) and f ∗ is a twice continuously differentiable function defined on the interval [−pi, pi]
and bounded away from zero. We are interested in the problem of testing for a specific parametric form
of the spectral density of the process (Xt)t∈Z, that is
(2.3) H0 : f ∈ F0 .
Here F0 denotes a parametric class of spectral densities defined by
(2.4) F0 =
{
f(λ) = σ2|1− eiλ|−2dg∗(λ; θ)
∣
∣
∣ (d, θ) ∈ D ×Θ, σ2 > 0, g∗ ∈ G
}
,
where D is a compact subset of the interval [0, 1/2), Θ ⊂ Rl denotes a compact set (l ∈ N) and G is
the set of positive and symmetric functions defined on the interval [−pi, pi] satisfying
∫ pi
−pi
log g∗(x; θ)dx = 0 .
For a given g∗ ∈ G we define
g(λ; d, θ) = |1− eiλ|−2dg∗(λ; θ).
For the testing problem (2.3) Fay and Philippe (2002) proposed to measure deviations from the null
hypothesis by
inf
d∈D,θ∈Θ
S(f, f(·, d, θ))(2.5)
where
S(f, f(·, d, θ)) = log
∫ pi
−pi
f(λ)
f(λ, d, θ)
dλ
2pi
−
∫ pi
−pi
log
f(λ)
f(λ, d, θ)
dλ
2pi
(2.6)
denotes a logarithmic contrast between the spectral density f and an element of the class F0. Note that
the information measure in (2.6) is always nonnegative and that the null hypothesis is satisfied if and
only if the expression in (2.5) vanishes. The logarithmic contrast has been used before by Mokkadem
3
(1997) and Dette and Spreckelsen (2003) for testing hypotheses in ARMA processes. In order to estimate
the minimal distance Fay and Philippe (2002) proposed to consider a tapered Fourier transform of the
series {X1, . . . , Xn} that is
d(p)n,k =
1
√
2pin
n∑
t=1
w(p)n,tXte
iλkt; k = 1, . . . , n
where λk = 2pik/n are the Fourier frequencies,
w(p)n,t =
(2p
p
)− 12
(
1− ei
2pit
n
)p
; t = 1, . . . , n
is the data taper and p ∈ N0 denotes the order of the taper (note that p = 0 yields w
(0)
n,t = 1, t = 1, . . . , n).
These quantities are used to define a pooled periodogram by
I
X
n,k :=
1
m
(m+p)k−p∑
j=(m+p)(k−1)+1
|d(p)n,j|
2; k = 1, . . . , Kn.
Throughout this paper I
Z
n,k denotes the pooled and tapered periodogram of the Gaussian white noise
Z1, . . . , Zn. Note that the interval [0, pi] is decomposed in Kn = b n−12(m+p)c intervals (m ∈ N) of the form
[λ(k−1)(m+p), , λk(m+p)]
and that the center of the kth interval is given by
(2.7) xk := (m+ p)
2pi
n
(
k −
1
2
)
.
Fay and Philippe (2002) introduced the discretized version of (2.6), i.e.
Sn
(
I
X
n , g(·; d, θ)
)
= log
(
1
Kn
Kn∑
k=1
I
X
n,k
g(xk; d, θ)
)
−
1
Kn
Kn∑
k=1
log
(
I
X
n,k
g(xk; d, θ)
)
+ γm,p
where the constant γm,p is defined by
(2.8) γm,p = E
[
log 2piI
Z
n,k
]
which is a centering constant, such that the expectation under the null hypothesis vanishes asymptoti-
cally. For the cases
1.) d0 = 0, m ≥ 5 and p = 0 or p = 1
2.) d0 > 0, m ≥ 5 and p = 1
4
Fay and Philippe (2002) proved that under the null hypothesis, i.e. f(λ) = g(λ, d0, θ0) for some (d0, θ0) ∈
D×Θ and certain assumptions of regularity [see Section 3 for details], the statistic
√
KnSn(I
X
n , g(·; dˆn, θˆn))
converges weakly, that is
(2.9) Tn =
√
KnSn
(
I
X
n , g(·; dˆn, θˆn)
)
d
→ N
(
0, τ 20
)
,
where (dˆn, θˆn) is any estimator of the true parameter (d0, θ0) satisfying
∥
∥(dˆn, θˆn)− (d0, θ0)
∥
∥ = Op
( 1
√
n
)
and the asymptotic variance in (2.9) is given by
τ 20 := Var
(
2piI
Z
n,k − log
(
2piI
Z
n,k
))
.
For a discussion of the quantities γm,p and τ 20 we refer to Hurvich et al. (2002). Note that these authors
did not assume a Gaussian white noise process, but considered a general white noise process (Zt)t∈Z
with several assumptions regarding the characteristic function E[exp(iZt)]. In this case there appears
an additional constant in the asymptotic variance depending on the fourth cumulant of the white noise
process. In the following we will study the asymptotic properties of the statistic Tn if the null hypothesis
is not satisfied. For the sake of simplicity, we restrict ourselves to the Gaussian case. The general case
is briefly discussed in Remark 3.2.
3 Weak convergence under fixed alternatives
If the null hypothesis is not satisfied, then the minimum distance in (2.5) is positive. Throughout this
paper we assume that there exists a unique pair (d0, θ0) ∈ (D ×Θ)0 such that
inf
(d,θ)∈D×Θ
S(f, g(·; d, θ)) = S(f, g(·; d0, θ0)),
where C0 denotes the interior of the set C ⊂ Rl+1 and D in (2.4) is defined by D = [δ, 1/2 − δ] for
some 0 < δ < 1/4. We further assume that the set Θ is additionally convex [see Chen and Deo (2006)].
Note that (d0, θ0) is the parameter corresponding to the best approximation of the spectral density f
by densities of the class F0. Throughout this paper let (dˆn, θˆn) denote a Whittle type estimate [Whittle
(1953)] which is defined as the minimizer of the objective function
Qn(d, θ) =
pi
Kn
Kn∑
j=1
I
x
n,j
g(xj, d, θ)
(3.1)
where xj is defined in (2.7). In the case where the model is correctly specified, the asymptotic behaviour
of the maximum likelihood estimator was investigated by Dahlhaus (1989). The Whittle estimator
5
was investigated by Fox and Taqqu (1986) and Giraitis and Surgailis (1990) for Gaussian and linear
processes, respectively. Recently, Chen and Deo (2006) derived the asymptotic properties of an estimator
minimizing an approximation to the negative of the exact Gaussian likelihood [Whittle (1953)] in the
case of misspecified long-range dependent processes. Note that in contrast to these results, the objective
function considered in (3.1) is based on the tapered and pooled periodogram in this definition, while
Chen and Deo (2006) considered the classical periodogram in the objective function (3.1). A careful
inspection of the proofs in this reference shows that the main results, in particular Theorem 2 and
Lemma 2 of Chen and Deo (2006), remain valid in this case. It is also notable that the asymptotic
properties  in particular the rate of convergence  depend sensitively on the distance d1 − d0. If
d1 − d0 ≤ 1/4 the estimator
√
n((dˆn, θˆn) − (d0, θ0)) is asymptotically normal distributed, while in the
case d1− d0 > 1/4 the difference converges in distribution with a different rate to a non-Gaussian limit.
In particular, the rate of convergence can be arbitrarily small in this case. In our main result we specify
the asymptotic behaviour of the test statistic proposed by Fay and Philippe (2002) in the case of a
misspecified model. For this purpose we define by
D(d0, θ0) := log
(
1
pi
∫ pi
0
f(x)
g(x; d0, θ0)
dx
)
−
1
pi
∫ pi
0
log
(
f(x)
g(x; d0, θ0)
)
dx(3.2)
as the minimal distance between the true spectral density f and the parametric class defined in (2.4) with
respect to the logarithmic contrast introduced in (2.6). Note that the null hypothesis (2.3) is satisfied
if and only if D(d0, θ0) = 0. We assume that (Xt)t∈Z is a stationary process with linear representation
(2.1) where the innovations (Zt)t∈Z define a Gaussian white noise process and the spectral density of
(Xt)t∈Z is given by (2.2).
Theorem. 3.1. Let (Xt)t∈Z be a stationary process with linear representation (2.1) and Gaussian white
noise (Zt)t∈Z, d1 ∈ (0, 1/2), p = 1, m ≥ 5, d1 − d0 < 1/4, and assume that the following conditions are
satisfied:
(A1) g∗(λ; θ) is three times continuously differentiable .
(A2) infθ infλ g∗(λ; θ) > 0, supθ supλ g
∗(λ; θ) <∞.
(A3) supλ supθ
∣
∣
∣
∂g∗(λ;θ)
∂θi
∣
∣
∣ <∞; 1 ≤ i ≤ l.
(A4) supλ supθ
∣
∣
∣
∂2g∗(λ;θ)
∂θi∂θj
∣
∣
∣ <∞, supλ supθ
∣
∣
∣
∂2g∗(λ;θ)
∂θi∂λ
∣
∣
∣ <∞; 1 ≤ i, j ≤ l.
(A5) supλ supθ
∣
∣
∣
∂3g∗(λ;θ)
∂θi∂θj∂θk
∣
∣
∣ <∞; 1 ≤ i, j, k ≤ l.
(A6)
∫ pi
−pi log g
∗(λ; θ) dλ = 0 for all θ ∈ Θ.
6
If n→∞, then
√
Kn
{
Sn
(
I
X
n , g(·; dˆn, θˆn)
)
−D(d0, θ0)
}
D
−→ N (0, τ 2∆)
where D(d0, θ0) denotes the minimal distance between the parametric class F0 and the unknown spectral
density f defined in (2.2) and the asymptotic variance is given by
τ 2∆ := (∆− 1)Var
(
2piI
Z
n,k
)
+ Var
(
2piI
Z
n,k − log 2piI
Z
n,k
)
with
∆ = pi
∫ pi
0
(
f(x)
g(x; d0, θ0)
)2
dx
(∫ pi
0
f(x)
g(x; d0, θ0)
dx
)−2
.(3.3)
Proof. Recalling the definition of the statistic Tn in (2.9) we introduce the decomposition
Tn =
√
Kn
{
Sn
(
I
X
n , g(·; dˆn, θˆn)
)
−D(d0, θ0)
}
=
√
Kn
{
An +Bn + Cn
}
,
where the random variables An, Bn and Cn are defined by
An := Sn
(
I
X
n , f(·)
)
,(3.4)
Bn := Sn
(
I
X
n , g(·; d0, θ0)
)
− Sn
(
I
X
n , f(·)
)
,(3.5)
Cn := Sn
(
I
X
n , g(·; dˆn, θˆn)
)
− Sn
(
I
X
n , g(·; d0, θ0)
)
,(3.6)
respectively. In the Appendix we will show that
An =
1
Kn
Kn∑
k=1
{
2piI
Z
n,k − 1− log 2piI
Z
n,k + γm,p
}
+ op
( 1
√
Kn
)
,(3.7)
Bn =
Kn∑
k=1
(
βn,k −
1
Kn
)(
2piI
Z
n,k − 1
)
(3.8)
+ log
(
1
Kn
Kn∑
k=1
f(xk)
g(xk; d0, θ0)
)
−
1
Kn
Kn∑
k=1
log
(
f(xk)
g(xk; d0, θ0)
)
+ op
( 1
√
Kn
)
,
Cn = op
( 1
√
Kn
)
(3.9)
where I
Z
n,k denotes the pooled and tapered periodogram of the Gaussian white noise process (Zt)t∈Z and
the constants βn,k are defined by
βn,k =
f(xk)
g(xk;d0,θ0)
∑Kn
j=1
f(xj)
g(xj ;d0,θ0)
=
|1− eixk |−2(d1−d0) f
∗(xk)
g∗(xk;θ0)
∑Kn
j=1 |1− e
ixj |−2(d1−d0) f
∗(xj)
g∗(xj ;θ0)
.
7
Observing the approximation
log
(
1
Kn
Kn∑
k=1
f(xk)
g(xk; d0, θ0)
)
−
1
Kn
Kn∑
k=1
log
(
f(xk)
g(xk; d0, θ0)
)
= log
(
1
Kn
Kn∑
k=1
|1− eixk |−2(d1−d0)
f ∗(xk)
g∗(xk; θ0)
)
−
1
Kn
Kn∑
k=1
log
(
|1− eixk |−2(d1−d0)
f ∗(xk)
g∗(xk; θ0)
)
= D(d0, θ0) +O
(
n−1+2(d1−d0)
+)
,
it follows that the weak convergence of the statistic Tn can be obtained from the asymptotic properties
of the random variable
T˜n =
√
Kn
Kn∑
k=1
{(
βn,k2piI
Z
n,k −
1
Kn
log 2piI
Z
n,k
)
−
(
βn,k −
1
Kn
γm,p
)}
.
For this purpose we use the central limit theorem of Ljapunov. To precise we note that the random
variables 2piI
Z
n,k are independent identically distributed with existing fourth moment satisfying
E
[
2piI
Z
n,k
]
= 1; k = 1, . . . Kn.(3.10)
Therefore we obtain for the variance of T˜n by a straightforward calculation
Var[T˜n] = Var
(
2piI
Z
n,k
)
Kn
Kn∑
k=1
β2n,k + Var
(
log 2piI
Z
n,k
)
− 2
{
E
[
2piI
Z
n,k log 2piI
Z
n,k
]
− γm,p
}
.
Observing the approximation
1
Kn
Kn∑
k=1
(
f(xk)
g(xk; d0, θ0)
)j
=
1
Kn
Kn∑
k=1
(
|1− eixk |−2(d1−d0)
f ∗(xk)
g∗(xk, θ0)
)j
=
1
pi
∫ pi
0
(
f(x)
g(x; d0, θ0)
)j
dx+O
(
n−1+2j(d1−d0)
+)
; j = 1, 2
we obtain by a tedious calculation
lim
n→∞
Kn
Kn∑
k=1
β2n,k = ∆,
where ∆ is defined in (3.3) and
(3.11)
Kn∑
k=1
β2n,k ≤
C
n
(note that d1− d0 < 14 by assumption). Combining these results gives for the asymptotic variance of T˜n
lim
n→∞
Var[T˜n] = τ
2
∆
8
where τ 2∆ is defined in Theorem 3.1. Note that E[log 2piI
Z
n,k]
4
is constant, then a similar calculation
yields for the numerator in the Ljapunov condition
K2n
Kn∑
k=1
E
[
βn,k
(
2piI
Z
n,k − 1
)
−
1
Kn
(
log 2piI
Z
n,k − γm,p
)]4
≤ K2n
Kn∑
k=1
∣
∣
∣
∣β
4
n,kE
[
2piI
Z
n,k − 1
]4
− 4β3n,k
1
Kn
E
[(
2piI
Z
n,k − 1
)3(
log 2piI
Z
n,k − γm,p
)]
+6β2n,k
1
K2n
E
[(
2piI
Z
n,k − 1
)2(
log 2piI
Z
n,k − γm,p
)2]
−4βn,k
1
K3n
E
[(
2piI
Z
n,k − 1
)(
log 2piI
Z
n,k − γm,p
)3]
+
1
K4n
E
[
log 2piI
Z
n,k − γm,p
]4
∣
∣
∣
∣
= O(1)
{
K2n
Kn∑
k=1
β4n,k +Kn
Kn∑
k=1
β3n,k +
Kn∑
k=1
β2n,k +
1
Kn
+
1
Kn
}
= O
( 1
n
)
,
where we have used (3.11) for the last estimate. This establishes the Lyapunov condition and the
asymptotic normality of Tn follows observing that Tn and T˜n have the same asymptotic behavior.
Remark. 3.2. Note that Theorem 3.1 holds under the null hypothesis and under the alternative,
in particular it reduces to Theorem 3.1 in Fay and Philippe (2002). These authors did not assume a
Gaussian white noise in the linear representation (2.1). This assumption was made here for the sake
of transparent presentation and Theorem 3.1 remains valid in the general case, where the asymptotic
variance has to be replaced by
τ 2∆ +
κ4αm,p
8(m+ p)
.
Here the constant αm,p is defined by
αm,p = E2
[
‖ζ‖2Φm,p(ζ)
]
with
Φm,p(x) =
ψm,p(x)
2m
− 1− ln
(ψm,p(x)
2m
)
+ γm,p,
ψm,p(x) =
(2p
p
)−1
m∑
j=1
∣
∣
∣
∣
p∑
l=0
(p
l
)
(−1)l
(
x2(j+l)−1 + ix2(j+l)
)
∣
∣
∣
∣
2
and ζ is a 2(m + p)−dimensional standard Gaussian vector. Note that αm,p is the same as in the
asymptotic variance under the null hypothesis in Fay and Philippe (2002).
9
Remark. 3.3. In this remark we indicate two important applications of the Theorem 3.1. For a more
detailed discussion we refer to Dette and Munk (2003).
(1) IfD(d0, θ0) is used as a measure for the deviation of the true spectral density from the parametric
class F0, we obtain from Theorem 3.1 a consistent estimate of D(d0, θ0), and it follows that the
interval [
0, Sˆn
(
I
X
n , g(·; dˆn, θˆn)
)
+
τˆ∆√
Kn
u1−α
]
is an asymptotic (1−α) confidence interval for the logarithmic contrast D(d0, θ0), which measures
the deviation from the parametric class F0. Here u1−α denotes the (1−α) quantile of the standard
normal distribution and τˆ 2∆ is a consistent estimate of the asymptotic variance τ
2
∆.
(2) As pointed out by Fay and Philippe (2002) an application of the asymptotic normality of the
statistic Sn(I
X
n , g(·; dˆn, θˆn)) under the null hypothesis consists in the construction of an asymptotic
level α test for the hypothesis of a parametric form of the spectral density of the long range
dependence process. A consistent test is obtained by rejecting the null hypothesis whenever
Sn
(
I
X
n , g(·; dˆn, θˆn)
)
≥
τ0√
Kn
u1−α
where τ 20 denotes the asymptotic variance under the null hypothesis (which has to be estimated
in the case of a non Gaussian white noise). The asymptotic power of this test can now be
approximated by Theorem 3.1, that is
PH1(  H0 is rejected ) ≈ Φ
(
√
Kn
D(d0, θ0)
τ∆
−
τ0
τ∆
u1−α
)
,
where τ0 and τ∆ denote the (asymptotic) standard deviation of
√
KnSn
(
I
X
n , g(·; dˆn, θˆn)
)
under the
null hypothesis and alternative, respectively, and Φ is the distribution function of the standard
normal distribution.
Example. 3.4. In this example we illustrate the accuracy of the confidence interval for the distance
D(d0, θ0) in Remark 3.3(1) by means of a small simulation study. We assume that the process X =
(Xt)t∈Z is a Gaussian FARIMA(0, d, 0)-process with spectral density
g(λ; d, θ) =
1
2pi
∣
∣1− eiλ
∣
∣−2d
but generated data from a Gaussian FARIMA(0,0.4,1)-process with spectral density given by
f(λ) =
1
2pi
|1− 0.1eiλ|2|1− eiλ|−2·0.4.
Using the formula 3.631(8) in Gradshteyn and Ryzhik (1980), we approximately calculate d0 and D(d0)
as 0.3400325 and 0.003725739, respectively. We generated 5000 replications of the process for sample
10
n = 100 n = 200 n = 500 n = 1000
0.8 0.6822 0.7244 0.7846 0.7966
0.9 0.9076 0.896 0.9164 0.9122
0.95 0.9876 0.975 0.9682 0.9698
Table 1: Simulated coverage probabilities of the asymptotic confidence intervals defined in Remark 3.3(1)
sizes n = 100, 200, 500 and 1000 using the farimaSim function in the fArma package in R. The parameter
d0 in the variance τ 2∆ was estimated by the Whittle estimator in (3.1). The other quantities in the
asymptotic variance have been determined explicitly by numerical integration and are given by
γm,p = −0.1400195,
Var(2piI
Z
n,k) = 0.2795195,
Var(2piI
Z
n,k − log 2piI
Z
n,k) = 0.03776237.
For each series the 80% , 90% and 95% confidence intervals (p = 1, m = 5) were calculated and the
proportion of the intervals containing the true value D(0.34) are listed in Table 1. We observe reason-
able coverage probabilities in most cases. While the 90% confidence interval is already approximated
accurately for the samples size n = 100, larger sample sizes are required for the 80% and 95% confidence
interval.
4 Appendix: Technical details
In this appendix we provide the technical details for the stochastic expansions (3.7) - (3.9).
4.1 Proof of (3.7)
We use a Bartlett decomposition technique, i.e. we relate the periodogram of X to the periodogram of
Z and then apply Lemma 4.2 in Fay and Philippe (2002) to show that the difference is stochastically
small, i.e.
An = Sn
(
I
X
n , f(·)
)
= Sn
(
2piI
Z
n , 1
)
+Rn
= log
(
1
Kn
Kn∑
k=1
2piI
Z
n,k
)
−
1
Kn
Kn∑
k=1
log
(
2piI
Z
n,k
)
+ γm,p + op
( 1
√
Kn
)
.
Using (3.10) and the independence of the I
Z
n.k we can expand the first term into a Taylor series and
obtain the stochastic expansion in (3.7).
11
4.2 Proof of (3.8)
Recall the definition of Bn in (3.5). For a proof of (3.8) we use the Bartlett decomposition twice, which
yields
Bn = log
(∑Kn
k=1 I
X
n,k/g(xk; d0, θ0)
∑Kn
k=1 2piI
Z
n,k
)
− log
(∑Kn
k=1 I
X
n,k/f(xk)
∑Kn
k=1 2piI
Z
n,k
)
−
1
Kn
Kn∑
k=1
log
(
f(xk)
g(xk; d0, θ0)
)
= log
(
1
Kn
Kn∑
k=1
I
X
n,k
g(xk; d0, θ0)
)
− log
(
1
Kn
Kn∑
k=1
2piI
Z
n,k
)
−
1
Kn
Kn∑
k=1
log
(
f(xk)
g(xk; d0, θ0)
)
+ op
( 1
√
Kn
)
= log
( Kn∑
k=1
βn,k
I
X
n,k
f(xk)
)
+ log
(
1
Kn
Kn∑
k=1
f(xk)
g(xk; d0, θ0)
)
− log
(
1
Kn
Kn∑
k=1
2piI
Z
n,k
)
−
1
Kn
Kn∑
k=1
log
(
f(xk)
g(xk; d0, θ0)
)
+ op
( 1
√
Kn
)
,
where the second estimate follows from Lemma 2 in Hurvich et al. (2002). We note that by the central
limit theorem
(4.1)
1
Kn
Kn∑
k=1
(
2piI
Z
n,k − 1
)
=
1
Kn
Kn∑
k=1
2piI
Z
n,k − 1 = Op
( 1
√
n
)
.
We will show at the end of this section that
(4.2)
Kn∑
k=1
βn,k
(
I
X
n,k
f(xk)
− 1
)
= Op
( 1
√
n
)
,
then the expansion of the function log(1 + z) = z+ o(z2) yields with the estimates (4.2) and (4.1) (note
that
∑Kn
k=1 βn,k = 1)
Bn =
Kn∑
k=1
βn,k
(
I
X
n,k
f(xk)
− 1
)
−
1
Kn
Kn∑
k=1
(
2piI
Z
n,k − 1
)
+ op
( 1
√
Kn
)
+ log
(
1
Kn
Kn∑
k=1
f(xk)
g(xk; d0, θ0)
)
−
1
Kn
Kn∑
k=1
log
(
f(xk)
g(xk; d0, θ0)
)
.
Observing Lemma 11 in Hurvich et al. (2002) we have
E
∣
∣
∣
∣
Kn∑
k=1
βn,k
(
I
X
n,k
f(xk)
− 2piI
Z
n,k
)∣
∣
∣
∣ ≤
Kn∑
k=1
βn,kE
∣
∣
∣
∣
I
X
n,k
f(xk)
− 2piI
Z
n,k
∣
∣
∣
∣ ≤



O
(
n−1+2(d1−d0)
+)
if d1 − d0 6= 0
O
(
logn
n
)
if d1 − d0 = 0
12
which yields
(4.3)
Kn∑
k=1
βn,k
(
I
X
n,k
f(xk)
− 2piI
Z
n,k
)
= op
( 1
√
n
)
.
(note that d1 − d0 < 14 by assumption). Therefore the assertion in (3.8) follows from (4.2) and (4.3).
We conclude this section with a proof of the statement (4.2) which is obtained observing the decompo-
sition
Kn∑
k=1
βn,k
(
I
X
n,k
f(xk)
− 1
)
=
Kn∑
k=1
βn,k
(
I
X
n,k
f(xk)
− 2piI
Z
n,k
)
+
Kn∑
k=1
βn,k
(
2piI
Z
n,k − 1
)
= Op
( 1
√
Kn
)
where the last estimate follows again from (4.3) and a straightforward application of Chebyshev's
inequality.
4.3 Proof of (3.9)
Observing the definition (3.6) we decompose Cn as follows
Cn = C
(1)
n + C
(2)
n
where
C(1)n = log
(
1
Kn
Kn∑
k=1
I
X
n,k
g(xk; Γˆn)
)
− log
(
1
Kn
Kn∑
k=1
I
X
n,k
g(xk; Γ0)
)
,(4.4)
C(2)n =
1
Kn
Kn∑
k=1
log
g(xk; Γˆn)
g(xk; Γ0)
,(4.5)
and we have used the notation Γˆn = (dˆn, θˆn) and Γ0 = (d0, θ0). The assertion in (3.9) is now obtained
by treating these terms separately, that is
C(j)n = op
( 1
√
n
)
, j = 1, 2.(4.6)
For a proof of (4.6) in the case j = 1 we note that the estimate Γˆn = (dˆn, θˆn) is defined as a solution of
the equation
∂Qn(Γˆn)
∂Γ
= 0,
where the function Qn is defined in (3.1). Therefore a Taylor expansion yields
C(1)n = logQn(Γ0)− logQn(Γˆn)
=
1
2
(
Γ0 − Γˆn
)T 1
Qn(Γˆn)
∂2Qn(Γˆn)
∂Γ∂ΓT
(
Γ0 − Γˆn
)
+ o(‖Γ0 − Γˆn‖
2).
13
An extension of Theorem 2, Lemma 2 and 3 in Chen and Deo (2006) to the objective function (3.1)
yields
Γˆn − Γ0 = Op
( 1
√
n
)
1
Qn(Γˆn)
P
−→
1
Q(Γ0)
=
(∫ pi
0
f(λ)
g(λ; Γ0)
dλ
)−1
,
∂2Qn(Γˆn)
∂Γ∂ΓT
P
−→
∂2Q(Γ0)
∂Γ∂ΓT
,
and assertion (4.6) follows in the case j = 1.
In order to prove the statement in the case j = 2 we recall the definition in (4.5) and obtain by a Taylor
expansion
C(2)n =
1
Kn
Kn∑
k=1
{
1
g(xk; Γ0)
∂g(xk; Γ0)
∂Γ
(
Γˆn − Γ0
)}
+Op
( 1
n
)
= Op
( 1
√
n
) 1
Kn
Kn∑
k=1
{
1
g(xk; Γ0)
∂g(xk; Γ0)
∂Γ
}
+ op
( 1
√
n
)
(4.7)
where we have again used an extension of Theorem 2 in Chen and Deo (2006) to the loss function (3.1).
From the assumption g(λ; Γ) ∈ F0 we have
∫ pi
−pi
log g(λ; Γ) dλ =
∫ pi
−pi
log g∗(λ, θ)dλ = 0
for all Γ ∈ D × Θ, which implies (observing the symmetry of the function g) that the sum in (4.7)
converges to 0 (a.s.). This proves the statement (4.6) in the case j = 2.
Acknowledgements The authors would like to thank Martina Stein, who typed parts of this manuscript
with considerable technical expertise. We are also grateful to G. Fay for helpful discussion during the
preparation of this manuscript. The work of the authors has been supported in part by the Collabora-
tive Research Center Statistical modeling of nonlinear dynamic processes (SFB 823) of the German
Research Foundation (DFG).
References
Beran, J. (1992). A goodness-of-fit test for time series with long-range dependence. Journal of the
Royal Statistical Society, Ser. B, 54(3):749760.
Beran, J. (1994). Statistics for Long-Memory Processes. Chapman and Hall, New York.
14
Chen, W. W. and Deo, R. S. (2004). A generalized Portmanteau goodness-of-fit test for time series
models. Econometric Theory, 20:382416.
Chen, W. W. and Deo, R. S. (2006). Estimation of misspecified long-memory models. Journal of
Econometrics, 134(1):257281.
Dahlhaus, R. (1989). Efficient parameter estimation for self-similar processes. The Annals of Statistics,
17(4):17491766.
Deo, R. S. and Chen, W. W. (2000). On the integral of the squared periodogram. Stochastic Processes
and their Applications, 85:159176.
Dette, H. (1999). A consistent test for the functional form of a regression based on a difference of
variance estimators. Annals of Statistics, 27:10121040.
Dette, H. (2002). A consistent test for heteroscedasticity in nonparametric regression based on the
kernel method. Journal of Statistical Planning and Inference, 103:311329.
Dette, H. and Munk, A. (2003). Some methodological aspects of validation of models in nonparametric
regression. Statistica Neerlandica, 57:207244.
Dette, H. and Spreckelsen, I. (2003). A note on a specification test for time series models based on
spectral density estimation. Scandinavian Journal of Statistics, 30:481491.
Fay, G. and Philippe, A. (2002). Goodness-of-fit test for long range dependent processes. ESAIM:
Probability and Statistics, 6:239258.
Fox, R. and Taqqu, M. S. (1986). Large-sample properties of parameter estimates for strongly dependent
stationary gaussian time series. Annals of Statistics, 14:517532.
Giraitis, L. and Surgailis, D. (1990). A central limit theorem for quadratic forms in strongly dependent
linear variables and its application to asymptotical normality of Whittle's estimate. Probability Theory
and Related Fields, 86(1):87104.
Gradshteyn, I. and Ryzhik, I. (1980). Table of Integrals, Series, and Products. Academic Press, New
York.
Granger, C. (1980). Long memory relationships and the aggregation of dynamic models. Journal of
Econometrics, 14(2):227238.
Granger, R. and Joyeux, R. (1980). An introduction to long-memory time series models and fractional
differencing. Journal of Time Series Analysis, 1(1):1529.
15
Greene, M. and Fielitz, B. (1977). Long-term dependence in common stock returns. Journal of Financial
Economics, 4(3):339349.
Hosking, J. (1981). Fractional differencing. Biometrika, 68(1):165176.
Hurvich, C., Moulines, E., and Soulier, P. (2002). The FEXP estimator for potentially non-stationary
linear time series. Stochastic Processes and their Applications, 97(2):307340.
Koutsoyiannis, D., Makropoulos, C., Langousis, A., Baki, S., Efstratiadis, A., Christofides, A., Kara-
vokiros, G., and Mamassis, N. (2009). HESS opinions: Climate, hydrology, energy, water: recognizing
uncertainty and seeking sustainability. Hydrology and Earth System Sciences, 13:247257.
Mokkadem, A. (1997). A measure of information and its applications to test for randomness against
ARMA alternatives and to goodness-of-fit test. Stochastic Processes and their Applications, 72(2):145
159.
Paparoditis, E. (2000). Spectral density based goodness-of-fit tests for time series models. Scandinavian
Journal of Statistics, 27:143176.
Park, K. and Willinger, W. (2000). Self-similar network traffic: An overview. In Park, K. and Willinger,
W., editors, Self-Similar Network Traffic and Performance Evaluation, pages 139. Wiley Interscience,
New York.
Stroe-Kunold, E., Stadnytska, T., Werner, J., and Braun, S. (2009). Estimating long-range dependence
in time series: An evaluation of estimators implemented in R. Behavior Research Methods, 41:909
923.
Whittle, P. (1953). Estimation and information in stationary time series. Arkiv for Matematik, 1:423
434.
16