SFB 823 Goodness-of-fit tests in long- range dependent processes under fixed alternatives D iscussion P aper Holger Dette, Kemal Sen Nr. 48/2010 Goodness-of-fit tests in long-range dependent processes under fixed alternatives Holger Dette, Kemal Sen Ruhr-Universität Bochum Fakultät für Mathematik 44780 Bochum Germany email: holger.dette@ruhr-uni-bochum.de, kemal.sen@ruhr-uni-bochum.de FAX: +49 234 32 14559 December 3, 2010 Abstract In a recent paper Fay and Philippe (2002) proposed a goodness-of-fit test for long-range depen- dent processes which uses the logarithmic contrast as information measure. These authors estab- lished asymptotic normality under the null hypothesis and local alternatives. In the present note we extend these results and show that the corresponding test statistic is also normally distributed under fixed alternatives. AMS Subject Classification: 60F05, 62F03 Keywords: long-range dependence, goodness-of-fit test, asymptotic power, periodogram 1 Introduction Nowadays long-range dependent processes represent a well accepted class of stochastic processes for modelling real phenomena in such diverse areas as hydrology, behaviour research, network traffic or finance [see Koutsoyiannis et al. (2009), Stroe-Kunold et al. (2009), Park and Willinger (2000), Granger 1 (1980), Greene and Fielitz (1977) among many others]. Numerous parametric models have been pro- posed for the analysis with long-range dependent processes. The most important among them are fractional ARIMA processes which were independently introduced by Granger and Joyeux (1980) and Hosking (1981) and fractional Gaussian noise processes [see Beran (1994)]. Many of the methods assume that the specific form of the spectrum is known except for a finite dimensional parameter. The results of the statistical analysis depend sensitively on pre-specified model assumptions, and the conclusions from the data may be misleading if these assumptions are violated. For this reason several authors have pointed out the importance of being able to check the goodness-of-fit of a specific model assumption in long-range dependent processes. Beran (1992) proposed a method for testing how well a specified model, such as a fractional Gaussian noise, fits the data. His results were extended by Deo and Chen (2000) who investigated an integral of the squared periodogram. Chen and Deo (2004) suggested a generalized Portmanteau test based on the discrete spectral average estimator and obtained the asymp- totic null distribution for Gaussian long-memory time series. While most of the tests proposed by these authors are based on the estimation of the L2 distance between the unknown spectral density and the best approximation by the parametric class, Fay and Philippe (2002) used a logarithmic contrast for the construction of a test for a specific parametric form of the spectral density [see also Mokkadem (1997) or Dette and Spreckelsen (2003) for an application of this measure in the context of ARMA processes]. These authors established the asymptotic normality of a corresponding test statistic under the null hypothesis and local alternatives. As pointed out by Chen and Deo (2004), most theoretical results in the context of goodness-of-fit testing address the asymptotic behaviour of a test statistic when the null hypothesis is correctly specified, and an additional question of interest is the power property of the corresponding test when the null hypothesis is actually misspecified. This problem requires asymptotic inference under the alternative and has found considerable interest in the context of classical regression analysis [see Dette (1999) or Dette (2002) among others]. Dette and Spreckelsen (2003) investigated the asymptotic properties of an L2-test proposed by Paparoditis (2000) for the parametric form of the spectral density in stationary short-range dependent processes, but less results are available for goodness-of-fit tests in long range dependence processes. The present paper is devoted to the asymptotic analysis of the test statistic proposed by Fay and Philippe (2002) under fixed alternatives. In Section 2 we introduce the necessary notations and assumptions and review the results of Fay and Philippe (2002). Section 3 presents our main results which show that the test statistic proposed by Fay and Philippe (2002) is also asymptotically normally distributed under fixed alternatives. We state a general result which contains the situation of a true null hypothesis as a special case and also discuss potential applications of our results. Finally, for the sake of a transparent presentation, some technical details are deferred to an appendix in Section 4. 2 2 Preliminaries Let X = (Xt)t∈Z denote a stochastic process which admits the linear representation (2.1) Xt = σ ∑ j∈Z ajZt−j where ∑ j∈Z a 2 j < ∞ and (Zt)t∈Z denotes a Gaussian white noise process. Following Fay and Philippe (2002) we represent the spectral density f of the process (Xt)t∈Z as f(λ) = σ2|1− eiλ|−2d1f ∗(λ); λ ∈ [−pi, pi](2.2) where d1 ∈ [0, 1/2) and f ∗ is a twice continuously differentiable function defined on the interval [−pi, pi] and bounded away from zero. We are interested in the problem of testing for a specific parametric form of the spectral density of the process (Xt)t∈Z, that is (2.3) H0 : f ∈ F0 . Here F0 denotes a parametric class of spectral densities defined by (2.4) F0 = { f(λ) = σ2|1− eiλ|−2dg∗(λ; θ) ∣ ∣ ∣ (d, θ) ∈ D ×Θ, σ2 > 0, g∗ ∈ G } , where D is a compact subset of the interval [0, 1/2), Θ ⊂ Rl denotes a compact set (l ∈ N) and G is the set of positive and symmetric functions defined on the interval [−pi, pi] satisfying ∫ pi −pi log g∗(x; θ)dx = 0 . For a given g∗ ∈ G we define g(λ; d, θ) = |1− eiλ|−2dg∗(λ; θ). For the testing problem (2.3) Fay and Philippe (2002) proposed to measure deviations from the null hypothesis by inf d∈D,θ∈Θ S(f, f(·, d, θ))(2.5) where S(f, f(·, d, θ)) = log ∫ pi −pi f(λ) f(λ, d, θ) dλ 2pi − ∫ pi −pi log f(λ) f(λ, d, θ) dλ 2pi (2.6) denotes a logarithmic contrast between the spectral density f and an element of the class F0. Note that the information measure in (2.6) is always nonnegative and that the null hypothesis is satisfied if and only if the expression in (2.5) vanishes. The logarithmic contrast has been used before by Mokkadem 3 (1997) and Dette and Spreckelsen (2003) for testing hypotheses in ARMA processes. In order to estimate the minimal distance Fay and Philippe (2002) proposed to consider a tapered Fourier transform of the series {X1, . . . , Xn} that is d(p)n,k = 1 √ 2pin n∑ t=1 w(p)n,tXte iλkt; k = 1, . . . , n where λk = 2pik/n are the Fourier frequencies, w(p)n,t = (2p p )− 12 ( 1− ei 2pit n )p ; t = 1, . . . , n is the data taper and p ∈ N0 denotes the order of the taper (note that p = 0 yields w (0) n,t = 1, t = 1, . . . , n). These quantities are used to define a pooled periodogram by I X n,k := 1 m (m+p)k−p∑ j=(m+p)(k−1)+1 |d(p)n,j| 2; k = 1, . . . , Kn. Throughout this paper I Z n,k denotes the pooled and tapered periodogram of the Gaussian white noise Z1, . . . , Zn. Note that the interval [0, pi] is decomposed in Kn = b n−12(m+p)c intervals (m ∈ N) of the form [λ(k−1)(m+p), , λk(m+p)] and that the center of the kth interval is given by (2.7) xk := (m+ p) 2pi n ( k − 1 2 ) . Fay and Philippe (2002) introduced the discretized version of (2.6), i.e. Sn ( I X n , g(·; d, θ) ) = log ( 1 Kn Kn∑ k=1 I X n,k g(xk; d, θ) ) − 1 Kn Kn∑ k=1 log ( I X n,k g(xk; d, θ) ) + γm,p where the constant γm,p is defined by (2.8) γm,p = E [ log 2piI Z n,k ] which is a centering constant, such that the expectation under the null hypothesis vanishes asymptoti- cally. For the cases 1.) d0 = 0, m ≥ 5 and p = 0 or p = 1 2.) d0 > 0, m ≥ 5 and p = 1 4 Fay and Philippe (2002) proved that under the null hypothesis, i.e. f(λ) = g(λ, d0, θ0) for some (d0, θ0) ∈ D×Θ and certain assumptions of regularity [see Section 3 for details], the statistic √ KnSn(I X n , g(·; dˆn, θˆn)) converges weakly, that is (2.9) Tn = √ KnSn ( I X n , g(·; dˆn, θˆn) ) d → N ( 0, τ 20 ) , where (dˆn, θˆn) is any estimator of the true parameter (d0, θ0) satisfying ∥ ∥(dˆn, θˆn)− (d0, θ0) ∥ ∥ = Op ( 1 √ n ) and the asymptotic variance in (2.9) is given by τ 20 := Var ( 2piI Z n,k − log ( 2piI Z n,k )) . For a discussion of the quantities γm,p and τ 20 we refer to Hurvich et al. (2002). Note that these authors did not assume a Gaussian white noise process, but considered a general white noise process (Zt)t∈Z with several assumptions regarding the characteristic function E[exp(iZt)]. In this case there appears an additional constant in the asymptotic variance depending on the fourth cumulant of the white noise process. In the following we will study the asymptotic properties of the statistic Tn if the null hypothesis is not satisfied. For the sake of simplicity, we restrict ourselves to the Gaussian case. The general case is briefly discussed in Remark 3.2. 3 Weak convergence under fixed alternatives If the null hypothesis is not satisfied, then the minimum distance in (2.5) is positive. Throughout this paper we assume that there exists a unique pair (d0, θ0) ∈ (D ×Θ)0 such that inf (d,θ)∈D×Θ S(f, g(·; d, θ)) = S(f, g(·; d0, θ0)), where C0 denotes the interior of the set C ⊂ Rl+1 and D in (2.4) is defined by D = [δ, 1/2 − δ] for some 0 < δ < 1/4. We further assume that the set Θ is additionally convex [see Chen and Deo (2006)]. Note that (d0, θ0) is the parameter corresponding to the best approximation of the spectral density f by densities of the class F0. Throughout this paper let (dˆn, θˆn) denote a Whittle type estimate [Whittle (1953)] which is defined as the minimizer of the objective function Qn(d, θ) = pi Kn Kn∑ j=1 I x n,j g(xj, d, θ) (3.1) where xj is defined in (2.7). In the case where the model is correctly specified, the asymptotic behaviour of the maximum likelihood estimator was investigated by Dahlhaus (1989). The Whittle estimator 5 was investigated by Fox and Taqqu (1986) and Giraitis and Surgailis (1990) for Gaussian and linear processes, respectively. Recently, Chen and Deo (2006) derived the asymptotic properties of an estimator minimizing an approximation to the negative of the exact Gaussian likelihood [Whittle (1953)] in the case of misspecified long-range dependent processes. Note that in contrast to these results, the objective function considered in (3.1) is based on the tapered and pooled periodogram in this definition, while Chen and Deo (2006) considered the classical periodogram in the objective function (3.1). A careful inspection of the proofs in this reference shows that the main results, in particular Theorem 2 and Lemma 2 of Chen and Deo (2006), remain valid in this case. It is also notable that the asymptotic properties  in particular the rate of convergence  depend sensitively on the distance d1 − d0. If d1 − d0 ≤ 1/4 the estimator √ n((dˆn, θˆn) − (d0, θ0)) is asymptotically normal distributed, while in the case d1− d0 > 1/4 the difference converges in distribution with a different rate to a non-Gaussian limit. In particular, the rate of convergence can be arbitrarily small in this case. In our main result we specify the asymptotic behaviour of the test statistic proposed by Fay and Philippe (2002) in the case of a misspecified model. For this purpose we define by D(d0, θ0) := log ( 1 pi ∫ pi 0 f(x) g(x; d0, θ0) dx ) − 1 pi ∫ pi 0 log ( f(x) g(x; d0, θ0) ) dx(3.2) as the minimal distance between the true spectral density f and the parametric class defined in (2.4) with respect to the logarithmic contrast introduced in (2.6). Note that the null hypothesis (2.3) is satisfied if and only if D(d0, θ0) = 0. We assume that (Xt)t∈Z is a stationary process with linear representation (2.1) where the innovations (Zt)t∈Z define a Gaussian white noise process and the spectral density of (Xt)t∈Z is given by (2.2). Theorem. 3.1. Let (Xt)t∈Z be a stationary process with linear representation (2.1) and Gaussian white noise (Zt)t∈Z, d1 ∈ (0, 1/2), p = 1, m ≥ 5, d1 − d0 < 1/4, and assume that the following conditions are satisfied: (A1) g∗(λ; θ) is three times continuously differentiable . (A2) infθ infλ g∗(λ; θ) > 0, supθ supλ g ∗(λ; θ) <∞. (A3) supλ supθ ∣ ∣ ∣ ∂g∗(λ;θ) ∂θi ∣ ∣ ∣ <∞; 1 ≤ i ≤ l. (A4) supλ supθ ∣ ∣ ∣ ∂2g∗(λ;θ) ∂θi∂θj ∣ ∣ ∣ <∞, supλ supθ ∣ ∣ ∣ ∂2g∗(λ;θ) ∂θi∂λ ∣ ∣ ∣ <∞; 1 ≤ i, j ≤ l. (A5) supλ supθ ∣ ∣ ∣ ∂3g∗(λ;θ) ∂θi∂θj∂θk ∣ ∣ ∣ <∞; 1 ≤ i, j, k ≤ l. (A6) ∫ pi −pi log g ∗(λ; θ) dλ = 0 for all θ ∈ Θ. 6 If n→∞, then √ Kn { Sn ( I X n , g(·; dˆn, θˆn) ) −D(d0, θ0) } D −→ N (0, τ 2∆) where D(d0, θ0) denotes the minimal distance between the parametric class F0 and the unknown spectral density f defined in (2.2) and the asymptotic variance is given by τ 2∆ := (∆− 1)Var ( 2piI Z n,k ) + Var ( 2piI Z n,k − log 2piI Z n,k ) with ∆ = pi ∫ pi 0 ( f(x) g(x; d0, θ0) )2 dx (∫ pi 0 f(x) g(x; d0, θ0) dx )−2 .(3.3) Proof. Recalling the definition of the statistic Tn in (2.9) we introduce the decomposition Tn = √ Kn { Sn ( I X n , g(·; dˆn, θˆn) ) −D(d0, θ0) } = √ Kn { An +Bn + Cn } , where the random variables An, Bn and Cn are defined by An := Sn ( I X n , f(·) ) ,(3.4) Bn := Sn ( I X n , g(·; d0, θ0) ) − Sn ( I X n , f(·) ) ,(3.5) Cn := Sn ( I X n , g(·; dˆn, θˆn) ) − Sn ( I X n , g(·; d0, θ0) ) ,(3.6) respectively. In the Appendix we will show that An = 1 Kn Kn∑ k=1 { 2piI Z n,k − 1− log 2piI Z n,k + γm,p } + op ( 1 √ Kn ) ,(3.7) Bn = Kn∑ k=1 ( βn,k − 1 Kn )( 2piI Z n,k − 1 ) (3.8) + log ( 1 Kn Kn∑ k=1 f(xk) g(xk; d0, θ0) ) − 1 Kn Kn∑ k=1 log ( f(xk) g(xk; d0, θ0) ) + op ( 1 √ Kn ) , Cn = op ( 1 √ Kn ) (3.9) where I Z n,k denotes the pooled and tapered periodogram of the Gaussian white noise process (Zt)t∈Z and the constants βn,k are defined by βn,k = f(xk) g(xk;d0,θ0) ∑Kn j=1 f(xj) g(xj ;d0,θ0) = |1− eixk |−2(d1−d0) f ∗(xk) g∗(xk;θ0) ∑Kn j=1 |1− e ixj |−2(d1−d0) f ∗(xj) g∗(xj ;θ0) . 7 Observing the approximation log ( 1 Kn Kn∑ k=1 f(xk) g(xk; d0, θ0) ) − 1 Kn Kn∑ k=1 log ( f(xk) g(xk; d0, θ0) ) = log ( 1 Kn Kn∑ k=1 |1− eixk |−2(d1−d0) f ∗(xk) g∗(xk; θ0) ) − 1 Kn Kn∑ k=1 log ( |1− eixk |−2(d1−d0) f ∗(xk) g∗(xk; θ0) ) = D(d0, θ0) +O ( n−1+2(d1−d0) +) , it follows that the weak convergence of the statistic Tn can be obtained from the asymptotic properties of the random variable T˜n = √ Kn Kn∑ k=1 {( βn,k2piI Z n,k − 1 Kn log 2piI Z n,k ) − ( βn,k − 1 Kn γm,p )} . For this purpose we use the central limit theorem of Ljapunov. To precise we note that the random variables 2piI Z n,k are independent identically distributed with existing fourth moment satisfying E [ 2piI Z n,k ] = 1; k = 1, . . . Kn.(3.10) Therefore we obtain for the variance of T˜n by a straightforward calculation Var[T˜n] = Var ( 2piI Z n,k ) Kn Kn∑ k=1 β2n,k + Var ( log 2piI Z n,k ) − 2 { E [ 2piI Z n,k log 2piI Z n,k ] − γm,p } . Observing the approximation 1 Kn Kn∑ k=1 ( f(xk) g(xk; d0, θ0) )j = 1 Kn Kn∑ k=1 ( |1− eixk |−2(d1−d0) f ∗(xk) g∗(xk, θ0) )j = 1 pi ∫ pi 0 ( f(x) g(x; d0, θ0) )j dx+O ( n−1+2j(d1−d0) +) ; j = 1, 2 we obtain by a tedious calculation lim n→∞ Kn Kn∑ k=1 β2n,k = ∆, where ∆ is defined in (3.3) and (3.11) Kn∑ k=1 β2n,k ≤ C n (note that d1− d0 < 14 by assumption). Combining these results gives for the asymptotic variance of T˜n lim n→∞ Var[T˜n] = τ 2 ∆ 8 where τ 2∆ is defined in Theorem 3.1. Note that E[log 2piI Z n,k] 4 is constant, then a similar calculation yields for the numerator in the Ljapunov condition K2n Kn∑ k=1 E [ βn,k ( 2piI Z n,k − 1 ) − 1 Kn ( log 2piI Z n,k − γm,p )]4 ≤ K2n Kn∑ k=1 ∣ ∣ ∣ ∣β 4 n,kE [ 2piI Z n,k − 1 ]4 − 4β3n,k 1 Kn E [( 2piI Z n,k − 1 )3( log 2piI Z n,k − γm,p )] +6β2n,k 1 K2n E [( 2piI Z n,k − 1 )2( log 2piI Z n,k − γm,p )2] −4βn,k 1 K3n E [( 2piI Z n,k − 1 )( log 2piI Z n,k − γm,p )3] + 1 K4n E [ log 2piI Z n,k − γm,p ]4 ∣ ∣ ∣ ∣ = O(1) { K2n Kn∑ k=1 β4n,k +Kn Kn∑ k=1 β3n,k + Kn∑ k=1 β2n,k + 1 Kn + 1 Kn } = O ( 1 n ) , where we have used (3.11) for the last estimate. This establishes the Lyapunov condition and the asymptotic normality of Tn follows observing that Tn and T˜n have the same asymptotic behavior. Remark. 3.2. Note that Theorem 3.1 holds under the null hypothesis and under the alternative, in particular it reduces to Theorem 3.1 in Fay and Philippe (2002). These authors did not assume a Gaussian white noise in the linear representation (2.1). This assumption was made here for the sake of transparent presentation and Theorem 3.1 remains valid in the general case, where the asymptotic variance has to be replaced by τ 2∆ + κ4αm,p 8(m+ p) . Here the constant αm,p is defined by αm,p = E2 [ ‖ζ‖2Φm,p(ζ) ] with Φm,p(x) = ψm,p(x) 2m − 1− ln (ψm,p(x) 2m ) + γm,p, ψm,p(x) = (2p p )−1 m∑ j=1 ∣ ∣ ∣ ∣ p∑ l=0 (p l ) (−1)l ( x2(j+l)−1 + ix2(j+l) ) ∣ ∣ ∣ ∣ 2 and ζ is a 2(m + p)−dimensional standard Gaussian vector. Note that αm,p is the same as in the asymptotic variance under the null hypothesis in Fay and Philippe (2002). 9 Remark. 3.3. In this remark we indicate two important applications of the Theorem 3.1. For a more detailed discussion we refer to Dette and Munk (2003). (1) IfD(d0, θ0) is used as a measure for the deviation of the true spectral density from the parametric class F0, we obtain from Theorem 3.1 a consistent estimate of D(d0, θ0), and it follows that the interval [ 0, Sˆn ( I X n , g(·; dˆn, θˆn) ) + τˆ∆√ Kn u1−α ] is an asymptotic (1−α) confidence interval for the logarithmic contrast D(d0, θ0), which measures the deviation from the parametric class F0. Here u1−α denotes the (1−α) quantile of the standard normal distribution and τˆ 2∆ is a consistent estimate of the asymptotic variance τ 2 ∆. (2) As pointed out by Fay and Philippe (2002) an application of the asymptotic normality of the statistic Sn(I X n , g(·; dˆn, θˆn)) under the null hypothesis consists in the construction of an asymptotic level α test for the hypothesis of a parametric form of the spectral density of the long range dependence process. A consistent test is obtained by rejecting the null hypothesis whenever Sn ( I X n , g(·; dˆn, θˆn) ) ≥ τ0√ Kn u1−α where τ 20 denotes the asymptotic variance under the null hypothesis (which has to be estimated in the case of a non Gaussian white noise). The asymptotic power of this test can now be approximated by Theorem 3.1, that is PH1(  H0 is rejected ) ≈ Φ ( √ Kn D(d0, θ0) τ∆ − τ0 τ∆ u1−α ) , where τ0 and τ∆ denote the (asymptotic) standard deviation of √ KnSn ( I X n , g(·; dˆn, θˆn) ) under the null hypothesis and alternative, respectively, and Φ is the distribution function of the standard normal distribution. Example. 3.4. In this example we illustrate the accuracy of the confidence interval for the distance D(d0, θ0) in Remark 3.3(1) by means of a small simulation study. We assume that the process X = (Xt)t∈Z is a Gaussian FARIMA(0, d, 0)-process with spectral density g(λ; d, θ) = 1 2pi ∣ ∣1− eiλ ∣ ∣−2d but generated data from a Gaussian FARIMA(0,0.4,1)-process with spectral density given by f(λ) = 1 2pi |1− 0.1eiλ|2|1− eiλ|−2·0.4. Using the formula 3.631(8) in Gradshteyn and Ryzhik (1980), we approximately calculate d0 and D(d0) as 0.3400325 and 0.003725739, respectively. We generated 5000 replications of the process for sample 10 n = 100 n = 200 n = 500 n = 1000 0.8 0.6822 0.7244 0.7846 0.7966 0.9 0.9076 0.896 0.9164 0.9122 0.95 0.9876 0.975 0.9682 0.9698 Table 1: Simulated coverage probabilities of the asymptotic confidence intervals defined in Remark 3.3(1) sizes n = 100, 200, 500 and 1000 using the farimaSim function in the fArma package in R. The parameter d0 in the variance τ 2∆ was estimated by the Whittle estimator in (3.1). The other quantities in the asymptotic variance have been determined explicitly by numerical integration and are given by γm,p = −0.1400195, Var(2piI Z n,k) = 0.2795195, Var(2piI Z n,k − log 2piI Z n,k) = 0.03776237. For each series the 80% , 90% and 95% confidence intervals (p = 1, m = 5) were calculated and the proportion of the intervals containing the true value D(0.34) are listed in Table 1. We observe reason- able coverage probabilities in most cases. While the 90% confidence interval is already approximated accurately for the samples size n = 100, larger sample sizes are required for the 80% and 95% confidence interval. 4 Appendix: Technical details In this appendix we provide the technical details for the stochastic expansions (3.7) - (3.9). 4.1 Proof of (3.7) We use a Bartlett decomposition technique, i.e. we relate the periodogram of X to the periodogram of Z and then apply Lemma 4.2 in Fay and Philippe (2002) to show that the difference is stochastically small, i.e. An = Sn ( I X n , f(·) ) = Sn ( 2piI Z n , 1 ) +Rn = log ( 1 Kn Kn∑ k=1 2piI Z n,k ) − 1 Kn Kn∑ k=1 log ( 2piI Z n,k ) + γm,p + op ( 1 √ Kn ) . Using (3.10) and the independence of the I Z n.k we can expand the first term into a Taylor series and obtain the stochastic expansion in (3.7). 11 4.2 Proof of (3.8) Recall the definition of Bn in (3.5). For a proof of (3.8) we use the Bartlett decomposition twice, which yields Bn = log (∑Kn k=1 I X n,k/g(xk; d0, θ0) ∑Kn k=1 2piI Z n,k ) − log (∑Kn k=1 I X n,k/f(xk) ∑Kn k=1 2piI Z n,k ) − 1 Kn Kn∑ k=1 log ( f(xk) g(xk; d0, θ0) ) = log ( 1 Kn Kn∑ k=1 I X n,k g(xk; d0, θ0) ) − log ( 1 Kn Kn∑ k=1 2piI Z n,k ) − 1 Kn Kn∑ k=1 log ( f(xk) g(xk; d0, θ0) ) + op ( 1 √ Kn ) = log ( Kn∑ k=1 βn,k I X n,k f(xk) ) + log ( 1 Kn Kn∑ k=1 f(xk) g(xk; d0, θ0) ) − log ( 1 Kn Kn∑ k=1 2piI Z n,k ) − 1 Kn Kn∑ k=1 log ( f(xk) g(xk; d0, θ0) ) + op ( 1 √ Kn ) , where the second estimate follows from Lemma 2 in Hurvich et al. (2002). We note that by the central limit theorem (4.1) 1 Kn Kn∑ k=1 ( 2piI Z n,k − 1 ) = 1 Kn Kn∑ k=1 2piI Z n,k − 1 = Op ( 1 √ n ) . We will show at the end of this section that (4.2) Kn∑ k=1 βn,k ( I X n,k f(xk) − 1 ) = Op ( 1 √ n ) , then the expansion of the function log(1 + z) = z+ o(z2) yields with the estimates (4.2) and (4.1) (note that ∑Kn k=1 βn,k = 1) Bn = Kn∑ k=1 βn,k ( I X n,k f(xk) − 1 ) − 1 Kn Kn∑ k=1 ( 2piI Z n,k − 1 ) + op ( 1 √ Kn ) + log ( 1 Kn Kn∑ k=1 f(xk) g(xk; d0, θ0) ) − 1 Kn Kn∑ k=1 log ( f(xk) g(xk; d0, θ0) ) . Observing Lemma 11 in Hurvich et al. (2002) we have E ∣ ∣ ∣ ∣ Kn∑ k=1 βn,k ( I X n,k f(xk) − 2piI Z n,k )∣ ∣ ∣ ∣ ≤ Kn∑ k=1 βn,kE ∣ ∣ ∣ ∣ I X n,k f(xk) − 2piI Z n,k ∣ ∣ ∣ ∣ ≤    O ( n−1+2(d1−d0) +) if d1 − d0 6= 0 O ( logn n ) if d1 − d0 = 0 12 which yields (4.3) Kn∑ k=1 βn,k ( I X n,k f(xk) − 2piI Z n,k ) = op ( 1 √ n ) . (note that d1 − d0 < 14 by assumption). Therefore the assertion in (3.8) follows from (4.2) and (4.3). We conclude this section with a proof of the statement (4.2) which is obtained observing the decompo- sition Kn∑ k=1 βn,k ( I X n,k f(xk) − 1 ) = Kn∑ k=1 βn,k ( I X n,k f(xk) − 2piI Z n,k ) + Kn∑ k=1 βn,k ( 2piI Z n,k − 1 ) = Op ( 1 √ Kn ) where the last estimate follows again from (4.3) and a straightforward application of Chebyshev's inequality. 4.3 Proof of (3.9) Observing the definition (3.6) we decompose Cn as follows Cn = C (1) n + C (2) n where C(1)n = log ( 1 Kn Kn∑ k=1 I X n,k g(xk; Γˆn) ) − log ( 1 Kn Kn∑ k=1 I X n,k g(xk; Γ0) ) ,(4.4) C(2)n = 1 Kn Kn∑ k=1 log g(xk; Γˆn) g(xk; Γ0) ,(4.5) and we have used the notation Γˆn = (dˆn, θˆn) and Γ0 = (d0, θ0). The assertion in (3.9) is now obtained by treating these terms separately, that is C(j)n = op ( 1 √ n ) , j = 1, 2.(4.6) For a proof of (4.6) in the case j = 1 we note that the estimate Γˆn = (dˆn, θˆn) is defined as a solution of the equation ∂Qn(Γˆn) ∂Γ = 0, where the function Qn is defined in (3.1). Therefore a Taylor expansion yields C(1)n = logQn(Γ0)− logQn(Γˆn) = 1 2 ( Γ0 − Γˆn )T 1 Qn(Γˆn) ∂2Qn(Γˆn) ∂Γ∂ΓT ( Γ0 − Γˆn ) + o(‖Γ0 − Γˆn‖ 2). 13 An extension of Theorem 2, Lemma 2 and 3 in Chen and Deo (2006) to the objective function (3.1) yields Γˆn − Γ0 = Op ( 1 √ n ) 1 Qn(Γˆn) P −→ 1 Q(Γ0) = (∫ pi 0 f(λ) g(λ; Γ0) dλ )−1 , ∂2Qn(Γˆn) ∂Γ∂ΓT P −→ ∂2Q(Γ0) ∂Γ∂ΓT , and assertion (4.6) follows in the case j = 1. In order to prove the statement in the case j = 2 we recall the definition in (4.5) and obtain by a Taylor expansion C(2)n = 1 Kn Kn∑ k=1 { 1 g(xk; Γ0) ∂g(xk; Γ0) ∂Γ ( Γˆn − Γ0 )} +Op ( 1 n ) = Op ( 1 √ n ) 1 Kn Kn∑ k=1 { 1 g(xk; Γ0) ∂g(xk; Γ0) ∂Γ } + op ( 1 √ n ) (4.7) where we have again used an extension of Theorem 2 in Chen and Deo (2006) to the loss function (3.1). From the assumption g(λ; Γ) ∈ F0 we have ∫ pi −pi log g(λ; Γ) dλ = ∫ pi −pi log g∗(λ, θ)dλ = 0 for all Γ ∈ D × Θ, which implies (observing the symmetry of the function g) that the sum in (4.7) converges to 0 (a.s.). This proves the statement (4.6) in the case j = 2. Acknowledgements The authors would like to thank Martina Stein, who typed parts of this manuscript with considerable technical expertise. We are also grateful to G. Fay for helpful discussion during the preparation of this manuscript. The work of the authors has been supported in part by the Collabora- tive Research Center Statistical modeling of nonlinear dynamic processes (SFB 823) of the German Research Foundation (DFG). References Beran, J. (1992). A goodness-of-fit test for time series with long-range dependence. Journal of the Royal Statistical Society, Ser. B, 54(3):749760. Beran, J. (1994). Statistics for Long-Memory Processes. Chapman and Hall, New York. 14 Chen, W. W. and Deo, R. S. (2004). A generalized Portmanteau goodness-of-fit test for time series models. Econometric Theory, 20:382416. Chen, W. W. and Deo, R. S. (2006). Estimation of misspecified long-memory models. Journal of Econometrics, 134(1):257281. Dahlhaus, R. (1989). Efficient parameter estimation for self-similar processes. The Annals of Statistics, 17(4):17491766. Deo, R. S. and Chen, W. W. (2000). On the integral of the squared periodogram. Stochastic Processes and their Applications, 85:159176. Dette, H. (1999). A consistent test for the functional form of a regression based on a difference of variance estimators. Annals of Statistics, 27:10121040. Dette, H. (2002). A consistent test for heteroscedasticity in nonparametric regression based on the kernel method. Journal of Statistical Planning and Inference, 103:311329. Dette, H. and Munk, A. (2003). Some methodological aspects of validation of models in nonparametric regression. Statistica Neerlandica, 57:207244. Dette, H. and Spreckelsen, I. (2003). A note on a specification test for time series models based on spectral density estimation. Scandinavian Journal of Statistics, 30:481491. Fay, G. and Philippe, A. (2002). Goodness-of-fit test for long range dependent processes. ESAIM: Probability and Statistics, 6:239258. Fox, R. and Taqqu, M. S. (1986). Large-sample properties of parameter estimates for strongly dependent stationary gaussian time series. Annals of Statistics, 14:517532. Giraitis, L. and Surgailis, D. (1990). A central limit theorem for quadratic forms in strongly dependent linear variables and its application to asymptotical normality of Whittle's estimate. Probability Theory and Related Fields, 86(1):87104. Gradshteyn, I. and Ryzhik, I. (1980). Table of Integrals, Series, and Products. Academic Press, New York. Granger, C. (1980). Long memory relationships and the aggregation of dynamic models. Journal of Econometrics, 14(2):227238. Granger, R. and Joyeux, R. (1980). An introduction to long-memory time series models and fractional differencing. Journal of Time Series Analysis, 1(1):1529. 15 Greene, M. and Fielitz, B. (1977). Long-term dependence in common stock returns. Journal of Financial Economics, 4(3):339349. Hosking, J. (1981). Fractional differencing. Biometrika, 68(1):165176. Hurvich, C., Moulines, E., and Soulier, P. (2002). The FEXP estimator for potentially non-stationary linear time series. Stochastic Processes and their Applications, 97(2):307340. Koutsoyiannis, D., Makropoulos, C., Langousis, A., Baki, S., Efstratiadis, A., Christofides, A., Kara- vokiros, G., and Mamassis, N. (2009). HESS opinions: Climate, hydrology, energy, water: recognizing uncertainty and seeking sustainability. Hydrology and Earth System Sciences, 13:247257. Mokkadem, A. (1997). A measure of information and its applications to test for randomness against ARMA alternatives and to goodness-of-fit test. Stochastic Processes and their Applications, 72(2):145 159. Paparoditis, E. (2000). Spectral density based goodness-of-fit tests for time series models. Scandinavian Journal of Statistics, 27:143176. Park, K. and Willinger, W. (2000). Self-similar network traffic: An overview. In Park, K. and Willinger, W., editors, Self-Similar Network Traffic and Performance Evaluation, pages 139. Wiley Interscience, New York. Stroe-Kunold, E., Stadnytska, T., Werner, J., and Braun, S. (2009). Estimating long-range dependence in time series: An evaluation of estimators implemented in R. Behavior Research Methods, 41:909 923. Whittle, P. (1953). Estimation and information in stationary time series. Arkiv for Matematik, 1:423 434. 16