SFB 
823 
 
Testing model assumptions in 
functional regression models 
 
 
D
iscussion P
aper 
 
Axel Bücher, Holger Dette,   
Gabi Wieczorek 
 
 
Nr. 27/2009 
 
 
 
 
 
 
 
 
 
 
Testing model assumptions in functional regression models
Axel Bu¨cher, Holger Dette, Gabi Wieczorek
Ruhr-Universita¨t Bochum
Fakulta¨t fu¨r Mathematik
44780 Bochum, Germany
e-mail: axel.buecher@ruhr-uni-bochum.de
e-mail: holger.dette@ruhr-uni-bochum.de
September 28, 2009
Abstract
In the functional regression model where the responses are curves new tests for the func-
tional form of the regression and the variance function are proposed, which are based on a
stochastic process estimating L2-distances. Our approach avoids the explicit estimation of the
functional regression and it is shown that normalized versions of the proposed test statistics
converge weakly. The finite sample properties of the tests are illustrated by means of a small
simulation study. It is also demonstrated that for small samples bootstrap versions of the
tests improve the quality of the approximation of the nominal level.
Keywords and Phrases: goodness-of-fit tests, functional data, parametric bootstrap, tests for het-
eroscedasticity
AMS Subject Classification: 62G10
1 Introduction
Since the pioneering work by Ramsay and Dalzell (1991) on regression analysis for functional data
this topic has received considerable attention in the recent literature. The interest in statistical
techniques enabling to take into account the functional nature of data stems from the fact that
nowadays in many applications (for instance in climatology, remote sensing, linguistics, . . .) the
data comes from the observation of a continuous phenomenon over time. For a review on the
statistical problems and techniques for functional data we refer to the monographs of Ramsay and
Silverman (2005) and Ferraty and Vieu (2006). In these models either predictors or responses
can be viewed as random functions. The data typically appears when the value of a variable is
repeatedly recorded on a dense grid of time points for a sample of subjects. While many authors
consider the problem of estimating the regression or generalizing classical concepts of multivariate
statistics as principal component or discrimination analysis to the situation where the data are
curves [see for example Besse and Ramsay (1986), Faraway (1997), Kneip and Utikal (2001), Cuevas
1
et al. (2002), Ferraty and Vieu (2003), Escabias et al. (2004) or Mu¨ller and Stadtmu¨ller (2005)
among many others others], much less attention has been paid to the problem of testing model
assumptions when analyzing functional data.
Many authors discussed the problem of testing hypotheses in a linear functional data model. For
example, Cardot et al. (2003), Mu¨ller and Stadtmu¨ller (2005) and Cardot et al. (2004) considered
the problem of testing a simple hypothesis in the case where the response is real and the predictor
is a random function, while Mas (2007) investigated a test for the mean of random curves. Recently
Shen and Faraway (2004) and Yang et al. (2007) discussed an F -test in a linear longitudinal data
model, while Kokoszka et al. (2008) tested for lack of dependence in a functional linear model
where both response and predictor are curves.
The present work considers the problem of testing model assumptions in the nonparametric func-
tional regression model
Yi(u) = m(u, ti) + ε(u, ti), ti ∈ [0, 1], i = 1, . . . , n, (1.1)
where u varies (without loss of generality) in the interval [0, 1]. Our main concern deals with the
problem of validating a parametric assumption of the form
Yi(u) = g(u, ti, β) + ε(u, ti) , ti ∈ [0, 1], i = 1, . . . , n, (1.2)
where g is a given parametric functional regression model and β : [0, 1] → Rk denotes a function,
which depends either on the variable u or on t (note that both cases correspond to a different
parametric modeling of the functional data). The latter model has been considered in the linear
context by numerous authors. In particular Shen and Faraway (2004) and Yang et al. (2007) have
proposed generalizations of the F -test for model Y (u) = xTβ(u) + ε(u) and used these methods to
analyze some data from Ergonomics. While in this work the predictor x is considered as discrete
(as in the classical ANOVA model) we concentrate in the present paper on the case where the
variable t in (1.2) varies in continuous way. Our work is inspired by the recent paper of Hlubinka
and Prchal (2007), who proposed a functional regression model of the form (1.2) to study the
time-variation of vertical atmospheric radiation profiles by means of functional regression models.
These authors assumed that the parameter (function) β depends on the value t.
In Section 2 we introduce some notation and propose a test for the hypothesis that the regression
function in the nonparametric functional regression model (1.1) is of a specific parametric form as
given in (1.2) with a function β depending on the variable u, that is
H0 : m(u, t) = g(u, t, β(u)) (1.3)
for some parametric function g and a parameter β : [0, 1] → Rk. The case, where the parameter
depends on t is investigated in Section 3, where we consider the hypothesis
H0 : m(u, t) = h(u, t, γ(t)) (1.4)
for a parametric function h and some function γ : [0, 1]→ Rk. Finally, we discuss in Section 4 the
problem of testing parametric assumptions regarding the second order properties of the process
Y (u). More precisely, if r(t, u, v) = Cov(ε(u, t), ε(v, t)) denotes the covariance of the observations
Y (u) and Y (v), we are interested in the hypothesis
H0 : r(t, u, v) = r(u, v), (1.5)
2
which corresponds to the case of homoscedasticity. Note that this assumption is necessary for the
application of the F -tests proposed by Shen and Faraway (2004) and Yang et al. (2007). Moreover,
this assumption was also made by Hlubinka and Prchal (2007) who proposed a nonlinear functional
regression model for the analysis of changes in atmospheric radiation. The proposed tests for the
hypotheses (1.3), (1.4) and (1.5) are very simple and are based on stochastic processes of empirical
L2-distances between the nonparametric and parametric functional regression model. We prove
weak convergence of these processes under the null hypothesis and fixed alternatives and, as a
consequence, asymptotic normality of functionals of these processes. In Section 5 we demonstrate
by means of a simulation study that for moderate sample sizes the quantiles of the asymptotic
distribution provide a rather accurate approximation of the nominal level. On the other hand -
for small sample sizes - a wild bootstrap version of the test is proposed and its accuracy is also
investigated. Finally, some technical details are given in an Appendix.
2 A process of empirical L2-distances for testing (1.3)
Consider the nonparametric functional regression model defined by (1.1) and assume that n inde-
pendent observations are available at distinctive points 0 ≤ t1 < · · · < tn ≤ 1. For the discussion
of the asymptotic properties of the tests proposed in this paper we will assume that the design
points t1, . . . , tn satisfy
n
max
i=2
∣
∣
∣
∣
∫ ti
ti−1
h(t) dt−
1
n
∣
∣
∣
∣ = o(n
−(1+γ)), (2.1)
where h ∈ Lipγ[0, 1] is a strictly positive (unknown) density on the interval [0, 1], which is Lipschitz
continuous of order γ > 1/2 [see Sacks and Ylvisacker (1970)].
For the construction of a test for the hypothesis (1.3) of a parametric functional regression model
we consider the class of parametric models
M = {g(·, ·, β(·)) : [0, 1]× [0, 1] −→ R | β : [0, 1] −→ Θ} ,
where Θ is some subset of Rk. For the sake of transparency we first discuss the case of testing the
hypothesis of a linear functional regression model, that is
H0 : m(u, t) = g(u, t, β(u)) = β(u)
Tf(u, t) , (2.2)
where f(u, t) = (f1(u, t), . . . , fk(u, t))T are given regression functions. We define for fixed u ∈ [0, 1]
the inner product
〈p, q〉u =
∫
p(u, t)q(u, t)h(t) dt.
on the space of functions defined on the unit square on [0, 1]2 with corresponding norm || · ||u, and
consider
M2u = inf
β(u)
||m(u, ·)− β(u)Tf(u, ·)||2u
3
as the minimal distance from m to functions of the form (2.2). A standard result from Hilbert
space theory [see Achieser (1956)] yields that M2u can be expressed as a ratio of two Gramian
determinants, i.e.
M2u =
Γu(m, f1, . . . , fk)
Γu(f1, . . . fk)
,
where Γu(p1, . . . pk) = det(〈pi, pj〉u)i,j=1,...,k is the Gramian determinant of the function p1, . . . , pk.
In order to obtain an estimator for M2u we replace the inner products Au,0 = 〈m,m〉u, Au,p =
〈m, fp〉u and Bu,p,q = 〈fp, fq〉u by their empirical counterparts
Aˆu,0 =
1
n− 1
n∑
i=2
Yi(u)Yi−1(u),
Aˆu,p =
1
n
n∑
i=1
Yi(u)fp(u, ti),
Bˆu,p,q =
1
n
n∑
i=1
fp(u, ti)fq(u, ti),
where p, q = 1, . . . , k. This yields a canonical estimate
Mˆ2u =
∣
∣
∣
∣
∣
∣
∣
∣
∣
Aˆu,0 Aˆu,1 · · · Aˆu,k
Aˆu,1 Bˆu,1,1 · · · Bˆu,1,k
...
...
. . .
...
Aˆu,k Bˆu,k,1 · · · Bˆu,k,k
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
Bˆu,1,1 · · · Bˆu,1,k
...
. . .
...
Bˆu,k,1 · · · Bˆu,k,k
∣
∣
∣
∣
∣
∣
∣
(2.3)
of the L2-distance M2u . In the following discussion we will study the asymptotic properties of the
process {Mˆ2u}u∈[0,1]. Denote by
Lipunifγ [0, 1] = {f = f(x, ·) : |f(x, t)− f(x, s)| ≤ C|s− t|
γ; s, t ∈ [0, 1]} (2.4)
the set of all functions f : [0, 1] × [0, 1] → R satisfying a uniform Lipschitz condition (in other
words, the constant C in (2.4) does not depend on x) and assume that for some γ > 1/2 and for
all (t, u, v) ∈ [0, 1]3
fj(u, ·), fj(·, t) ∈ Lip
unif
γ [0, 1] j = 1, . . . , k
m(u, ·),m(·, t) ∈ Lipunifγ [0, 1],
r(·, u, v), r(t, ·, v), r(t, u, ·) ∈ Lipunifγ [0, 1],
where
r(t, u, v) = E[ε(u, t)ε(v, t)]
4
denotes the covariance of the (centered) errors at the point t. The following result specifies the
asymptotic properties of the stochastic process {Tˆu−M2u}u∈[0,1]. Throughout this paper the symbol
=⇒ denotes weak convergence.
Theorem 2.1. If the assumptions stated in this section are satisfied and the linear hypothesis
(2.2) has to be tested we have as n→∞
√
n(Tˆ 2u −M
2
u) =⇒ G,
in C[0, 1], where G is a centered Gaussian process with covariance k = k(u, v) given by
k(u, v) =
∫
r2(t, u, v)h(t) dt (2.5)
+ 4
∫
r(t, u, v)(m(u, t)− g(u, t, β0(u)))(m(v, t)− g(v, t, β0(v)))h(t) dt
and
β0(u) = argminβ||m(u, ·)− g(v, ·, β)||
2
u (2.6)
corresponds to the parameter of the best approximation of the function m(u, ·) by the parametric
regression model.
Proof of Theorem 2.1. We assume without loss of generality that the functions f1(u, ·), . . . , fk(u, ·)
are orthonormal with respect to the inner product 〈p, q〉u. Then the minimal L2-distance obtained
by the best approximation simplifies to
M2u = Au,0 −
k∑
p=1
A2u,p
It is easy to see that the statistics Aˆu,p and Bˆu,p,q are
√
n consistent estimates of the quantities
〈m, fp〉u = 0 and 〈fp, fq〉 = δp,q, respectively, and consequently we obtain for the statistic in (2.3)
Tn(u) =
√
n
{
Aˆu,0 −
k∑
p=1
Aˆ2u,p −M
2
u
}
+ op(1) = T¯n(u) + oP (1)
uniformly with respect to u ∈ [0, 1], where the last equality defines the process T¯n(u) in an obvious
manner. For the proof of weak convergence we have to show
(T¯n(u1), . . . , T¯n(um))
D
−→ (G(u1), . . . , G(um)) ∀u1, . . . , um ∈ [0, 1],m ∈ N
Tightness of the sequence (T¯n)n∈N.
The convergence of the finite dimensional distributions follows from Theorem 2.1 and its proof in
Dette et al. (1999). For a proof of tightness we use the decomposition T¯n(u) = Un(u) +Vn(u) with
Un(u) =
√
n
(
Aˆu,0 − EAˆu,0 − (
k∑
p=1
Aˆ2u,p − EAˆ
2
u,p)
)
5
Vn(u) =
√
n
(
EAˆu,0 − 〈m,m〉u − (
k∑
p=1
EAˆ2u,p − 〈m, fp〉u)
)
.
Consequently, it is sufficient to show that the (deterministic) sequence Vn(u) converges uniformly
to 0, i.e.
sup
u∈[0,1]
|Vn(u)| = o(1), (2.7)
and that the process {Un(u)}u∈[0,1] is tight. For this purpose we use Theorem 12.3 from Billingsley
(1968) and show that there are constants α > 0, γ ≥ 0 and a nondecreasing, continuous function
F on [0, 1] such that
E [|Un(u)− Un(v)|
γ] ≤ |F (u)− F (v)|α. (2.8)
We first prove (2.7) and introduce the decomposition
Vn(u) =
√
n
(
Vn0(u)−
k∑
p=1
Vnp(u)
)
with Vn0(u) = EAˆu,0 − 〈m,m〉u and Vnp(u) = EAˆ2u,p − 〈m, fp〉
2
u. Assertion (2.7) follows from
sup
u∈[0,1]
|Vnp(u)| = o(n
− 12 ) , p = 0, . . . , k (2.9)
We consider exemplarily the first summand Vn0(u), which can be represented as
Vn0(u) = A1(u) + A2(u) + A3(u)
with
A1(u) =
1
n− 1
n∑
i=1
m2(u, ti)− 〈m,m〉u
A2(u) = −
1
n− 1
n∑
i=2
m(u, ti)(m(u, ti)−m(u, ti−1))
A3(u) = −
1
n− 1
m(u, t1).
Using the the fact that m2(u, ·) ∈ Lipunifγ [0, 1] and taking into account that max
n
i=2 |ti − ti−1| =
O(n−γ) = o(n−
1
2 ), by (2.1), we obtain that all terms are of order o(n−1/2), uniformly with respect
to u ∈ [0, 1]. This proves (2.9) for p = 0 and similar arguments for the remaining terms show that
(2.7) holds.
In order to show that condition (2.8) is valid we calculate
E
[
(Un(u)− Un(v))
2
]
= n (B1 +B2 +B3 +B4),
6
where
B1 = Var(Aˆu,0) + Var(Aˆv,0)− 2 Cov(Aˆu,0, Aˆv,0),
B2 = Var(
k∑
p=1
Aˆ2u,p) + Var(
k∑
p=1
Aˆ2v,p)− 2 Cov(
k∑
p=1
Aˆ2u,p,
k∑
p=1
Aˆ2v,p),
B3 = 2 Cov(Aˆu,0,
k∑
p=1
Aˆ2v,p)− 2 Cov(Aˆu,0,
k∑
p=1
Aˆ2u,p),
B4 = 2 Cov(
k∑
p=1
Aˆ2u,p, Aˆv,0)− 2 Cov(Aˆv,0,
k∑
p=1
Aˆ2v,p).
We now show that it is possible to find, for each term Bi = Bi(u, v) (i = 1, . . . , 4) a constant C
such that
nBi(u, v) ≤ C |u− v|
γ,
which proves condition (2.8). For this purpose we exemplarily consider the expression B1, the
corresponding statements for the other terms follow along similar lines. A straightforward but
tedious calculation yields
Cov(Aˆu,0, Aˆv,0) =
1
(n− 1)2
{ n∑
i=2
m(u, ti−1)m(v, ti−1)r(ti, u, v) +m(u, ti)m(v, ti)r(ti−1, u, v)
+ r(ti, u, v)r(ti−1, u, v)
+
n∑
i=3
m(u, ti)m(v, ti−2)r(ti−1, u, v) +m(v, ti)m(u, ti−2)r(ti−1, u, v)
}
,
and we therefore obtain B1 = B˜1(u, v) + B˜1(v, u) with
B˜1(u, v) =
1
(n− 1)2
{ n∑
i=2
m(u, ti−1)
2r(ti, u, u)−m(u, ti−1)m(v, ti−1)r(ti, u, v)
+m(u, ti)
2r(ti−1, u, u)−m(u, ti)m(v, ti)r(ti−1, u, v)
+r(ti, u, u)r(ti−1, u, u)− r(ti, u, v)r(ti−1, u, v)
+2
n∑
i=3
m(u, ti)m(u, ti−2)r(ti−1, u, u)−m(u, ti)m(v, ti−2)r(ti−1, u, v)
}
.
A typical summand in B˜1 can be estimated by
|m(u, ti−1)
2r(ti, u, u)−m(u, ti−1)m(v, ti−1)r(ti, u, v)| ≤ C|u− v|
γ
using the Lipschitz property of the functions r and m. All other summands are treated similarly,
and we obtain
B1 = B1(u, v) ≤
1
n− 1
C|u− v|γ,
which proves assertion (2.8) and completes the proof of Theorem 1.
2
7
Remark 2.2. The assertion of Theorem 2.2 remains also valid, if the general hypothesis (1.3) of
nonlinear functional regression models has to be tested, and we will indicate the arguments for
proving this assertion here briefly. First note that the estimate Aˆu,0 can be rewritten as
Aˆu,0 =
1
n
n∑
i=1
Y 2i (u)− σˆ
2
u + op(
1
√
n
),
where
σˆ2u =
1
2(n− 1)
n∑
i=2
(Yi(u)− Yi−1(u))
2
denotes an estimate of the integrated variance
∫ 1
0
Var (ε(u, t))h(t)dt =
∫ 1
0
r2(t, u, u)h(t)dt
at the point u ∈ [0, 1] [see for example Rice (1984)]. Now a straightforward calculation shows that
the estimate Mˆ2u is essentially the sum of squared residuals, i.e.
Mˆ2u = argminβ
1
n
n∑
i=1
(
Y 2i (u)− β
Tf(u, ti)
)2
− σˆ2u + op(
1
√
n
) (2.10)
uniformly with respect to u ∈ [0, 1]. Obviously, this concept can be easily generalized to the
problem of testing the hypothesis of a nonlinear functional regression model. To be precise we
assume that for each u ∈ [0, 1] the function gu : t 7→ g(u, t, βu) satisfies the standard regularity
conditions of a nonlinear regression model [see for example Gallant (1987) or Seber and Wild
(1989)]. In particular we assume the set Θ ⊂ Rk is a compact set with non-empty interior and
that for all u, t ∈ [0, 1] the function
g(u, t, β) is twice continuously differentiable w.r.t. β
and satisfies
g(u, ·, β), g(·, t, β) ∈ Lipunifγ [0, 1].
We recall the definition (2.6) of the parameter corresponding to best L2-approximation of the
function m(u, ·) : [0, 1] → R by parametric functions of the form {g(u, ·, βu) | βu ∈ Θ}, where
we assume for each u ∈ [0, 1] the existence of the minimum β0(u) at a unique interior point of
the compact space Θ. The L2-distance between the function m(u, ·) and its best approximation
g(u, ·, β0(u)) in the parametric class is now defined by
M2u =
∫ 1
0
(m(u, t)− g(u, t, β0(u)))
2 h(t)dt.
In order to investigate whether the hypothesis (1.3) is satisfied let for each u ∈ [0, 1]
βˆ0(u) = arginfβ
n∑
i=1
(Yi(u)− g(u, ti, β))
2 (2.11)
8
denote the nonlinear least squares estimate (here and throughout this paper it is assumed that the
infimum in (2.11) is attained at a unique interior point of Θ ⊂ Rk) and observing (2.11) we obtain
as the analogue of (2.3) the statistic
Tˆ 2u =
1
n
n∑
i=1
(Yi(u)− g(u, ti, βˆ0(u)))
2 − σˆ2u. (2.12)
It follows by similar arguments as in Brodeau (1993) that
Tˆ 2u =
1
n
n∑
i=2
εi(u, ti)εi(u, ti−1)−
2
n
n∑
i=1
(m(u, ti)− g(u, ti, β0(u))ε(u, ti) +M
2
u + op(
1
√
n
)
uniformly with respect tu u ∈ [0, 1], and a similar reasoning as presented in the proof of Theorem
2.1 shows that √
n (Tˆ 2u −M
2
u) =⇒ G,
where the covariance structure of the Gaussian process G is specified in (2.5). The details are
omitted for the sake of brevity.
Note that the null hypotheses (1.3) is satisfied if and only if M2u = 0 for all u ∈ [0, 1]. Consequently,
a consistent test can be obtained by rejecting the null hypotheses for large values of a Crame´r-von-
Mises or a Kolmogoroff-Smirnov functional of the process {Tˆu}u∈[0,1]. Under the null hypothesis
the covariance kernel of the limiting process G in Theorem 1 reduces to
k(u, v)
H0=
∫ 1
0
r2(t, u, v)h(t)dt
and by the continuous mapping theorem it follows that the statistic
√
n
∫ 1
0
Tˆ 2udu
converges weakly to a centered normal distribution with variance
∫ 1
0
∫ 1
0 k(u, v)dudv. Therefore, it
remains to estimate the asymptotic variance, and we propose to use
sˆ2n =
∫ 1
0
∫ 1
0
kˆ(u, v)dudv,
where the estimate of the covariance kernel k(u, v) is defined by
kˆ(u, v) =
1
4(n− 3)
n−2∑
i=2
Si(u)Si(v)Si+2(u)Si+2(v). (2.13)
with Si(u) = Yi(u)−Yi−1(u). The following result shows that under the null hypothesis the statistic
sˆ2n is a consistent estimate of the asymptotic variance. The technical details of the proof are given
in the Appendix.
9
Proposition 2.2 Under the assumptions of Theorem 1 we have
kˆ(u, v) = k(u, v) +Op(n
−1/2)
uniformly with respect to u, v ∈ [0, 1].
Theorem 2.1 and Proposition 2.2 provide an asymptotic level α test by rejecting the null hypothesis
(1.3) if
Tn =
√
n
√∫ 1
0
∫ 1
0 kˆ(u, v) du dv
∫ 1
0
Tˆ 2u du > u1−α, (2.14)
where u1−α denotes the (1 − α) quantile of the standard normal distribution. The finite sample
properties of this test and a corresponding bootstrap version will be illustrated in Section 5.
3 A test for the hypothesis (1.4)
We now consider the problem of testing the hypothesis (1.4) in the functional regression model
defined by (1.1) and assume that n independent observations according to the model (??) are
available. For this purpose we define for fixed t ∈ [0, 1] the L2-distance
M2t = infγt
∫ 1
0
(m(u, t)− h(u, t, γt))
2 du. (3.1)
We only deal with the linear case, that is
h(u, t, γ(t)) = γ(t)Tf(u, t)
for some given regression functions f(u, t) = (f1(u, t), . . . , fk(u, t)) and denote by γ0(t) the func-
tion, which yields to the minimal values in (3.1). As a global measure of deviance from the null
hypothesis we consider the functional
M2 =
∫ 1
0
M2t h(t) dt, (3.2)
and obviously the hypothesis H0 : M2 = 0 is equivalent to (1.4).
Similarly as in Section 2, standard Hilbert space theory shows that the distance M2t can be ex-
pressed as a ratio of two Gramian determinants
M2t =
Γt(m, f1, . . . , fk)
Γ(f1, . . . , fk)
, (3.3)
where Γt(p1, . . . , pl) = det((〈pi, pj〉t)li,j=1) and the inner products are now calculated with respect
to the variable u, that is
〈f, g〉t =
∫ 1
0
f(u, t)g(u, t) du.
10
For the time ti we can “estimate” the entries of the matrix in the numerator of (3.3) by
Bˆi,0 =
∫
Yi(u)Yi−1(u) du,
Bˆi,p =
∫
Yi(u)fp(u, ti) du,
Cˆi,p =
∫
Yi−1(u)fp(u, ti−1) du = Bˆi−1,p,
and define
Mˆ2ti =
∣
∣
∣
∣
∣
∣
∣
∣
∣
Bˆi,0 Bˆi,1 · · · Bˆi,k
Cˆi,1 〈f1, f1〉ti · · · 〈f1, fk〉ti
...
...
. . .
...
Cˆi,k 〈fk, f1〉ti · · · 〈fk, fk〉ti
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
∣
〈f1, f1〉ti · · · 〈f1, fk〉ti
...
. . .
...
〈fk, f1〉ti · · · 〈fk, fk〉ti
∣
∣
∣
∣
∣
∣
∣
(3.4)
as an estimator for M2ti . Note that we estimate the entries in the upper first column by Cˆi,p rather
than Bˆi,p in order to assure that the statistic Mˆ2ti is asymptotically unbiased. However, because
only one observation is made at time ti, the variance of Mˆ2ti is not converging to 0 with increasing
sample size. As a consequence, the statistic Mˆ2ti is not a consistent estimate for M
2
ti . Nevertheless,
a consistent estimate for the measure defined in (3.2) can be obtained by averaging the quantities
Mˆ2ti , that is
Mˆ2 =
1
n− 1
n∑
i=2
M2ti . (3.5)
Similarly, consistent estimates of M2t at a particular point t can be obtained by local averages.
Theorem 3.1. Under the assumptions of Section 2 the estimate Mˆ2 defined in (3.5) is consistent
for M2 =
∫ 1
0 M
2
t dt. More precisely, we have as n→∞
√
n− 1(Mˆ2 −M2)
D
→ N (0, σ2),
where the asymptotic variance is given by
σ2 =
∫ 1
0
(∫
[0,1]2
(
r(u, v, t)− (Pu,tr)(v)
)(
r(u, v, t)− (Pv,tr)(u)
)
du dv
+ 4
∫
[0,1]2
r(u, v, t)
(
m(u, t)− (Ptm)(u)
)(
m(v, t)− (Ptm)(v)
)
du dv
)
h(t)dt,
and (Ptm)(u) = γTt,0f(u, t) and (Pu,tr)(v) = γ
T
u,t,0f(v, t) denote the orthogonal projections of the
function m(·, t) and r(u, ·, t) on the set span{f1(·, t), . . . , fk(·, t)}, respectively, that is
∫ 1
0
(m(u, t)− γTt,0f(u, t))
2 du = inf
γt
∫ 1
0
(m(u, t)− γTt f(u, t))
2 du = M2t ,
11
∫ 1
0
(r(u, v, t)− γTu,t,0f(v, t))
2 dv = inf
γu,t
∫ 1
0
(r(u, v, t)− γTu,tf(v, t))
2 dv.
Proof of Theorem 3.1. Without loss of generality we may assume that the functions f1, . . . , fk
are orthonormal and therefore the minimal distance in (3.3) and its estimator defined in (3.4)
simplify to
M2ti = 〈m,m〉ti −
k∑
p=1
〈m, fp〉
2
ti ,
Mˆ2ti = Bˆi,0 −
k∑
p=1
Bˆi,pCˆi,p,
respectively. A careful calculation of the moments of the random variables in the latter expression
yields
E[Bˆi,0] = 〈m,m〉ti +O(n
−γ),
E[Bˆi,pCˆi,p] = 〈m, fp〉
2
ti +O(n
−γ),
Var(Bi,0) =
∫
r(u, v, ti)
2 du dv + 2
∫
r(u, v, ti)m(u, ti)m(v, ti) du dv + (O(n
−γ),
Cov(Bˆi,pCˆi,p, Bˆi,qCˆi,q) =
∫
r(u, v, ti)fp(u, ti)fq(v, ti) du dv
(
2〈m, fp〉ti〈m, fq〉ti
+
∫
r(u, v, ti)fp(u, ti)fq(v, ti) du dv
)
+O(n−γ),
Cov(Bˆi,0, Bˆi,pCˆi,p) = 2
∫
r(u, v, ti)m(u, ti)fp(v, t) du dt 〈m, fp〉ti
+
∫
r(u, v, ti)r(u,w, ti)fp(v, ti)fp(w, ti) du dv dw +O(n
−γ),
Cov(Bˆi,0, Bˆi−1,0) = 2
∫
r(u, v, ti)m(u, ti)m(v, ti) du dv,
Cov(Bˆi,0,, Bˆi−1,pCˆi−1,p) =
∫
r(u, v, ti)m(u, ti)fp(v, ti) du dv 〈m, fp〉ti +O(n
−γ)
= Cov(Bˆi−1,p, Bˆi,0Cˆi,p),
Cov(Bˆi,pCˆi,p, Bˆi−1,qCˆi−1,q) =
∫
r(u, v, ti)fp(u, ti)fq(v, ti) du dv 〈m, fp〉ti〈m, fq〉ti +O(n
−γ).
The sequence Mˆ2t2 , . . . , Mˆ
2
tn forms a triangular array of one-dependent random variable and as a
consequence all covariances corresponding to a lag larger than one vanish. Therefore the variance
of the standardized mean
σ2n = Var(
1
√
n− 1
n∑
i=2
M2ti)
12
is given by
σ2n =
1
n− 1
n∑
i=2
{
Var(Bi,0) +
k∑
p,q=1
Cov(Bˆi,pCˆi,p, Bˆi,qCˆi,q)− 2
k∑
p=1
Cov(Bˆi,0, Bˆi,pCˆi,p)
+ 2 Cov(Bˆi,0, Bi−1,0)− 2
k∑
p=1
Cov(Bˆi,0, Bˆi−1,pCˆi−1,p)
− 2
k∑
p=1
Cov(Bˆi−1,0, Bˆi,pCˆi,p) + 2
k∑
p,q=1
Cov(Bˆi,pCˆi,p, Bˆi−1,qCˆi−1,q)
}
+O(n−γ)
=
∫ 1
0
{∫
r(u, v, t)2 du dv − 2
k∑
p=1
∫
r(u, v, t)r(u,w, t)fp(v, t)fq(w, t) du dv dw
+
k∑
p,q=1
(∫
r(u, v, t)fp(u, t)fq(v, t) du dv
)2
+ 4
∫
r(u, v, t)m(u, t)m(v, t) du dv − 8
p∑
i=1
∫
r(u, v, t)m(u, t)fp(v, t) du, dv 〈m, fp〉t
+ 4
k∑
p,q=1
∫
r(u, v, t)fp(u, t)fq(v, t) du dv 〈m, fp〉t 〈m, fq〉t
}
dt+O(n−γ)
= σ2 +O(n−γ).
Here the last equality uses the fact that under the assumption of orthonormality the orthogonal
projection Ptm and Pu,tr are given by
(Ptm)(u) =
k∑
p=1
〈m, fp〉tfp(u, t),
(Pu,tr)(v) =
k∑
p=1
〈r(u, ·), fp〉tfp(v, t).
The assertion of the theorem now follows by the classical central limit theorem for m-dependent
random variables (see Orey (1958)).
2
Under the null hypothesis the variance of the limiting normal distribution simplifies to
σ2
H0=
∫ 1
0
(∫
[0,1]2
(
r(u, v, t)− (Pu,tr)(v)
)(
r(u, v, t)− (Pv,tr)(u)
)
d(u, v)h(t) dt.
We propose to estimate this variance by
σˆ2 =
1
4(n− 3)
n−2∑
i=2
∫
[0,1]2
Si(u)
(
Si(v)−
∫ 1
0
Si(x)f(x, ti)
T dxA−1i f(v, ti)
)
13
× Si+2(v)
(
Si+2(u)−
∫ 1
0
Si+2(x)f(x, ti+2)
T dxA−1i+2f(u, ti+2
)
d(u, v),
where Si(u) = Yi(u)− Yi−1(u) and Ai =
∫ 1
0 f(u, ti)f(u, ti)
T du ∈ Rk×k. Observing that the orthog-
onal projection (Pu,tr)(v) is given by
Pu,tr(v) = γ
T
u,t,0f(v, t) =
∫ 1
0
r(u, x, t)f(x, t)T dx
(∫ 1
0
f(x, t)(f(x, t)T dx
)−1
f(v, t)
it follows by a similar calculation as in the proof of Theorem 3.1 that σˆ2 is a
√
n-consistent estimator
for σ2. Therefore we obtain an asymptotic level α test for the hypothesis (1.4) by rejecting H0 if
√
n− 1
σˆ2
Mˆ2 > u1−α, (3.6)
where u1−α denotes the (1− α) quantile of the standard normal distribution.
4 Testing homoscedasticity
In this section we address the problem of testing the hypothesis (1.5) of homoscedastic errors in
the functional regression model (1.1). Motivated by the discussion in Section 2 and 3 we propose
the following measure of heteroscedasticity at a point (u, v) ∈ [0, 1]2
τ 2(u, v) = min
a∈R
||r(·, u, v)− a||2 =
∫ 1
0
r2(t, u, v)h(t) dt−
(∫ 1
0
r(t, u, v)h(t) dt
)2
. (4.1)
Note that τ 2(u, v) = 0 a.e. if and only if the covariance function does not depend on t, that is the
hypothesis (1.5) of homoscedasticity is valid. An estimator for the quantity
∫ 1
0 r
2(t, u, v)h(t) dt in
(4.1) has been proposed in (2.13), and for the second term we will use a similar estimate based on
the statistic
k˜(u, v) =
1
2(n− 1)
n∑
i=2
Si(u)Si(v),
where Si(u) = Yi(u)−Yi−1(u). We therefore obtain as an estimator of the process {τ 2(u, v)}u,v∈[0,1]
τˆ 2n(u, v) =
1
4(n− 3)
n−2∑
i=2
Si(u)Si(v)Si+1(u)Si+2(v)−
(
1
2(n− 1)
n∑
i=2
Si(u)Si(v)
)2
.
The asymptotic properties of this random variable are specified in the following result.
Theorem 4.1. Assume that the third and fourth moments
d1(t, u, v, w) = E[ε(u, t)ε(v, t)ε(w, t)]
d2(t, u, v, w, x) = E[ε(u, t)ε(v, t)ε(w, t)ε(x, t)]
14
of the error process ε(u, t) exist and are elements of Lipunifγ [0, 1] for every argument. If the as-
sumptions of Section 2 are satisfied we have as n→∞
4
√
n(τˆ 2n(u, v)− τ
2(u, v)) =⇒ G
in C[0, 1]2. Here G is a centered Gaussian field on [0, 1]2 whose covariance structure under the
null hypothesis of homoscedasticity is given by
k ((u1, v1), (u2, v2)) := Cov (G(u1, v1), G(u2, v2))
= 6D(2)2 (u1, v1, u2, v2)− 12D
(r,1,1)
2 (u1, v1, u2, v2) + 8D
(r,1,1)
2 (u1, u2, v1, v2) + 8D
(r,1,1)
2 (u1, v2, v1, u2)
+6J(u1, v1, u2, v2, u1, v1, u2, v2) + 4J(u1, u2, v1, v2, u1, u2, v1, v2)
+4J(u1, v2, v1, u2, u1, v2, v1, u2)− 8J(u1, v1, u2, v2, u1, u2, v1, v2)
−8J(u1, v2, u2, v2, u1, v2, v1, u2) + 8J(u1, v1, u2, v2, u1, v2, v1, u2)
+2D(r)1 (u1, u2, v1, v2) + 2D
(r)
1 (u1, v2, v1, u2) + 2D
(r)
1 (v1, u2, u1, v2) + 2D
(r)
1 (v1, v2, u1, u2)
where the following notations have been used
D(2)2 (u1, v1, u2, v2) =
∫ 1
0
d2(t, u1, v1, u2, v2)
2h(t) dt
D(r,i,j)2 (u1, v1, u2, v2) = r(u1, v1)
ir(u2, v2)
j
∫ 1
0
d2(t, u1, v1, u2, v2)h(t) dt
J(u1, v1, u2, v2, u3, v3, u4, v4) =
4∏
i=1
r(t, ui, vi)
D(r)1 (u1, v1, u2, v2) = r(u1, v1)
∫ 1
0
d1(t, v1, u2, v2)d1(t, u1, u2, v2)h(t) dt.
Proof of Theorem 4.1. The proof follows along similar lines as the proof of Theorem 2.1,
establishing weak convergence of finite dimensional distributions and tightness of the sequence
{4
√
n(τˆ 2n(u, v)− τ
2(u, v))}u,v∈[0,1].
For this reason only the main steps are indicated in the subsequent discussion. A careful inspection
of the results in the proof of Lemma 6.2 and 6.3 in Dette et al. (1999) yields to the following
decomposition into a sum of 4-dependent random variables and a stochastic remainder of order
n−
1
2
τˆ 2n(u, v)− τ
2(u, v) =
1
4(n− 3)
n−2∑
j=2
Wj(u, v) + oP (n
− 12 )
(uniformly with respect to (u, v)), where
Wj(u, v) = Zj(u, v) {Zj+2(u, v) + 4δj(u, v)} ,
Zj(u, v) = ∆εu,j−1,j∆εv,j−1,j − Ej(u, v),
Ej(u, v) = E[∆εu,j−1,j∆εv,j−1,j] = 2r(tj, u, v) +O(n
−γ),
15
δj(u, v) = r(tj, u, v)−
1
n
n∑
i=1
r(ti, u, v).
A straightforward but tedious calculation shows that the covariance structure of the random vari-
ables Wj(u, v) is given by
Cov(Wj(u1, v1),Wj(u2, v2))
= 4 (d2(tj, u1, v1, u2, v2) + r(tj, u1, v1)r(tj, u2, v2) + r(tj, u1, u2)r(tj, v1, v2) + r(tj, u1, v2)r(tj, v1, u2))
2
+16
(
d2(tj, u1, v1, u2, v2) + r(tj, u1, v1)r(tj, u2, v2) + r(tj, u1, u2)r(tj, v1, v2)
+r(tj, u1, v2)r(tj, v1, u2)
)(
2δj(u1, v1)δj(u2, v2)− r(tj, u1, v1)r(tj(u2, v2)
)
+16r2(tj, u1, v1)r
2(tj, u2, v2)− 64r(tj, u1, v1)r(tj, u2, v2)δj(u1, v1)δj(u2, v2) +O(n
−γ),
Cov(Wj(u1, v1),Wj+1(u2, v2))
= (d2(tj, u1, v1, u2, v2))
2 − 2d2(tj, u1, v1, u2, v2)r(tj, u1, v1)r(tj, u2, v2) + r
2(tj, u1, v1)r
2(tj, u2, v2)
+d1(tj, v1, u2, v2)d1(tj, u1, v1, u2)r(tj, u1, u2) + d1(tj, v1, u2, v2)d1(tj, u1, v1, u2)r(tj, u1, v2)
+d1(tj, u1, u2, v2)d1(tj, u1, v1, v2)r(tj, v1, u2) + d1(tj, u1, u2, v2)d1(tj, u1, v1, u2)r(tj, v1, v2)
−8δj(u2, v2)d1(tj, u1, v1, v2)d1(tj, u1, v1, u2) + 16δj(u1, v1)δj(u2, v2)
×
(
d2(tj, u1, v1, u2, v2)− r(tj, u1, v1)r(tj, u2, v2)
)
+O(n−γ)
and
Cov(Wj(u1, v1),Wi(u2, v2)) = 0 for |i− j| ≥ 2.
The dominating sum
An(u, v) =
1
4(n− 3)
n−2∑
j=2
Wj(u, v)
therefore has asymptotic covariance
16nCov(An(u1, v1), An(u2, v2)) =
1
n
∑
j
Cov(Wj(u1, v1),Wj(u2, v2)) + Cov(Wj(u1, v1),Wj+1(u2, v2))
+ Cov(Wj(u2, v2),Wj+1(u1, v1)) + o(1)
= k ((u1, v1), (u2, v2)) + o(1).
The last equality is obtained using the Lipschitz continuity of the regression functions. Finally
the validation of tightness follows along similar lines as in the proof of Theorem 2.1 by a tedious
calculation of a corresponding moment condition for Gaussian fields [see e.g. Bickel and Wichura
(1971)] and is therefore omitted.
2
5 Finite sample properties
In this section we study the finite sample properties of the tests proposed in the previous sections.
Our first example considers the linear hypothesis
H0 : m(u, t) = g(u, t, β(u)) = β(u) f(u, t),
16
where f : [0, 1]2 → R is some given function and β : [0, 1]→ R (i.e. k = 1). The discussion following
the proof of Theorem 2.1 states that under the null hypothesis H0, the statistic Tn defined in (2.14)
converges weakly to a standard normal distribution. We reject the hypothesis H0 if the inequality
(2.14) is satisfied. In order to study the approximation of the nominal level and the power of this
asymptotic level α test 5000 replications with different functions f have been performed. The error
terms ε(u, ti) are assumed to be i.i.d. Brownian Motions, i.e. r(t, u, v) = u∧ v, which implies that
the model is homoscedastic. The results under the null hypothesis are presented in Table 1 for the
functions
f1(u, t) = (−1 + 2u) + 2(1− u)t (5.1)
f2(u, t) = (1 + u) cos(2pit). (5.2)
It can be seen that the nominal level of the test is well approximated in most cases. For the
function f1(u, t) = (−1 + 2u) + (2− 2u)t the approximation is very accurate for sample sizes larger
than n = 100, for smaller values the level is either overestimated (if the nominal level is smaller
than α = 0.1) or underestimated (if the nominal level is larger than α = 0.1). In the case where
we use the function f2(u, t) = (1 + u) cos(2pit) we underestimate the level, with the tendency to
get better approximations for larger sample sizes.
For the investigation of the power of the test we consider the functions fi defined in (5.1) and (5.2)
with two additive alternatives, that is
m(u, t) = fi(u, t) +
1
2
exp(t) (5.3)
m(u, t) = fi(u, t) + sin(2pit) (5.4)
with i = 1, 2. The corresponding results are presented in Tables 2 and 3. We observe reasonable
rejection probabilities for all sample sizes and both choices of f1.
Note that for sample sizes n = 25 and n = 50 the approximation of the nominal level is less
accurate. In these cases we propose a wild bootstrap procedure to obtain a more accurate test
procedure [see Wu (1986)]. For this purpose we denote by βˆ(u) the (point-wise) ordinary least
square estimator of the function β(u) and calculate the parametric residuals by
εˆ(u, ti) = Yi(u)− βˆ(u) f1(u, ti) (5.5)
for i = 1, . . . , n and u ∈ [0, 1]. For b = 1, . . . B with B ∈ N let vb∗i be independent samples of
a random variable V with a Laplacian distribution on the set {−1, 1}, and define the bootstrap
sample as
Y b∗i (u) = βˆ(u) f1(u, ti) + ε
b∗
i (u); i = 1, . . . , n, (5.6)
where
εb∗i (u) = v
b∗
i εˆ(u, ti). (5.7)
For each b ∈ {1, . . . , B} we calculate the statistic T b∗n = Tn(Y
b∗
1 (·), . . . , Y
b∗
n (·)), with Tn as given in
(2.14) and denote by
H∗n,B(x) =
1
B
B∑
b=1
I{T b∗n ≤ x}
17
the empirical distribution function of T 1∗n , . . . , T
B∗
n . We determine the (1 − α)-quantile of this
distribution and use its quantiles as critical values for the test statistic Tn = Tn(Y1(·), . . . , Yn(·)).
In our simulation study we made 1000 replications of this procedure with B = 200 bootstrap-
samples, the corresponding results under the null hypothesis are presented in Table 4 for the
sample sizes n = 25 and n = 50, and the regression functions (5.1) and (5.2). Compared to the
test based on the normal approximation we observe a substantial improvement with respect to the
approximation of the nominal level.
In Table 5 and 6 we show the simulated rejection probabilities of the wild bootstrap test for the
alternatives (5.3) and (5.4), respectively. In all cases we obtain similar rejection probabilities as for
the test defined in (2.14). Compared to the test based on the asymptotic distribution, a slight loss
in power is observed in case of the alternative fi(u, t) + sin(2pit), while in case of the exponential
alternative we observe a negligible improvement for the majority of scenarios.
As a second example we study the finite sample properties of the test for the hypothesis
H0 : m(u, t) = γ(t) f(u, t), (5.8)
defined in Section 3, where again f : [0, 1]2 → R is some given function and γ : [0, 1] → R
(i.e. k = 1). The discussion at the end of Section 3 suggests to reject the hypothesis H0 if the
inequality (3.6) is satisfied. We have investigated the finite sample properties of this test under
the assumptions of the previous study for f = f1 as given in (5.1). The normal approximation did
not yield a sufficiently accurate approximations of the level for sample sizes up to n = 500 and for
this reason these results are not depicted. As an alternative we propose to use a wild bootstrap
approximation similar to the one given in the previous paragraph. More precisely, we calculate
residuals analogously to (5.5) by
εˆ(u, ti) = Yi(u)− γˆ(ti) f1(u, ti)
for i = 1, . . . , n and u ∈ [0, 1], where γˆ(ti) denotes the least square estimator for γ(ti). As in
equations (5.6) and (5.7) we define
εb∗i (u) = v
b∗
i εˆ(u, ti)
and
Y b∗i (u) = γˆ(ti) f1(u, ti) + ε
b∗
i (u) (i = 1, . . . , n)
to obtain a wild bootstrap sample. The results of the corresponding bootstrap test are shown in
Table 7. We observe that the resampling procedure yields to a test with a very accurate approx-
imation of the nominal level and a perfect power behavior under the alternative H1 : m(u, t) =
f1(u, t) + 12 exp(t).
Acknowledgements. The authors would like to thank Martina Stein, who typed parts of this
manuscript with considerable technical expertise. This work has been supported in part by the
Collaborative Research Center “Statistical modeling of nonlinear dynamic processes” (SFB 823)
of the German Research Foundation.
18
6 Appendix: Proof of Proposition 2.2
We introduce the representation
Si(u) = ∆mu,i + ∆εu,i
with
∆mu,i := m(u, ti)−m(u, ti−1),
∆εu,i := ε(u, ti)− ε(u, ti−1),
and consider the following decomposition of the estimate kˆ
kˆ(u, v) = 2T1n(u, v) + T2n(u, v) + 2T3n(u, v) + T˜n(u, v), (6.1)
where
T1n(u, v) =
1
4(n− 3)
n−2∑
i=2
∆mu,i∆mu,i+2 {∆mv,i+2∆εv,i + ∆mv,i∆εv,i+2}
T2n(u, v) =
1
4(n− 3)
n−2∑
i=2
[
∆mu,i∆mu,i+2∆εv,i∆εv,i+2
+ ∆mu,i∆mv,i∆εu,i+2∆εv,i+2
+ ∆mu,i+2∆mv,i+2∆εu,i∆εv,i + ∆mu,i+2∆mv,i∆εu,i∆εv,i+2
]
T3n(u, v) =
1
4(n− 3)
n−2∑
i=2
∆εv,i∆εv,i+2
[
∆mu,i∆εu,i+2 + ∆mu,i+2∆εu,i
]
.
T˜n(u, v) =
1
4(n− 3)
n−2∑
i=2
∆εu,i∆εu,i+2∆εv,i∆εv,i+2
We show that the first three terms of the decomposition (6.1) are asymptotically negligible. For
this reason we analyze the term T1n(u, v) exemplarily. We have
T1n(u, v) = T
(a)
1n (u, v) + T
(b)
1n (u, v) (6.2)
with
T (a)1n (u, v) =
1
4(n− 3)
n−2∑
i=2
∆mu,i∆mu,i+2∆mv,i+2∆εv,i
T (b)1n (u, v) =
1
4(n− 3)
n−2∑
i=2
∆mu,i∆mu,i+2∆mv,i∆εv,i+2.
Both sums are centered and for the calculation of the variance of T (a)1n (u, v) it follows
Var(T (a)1n ) =
1
16(n− 3)2
n−2∑
i=2
n−2∑
j=2
E [∆mu,i∆mu,i+2∆mu,j∆mu,j+2
19
∆mv,j+2∆mv,i+2∆εv,i∆εv,j] .
Note that this sum is dominated by the sum of those expectations corresponding to the indices
with i = j, i = j + 1 or j = i + 1. We exemplarily treat the case i = j. Using the Lipschitz
continuity of the function m it follows
∆mu,i+2 ≤ max
2≤i≤n
|ti − ti−1|
γ = O(n−γ) (6.3)
uniformly with respect to i = 2, . . . , n, and this estimate yields
E[∆2mu,i∆
2mu,i+2∆
2mv,i+2∆
2εv,i] = O
(
n−6γ
)
.
Consequently, by Markov’s inequality we obtain (uniformly with respect to u and v)
T (a)1n (u, v) = Op(n
−3γ).
The term T (b)1n (u, v) in (6.2) is treated similarly, which implies T1n(u, v) = op
(
n−1/2
)
. Similar
arguments for the statistics T2n(u, v) and T3n(u, v) in (6.1) give
kˆ(u, v) = T˜n(u, v) + op(n
−1/2).
For the investigation of the remaining (dominating) term T˜n(u, v) we note that the sequence
(∆ε(u, ti))i=1,..n is 2-dependent, which yields
E[∆εu,i∆εu,i+2∆εv,i∆εv,i+2] = E[ε(u, ti)ε(v, ti) + ε(u, ti−1)ε(v, ti−1)]
E[ε(u, ti+2)ε(v, ti+2) + ε(u, ti+1)ε(v, ti+1)]
= [r(ti, u, v) + r(ti−1, u, v)][r(ti+2, u, v) + r(ti+1, u, v)]
= 4r(ti, u, v)r(ti+2, u, v) +O(n
−γ)
by the Lipschitz continuity of the covariance function r. Observing the definition of T˜n(u, v) this
gives
E[T˜n(u, v)] =
1
n− 3
n−2∑
i=2
r(ti, u, v)r(ti+2, u, v) +O(n
−γ) = k(u, v) + o
(
n−1/2
)
.
A similar calculation shows that the variance of T˜n is of order O(n−1), which yields the assertion
of Proposition 2.2.
2
20
n Mean Var 0.15 0.1 0.05 0.01
25 -0.2484 1.4008 0.1192 0.0920 0.0664 0.0334
50 -0.1359 1.2099 0.1336 0.1022 0.0632 0.0262
f1(u, t) 100 -0.0975 1.0773 0.1328 0.0954 0.0544 0.0202
200 -0.0290 1.0516 0.1464 0.1062 0.0674 0.0208
500 -0.0373 1.0537 0.1514 0.1064 0.0578 0.0170
25 -1.28 1.4253 0.0382 0.0264 0.0152 0.0056
50 -0.6477 1.2543 0.0726 0.0538 0.0372 0.0146
f2(u, t) 100 -0.3862 1.1260 0.0886 0.0676 0.0434 0.0164
200 -0.2797 1.0379 0.1014 0.0718 0.0402 0.0102
500 -0.1455 1.0267 0.1226 0.0802 0.0444 0.0134
Table 1: Simulated rejection probabilities of the test (2.14) under the null hypothesis H0 : m(u, t) =
fi(u, t), i = 1, 2, where the regression functions f1 and f2 are given in (5.1) and (5.2), respectively.
n Mean Var 0.15 0.1 0.05 0.01
25 2.52 4.32 0.764 0.705 0.615 0.480
50 3.72 3.92 0.936 0.914 0.874 0.740
f1(u, t) 100 5.13 3.68 0.997 0.995 0.990 0.955
200 7.18 3.47 1 1 1 1
500 11.39 3.60 1 1 1 1
25 7.99 23.49 0.981 0.977 0.970 0.939
50 13.31 25.02 1 1 1 1
f2(u, t) 100 19.49 26.67 1 1 1 1
200 27.52 27.33 1 1 1 1
500 44.45 29.01 1 1 1 1
Table 2: Simulated rejection probabilities of the test (2.14) under the alternative H1 : m(u, t) =
fi(u, t) + 1/2 exp(t), i = 1, 2, where the regression functions f1 and f2 are given in (5.1) and (5.2),
respectively.
n Mean Var 0.15 0.1 0.05 0.01
25 11.48 33.036 1 1 0.9996 0.9984
50 15.47 27.2397 1 1 1 1
f1(u, t) 100 21.09 24.1177 1 1 1 1
200 29.52 23.459 1 1 1 1
500 45.92 22.6557 1 1 1 1
25 5.047 11.9046 0.9398 0.9174 0.0876 0.7932
50 8.645 14.3088 0.9988 0.9982 0.9966 0.9866
f2(u, t) 100 12.51 14.4536 1 1 1 1
200 17.69 13.7865 1 1 1 1
500 27.69 13.3727 1 1 1 1
Table 3: Simulated rejection probabilities of the test (2.14) under the alternative H1 : m(u, t) =
fi(u, t) + sin(2pit), i = 1, 2, where the regression functions f1 and f2 are given in (5.1) and (5.2),
respectively.
21
n 0.15 0.1 0.05 0.01
f1(u, t) 25 0.15 0.108 0.055 0.020
50 0.15 0.101 0.058 0.016
f2(u, t) 25 0.158 0.108 0.057 0.020
50 0.154 0.095 0.051 0.013
Table 4: Simulated rejection probabilities of the bootstrap test under the null hypothesis H0 :
m(u, t) = fi(u, t), i = 1, 2, where the regression functions f1 and f2 are given in (5.1) and (5.2),
respectively.
n 0.15 0.1 0.05 0.01
f1(u, t) 25 0.856 0.792 0.640 0.440
50 0.956 0.942 0.908 0.756
f2(u, t) 25 0.996 0.990 0.980 0.926
50 1 1 1 1
Table 5: Simulated rejection probabilities of the bootstrap test under the alternative H1 : m(u, t) =
fi(u, t) + 1/2 exp(t), i = 1, 2, where the regression functions f1 and f2 are given in (5.1) and (5.2),
respectively.
n 0.15 0.1 0.05 0.01
f1(u, t) 25 0.957 0.922 0.857 0.673
50 0.998 0.996 0.986 0.918
f2(u, t) 25 0.988 0.977 0.952 0.797
50 0.999 0.998 0.997 0.985
Table 6: Simulated rejection probabilities of the bootstrap test under the alternative H1 : m(u, t) =
fi(u, t) + sin(2pit), i = 1, 2, where the regression functions f1 and f2 are given in (5.1) and (5.2),
respectively.
22
n 0.15 0.1 0.05 0.01
H0 25 0.148 0.106 0.062 0.022
50 0.152 0.104 0.048 0.010
100 0.160 0.104 0.052 0.018
200 0.148 0.116 0.062 0.016
500 0.150 0.104 0.06 0.012
H1 25 0.978 0.954 0.910 0.792
50 0.996 0.992 0.980 0.940
100 1 1 1 0.998
200 1 1 1 1
500 1 1 1 1
Table 7: Simulated rejection probabilities of the bootstrap test for the hypothesis (5.8). Under
H0 : m(u, t) = f1(u, t), under H1 : m(u, t) = f1(u, t) + 1/2 exp(t).
References
Achieser, N. I. (1956). Theory of Approximation. Frederick Ungar Publishing Co., New York.
Besse, P. and Ramsay, J. O. (1986). Principal components of sampled functions. Psychometrika,
51:285–311.
Bickel, P. and Wichura, M. J. (1971). Convergence criteria for multiparameter stochastic processes
and some applications. Annals of Mathematical Statistics, 42:1656–1670.
Billingsley, P. (1968). Convergence of Probability Measures. Wiley, New York.
Brodeau, F. (1993). Tests for the choice of approximative models in nonlinear regression when the
variance is unknown. Statistics, 24(2):95–106.
Cardot, H., Ferraty, F., Mas, A., and Sarda, P. (2003). Testing hypotheses in the functional linear
model. Scandinavian Journal of Statistics, 30:241–255.
Cardot, H., Goia, A., and Sarda, P. (2004). Testing for no effect in functional linear regression
models, some computational approaches. Communications in Statistics - Simulation and Com-
putation, 33:179–199.
Cuevas, A., Febrero, M., and Fraiman, R. (2002). Linear functional regression: the case of fixed
design and functional response. Canadian Journal of Statistics, 30:285–300.
Dette, H., Munk, A., and Wagner, T. (1999). Testing model assumptions in multivariate linear
regression models. Nonparametric Statistics, 12:309–342.
Escabias, M., Aguilera, A. M., and Valderrama, M. J. (2004). Principal component estimation of
functional logistic regression: discussion of two different approaches. Journal of Nonparametric
Statistics, 16:365–384.
23
Faraway, J. J. (1997). Regression analysis for a functional response. Technometrics, 39:254–261.
Ferraty, F. and Vieu, P. (2003). Curves discrimination: a nonparametric functional approach.
Computational Statistics and Data Analysis, 44:161–173.
Ferraty, F. and Vieu, P. (2006). Nonparametric Functional Data Analysis: Theory and Practice.
Springer, New York.
Gallant, A. R. (1987). Nonlinear Statistical Models. Wiley, New York.
Hlubinka, D. and Prchal, L. (2007). Changes in atmospheric radiation from the statistical point
of view. Computational Statistics and Data Analysis, 51:4926–4941.
Kneip, A. and Utikal, K. (2001). Time trends in the joint distribution of income and age. In
Economic Essays, A Festschrift for Werner HILDENBRAND. Springer Verlag.
Kokoszka, P., Maslova, I., Sojka, J., and Zhu, L. (2008). Testing for lack of dependence in the
functional linear model. The Canadian Journal of Statistics, 36:207–222.
Mas, A. (2007). Testing for the mean of random curves: a penalization approach. Statistical
Inference for Stochastic Processes, 10:147–163.
Mu¨ller, H. G. and Stadtmu¨ller, U. (2005). Generalized functional linear models. Annals of Statis-
tics, 33:774–805.
Orey, S. (1958). A central limit theorem for m-dependent random variables. Duke Mathematical
Journal, 52:543–546.
Ramsay, J. O. and Dalzell, C. J. (1991). Some tools for functional data analysis (with discussion).
Journal of the Royal Statistical Society, Series B, 52:539–572.
Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd. Ed. Springer, New
York.
Sacks, J. and Ylvisacker, D. (1970). Designs for regression problems with correlated errors III.
Annals of Mathematical Statistics, 41:2057–2074.
Seber, G. A. F. and Wild, C. J. (1989). Nonlinear Regression. John Wiley and Sons Inc., New
York.
Shen, Q. and Faraway, J. (2004). An F test for linear models with functional responses. Statistica
Sinica, 14:1239–1257.
Wu, C. F. J. (1986). Jackknife, bootstrap and other resampling methods in regression analysis.
Annals of Statistics, 14:1261–1295.
Yang, X., Shen, Q., Xu, H., and Shoptaw, S. (2007). Functional regression analysis using an
F test for longitudinal data with large numbers of repeated measures. Statistics in Medicine,
26:1552–1566.
24