SFB 823 Testing model assumptions in functional regression models D iscussion P aper Axel Bücher, Holger Dette, Gabi Wieczorek Nr. 27/2009 Testing model assumptions in functional regression models Axel Bu¨cher, Holger Dette, Gabi Wieczorek Ruhr-Universita¨t Bochum Fakulta¨t fu¨r Mathematik 44780 Bochum, Germany e-mail: axel.buecher@ruhr-uni-bochum.de e-mail: holger.dette@ruhr-uni-bochum.de September 28, 2009 Abstract In the functional regression model where the responses are curves new tests for the func- tional form of the regression and the variance function are proposed, which are based on a stochastic process estimating L2-distances. Our approach avoids the explicit estimation of the functional regression and it is shown that normalized versions of the proposed test statistics converge weakly. The finite sample properties of the tests are illustrated by means of a small simulation study. It is also demonstrated that for small samples bootstrap versions of the tests improve the quality of the approximation of the nominal level. Keywords and Phrases: goodness-of-fit tests, functional data, parametric bootstrap, tests for het- eroscedasticity AMS Subject Classification: 62G10 1 Introduction Since the pioneering work by Ramsay and Dalzell (1991) on regression analysis for functional data this topic has received considerable attention in the recent literature. The interest in statistical techniques enabling to take into account the functional nature of data stems from the fact that nowadays in many applications (for instance in climatology, remote sensing, linguistics, . . .) the data comes from the observation of a continuous phenomenon over time. For a review on the statistical problems and techniques for functional data we refer to the monographs of Ramsay and Silverman (2005) and Ferraty and Vieu (2006). In these models either predictors or responses can be viewed as random functions. The data typically appears when the value of a variable is repeatedly recorded on a dense grid of time points for a sample of subjects. While many authors consider the problem of estimating the regression or generalizing classical concepts of multivariate statistics as principal component or discrimination analysis to the situation where the data are curves [see for example Besse and Ramsay (1986), Faraway (1997), Kneip and Utikal (2001), Cuevas 1 et al. (2002), Ferraty and Vieu (2003), Escabias et al. (2004) or Mu¨ller and Stadtmu¨ller (2005) among many others others], much less attention has been paid to the problem of testing model assumptions when analyzing functional data. Many authors discussed the problem of testing hypotheses in a linear functional data model. For example, Cardot et al. (2003), Mu¨ller and Stadtmu¨ller (2005) and Cardot et al. (2004) considered the problem of testing a simple hypothesis in the case where the response is real and the predictor is a random function, while Mas (2007) investigated a test for the mean of random curves. Recently Shen and Faraway (2004) and Yang et al. (2007) discussed an F -test in a linear longitudinal data model, while Kokoszka et al. (2008) tested for lack of dependence in a functional linear model where both response and predictor are curves. The present work considers the problem of testing model assumptions in the nonparametric func- tional regression model Yi(u) = m(u, ti) + ε(u, ti), ti ∈ [0, 1], i = 1, . . . , n, (1.1) where u varies (without loss of generality) in the interval [0, 1]. Our main concern deals with the problem of validating a parametric assumption of the form Yi(u) = g(u, ti, β) + ε(u, ti) , ti ∈ [0, 1], i = 1, . . . , n, (1.2) where g is a given parametric functional regression model and β : [0, 1] → Rk denotes a function, which depends either on the variable u or on t (note that both cases correspond to a different parametric modeling of the functional data). The latter model has been considered in the linear context by numerous authors. In particular Shen and Faraway (2004) and Yang et al. (2007) have proposed generalizations of the F -test for model Y (u) = xTβ(u) + ε(u) and used these methods to analyze some data from Ergonomics. While in this work the predictor x is considered as discrete (as in the classical ANOVA model) we concentrate in the present paper on the case where the variable t in (1.2) varies in continuous way. Our work is inspired by the recent paper of Hlubinka and Prchal (2007), who proposed a functional regression model of the form (1.2) to study the time-variation of vertical atmospheric radiation profiles by means of functional regression models. These authors assumed that the parameter (function) β depends on the value t. In Section 2 we introduce some notation and propose a test for the hypothesis that the regression function in the nonparametric functional regression model (1.1) is of a specific parametric form as given in (1.2) with a function β depending on the variable u, that is H0 : m(u, t) = g(u, t, β(u)) (1.3) for some parametric function g and a parameter β : [0, 1] → Rk. The case, where the parameter depends on t is investigated in Section 3, where we consider the hypothesis H0 : m(u, t) = h(u, t, γ(t)) (1.4) for a parametric function h and some function γ : [0, 1]→ Rk. Finally, we discuss in Section 4 the problem of testing parametric assumptions regarding the second order properties of the process Y (u). More precisely, if r(t, u, v) = Cov(ε(u, t), ε(v, t)) denotes the covariance of the observations Y (u) and Y (v), we are interested in the hypothesis H0 : r(t, u, v) = r(u, v), (1.5) 2 which corresponds to the case of homoscedasticity. Note that this assumption is necessary for the application of the F -tests proposed by Shen and Faraway (2004) and Yang et al. (2007). Moreover, this assumption was also made by Hlubinka and Prchal (2007) who proposed a nonlinear functional regression model for the analysis of changes in atmospheric radiation. The proposed tests for the hypotheses (1.3), (1.4) and (1.5) are very simple and are based on stochastic processes of empirical L2-distances between the nonparametric and parametric functional regression model. We prove weak convergence of these processes under the null hypothesis and fixed alternatives and, as a consequence, asymptotic normality of functionals of these processes. In Section 5 we demonstrate by means of a simulation study that for moderate sample sizes the quantiles of the asymptotic distribution provide a rather accurate approximation of the nominal level. On the other hand - for small sample sizes - a wild bootstrap version of the test is proposed and its accuracy is also investigated. Finally, some technical details are given in an Appendix. 2 A process of empirical L2-distances for testing (1.3) Consider the nonparametric functional regression model defined by (1.1) and assume that n inde- pendent observations are available at distinctive points 0 ≤ t1 < · · · < tn ≤ 1. For the discussion of the asymptotic properties of the tests proposed in this paper we will assume that the design points t1, . . . , tn satisfy n max i=2 ∣ ∣ ∣ ∣ ∫ ti ti−1 h(t) dt− 1 n ∣ ∣ ∣ ∣ = o(n −(1+γ)), (2.1) where h ∈ Lipγ[0, 1] is a strictly positive (unknown) density on the interval [0, 1], which is Lipschitz continuous of order γ > 1/2 [see Sacks and Ylvisacker (1970)]. For the construction of a test for the hypothesis (1.3) of a parametric functional regression model we consider the class of parametric models M = {g(·, ·, β(·)) : [0, 1]× [0, 1] −→ R | β : [0, 1] −→ Θ} , where Θ is some subset of Rk. For the sake of transparency we first discuss the case of testing the hypothesis of a linear functional regression model, that is H0 : m(u, t) = g(u, t, β(u)) = β(u) Tf(u, t) , (2.2) where f(u, t) = (f1(u, t), . . . , fk(u, t))T are given regression functions. We define for fixed u ∈ [0, 1] the inner product 〈p, q〉u = ∫ p(u, t)q(u, t)h(t) dt. on the space of functions defined on the unit square on [0, 1]2 with corresponding norm || · ||u, and consider M2u = inf β(u) ||m(u, ·)− β(u)Tf(u, ·)||2u 3 as the minimal distance from m to functions of the form (2.2). A standard result from Hilbert space theory [see Achieser (1956)] yields that M2u can be expressed as a ratio of two Gramian determinants, i.e. M2u = Γu(m, f1, . . . , fk) Γu(f1, . . . fk) , where Γu(p1, . . . pk) = det(〈pi, pj〉u)i,j=1,...,k is the Gramian determinant of the function p1, . . . , pk. In order to obtain an estimator for M2u we replace the inner products Au,0 = 〈m,m〉u, Au,p = 〈m, fp〉u and Bu,p,q = 〈fp, fq〉u by their empirical counterparts Aˆu,0 = 1 n− 1 n∑ i=2 Yi(u)Yi−1(u), Aˆu,p = 1 n n∑ i=1 Yi(u)fp(u, ti), Bˆu,p,q = 1 n n∑ i=1 fp(u, ti)fq(u, ti), where p, q = 1, . . . , k. This yields a canonical estimate Mˆ2u = ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ Aˆu,0 Aˆu,1 · · · Aˆu,k Aˆu,1 Bˆu,1,1 · · · Bˆu,1,k ... ... . . . ... Aˆu,k Bˆu,k,1 · · · Bˆu,k,k ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ Bˆu,1,1 · · · Bˆu,1,k ... . . . ... Bˆu,k,1 · · · Bˆu,k,k ∣ ∣ ∣ ∣ ∣ ∣ ∣ (2.3) of the L2-distance M2u . In the following discussion we will study the asymptotic properties of the process {Mˆ2u}u∈[0,1]. Denote by Lipunifγ [0, 1] = {f = f(x, ·) : |f(x, t)− f(x, s)| ≤ C|s− t| γ; s, t ∈ [0, 1]} (2.4) the set of all functions f : [0, 1] × [0, 1] → R satisfying a uniform Lipschitz condition (in other words, the constant C in (2.4) does not depend on x) and assume that for some γ > 1/2 and for all (t, u, v) ∈ [0, 1]3 fj(u, ·), fj(·, t) ∈ Lip unif γ [0, 1] j = 1, . . . , k m(u, ·),m(·, t) ∈ Lipunifγ [0, 1], r(·, u, v), r(t, ·, v), r(t, u, ·) ∈ Lipunifγ [0, 1], where r(t, u, v) = E[ε(u, t)ε(v, t)] 4 denotes the covariance of the (centered) errors at the point t. The following result specifies the asymptotic properties of the stochastic process {Tˆu−M2u}u∈[0,1]. Throughout this paper the symbol =⇒ denotes weak convergence. Theorem 2.1. If the assumptions stated in this section are satisfied and the linear hypothesis (2.2) has to be tested we have as n→∞ √ n(Tˆ 2u −M 2 u) =⇒ G, in C[0, 1], where G is a centered Gaussian process with covariance k = k(u, v) given by k(u, v) = ∫ r2(t, u, v)h(t) dt (2.5) + 4 ∫ r(t, u, v)(m(u, t)− g(u, t, β0(u)))(m(v, t)− g(v, t, β0(v)))h(t) dt and β0(u) = argminβ||m(u, ·)− g(v, ·, β)|| 2 u (2.6) corresponds to the parameter of the best approximation of the function m(u, ·) by the parametric regression model. Proof of Theorem 2.1. We assume without loss of generality that the functions f1(u, ·), . . . , fk(u, ·) are orthonormal with respect to the inner product 〈p, q〉u. Then the minimal L2-distance obtained by the best approximation simplifies to M2u = Au,0 − k∑ p=1 A2u,p It is easy to see that the statistics Aˆu,p and Bˆu,p,q are √ n consistent estimates of the quantities 〈m, fp〉u = 0 and 〈fp, fq〉 = δp,q, respectively, and consequently we obtain for the statistic in (2.3) Tn(u) = √ n { Aˆu,0 − k∑ p=1 Aˆ2u,p −M 2 u } + op(1) = T¯n(u) + oP (1) uniformly with respect to u ∈ [0, 1], where the last equality defines the process T¯n(u) in an obvious manner. For the proof of weak convergence we have to show (T¯n(u1), . . . , T¯n(um)) D −→ (G(u1), . . . , G(um)) ∀u1, . . . , um ∈ [0, 1],m ∈ N Tightness of the sequence (T¯n)n∈N. The convergence of the finite dimensional distributions follows from Theorem 2.1 and its proof in Dette et al. (1999). For a proof of tightness we use the decomposition T¯n(u) = Un(u) +Vn(u) with Un(u) = √ n ( Aˆu,0 − EAˆu,0 − ( k∑ p=1 Aˆ2u,p − EAˆ 2 u,p) ) 5 Vn(u) = √ n ( EAˆu,0 − 〈m,m〉u − ( k∑ p=1 EAˆ2u,p − 〈m, fp〉u) ) . Consequently, it is sufficient to show that the (deterministic) sequence Vn(u) converges uniformly to 0, i.e. sup u∈[0,1] |Vn(u)| = o(1), (2.7) and that the process {Un(u)}u∈[0,1] is tight. For this purpose we use Theorem 12.3 from Billingsley (1968) and show that there are constants α > 0, γ ≥ 0 and a nondecreasing, continuous function F on [0, 1] such that E [|Un(u)− Un(v)| γ] ≤ |F (u)− F (v)|α. (2.8) We first prove (2.7) and introduce the decomposition Vn(u) = √ n ( Vn0(u)− k∑ p=1 Vnp(u) ) with Vn0(u) = EAˆu,0 − 〈m,m〉u and Vnp(u) = EAˆ2u,p − 〈m, fp〉 2 u. Assertion (2.7) follows from sup u∈[0,1] |Vnp(u)| = o(n − 12 ) , p = 0, . . . , k (2.9) We consider exemplarily the first summand Vn0(u), which can be represented as Vn0(u) = A1(u) + A2(u) + A3(u) with A1(u) = 1 n− 1 n∑ i=1 m2(u, ti)− 〈m,m〉u A2(u) = − 1 n− 1 n∑ i=2 m(u, ti)(m(u, ti)−m(u, ti−1)) A3(u) = − 1 n− 1 m(u, t1). Using the the fact that m2(u, ·) ∈ Lipunifγ [0, 1] and taking into account that max n i=2 |ti − ti−1| = O(n−γ) = o(n− 1 2 ), by (2.1), we obtain that all terms are of order o(n−1/2), uniformly with respect to u ∈ [0, 1]. This proves (2.9) for p = 0 and similar arguments for the remaining terms show that (2.7) holds. In order to show that condition (2.8) is valid we calculate E [ (Un(u)− Un(v)) 2 ] = n (B1 +B2 +B3 +B4), 6 where B1 = Var(Aˆu,0) + Var(Aˆv,0)− 2 Cov(Aˆu,0, Aˆv,0), B2 = Var( k∑ p=1 Aˆ2u,p) + Var( k∑ p=1 Aˆ2v,p)− 2 Cov( k∑ p=1 Aˆ2u,p, k∑ p=1 Aˆ2v,p), B3 = 2 Cov(Aˆu,0, k∑ p=1 Aˆ2v,p)− 2 Cov(Aˆu,0, k∑ p=1 Aˆ2u,p), B4 = 2 Cov( k∑ p=1 Aˆ2u,p, Aˆv,0)− 2 Cov(Aˆv,0, k∑ p=1 Aˆ2v,p). We now show that it is possible to find, for each term Bi = Bi(u, v) (i = 1, . . . , 4) a constant C such that nBi(u, v) ≤ C |u− v| γ, which proves condition (2.8). For this purpose we exemplarily consider the expression B1, the corresponding statements for the other terms follow along similar lines. A straightforward but tedious calculation yields Cov(Aˆu,0, Aˆv,0) = 1 (n− 1)2 { n∑ i=2 m(u, ti−1)m(v, ti−1)r(ti, u, v) +m(u, ti)m(v, ti)r(ti−1, u, v) + r(ti, u, v)r(ti−1, u, v) + n∑ i=3 m(u, ti)m(v, ti−2)r(ti−1, u, v) +m(v, ti)m(u, ti−2)r(ti−1, u, v) } , and we therefore obtain B1 = B˜1(u, v) + B˜1(v, u) with B˜1(u, v) = 1 (n− 1)2 { n∑ i=2 m(u, ti−1) 2r(ti, u, u)−m(u, ti−1)m(v, ti−1)r(ti, u, v) +m(u, ti) 2r(ti−1, u, u)−m(u, ti)m(v, ti)r(ti−1, u, v) +r(ti, u, u)r(ti−1, u, u)− r(ti, u, v)r(ti−1, u, v) +2 n∑ i=3 m(u, ti)m(u, ti−2)r(ti−1, u, u)−m(u, ti)m(v, ti−2)r(ti−1, u, v) } . A typical summand in B˜1 can be estimated by |m(u, ti−1) 2r(ti, u, u)−m(u, ti−1)m(v, ti−1)r(ti, u, v)| ≤ C|u− v| γ using the Lipschitz property of the functions r and m. All other summands are treated similarly, and we obtain B1 = B1(u, v) ≤ 1 n− 1 C|u− v|γ, which proves assertion (2.8) and completes the proof of Theorem 1. 2 7 Remark 2.2. The assertion of Theorem 2.2 remains also valid, if the general hypothesis (1.3) of nonlinear functional regression models has to be tested, and we will indicate the arguments for proving this assertion here briefly. First note that the estimate Aˆu,0 can be rewritten as Aˆu,0 = 1 n n∑ i=1 Y 2i (u)− σˆ 2 u + op( 1 √ n ), where σˆ2u = 1 2(n− 1) n∑ i=2 (Yi(u)− Yi−1(u)) 2 denotes an estimate of the integrated variance ∫ 1 0 Var (ε(u, t))h(t)dt = ∫ 1 0 r2(t, u, u)h(t)dt at the point u ∈ [0, 1] [see for example Rice (1984)]. Now a straightforward calculation shows that the estimate Mˆ2u is essentially the sum of squared residuals, i.e. Mˆ2u = argminβ 1 n n∑ i=1 ( Y 2i (u)− β Tf(u, ti) )2 − σˆ2u + op( 1 √ n ) (2.10) uniformly with respect to u ∈ [0, 1]. Obviously, this concept can be easily generalized to the problem of testing the hypothesis of a nonlinear functional regression model. To be precise we assume that for each u ∈ [0, 1] the function gu : t 7→ g(u, t, βu) satisfies the standard regularity conditions of a nonlinear regression model [see for example Gallant (1987) or Seber and Wild (1989)]. In particular we assume the set Θ ⊂ Rk is a compact set with non-empty interior and that for all u, t ∈ [0, 1] the function g(u, t, β) is twice continuously differentiable w.r.t. β and satisfies g(u, ·, β), g(·, t, β) ∈ Lipunifγ [0, 1]. We recall the definition (2.6) of the parameter corresponding to best L2-approximation of the function m(u, ·) : [0, 1] → R by parametric functions of the form {g(u, ·, βu) | βu ∈ Θ}, where we assume for each u ∈ [0, 1] the existence of the minimum β0(u) at a unique interior point of the compact space Θ. The L2-distance between the function m(u, ·) and its best approximation g(u, ·, β0(u)) in the parametric class is now defined by M2u = ∫ 1 0 (m(u, t)− g(u, t, β0(u))) 2 h(t)dt. In order to investigate whether the hypothesis (1.3) is satisfied let for each u ∈ [0, 1] βˆ0(u) = arginfβ n∑ i=1 (Yi(u)− g(u, ti, β)) 2 (2.11) 8 denote the nonlinear least squares estimate (here and throughout this paper it is assumed that the infimum in (2.11) is attained at a unique interior point of Θ ⊂ Rk) and observing (2.11) we obtain as the analogue of (2.3) the statistic Tˆ 2u = 1 n n∑ i=1 (Yi(u)− g(u, ti, βˆ0(u))) 2 − σˆ2u. (2.12) It follows by similar arguments as in Brodeau (1993) that Tˆ 2u = 1 n n∑ i=2 εi(u, ti)εi(u, ti−1)− 2 n n∑ i=1 (m(u, ti)− g(u, ti, β0(u))ε(u, ti) +M 2 u + op( 1 √ n ) uniformly with respect tu u ∈ [0, 1], and a similar reasoning as presented in the proof of Theorem 2.1 shows that √ n (Tˆ 2u −M 2 u) =⇒ G, where the covariance structure of the Gaussian process G is specified in (2.5). The details are omitted for the sake of brevity. Note that the null hypotheses (1.3) is satisfied if and only if M2u = 0 for all u ∈ [0, 1]. Consequently, a consistent test can be obtained by rejecting the null hypotheses for large values of a Crame´r-von- Mises or a Kolmogoroff-Smirnov functional of the process {Tˆu}u∈[0,1]. Under the null hypothesis the covariance kernel of the limiting process G in Theorem 1 reduces to k(u, v) H0= ∫ 1 0 r2(t, u, v)h(t)dt and by the continuous mapping theorem it follows that the statistic √ n ∫ 1 0 Tˆ 2udu converges weakly to a centered normal distribution with variance ∫ 1 0 ∫ 1 0 k(u, v)dudv. Therefore, it remains to estimate the asymptotic variance, and we propose to use sˆ2n = ∫ 1 0 ∫ 1 0 kˆ(u, v)dudv, where the estimate of the covariance kernel k(u, v) is defined by kˆ(u, v) = 1 4(n− 3) n−2∑ i=2 Si(u)Si(v)Si+2(u)Si+2(v). (2.13) with Si(u) = Yi(u)−Yi−1(u). The following result shows that under the null hypothesis the statistic sˆ2n is a consistent estimate of the asymptotic variance. The technical details of the proof are given in the Appendix. 9 Proposition 2.2 Under the assumptions of Theorem 1 we have kˆ(u, v) = k(u, v) +Op(n −1/2) uniformly with respect to u, v ∈ [0, 1]. Theorem 2.1 and Proposition 2.2 provide an asymptotic level α test by rejecting the null hypothesis (1.3) if Tn = √ n √∫ 1 0 ∫ 1 0 kˆ(u, v) du dv ∫ 1 0 Tˆ 2u du > u1−α, (2.14) where u1−α denotes the (1 − α) quantile of the standard normal distribution. The finite sample properties of this test and a corresponding bootstrap version will be illustrated in Section 5. 3 A test for the hypothesis (1.4) We now consider the problem of testing the hypothesis (1.4) in the functional regression model defined by (1.1) and assume that n independent observations according to the model (??) are available. For this purpose we define for fixed t ∈ [0, 1] the L2-distance M2t = infγt ∫ 1 0 (m(u, t)− h(u, t, γt)) 2 du. (3.1) We only deal with the linear case, that is h(u, t, γ(t)) = γ(t)Tf(u, t) for some given regression functions f(u, t) = (f1(u, t), . . . , fk(u, t)) and denote by γ0(t) the func- tion, which yields to the minimal values in (3.1). As a global measure of deviance from the null hypothesis we consider the functional M2 = ∫ 1 0 M2t h(t) dt, (3.2) and obviously the hypothesis H0 : M2 = 0 is equivalent to (1.4). Similarly as in Section 2, standard Hilbert space theory shows that the distance M2t can be ex- pressed as a ratio of two Gramian determinants M2t = Γt(m, f1, . . . , fk) Γ(f1, . . . , fk) , (3.3) where Γt(p1, . . . , pl) = det((〈pi, pj〉t)li,j=1) and the inner products are now calculated with respect to the variable u, that is 〈f, g〉t = ∫ 1 0 f(u, t)g(u, t) du. 10 For the time ti we can “estimate” the entries of the matrix in the numerator of (3.3) by Bˆi,0 = ∫ Yi(u)Yi−1(u) du, Bˆi,p = ∫ Yi(u)fp(u, ti) du, Cˆi,p = ∫ Yi−1(u)fp(u, ti−1) du = Bˆi−1,p, and define Mˆ2ti = ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ Bˆi,0 Bˆi,1 · · · Bˆi,k Cˆi,1 〈f1, f1〉ti · · · 〈f1, fk〉ti ... ... . . . ... Cˆi,k 〈fk, f1〉ti · · · 〈fk, fk〉ti ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ ∣ 〈f1, f1〉ti · · · 〈f1, fk〉ti ... . . . ... 〈fk, f1〉ti · · · 〈fk, fk〉ti ∣ ∣ ∣ ∣ ∣ ∣ ∣ (3.4) as an estimator for M2ti . Note that we estimate the entries in the upper first column by Cˆi,p rather than Bˆi,p in order to assure that the statistic Mˆ2ti is asymptotically unbiased. However, because only one observation is made at time ti, the variance of Mˆ2ti is not converging to 0 with increasing sample size. As a consequence, the statistic Mˆ2ti is not a consistent estimate for M 2 ti . Nevertheless, a consistent estimate for the measure defined in (3.2) can be obtained by averaging the quantities Mˆ2ti , that is Mˆ2 = 1 n− 1 n∑ i=2 M2ti . (3.5) Similarly, consistent estimates of M2t at a particular point t can be obtained by local averages. Theorem 3.1. Under the assumptions of Section 2 the estimate Mˆ2 defined in (3.5) is consistent for M2 = ∫ 1 0 M 2 t dt. More precisely, we have as n→∞ √ n− 1(Mˆ2 −M2) D → N (0, σ2), where the asymptotic variance is given by σ2 = ∫ 1 0 (∫ [0,1]2 ( r(u, v, t)− (Pu,tr)(v) )( r(u, v, t)− (Pv,tr)(u) ) du dv + 4 ∫ [0,1]2 r(u, v, t) ( m(u, t)− (Ptm)(u) )( m(v, t)− (Ptm)(v) ) du dv ) h(t)dt, and (Ptm)(u) = γTt,0f(u, t) and (Pu,tr)(v) = γ T u,t,0f(v, t) denote the orthogonal projections of the function m(·, t) and r(u, ·, t) on the set span{f1(·, t), . . . , fk(·, t)}, respectively, that is ∫ 1 0 (m(u, t)− γTt,0f(u, t)) 2 du = inf γt ∫ 1 0 (m(u, t)− γTt f(u, t)) 2 du = M2t , 11 ∫ 1 0 (r(u, v, t)− γTu,t,0f(v, t)) 2 dv = inf γu,t ∫ 1 0 (r(u, v, t)− γTu,tf(v, t)) 2 dv. Proof of Theorem 3.1. Without loss of generality we may assume that the functions f1, . . . , fk are orthonormal and therefore the minimal distance in (3.3) and its estimator defined in (3.4) simplify to M2ti = 〈m,m〉ti − k∑ p=1 〈m, fp〉 2 ti , Mˆ2ti = Bˆi,0 − k∑ p=1 Bˆi,pCˆi,p, respectively. A careful calculation of the moments of the random variables in the latter expression yields E[Bˆi,0] = 〈m,m〉ti +O(n −γ), E[Bˆi,pCˆi,p] = 〈m, fp〉 2 ti +O(n −γ), Var(Bi,0) = ∫ r(u, v, ti) 2 du dv + 2 ∫ r(u, v, ti)m(u, ti)m(v, ti) du dv + (O(n −γ), Cov(Bˆi,pCˆi,p, Bˆi,qCˆi,q) = ∫ r(u, v, ti)fp(u, ti)fq(v, ti) du dv ( 2〈m, fp〉ti〈m, fq〉ti + ∫ r(u, v, ti)fp(u, ti)fq(v, ti) du dv ) +O(n−γ), Cov(Bˆi,0, Bˆi,pCˆi,p) = 2 ∫ r(u, v, ti)m(u, ti)fp(v, t) du dt 〈m, fp〉ti + ∫ r(u, v, ti)r(u,w, ti)fp(v, ti)fp(w, ti) du dv dw +O(n −γ), Cov(Bˆi,0, Bˆi−1,0) = 2 ∫ r(u, v, ti)m(u, ti)m(v, ti) du dv, Cov(Bˆi,0,, Bˆi−1,pCˆi−1,p) = ∫ r(u, v, ti)m(u, ti)fp(v, ti) du dv 〈m, fp〉ti +O(n −γ) = Cov(Bˆi−1,p, Bˆi,0Cˆi,p), Cov(Bˆi,pCˆi,p, Bˆi−1,qCˆi−1,q) = ∫ r(u, v, ti)fp(u, ti)fq(v, ti) du dv 〈m, fp〉ti〈m, fq〉ti +O(n −γ). The sequence Mˆ2t2 , . . . , Mˆ 2 tn forms a triangular array of one-dependent random variable and as a consequence all covariances corresponding to a lag larger than one vanish. Therefore the variance of the standardized mean σ2n = Var( 1 √ n− 1 n∑ i=2 M2ti) 12 is given by σ2n = 1 n− 1 n∑ i=2 { Var(Bi,0) + k∑ p,q=1 Cov(Bˆi,pCˆi,p, Bˆi,qCˆi,q)− 2 k∑ p=1 Cov(Bˆi,0, Bˆi,pCˆi,p) + 2 Cov(Bˆi,0, Bi−1,0)− 2 k∑ p=1 Cov(Bˆi,0, Bˆi−1,pCˆi−1,p) − 2 k∑ p=1 Cov(Bˆi−1,0, Bˆi,pCˆi,p) + 2 k∑ p,q=1 Cov(Bˆi,pCˆi,p, Bˆi−1,qCˆi−1,q) } +O(n−γ) = ∫ 1 0 {∫ r(u, v, t)2 du dv − 2 k∑ p=1 ∫ r(u, v, t)r(u,w, t)fp(v, t)fq(w, t) du dv dw + k∑ p,q=1 (∫ r(u, v, t)fp(u, t)fq(v, t) du dv )2 + 4 ∫ r(u, v, t)m(u, t)m(v, t) du dv − 8 p∑ i=1 ∫ r(u, v, t)m(u, t)fp(v, t) du, dv 〈m, fp〉t + 4 k∑ p,q=1 ∫ r(u, v, t)fp(u, t)fq(v, t) du dv 〈m, fp〉t 〈m, fq〉t } dt+O(n−γ) = σ2 +O(n−γ). Here the last equality uses the fact that under the assumption of orthonormality the orthogonal projection Ptm and Pu,tr are given by (Ptm)(u) = k∑ p=1 〈m, fp〉tfp(u, t), (Pu,tr)(v) = k∑ p=1 〈r(u, ·), fp〉tfp(v, t). The assertion of the theorem now follows by the classical central limit theorem for m-dependent random variables (see Orey (1958)). 2 Under the null hypothesis the variance of the limiting normal distribution simplifies to σ2 H0= ∫ 1 0 (∫ [0,1]2 ( r(u, v, t)− (Pu,tr)(v) )( r(u, v, t)− (Pv,tr)(u) ) d(u, v)h(t) dt. We propose to estimate this variance by σˆ2 = 1 4(n− 3) n−2∑ i=2 ∫ [0,1]2 Si(u) ( Si(v)− ∫ 1 0 Si(x)f(x, ti) T dxA−1i f(v, ti) ) 13 × Si+2(v) ( Si+2(u)− ∫ 1 0 Si+2(x)f(x, ti+2) T dxA−1i+2f(u, ti+2 ) d(u, v), where Si(u) = Yi(u)− Yi−1(u) and Ai = ∫ 1 0 f(u, ti)f(u, ti) T du ∈ Rk×k. Observing that the orthog- onal projection (Pu,tr)(v) is given by Pu,tr(v) = γ T u,t,0f(v, t) = ∫ 1 0 r(u, x, t)f(x, t)T dx (∫ 1 0 f(x, t)(f(x, t)T dx )−1 f(v, t) it follows by a similar calculation as in the proof of Theorem 3.1 that σˆ2 is a √ n-consistent estimator for σ2. Therefore we obtain an asymptotic level α test for the hypothesis (1.4) by rejecting H0 if √ n− 1 σˆ2 Mˆ2 > u1−α, (3.6) where u1−α denotes the (1− α) quantile of the standard normal distribution. 4 Testing homoscedasticity In this section we address the problem of testing the hypothesis (1.5) of homoscedastic errors in the functional regression model (1.1). Motivated by the discussion in Section 2 and 3 we propose the following measure of heteroscedasticity at a point (u, v) ∈ [0, 1]2 τ 2(u, v) = min a∈R ||r(·, u, v)− a||2 = ∫ 1 0 r2(t, u, v)h(t) dt− (∫ 1 0 r(t, u, v)h(t) dt )2 . (4.1) Note that τ 2(u, v) = 0 a.e. if and only if the covariance function does not depend on t, that is the hypothesis (1.5) of homoscedasticity is valid. An estimator for the quantity ∫ 1 0 r 2(t, u, v)h(t) dt in (4.1) has been proposed in (2.13), and for the second term we will use a similar estimate based on the statistic k˜(u, v) = 1 2(n− 1) n∑ i=2 Si(u)Si(v), where Si(u) = Yi(u)−Yi−1(u). We therefore obtain as an estimator of the process {τ 2(u, v)}u,v∈[0,1] τˆ 2n(u, v) = 1 4(n− 3) n−2∑ i=2 Si(u)Si(v)Si+1(u)Si+2(v)− ( 1 2(n− 1) n∑ i=2 Si(u)Si(v) )2 . The asymptotic properties of this random variable are specified in the following result. Theorem 4.1. Assume that the third and fourth moments d1(t, u, v, w) = E[ε(u, t)ε(v, t)ε(w, t)] d2(t, u, v, w, x) = E[ε(u, t)ε(v, t)ε(w, t)ε(x, t)] 14 of the error process ε(u, t) exist and are elements of Lipunifγ [0, 1] for every argument. If the as- sumptions of Section 2 are satisfied we have as n→∞ 4 √ n(τˆ 2n(u, v)− τ 2(u, v)) =⇒ G in C[0, 1]2. Here G is a centered Gaussian field on [0, 1]2 whose covariance structure under the null hypothesis of homoscedasticity is given by k ((u1, v1), (u2, v2)) := Cov (G(u1, v1), G(u2, v2)) = 6D(2)2 (u1, v1, u2, v2)− 12D (r,1,1) 2 (u1, v1, u2, v2) + 8D (r,1,1) 2 (u1, u2, v1, v2) + 8D (r,1,1) 2 (u1, v2, v1, u2) +6J(u1, v1, u2, v2, u1, v1, u2, v2) + 4J(u1, u2, v1, v2, u1, u2, v1, v2) +4J(u1, v2, v1, u2, u1, v2, v1, u2)− 8J(u1, v1, u2, v2, u1, u2, v1, v2) −8J(u1, v2, u2, v2, u1, v2, v1, u2) + 8J(u1, v1, u2, v2, u1, v2, v1, u2) +2D(r)1 (u1, u2, v1, v2) + 2D (r) 1 (u1, v2, v1, u2) + 2D (r) 1 (v1, u2, u1, v2) + 2D (r) 1 (v1, v2, u1, u2) where the following notations have been used D(2)2 (u1, v1, u2, v2) = ∫ 1 0 d2(t, u1, v1, u2, v2) 2h(t) dt D(r,i,j)2 (u1, v1, u2, v2) = r(u1, v1) ir(u2, v2) j ∫ 1 0 d2(t, u1, v1, u2, v2)h(t) dt J(u1, v1, u2, v2, u3, v3, u4, v4) = 4∏ i=1 r(t, ui, vi) D(r)1 (u1, v1, u2, v2) = r(u1, v1) ∫ 1 0 d1(t, v1, u2, v2)d1(t, u1, u2, v2)h(t) dt. Proof of Theorem 4.1. The proof follows along similar lines as the proof of Theorem 2.1, establishing weak convergence of finite dimensional distributions and tightness of the sequence {4 √ n(τˆ 2n(u, v)− τ 2(u, v))}u,v∈[0,1]. For this reason only the main steps are indicated in the subsequent discussion. A careful inspection of the results in the proof of Lemma 6.2 and 6.3 in Dette et al. (1999) yields to the following decomposition into a sum of 4-dependent random variables and a stochastic remainder of order n− 1 2 τˆ 2n(u, v)− τ 2(u, v) = 1 4(n− 3) n−2∑ j=2 Wj(u, v) + oP (n − 12 ) (uniformly with respect to (u, v)), where Wj(u, v) = Zj(u, v) {Zj+2(u, v) + 4δj(u, v)} , Zj(u, v) = ∆εu,j−1,j∆εv,j−1,j − Ej(u, v), Ej(u, v) = E[∆εu,j−1,j∆εv,j−1,j] = 2r(tj, u, v) +O(n −γ), 15 δj(u, v) = r(tj, u, v)− 1 n n∑ i=1 r(ti, u, v). A straightforward but tedious calculation shows that the covariance structure of the random vari- ables Wj(u, v) is given by Cov(Wj(u1, v1),Wj(u2, v2)) = 4 (d2(tj, u1, v1, u2, v2) + r(tj, u1, v1)r(tj, u2, v2) + r(tj, u1, u2)r(tj, v1, v2) + r(tj, u1, v2)r(tj, v1, u2)) 2 +16 ( d2(tj, u1, v1, u2, v2) + r(tj, u1, v1)r(tj, u2, v2) + r(tj, u1, u2)r(tj, v1, v2) +r(tj, u1, v2)r(tj, v1, u2) )( 2δj(u1, v1)δj(u2, v2)− r(tj, u1, v1)r(tj(u2, v2) ) +16r2(tj, u1, v1)r 2(tj, u2, v2)− 64r(tj, u1, v1)r(tj, u2, v2)δj(u1, v1)δj(u2, v2) +O(n −γ), Cov(Wj(u1, v1),Wj+1(u2, v2)) = (d2(tj, u1, v1, u2, v2)) 2 − 2d2(tj, u1, v1, u2, v2)r(tj, u1, v1)r(tj, u2, v2) + r 2(tj, u1, v1)r 2(tj, u2, v2) +d1(tj, v1, u2, v2)d1(tj, u1, v1, u2)r(tj, u1, u2) + d1(tj, v1, u2, v2)d1(tj, u1, v1, u2)r(tj, u1, v2) +d1(tj, u1, u2, v2)d1(tj, u1, v1, v2)r(tj, v1, u2) + d1(tj, u1, u2, v2)d1(tj, u1, v1, u2)r(tj, v1, v2) −8δj(u2, v2)d1(tj, u1, v1, v2)d1(tj, u1, v1, u2) + 16δj(u1, v1)δj(u2, v2) × ( d2(tj, u1, v1, u2, v2)− r(tj, u1, v1)r(tj, u2, v2) ) +O(n−γ) and Cov(Wj(u1, v1),Wi(u2, v2)) = 0 for |i− j| ≥ 2. The dominating sum An(u, v) = 1 4(n− 3) n−2∑ j=2 Wj(u, v) therefore has asymptotic covariance 16nCov(An(u1, v1), An(u2, v2)) = 1 n ∑ j Cov(Wj(u1, v1),Wj(u2, v2)) + Cov(Wj(u1, v1),Wj+1(u2, v2)) + Cov(Wj(u2, v2),Wj+1(u1, v1)) + o(1) = k ((u1, v1), (u2, v2)) + o(1). The last equality is obtained using the Lipschitz continuity of the regression functions. Finally the validation of tightness follows along similar lines as in the proof of Theorem 2.1 by a tedious calculation of a corresponding moment condition for Gaussian fields [see e.g. Bickel and Wichura (1971)] and is therefore omitted. 2 5 Finite sample properties In this section we study the finite sample properties of the tests proposed in the previous sections. Our first example considers the linear hypothesis H0 : m(u, t) = g(u, t, β(u)) = β(u) f(u, t), 16 where f : [0, 1]2 → R is some given function and β : [0, 1]→ R (i.e. k = 1). The discussion following the proof of Theorem 2.1 states that under the null hypothesis H0, the statistic Tn defined in (2.14) converges weakly to a standard normal distribution. We reject the hypothesis H0 if the inequality (2.14) is satisfied. In order to study the approximation of the nominal level and the power of this asymptotic level α test 5000 replications with different functions f have been performed. The error terms ε(u, ti) are assumed to be i.i.d. Brownian Motions, i.e. r(t, u, v) = u∧ v, which implies that the model is homoscedastic. The results under the null hypothesis are presented in Table 1 for the functions f1(u, t) = (−1 + 2u) + 2(1− u)t (5.1) f2(u, t) = (1 + u) cos(2pit). (5.2) It can be seen that the nominal level of the test is well approximated in most cases. For the function f1(u, t) = (−1 + 2u) + (2− 2u)t the approximation is very accurate for sample sizes larger than n = 100, for smaller values the level is either overestimated (if the nominal level is smaller than α = 0.1) or underestimated (if the nominal level is larger than α = 0.1). In the case where we use the function f2(u, t) = (1 + u) cos(2pit) we underestimate the level, with the tendency to get better approximations for larger sample sizes. For the investigation of the power of the test we consider the functions fi defined in (5.1) and (5.2) with two additive alternatives, that is m(u, t) = fi(u, t) + 1 2 exp(t) (5.3) m(u, t) = fi(u, t) + sin(2pit) (5.4) with i = 1, 2. The corresponding results are presented in Tables 2 and 3. We observe reasonable rejection probabilities for all sample sizes and both choices of f1. Note that for sample sizes n = 25 and n = 50 the approximation of the nominal level is less accurate. In these cases we propose a wild bootstrap procedure to obtain a more accurate test procedure [see Wu (1986)]. For this purpose we denote by βˆ(u) the (point-wise) ordinary least square estimator of the function β(u) and calculate the parametric residuals by εˆ(u, ti) = Yi(u)− βˆ(u) f1(u, ti) (5.5) for i = 1, . . . , n and u ∈ [0, 1]. For b = 1, . . . B with B ∈ N let vb∗i be independent samples of a random variable V with a Laplacian distribution on the set {−1, 1}, and define the bootstrap sample as Y b∗i (u) = βˆ(u) f1(u, ti) + ε b∗ i (u); i = 1, . . . , n, (5.6) where εb∗i (u) = v b∗ i εˆ(u, ti). (5.7) For each b ∈ {1, . . . , B} we calculate the statistic T b∗n = Tn(Y b∗ 1 (·), . . . , Y b∗ n (·)), with Tn as given in (2.14) and denote by H∗n,B(x) = 1 B B∑ b=1 I{T b∗n ≤ x} 17 the empirical distribution function of T 1∗n , . . . , T B∗ n . We determine the (1 − α)-quantile of this distribution and use its quantiles as critical values for the test statistic Tn = Tn(Y1(·), . . . , Yn(·)). In our simulation study we made 1000 replications of this procedure with B = 200 bootstrap- samples, the corresponding results under the null hypothesis are presented in Table 4 for the sample sizes n = 25 and n = 50, and the regression functions (5.1) and (5.2). Compared to the test based on the normal approximation we observe a substantial improvement with respect to the approximation of the nominal level. In Table 5 and 6 we show the simulated rejection probabilities of the wild bootstrap test for the alternatives (5.3) and (5.4), respectively. In all cases we obtain similar rejection probabilities as for the test defined in (2.14). Compared to the test based on the asymptotic distribution, a slight loss in power is observed in case of the alternative fi(u, t) + sin(2pit), while in case of the exponential alternative we observe a negligible improvement for the majority of scenarios. As a second example we study the finite sample properties of the test for the hypothesis H0 : m(u, t) = γ(t) f(u, t), (5.8) defined in Section 3, where again f : [0, 1]2 → R is some given function and γ : [0, 1] → R (i.e. k = 1). The discussion at the end of Section 3 suggests to reject the hypothesis H0 if the inequality (3.6) is satisfied. We have investigated the finite sample properties of this test under the assumptions of the previous study for f = f1 as given in (5.1). The normal approximation did not yield a sufficiently accurate approximations of the level for sample sizes up to n = 500 and for this reason these results are not depicted. As an alternative we propose to use a wild bootstrap approximation similar to the one given in the previous paragraph. More precisely, we calculate residuals analogously to (5.5) by εˆ(u, ti) = Yi(u)− γˆ(ti) f1(u, ti) for i = 1, . . . , n and u ∈ [0, 1], where γˆ(ti) denotes the least square estimator for γ(ti). As in equations (5.6) and (5.7) we define εb∗i (u) = v b∗ i εˆ(u, ti) and Y b∗i (u) = γˆ(ti) f1(u, ti) + ε b∗ i (u) (i = 1, . . . , n) to obtain a wild bootstrap sample. The results of the corresponding bootstrap test are shown in Table 7. We observe that the resampling procedure yields to a test with a very accurate approx- imation of the nominal level and a perfect power behavior under the alternative H1 : m(u, t) = f1(u, t) + 12 exp(t). Acknowledgements. The authors would like to thank Martina Stein, who typed parts of this manuscript with considerable technical expertise. This work has been supported in part by the Collaborative Research Center “Statistical modeling of nonlinear dynamic processes” (SFB 823) of the German Research Foundation. 18 6 Appendix: Proof of Proposition 2.2 We introduce the representation Si(u) = ∆mu,i + ∆εu,i with ∆mu,i := m(u, ti)−m(u, ti−1), ∆εu,i := ε(u, ti)− ε(u, ti−1), and consider the following decomposition of the estimate kˆ kˆ(u, v) = 2T1n(u, v) + T2n(u, v) + 2T3n(u, v) + T˜n(u, v), (6.1) where T1n(u, v) = 1 4(n− 3) n−2∑ i=2 ∆mu,i∆mu,i+2 {∆mv,i+2∆εv,i + ∆mv,i∆εv,i+2} T2n(u, v) = 1 4(n− 3) n−2∑ i=2 [ ∆mu,i∆mu,i+2∆εv,i∆εv,i+2 + ∆mu,i∆mv,i∆εu,i+2∆εv,i+2 + ∆mu,i+2∆mv,i+2∆εu,i∆εv,i + ∆mu,i+2∆mv,i∆εu,i∆εv,i+2 ] T3n(u, v) = 1 4(n− 3) n−2∑ i=2 ∆εv,i∆εv,i+2 [ ∆mu,i∆εu,i+2 + ∆mu,i+2∆εu,i ] . T˜n(u, v) = 1 4(n− 3) n−2∑ i=2 ∆εu,i∆εu,i+2∆εv,i∆εv,i+2 We show that the first three terms of the decomposition (6.1) are asymptotically negligible. For this reason we analyze the term T1n(u, v) exemplarily. We have T1n(u, v) = T (a) 1n (u, v) + T (b) 1n (u, v) (6.2) with T (a)1n (u, v) = 1 4(n− 3) n−2∑ i=2 ∆mu,i∆mu,i+2∆mv,i+2∆εv,i T (b)1n (u, v) = 1 4(n− 3) n−2∑ i=2 ∆mu,i∆mu,i+2∆mv,i∆εv,i+2. Both sums are centered and for the calculation of the variance of T (a)1n (u, v) it follows Var(T (a)1n ) = 1 16(n− 3)2 n−2∑ i=2 n−2∑ j=2 E [∆mu,i∆mu,i+2∆mu,j∆mu,j+2 19 ∆mv,j+2∆mv,i+2∆εv,i∆εv,j] . Note that this sum is dominated by the sum of those expectations corresponding to the indices with i = j, i = j + 1 or j = i + 1. We exemplarily treat the case i = j. Using the Lipschitz continuity of the function m it follows ∆mu,i+2 ≤ max 2≤i≤n |ti − ti−1| γ = O(n−γ) (6.3) uniformly with respect to i = 2, . . . , n, and this estimate yields E[∆2mu,i∆ 2mu,i+2∆ 2mv,i+2∆ 2εv,i] = O ( n−6γ ) . Consequently, by Markov’s inequality we obtain (uniformly with respect to u and v) T (a)1n (u, v) = Op(n −3γ). The term T (b)1n (u, v) in (6.2) is treated similarly, which implies T1n(u, v) = op ( n−1/2 ) . Similar arguments for the statistics T2n(u, v) and T3n(u, v) in (6.1) give kˆ(u, v) = T˜n(u, v) + op(n −1/2). For the investigation of the remaining (dominating) term T˜n(u, v) we note that the sequence (∆ε(u, ti))i=1,..n is 2-dependent, which yields E[∆εu,i∆εu,i+2∆εv,i∆εv,i+2] = E[ε(u, ti)ε(v, ti) + ε(u, ti−1)ε(v, ti−1)] E[ε(u, ti+2)ε(v, ti+2) + ε(u, ti+1)ε(v, ti+1)] = [r(ti, u, v) + r(ti−1, u, v)][r(ti+2, u, v) + r(ti+1, u, v)] = 4r(ti, u, v)r(ti+2, u, v) +O(n −γ) by the Lipschitz continuity of the covariance function r. Observing the definition of T˜n(u, v) this gives E[T˜n(u, v)] = 1 n− 3 n−2∑ i=2 r(ti, u, v)r(ti+2, u, v) +O(n −γ) = k(u, v) + o ( n−1/2 ) . A similar calculation shows that the variance of T˜n is of order O(n−1), which yields the assertion of Proposition 2.2. 2 20 n Mean Var 0.15 0.1 0.05 0.01 25 -0.2484 1.4008 0.1192 0.0920 0.0664 0.0334 50 -0.1359 1.2099 0.1336 0.1022 0.0632 0.0262 f1(u, t) 100 -0.0975 1.0773 0.1328 0.0954 0.0544 0.0202 200 -0.0290 1.0516 0.1464 0.1062 0.0674 0.0208 500 -0.0373 1.0537 0.1514 0.1064 0.0578 0.0170 25 -1.28 1.4253 0.0382 0.0264 0.0152 0.0056 50 -0.6477 1.2543 0.0726 0.0538 0.0372 0.0146 f2(u, t) 100 -0.3862 1.1260 0.0886 0.0676 0.0434 0.0164 200 -0.2797 1.0379 0.1014 0.0718 0.0402 0.0102 500 -0.1455 1.0267 0.1226 0.0802 0.0444 0.0134 Table 1: Simulated rejection probabilities of the test (2.14) under the null hypothesis H0 : m(u, t) = fi(u, t), i = 1, 2, where the regression functions f1 and f2 are given in (5.1) and (5.2), respectively. n Mean Var 0.15 0.1 0.05 0.01 25 2.52 4.32 0.764 0.705 0.615 0.480 50 3.72 3.92 0.936 0.914 0.874 0.740 f1(u, t) 100 5.13 3.68 0.997 0.995 0.990 0.955 200 7.18 3.47 1 1 1 1 500 11.39 3.60 1 1 1 1 25 7.99 23.49 0.981 0.977 0.970 0.939 50 13.31 25.02 1 1 1 1 f2(u, t) 100 19.49 26.67 1 1 1 1 200 27.52 27.33 1 1 1 1 500 44.45 29.01 1 1 1 1 Table 2: Simulated rejection probabilities of the test (2.14) under the alternative H1 : m(u, t) = fi(u, t) + 1/2 exp(t), i = 1, 2, where the regression functions f1 and f2 are given in (5.1) and (5.2), respectively. n Mean Var 0.15 0.1 0.05 0.01 25 11.48 33.036 1 1 0.9996 0.9984 50 15.47 27.2397 1 1 1 1 f1(u, t) 100 21.09 24.1177 1 1 1 1 200 29.52 23.459 1 1 1 1 500 45.92 22.6557 1 1 1 1 25 5.047 11.9046 0.9398 0.9174 0.0876 0.7932 50 8.645 14.3088 0.9988 0.9982 0.9966 0.9866 f2(u, t) 100 12.51 14.4536 1 1 1 1 200 17.69 13.7865 1 1 1 1 500 27.69 13.3727 1 1 1 1 Table 3: Simulated rejection probabilities of the test (2.14) under the alternative H1 : m(u, t) = fi(u, t) + sin(2pit), i = 1, 2, where the regression functions f1 and f2 are given in (5.1) and (5.2), respectively. 21 n 0.15 0.1 0.05 0.01 f1(u, t) 25 0.15 0.108 0.055 0.020 50 0.15 0.101 0.058 0.016 f2(u, t) 25 0.158 0.108 0.057 0.020 50 0.154 0.095 0.051 0.013 Table 4: Simulated rejection probabilities of the bootstrap test under the null hypothesis H0 : m(u, t) = fi(u, t), i = 1, 2, where the regression functions f1 and f2 are given in (5.1) and (5.2), respectively. n 0.15 0.1 0.05 0.01 f1(u, t) 25 0.856 0.792 0.640 0.440 50 0.956 0.942 0.908 0.756 f2(u, t) 25 0.996 0.990 0.980 0.926 50 1 1 1 1 Table 5: Simulated rejection probabilities of the bootstrap test under the alternative H1 : m(u, t) = fi(u, t) + 1/2 exp(t), i = 1, 2, where the regression functions f1 and f2 are given in (5.1) and (5.2), respectively. n 0.15 0.1 0.05 0.01 f1(u, t) 25 0.957 0.922 0.857 0.673 50 0.998 0.996 0.986 0.918 f2(u, t) 25 0.988 0.977 0.952 0.797 50 0.999 0.998 0.997 0.985 Table 6: Simulated rejection probabilities of the bootstrap test under the alternative H1 : m(u, t) = fi(u, t) + sin(2pit), i = 1, 2, where the regression functions f1 and f2 are given in (5.1) and (5.2), respectively. 22 n 0.15 0.1 0.05 0.01 H0 25 0.148 0.106 0.062 0.022 50 0.152 0.104 0.048 0.010 100 0.160 0.104 0.052 0.018 200 0.148 0.116 0.062 0.016 500 0.150 0.104 0.06 0.012 H1 25 0.978 0.954 0.910 0.792 50 0.996 0.992 0.980 0.940 100 1 1 1 0.998 200 1 1 1 1 500 1 1 1 1 Table 7: Simulated rejection probabilities of the bootstrap test for the hypothesis (5.8). Under H0 : m(u, t) = f1(u, t), under H1 : m(u, t) = f1(u, t) + 1/2 exp(t). References Achieser, N. I. (1956). Theory of Approximation. Frederick Ungar Publishing Co., New York. Besse, P. and Ramsay, J. O. (1986). Principal components of sampled functions. Psychometrika, 51:285–311. Bickel, P. and Wichura, M. J. (1971). Convergence criteria for multiparameter stochastic processes and some applications. Annals of Mathematical Statistics, 42:1656–1670. Billingsley, P. (1968). Convergence of Probability Measures. Wiley, New York. Brodeau, F. (1993). Tests for the choice of approximative models in nonlinear regression when the variance is unknown. Statistics, 24(2):95–106. Cardot, H., Ferraty, F., Mas, A., and Sarda, P. (2003). Testing hypotheses in the functional linear model. Scandinavian Journal of Statistics, 30:241–255. Cardot, H., Goia, A., and Sarda, P. (2004). Testing for no effect in functional linear regression models, some computational approaches. Communications in Statistics - Simulation and Com- putation, 33:179–199. Cuevas, A., Febrero, M., and Fraiman, R. (2002). Linear functional regression: the case of fixed design and functional response. Canadian Journal of Statistics, 30:285–300. Dette, H., Munk, A., and Wagner, T. (1999). Testing model assumptions in multivariate linear regression models. Nonparametric Statistics, 12:309–342. Escabias, M., Aguilera, A. M., and Valderrama, M. J. (2004). Principal component estimation of functional logistic regression: discussion of two different approaches. Journal of Nonparametric Statistics, 16:365–384. 23 Faraway, J. J. (1997). Regression analysis for a functional response. Technometrics, 39:254–261. Ferraty, F. and Vieu, P. (2003). Curves discrimination: a nonparametric functional approach. Computational Statistics and Data Analysis, 44:161–173. Ferraty, F. and Vieu, P. (2006). Nonparametric Functional Data Analysis: Theory and Practice. Springer, New York. Gallant, A. R. (1987). Nonlinear Statistical Models. Wiley, New York. Hlubinka, D. and Prchal, L. (2007). Changes in atmospheric radiation from the statistical point of view. Computational Statistics and Data Analysis, 51:4926–4941. Kneip, A. and Utikal, K. (2001). Time trends in the joint distribution of income and age. In Economic Essays, A Festschrift for Werner HILDENBRAND. Springer Verlag. Kokoszka, P., Maslova, I., Sojka, J., and Zhu, L. (2008). Testing for lack of dependence in the functional linear model. The Canadian Journal of Statistics, 36:207–222. Mas, A. (2007). Testing for the mean of random curves: a penalization approach. Statistical Inference for Stochastic Processes, 10:147–163. Mu¨ller, H. G. and Stadtmu¨ller, U. (2005). Generalized functional linear models. Annals of Statis- tics, 33:774–805. Orey, S. (1958). A central limit theorem for m-dependent random variables. Duke Mathematical Journal, 52:543–546. Ramsay, J. O. and Dalzell, C. J. (1991). Some tools for functional data analysis (with discussion). Journal of the Royal Statistical Society, Series B, 52:539–572. Ramsay, J. O. and Silverman, B. W. (2005). Functional Data Analysis, 2nd. Ed. Springer, New York. Sacks, J. and Ylvisacker, D. (1970). Designs for regression problems with correlated errors III. Annals of Mathematical Statistics, 41:2057–2074. Seber, G. A. F. and Wild, C. J. (1989). Nonlinear Regression. John Wiley and Sons Inc., New York. Shen, Q. and Faraway, J. (2004). An F test for linear models with functional responses. Statistica Sinica, 14:1239–1257. Wu, C. F. J. (1986). Jackknife, bootstrap and other resampling methods in regression analysis. Annals of Statistics, 14:1261–1295. Yang, X., Shen, Q., Xu, H., and Shoptaw, S. (2007). Functional regression analysis using an F test for longitudinal data with large numbers of repeated measures. Statistics in Medicine, 26:1552–1566. 24