Estimating a bivariate density when there are extra data on one or both components
Loading...
Date
2005-10-12T06:58:23Z
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Assume we have a dataset, Z say, from the joint distribution of random variables X and Y , and two further, independent datasets, X and Y, from
the marginal distributions of X and Y , respectively. We wish to combine X, Y and Z, so as to construct an estimator of the joint density. This problem is readily solved in some parametric circumstances. For example, if the joint distribution were normal then we would combine data from X and Z to estimate the mean
and variance of X; proceed analogously to estimate the mean and variance of Y ; but use data from Z alone to estimate E(XY ). However, the problem is more difficult in a nonparametric setting. There we suggest a copula-based solution, which has potential benefits even when the marginal datasets X and Y are empty.
For example, if the copula density is sufficiently smooth in the region where we wish to estimate it, then the effective dimension of the structure that links the marginal distributions is relatively low, and the joint density of X and Y can be
estimated with a high degree of accuracy. Similar improvements in performance are available if the marginals are close to being independent. We suggest using wavelet estimators to approximate the copula density, which in cases of statistical interest can be unbounded along boundaries. Our techniques are also useful for solving
recently-considered related problems, for example where the marginal distributions are determined by parametric models. Therefore the methodology has application beyond the context which motivated it. The methodology is also readily extended to more general multivariate settings.
Description
Table of contents
Keywords
Copula, Dimension reduction, Independence, Kernel method, Prediction, Threshold, Wavelet