Numerical algebraic fan of a design for statistical model building
Loading...
Date
2013-01-29
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In this article we develop methods for the analysis of non-standard experimental
designs by using techniques from algebraic statistics. Our work is motivated by a
thermal spraying process used to produce a particle coating on a surface, e.g. for
wear protection or durable medical instruments. In this application non-standard
designs occur as intermediate results from initial standard designs in a two-stage
production process. We investigate algebraic methods to derive better identifiable
models with particular emphasis on the second stage of two-stage processes.
Ideas from algebraic statistics are explored where the design as finite set of distinct
experimental settings is expressed as solution of a system of polynomials. Thereby
the design is identified by a polynomial ideal and features and properties of the
ideal are explored and provide inside into the structures of models identifiable by
the design [Pistone et al., 2001, Riccomagno, 2009]. Holliday et al. [1999] apply
these ideas to a problem from the automotive industry with an incomplete standard
factorial design, Bates et al. [2003] to the question of finding good polynomial metamodels
for computer experiments. In our thermal spraying application, designs for the controllable process parameters
are run and properties of particles in flight measured as intermediate responses.
The final output describes the coating properties, which are very time-consuming
and expensive to measure as the specimen has to be destroyed. It is desirable to
predict coating properties either on the basis of process parameters and/or from
particle properties. Rudak et al. [2012] provides a first comparison of different modeling
approaches. There are still open questions: which models are identifiable with
the different choices of input (process parameters, particle properties, or both)? Is
it better to base the second model between particle and coating properties on estimated
expected values or the observations themselves? The present article is a
contribution in this direction. Especially in the second stage particle properties as
input variables are observed values from the originally chosen design for the controllable
factors. The resulting design on the particle property level can be tackled
with algebraic statistics to determine identifiable models. However, it turns out that
resulting models contain elements which are only identifiable due to small deviations
of the design from more regular points, hence leading to unwanted unstable model results.
We tackle this problem with tools from algebraic statistics. Because of the fact
that data in the second stage are very noisy, we extend existing theory by switching
from symbolic, exact computations to numerical computations in the calculation of
the design ideal and of its fan. Specifically, instead of polynomials whose solution
are the design points, we identify a design with a set of polynomials which "almost
vanish" at the design points using results and algorithms from Fassino [2010].
The paper is organized as follows. In Section 2 three different approaches towards
the modeling of a final output in a two-stage process are introduced and compared.
The algebraic treatment and reasoning is the same whatever the approach. Section
3 contains the theoretical background of algebraic statistics for experimental design,
always exemplified for the special application. Section 4 is the case study itself.