Survival models with preclustered gene groups as covariates
Loading...
Date
2012-02-21
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
An important application of high dimensional gene expression measurements is the risk prediction
and the interpretation of the variables in the resulting survival models. A major problem in this context is the
typically large number of genes compared to the number of observations (individuals). Feature selection
procedures can generate predictive models with high prediction accuracy and at the same time low model
complexity. However, interpretability of the resulting models is still limited due to little knowledge on many of
the remaining selected genes. Thus, we summarize genes as gene groups defined by the hierarchically structured
Gene Ontology (GO) and include these gene groups as covariates in the hazard regression models. Since
expression profiles within GO groups are often heterogeneous, we present a new method to obtain subgroups
with coherent patterns. We apply preclustering to genes within GO groups according to the correlation of their
gene expression measurements.