Finite Bayesian mixture models with applications in spatial cluster analysis and bioinformatics

Schäfer, Martin

Full metadata record

DC Field	Value	Language
dc.contributor.advisor	Ickstadt, Katja	-
dc.contributor.author	Schäfer, Martin	-
dc.date.accessioned	2015-12-07T12:44:01Z	-
dc.date.available	2015-12-07T12:44:01Z	-
dc.date.issued	2015	-
dc.identifier.uri	http://hdl.handle.net/2003/34392	-
dc.identifier.uri	http://dx.doi.org/10.17877/DE290R-16464	-
dc.description.abstract	In many statistical applications, one encounters populations that form homogenous subgroups regarding one or several characteristics. Across the subgroups, however, heterogeneity may often be found. Mixture distributions are a natural means to model data from such applications. This PhD thesis is based on two projects that focus on such applications. In the first project, spatial nanoscale clusters formed by Ras proteins in the cell membrane are investigated. Such clusters play a crucial role in intracellular communication and are thus of interest in cancer research. In this case, the subgroups are clustered and non-clustered proteins. In the second project, epigenomic data obtained from sequencing experiments are integrated with another genomic or epigenomic input, aiming, e.g., to detect genes that contribute to the development of cancer. Here, the subgroups are defined by a) genes presenting congruent (epi)genomic aberrations in both considered variables, b) genes presenting incongruent aberrations, and c) genes lacking aberrations in at least one of the variables. Employing a Bayesian framework, objects are classified in both projects by fitting finite univariate mixture distributions with a small fixed number of components to values from a score summarizing relevant information about the research question. Such mixture distributions have favorable characteristics in terms of interpretation and present little sensitivity to label switching in Markov Chain Monte Carlo analyses. Mixtures of gamma distributions are considered for Ras proteins, while mixtures of normal and exponential or gamma distributions are a focus for the bioinformatic analysis. In the latter, classification is the primary goal, while in the Ras protein application, estimating key parameters of the spatial clustering is of more interest. The results of both projects are presented in this thesis. For both applications, the methods have been implemented in software and their performance is compared with competing approaches on experimental as well as on simulated data. To warrant an appropriate simulation of Ras protein patterns, a new cluster point process model called the double Matérn cluster process is developed and described in this thesis.	en
dc.language.iso	en	de
dc.subject	Bayesian statistics	en
dc.subject	Finite mixture model	en
dc.subject	Spatial cluster analysis	en
dc.subject	Matérn cluster process	en
dc.subject	Nearest neighbor distances	en
dc.subject	Gene transcription	en
dc.subject	ChIP-seq data	en
dc.subject	Integrative analysis	en
dc.subject.ddc	310	-
dc.title	Finite Bayesian mixture models with applications in spatial cluster analysis and bioinformatics	en
dc.type	Text	de
dc.contributor.referee	Rahnenführer, Jörg	-
dc.date.accepted	2015-11-06	-
dc.type.publicationtype	doctoralThesis	en
dcterms.accessRights	open access	-
Appears in Collections:	Lehrstuhl Mathematische Statistik und biometrische Anwendungen

Files in This Item:

File	Description	Size	Format
Dissertation.pdf	DNB	12.07 MB	Adobe PDF	View/Open

This item is protected by original copyright

View License

Show simple item record

This item is protected by original copyright rightsstatements.org