Statistical analysis of genotype and gene expression data

Schwender, Holger

Statistical analysis of genotype and gene expression data

dc.contributor.advisor	Ickstadt, Katja
dc.contributor.author	Schwender, Holger
dc.contributor.referee	Weihs, Claus
dc.date.accepted	2007-02-20
dc.date.accessioned	2007-02-26T14:15:15Z
dc.date.available	2007-02-26T14:15:15Z
dc.date.issued	2007-02-26T14:15:15Z
dc.description.abstract	A common and important goal in cancer research is the identification of genetic markers such as genes or genetic variations that enable to determine if a person has a particular type of cancer, or lead to a higher risk of developing cancer. In recent years, many biotechnologies for measuring these markers have been developed. The most prominent examples are microarrays that can be used to, e.g., measure the expression levels of tens of thousands of genes simultaneously. The most widely used type of microarrays is the Affymetrix GeneChip on which each gene is represented by eleven pairs of probes. The corresponding probe intensities have to be preprocessed, i.e. summarized to one expression value per gene, before variable selection and classification methods can be applied to the gene expression data. This thesis is based on two projects: The goals of the first project are to identify the preprocessing method for Affymetrix microarrays that leads to the most efficient data reduction, and to provide a software enabling to apply this procedure to the data from studies comprising hundreds of Affymetrix GeneChips. The results of this project are presented in this thesis. The second project is concerned with SNPs (Single Nucleotide Polymorphisms), i.e. variations at a single base-pair position in the genome. While a vast number of papers on the analysis of gene expression data have been published, only a few variable selection and classification methods dealing with the specific needs of the analysis of SNP data have been proposed. One of the exceptions is logic regression. In this thesis, it is shown how approaches for the analysis of gene expression data can be adapted to SNP data, and a procedure based on a bagging version of logic regression is proposed that enables the detection of SNP interactions explanatory for a higher cancer risk. Furthermore, two measures for quantifying the importance of each of these interactions for prediction are presented, and compared with existing measures.	en
dc.identifier.uri	http://hdl.handle.net/2003/23306
dc.identifier.uri	http://dx.doi.org/10.17877/DE290R-8430
dc.identifier.urn	urn:nbn:de:hbz:290-2003/23306-7
dc.language.iso	en	en
dc.subject	Microarray	en
dc.subject	Single nucleotide polymorphism	en
dc.subject	SNP	en
dc.subject	Variable selection	en
dc.subject	Classification	en
dc.subject	Preprocessing	en
dc.subject	Cancer risk	en
dc.subject.ddc	310
dc.title	Statistical analysis of genotype and gene expression data	en
dc.type	Text	de
dc.type.publicationtype	doctoralThesis	en
dcterms.accessRights	open access

Files

Original bundle

Now showing 1 - 1 of 1

Name:: diss_schwender.pdf
Size:: 1.89 MB
Format:: Adobe Portable Document Format
Description:: DNB

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.93 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Lehrstuhl Mathematische Statistik und biometrische Anwendungen