"Integrative statistical methods for analyzing biomedical data: applications in health and disease”
Loading...
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Alternative Title(s)
Abstract
In a series of four complementary studies, we apply innovative integrative statistical methods to diverse biomedical datasets to address both fundamental research questions and practical challenges in health and disease. Two of these investigations focus on the in vivo alkaline comet assay - a pivotal tool for assessing DNA damage as a marker of genotoxicity. In the first comet assay study (Article 1), we examine the impact of different centrality measures on the evaluation of tail intensity data. Using both original experimental data and simulation frameworks, we demonstrate that even subtle variations in summarizing techniques - whether using medians, arithmetic means, or geometric means - can lead to markedly different statistical conclusions and dose–response interpretations. These findings emphasize the critical need for careful methodological selection in genotoxicity assessments. In a subsequent comet assay work (Article 2), we compile and analyze extensive historical control data from multiple laboratories. This investigation addresses key statistical issues, including inter-laboratory variability and the handling of zero-valued measurements, and discusses whether the findings from the first paper are similar in the centrality statistical measures and regulatory interpretations.
In the third study (Article 3), we introduce a novel multi-omics approach to better understand Alzheimer’s disease (AD). By integrating genome-wide DNA methylation profiles with high-resolution metabolomics data from prefrontal cortex tissue samples, we develop innovative single-, joint- and multi-omics profile scores using Machine Learning and advanced regression techniques. This integrative analysis significantly improves the prediction of AD neuropathology, based on these profile scores. It also uncovers pivotal biological pathways, such as lipid metabolism and signal transduction, that are potentially involved in driving disease progression. These findings underscore the potential of combining multiple omics layers to elucidate complex molecular interactions underlying neurodegenerative disorders.
Complementing these human-focused studies, our fourth investigation (Article 4) applies hierarchical modeling to veterinary epidemiology, specifically targeting respiratory diseases in piglet production. We thereby compare frequentist and Bayesian hierarchical regression models to assess the influence of various environmental and management factors - including floor condition, water flow rates, stocking density, and indoor climate conditions - on respiratory health outcomes in pigs. By accounting for the multi-level structure inherent in farm data (spanning individual animals, pens, compartments, and farms), we demonstrate that Bayesian approaches with informative priors can effectively overcome challenges posed by small sample sizes and high inter-cluster variability. This ultimately provides more robust estimates and practical insights for disease management in livestock production.
Collectively, the four works of my cumulative thesis illustrate how tailored, integrative statistical methodologies can enhance our understanding of complex biological systems. These methodologies improve decision-making across a spectrum of applications, ranging from the regulatory evaluation of chemical safety and the elucidation of neurodegenerative disease mechanisms to the optimization of animal health in agricultural settings. The work emphasizes that the choice of statistical methods is not merely a technical detail but a pivotal factor that can substantially alter study outcomes and subsequent interpretations in both clinical and applied research environments. While the first two manuscripts are published, the third and fourth work are submitted and attached in its current version.
Description
Table of contents
Keywords
Integrative statistic, Multi-omics analysis, Genotoxicity assesment, Hierarchical modeling, Biomedical data
Subjects based on RSWK
Datenanalyse, Medizinische Statistik, Mutagenität, Nervendegeneration