Eldorado - Repository of the TU Dortmund
Resources for and from Research, Teaching and Studying
This is the institutional repository of the TU Dortmund. Ressources for Research, Study and Teaching are archived and made publicly available.
Communities in Eldorado
Select a community to browse its collections.
Recent Submissions
Discovering nucleotide-level and structural variants in cancer genome data from second- and third-generation sequencing technologies
(2024) Hartmann, Till; Köster, Johannes; Teubner, Jens
The field of bioinformatics is a diverse one, as it — the name giving a quite obvious hint — bridges the fields of biology and computer science. For example, in (human) cancer research, one commonly
1. obtains a blood and/or tissue sample of a patient in a study,
2. sequences or otherwise analyses the sample on a specialized device to obtain relevant information (such as its genome, transcriptome or methylome),
3. determines variation between the sample and some reference sample(s) or determines other aspects that may be of interest,
4. annotates, filters and analyses these for further examination,
5. and subsequently makes use of them to further one’s research or study goal.
In this thesis, we mainly concern ourselves with the last three aspects, though we also explain the sequencing process for two different technologies. Also, we will focus on the genome part of the second item, i.e. the DNA contained within the cells of most organisms. In this context, determining variation usually involves comparing many short DNA sequences to a larger reference DNA sequence (such as ”the human genome”).
homopolymer-aware PairHMM:
We introduce a homopolymer-aware PairHMM, which addresses one major issue of Oxford Nanopore Technologies (ONT) sequencing: due to the design of the technology, so-called homopolymers — runs of identical nucleotides — often have inaccurately called lengths, which impacts results negatively. This specialized model allows for more accurate alignments and probability estimates, and can be applied to any ONT sequencing data.
Detecting extrachromosomal circular DNA:
The homopolymer-aware PairHMM finds practical application in the calling of extrachromosomal circular DNA. Extrachromosomal circular DNA is DNA that is both circular and located outside the chromosomal structure of the genome. Such extrachromosomal circular DNA plays an important part in cancer research, as a biomarker for certain cancers or as a way to track tumour progression. We developed a graph-based method to detect eccDNA in ONT sequencing samples, and also provide an easy to set up workflow that guarantees reproducibility and provides an interactive report for exploration of results and documentation of the pipeline.
Detecting copy number variation:
A different kind of variation is copy number variation, where (larger) regions of a sample’s genome have been repeated or deleted. Over the years, many approaches making use of different kinds of information have been explored, for example read-depth information: Determining variation is often done by comparing, mapping or aligning short DNA sequences to a reference sequence. For each position in the reference sequence, it is then possible to find the number of short sequences that overlap the respective position, which is basically the read-depth. This read-depth information may then be used to locate and estimate the magnitude of copy number variation. However, the mapping process to acquire the read-depth information can both be computationally expensive and require lots of disk-space. Therefore, to improve resource usage, we use a kind of pseudo-mapping based on k-mers instead, and show that using k-mer counts is at least as good as relying on classic read-depth information, while at the same time saving disk space.
Filtering of variant calls:
Because the field of bioinformatics wouldn’t be the same without its custom file formats, we address one specific format — the Variant Call Format (VCF). VCF is a text-based format describing genomic variation, for example a duplication (copy number variation) or a gene fusion or circle. As the format is widely used and there initially was no binary counterpart, it can be alluring to resort to established text modifying tools such as awk, sed and grep. However, the intricacies of the file format essentially prohibit this, as the results will likely be not what one expects. Even with existing specialized tools such as bcftools, certain syntax and semantic combinations are quite unintuitive and potentially lead to incorrect results. We therefore provide vembrane, a VCF tool which does not introduce its own domain specific language but uses Python instead. It also has built-in support for certain custom annotations (such as provided by SnpEff or VEP) and is the only tool which handles breakend records correctly. For example, extrachromosomal circular DNA variants have to be encoded using such breakend records.
Amtliche Mitteilungen der Technischen Universität Dortmund Nr. 3/2025
(Technische Universität Dortmund, 2025-01-27)
AI in process industries
(2023-04-13) Bortz, Michael; Dadhe, Kai; Engell, Sebastian; Gepert, Vanessa; Kockmann, Norbert; Müller-Pfefferkorn, Ralph; Schindler, Thorsten; Urbas, Leon
The chemical industry is one of the key industrial sectors in Germany and at the same time one of the largest consumers of energy and raw materials. A successful energy transition and the development of a circular economy can only succeed if they are actively supported and shaped by the chemical industry – through the redesign of existing production processes and the exploration and implementation of new process routes. The challenge is to realize this transformation within a very short time and for many production processes, whereby a much larger number of process routes must be explored. Digital technologies are key to master this transformation towards more sustainability, climate, and environmental protection. The KEEN project aims to explore and leverage artificial intelligence (AI) opportunities in process industry. The newly developed AI methods are tested wherever possible in real working environments and production plants to prove the economic benefit, applicability, and reliability of the methods and technologies.
Comprehensive study of the enhanced reactivity of turbo-Grignard reagents
(2023-03-27) Hermann, Andreas; Seymen, Rana; Brieger, Lukas; Kleinheider, Johannes; Grabe, Bastian; Hiller, Wolf; Strohmann, Carsten
Since its introduction in 2004, Knochel's so called Turbo-Grignard reagents revolutionized the usage of Grignard reagents. Through the simple addition of LiCl to a magnesium alkyl an outstanding increase in reactivity can be achieved. Though the exact composition of the reactive species remained mysterious, the reactive mixture itself is readily used not only in synthesis but also found its way into more distant fields like material science. To unravel this mystery, we combined single-crystal X-ray diffraction with in-solution NMR-spectroscopy and closed our investigations with quantum chemical calculations. Using such a variety of methods, we have gained insight into and an explanation for the extraordinary reactivity of this extremely convenient reagent by determining the structure of the first bimetallic reactive species [t-Bu2Mg ⋅ LiCl ⋅ 4 thf] with two tert-butyl anions at the magnesium center and incorporated lithium chloride.
Design of module type package services for modular downstream units and process analytic technology
(2023-05-10) Bittorf, Lukas; Oeing, Jonas; Kock, Tobias; Garreis, Robert; Kockmann, Norbert
Modularization of process plants with its standardization activities is one of the current responses to react to dynamic markets, shorter product life cycles, and uncertain supply chains. Standardized solutions for intelligent process equipment assemblies with own automation promise high potential for chemical and pharmaceutical industries. Despite the standardized description of the module type package (MTP) and the corresponding service concept, the implementation of the service logic is left to the manufacturer, which often leads to finding various granular services for different process functions or assemblies. In this contribution, different service design approaches for a generic ‘separate’ service are investigated on the example of a solvent extraction and a distillation column. Additionally, a Raman spectroscope device for process analysis is implemented via MTP with an ‘analyze’ service. Pros and cons of the different service design approaches are discussed in the context of a fast and flexible process development in the laboratory.