Authors: Kopczynski, Dominik
Title: Resource-constrained analysis of ion mobility spectrometry data
Language (ISO): en
Abstract: During the past decades numerous spectrometry devices, e.g. mass spectrometry or liquid or gas chromatography, have been engineered to measure the different properties of molecules, compounds or even complex structures. One device exploits the different mobilities of ionized analytes, the so-called ion mobility spectrometer. The advantages of this device are the low costs of production and maintenance (e.g. a high vacuum as in mass spectrometry is not required), the fast capture (a few milliseconds suffice) and the provision of a high resolution of up to parts per billion (ppb) by volume. An ion mobility spectrometer coupled with a multi-capillary column for pre-separation achieves a resolution higher by several magnitudes. Substantial research was done to investigate its feasibility for clinical or biotechnology applications, especially clinical diagnosis or live monitoring. Ongoing miniaturization provides devices of even mobile phone-size, allowing mobile applications. In critical places like main stations or sports stadiums, mobile devices are conceivable for the detection of drugs or explosives. Another application scenario can be a mobile device monitoring the breath of patients which can be used at home. For such scenarios it is inevitable that the data is analyzed directly at the device right after the capturing. The amount of data, the complexity of the two-dimensional spectra as well as time and device restrictions require analysis software specifically designed for this application. The basis of MCC/IMS analysis is a representation of all high-intensity regions (peaks) in the measurement by using a few descriptive parameters per peak instead of the full measurement data, a process that we refer to as peak extraction. The position of peaks infers the corresponding analyte and its signal intensity delivers information about the concentration of the analyte. These peaks can hint at several features, e.g. diseases in clinical diagnostics. Previous work mainly concentrated on the extraction of the position of the peaks’ highest signal intensity (mode). Using statistical distributions, we introduce a function which requires only seven descriptive parameters to approximate the complete shape of a peak. The straightforward nature of this function as well as the intuitive descriptors simplify and accelerate the methods estimating the descriptor set for every detected peak. Additional post-processing steps like comparison with a reference, or aligning or clustering a set of measurements further simplify and add precision to the provided peak model. Having a measurement and the proposed peak model, the peaks have to be detected and the model descriptors have to be estimated automatically. Here, we introduce two methods executing this task. The offline peak model estimation reduces one measurement automatically into a set of peak models but without any restrictions, i.e. the data is completely available during the whole analysis process and can be accessed as often as required. Furthermore, space and time restrictions were not taken into account. The idea of this method is to take its approaches as a basis and redesign them for an online analysis. Our second introduced method is referred to as online peak model estimation. The method is restricted to store only one or a small quantity of consecutive ion mobility spectra and discard the raw data directly after the analysis. Additionally, this analysis has a strict time restriction provided by the device itself (every 100 ms a new ion mobility spectrum is captured) and should even run in time on current embedded systems as the Raspberry Pi. Of course, this method should also provide a list of peak descriptors. For that purpose, we redesigned particular methods to satisfy these restrictions. This method is suitable for the application on mobile detection devices. To find commonalities and differences among a set of measurements for further classification or timeline analysis, it is an inherent necessity to find and connect peaks provided by the same analyte. We refer to these clusters covering several peaks from different measurements as consensus peaks. Several clustering methods are already introduced in literature, but many have the disadvantage of requesting the number of clusters a priori. We introduce an enhanced method of the classic EM algorithm which dynamically determines the number of clusters. Additionally, we present the main ideas of efficient implementation to make the clustering method feasible on embedded systems as well. As the EM algorithm works with statistical models, the obtained information of the peak extraction step can be efficiently applied, providing a more precise clustering. As an addition, a method is introduced to align either a peak list containing peak descriptors or consensus peak descriptors against a reference list with potentially previously discovered analytes and their parameters. This method also employs statistical models and statistical optimization methods. Since all utilized statistical methods and models are rather expensive in terms of computation time and contain almost always the costly exponential function, we introduce an approximate exponential function as substitute. This function has the ability to compute an exponential value up to 4-6 times faster than exact functions of provided standard libraries with only a minimal loss of precision. The exploitation of the binary representation of floating point values within a processor makes this acceleration possible. These features are desirable for the application on embedded systems. All methods and implementations will be evaluated in detail in terms of computation time, accuracy and reasonability.
Subject Headings: Ionenmobilitätsspektrometrie
Analyse von Spektrometriedaten
Vollautomatischer Analyse-Workflow
Subject Headings (RSWK): Ionenbeweglichkeitsspektroskopie
Issue Date: 2017
Appears in Collections:LS 11

Files in This Item:
File Description SizeFormat 
Dissertation_Kopczynski.pdfDNB1.76 MBAdobe PDFView/Open

This item is protected by original copyright

All resources in the repository are protected by copyright.