Methods for efficient resource utilization in statistical machine learning algorithms

dc.contributor.advisorMarwedel, Peter
dc.contributor.authorKotthaus, Helena
dc.contributor.refereeRahnenführer, Jörg
dc.date.accepted2018-06-07
dc.date.accessioned2018-06-15T12:39:18Z
dc.date.available2018-06-15T12:39:18Z
dc.date.issued2018
dc.description.abstractIn recent years, statistical machine learning has emerged as a key technique for tackling problems that elude a classic algorithmic approach. One such problem, with a major impact on human life, is the analysis of complex biomedical data. Solving this problem in a fast and efficient manner is of major importance, as it enables, e.g., the prediction of the efficacy of different drugs for therapy selection. While achieving the highest possible prediction quality appears desirable, doing so is often simply infeasible due to resource constraints. Statistical learning algorithms for predicting the health status of a patient or for finding the best algorithm configuration for the prediction require an excessively high amount of resources. Furthermore, these algorithms are often implemented with no awareness of the underlying system architecture, which leads to sub-optimal resource utilization. This thesis presents methods for efficient resource utilization of statistical learning applications. The goal is to reduce the resource demands of these algorithms to meet a given time budget while simultaneously preserving the prediction quality. As a first step, the resource consumption characteristics of learning algorithms are analyzed, as well as their scheduling on underlying parallel architectures, in order to develop optimizations that enable these algorithms to scale to larger problem sizes. For this purpose, new profiling mechanisms are incorporated into a holistic profiling framework. The results show that one major contributor to the resource issues is memory consumption. To overcome this obstacle, a new optimization based on dynamic sharing of memory is developed that speeds up computation by several orders of magnitude in situations when available main memory is the bottleneck, leading to swapping out memory. One important application that can be applied for automated parameter tuning of learning algorithms is model-based optimization. Within a huge search space, algorithm configurations are evaluated to find the configuration with the best prediction quality. An important step towards better managing this search space is to parallelize the search process itself. However, a high runtime variance within the configuration space can cause inefficient resource utilization. For this purpose, new resource-aware scheduling strategies are developed that efficiently map evaluations of configurations to the parallel architecture, depending on their resource demands. In contrast to classical scheduling problems, the new scheduling interacts with the configuration proposal mechanism to select configurations with suitable resource demands. With these strategies, it becomes possible to make use of the full potential of parallel architectures. Compared to established parallel execution models, the results show that the new approach enables model-based optimization to converge faster to the optimum within a given time budget.en
dc.identifier.urihttp://hdl.handle.net/2003/36929
dc.identifier.urihttp://dx.doi.org/10.17877/DE290R-18928
dc.language.isoende
dc.subjectR languageen
dc.subjectResource optimizationen
dc.subjectPerformance analysisen
dc.subjectMachine learningen
dc.subjectMemory optimizationen
dc.subjectProfilingen
dc.subjectResource-constraint systemsen
dc.subjectEmbedded systemsen
dc.subjectBlack-box optimizationen
dc.subjectHyperparameter tuningen
dc.subjectModel selectionen
dc.subjectModel-based optimizationen
dc.subjectResource-aware schedulingen
dc.subjectParallelizationen
dc.subject.ddc004
dc.subject.rswkAlgorithmende
dc.titleMethods for efficient resource utilization in statistical machine learning algorithmsen
dc.typeTextde
dc.type.publicationtypedoctoralThesisde
dcterms.accessRightsopen access
eldorado.secondarypublicationfalsede

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Dissertation_Kotthaus.pdf
Size:
4.13 MB
Format:
Adobe Portable Document Format
Description:
DNB
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.85 KB
Format:
Item-specific license agreed upon to submission
Description: