Performance Analysis for Parallel R Programs: Towards Efficient Resource Utilization

dc.contributor.authorKotthaus, Helena
dc.contributor.authorKorb, Ingo
dc.contributor.authorMarwedel, Peter
dc.date.accessioned2018-10-11T13:48:34Z
dc.date.available2018-10-11T13:48:34Z
dc.date.issued2015-01
dc.description.abstractParallel computing is becoming more and more popular, since R is increasingly used to process large data sets. We therefore have improved traceR to allow for profiling parallel applications also. TraceR can be used for common cases like parallelization on multiple cores or parallelization on multiple machines. For the parallel performance analysis we added measurements like CPU utilization of parallel tasks and measurements for analyzing the memory usage of parallel programs during execution. With our parallel performance analysis we concentrate on applications that are embarrassingly par- allel consisting of independent tasks. One example application which is embarrassingly parallel and also has a high resource utilization is the model selection. Here the goal is to find the best machine learning algorithm configuration for building a model for the given data. Therefore one has to search through a huge model space. Since the gain from parallel execution can be negated if the memory requirements of all parallel processes exceed the capacity of the system, our profiling data can serve as a constraint to determine the degree of parallelism and also to guide distribution of parallel R applications. Our goal is to provide a resource-aware parallelization strategy. To develop such a strategy we first need to analyze the performance of parallel applications. In the following we therefore will describe different parallel example applications and show how traceR is applied to analyze parallel R applications.en
dc.identifier.urihttp://hdl.handle.net/2003/37163
dc.identifier.urihttp://dx.doi.org/10.17877/DE290R-19159
dc.language.isoende
dc.relation.ispartofseriesTechnical report / Sonderforschungsbereich Verfügbarkeit von Information durch Analyse unter Ressourcenbeschränkung;1/2015
dc.subjectparallel computingen
dc.subjecttraceRen
dc.subject.ddc004
dc.subject.rswkParallelverarbeitungde
dc.titlePerformance Analysis for Parallel R Programs: Towards Efficient Resource Utilizationen
dc.typeTextde
dc.type.publicationtypereportde
dcterms.accessRightsopen access
eldorado.secondarypublicationfalsede

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
kotthaus_2015a.pdf
Size:
894.25 KB
Format:
Adobe Portable Document Format
Description:
DNB
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.85 KB
Format:
Item-specific license agreed upon to submission
Description: