Very fast finite element Poisson solvers on lower precision accelerator hardware: A proof of concept study for Nvidia Tesla V100

Ruda, Dustin; Turek, Stefan; Ribbrock, Dirk; Zajac, Peter

Very fast finite element Poisson solvers on lower precision accelerator hardware: A proof of concept study for Nvidia Tesla V100

dc.contributor.author	Ruda, Dustin
dc.contributor.author	Turek, Stefan
dc.contributor.author	Ribbrock, Dirk
dc.contributor.author	Zajac, Peter
dc.date.accessioned	2023-03-27T11:38:30Z
dc.date.available	2023-03-27T11:38:30Z
dc.date.issued	2022-05-06
dc.description.abstract	Recently, accelerator hardware in the form of graphics cards including Tensor Cores, specialized for AI, has significantly gained importance in the domain of high-performance computing. For example, NVIDIA’s Tesla V100 promises a computing power of up to 125 TFLOP/s achieved by Tensor Cores, but only if half precision floating point format is used. We describe the difficulties and discrepancy between theoretical and actual computing power if one seeks to use such hardware for numerical simulations, that is, solving partial differential equations with a matrix-based finite element method, with numerical examples. If certain requirements, namely low condition numbers and many dense matrix operations, are met, the indicated high performance can be reached without an excessive loss of accuracy. A new method to solve linear systems arising from Poisson’s equation in 2D that meets these requirements, based on “prehandling” by means of hier-archical finite elements and an additional Schur complement approach, is presented and analyzed. We provide numerical results illustrating the computational performance of this method and compare it to a commonly used (geometric) multigrid solver on standard hardware. It turns out that we can exploit nearly the full computational power of Tensor Cores and achieve a significant speed-up compared to the standard methodology without losing accuracy.	en
dc.identifier.uri	http://hdl.handle.net/2003/41313
dc.identifier.uri	http://dx.doi.org/10.17877/DE290R-23156
dc.language.iso	en	de
dc.relation.ispartofseries	The international journal of high performance computing applications;36(4)
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	Accelerator hardware	en
dc.subject	Tensor core GPUs	en
dc.subject	NVIDIA V100	en
dc.subject	Prehandling	en
dc.subject	Hierarchical finite elements	en
dc.subject	Poisson's equation	en
dc.subject.ddc	510
dc.title	Very fast finite element Poisson solvers on lower precision accelerator hardware: A proof of concept study for Nvidia Tesla V100	en
dc.type	Text	de
dc.type.publicationtype	article	de
dcterms.accessRights	open access
eldorado.dnb.deposit	false	de
eldorado.secondarypublication	true	de
eldorado.secondarypublication.primarycitation	Ruda D, Turek S, Ribbrock D, Zajac P. Very fast finite element Poisson solvers on lower precision accelerator hardware: A proof of concept study for Nvidia Tesla V100. The International Journal of High Performance Computing Applications. 2022;36(4):459-474. doi:10.1177/10943420221084657	de
eldorado.secondarypublication.primaryidentifier	https://doi.org/10.1177/10943420221084657	de

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 10943420221084657.pdf
Size:: 1.94 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 4.85 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Lehrstuhl III Angewandte Mathematik und Numerik