An extension of a very fast direct finite element Poisson solver on lower precision accelerator hardware towards semi-structured grids
Lade...
Datum
Autor:innen
Zeitschriftentitel
ISSN der Zeitschrift
Bandtitel
Verlag
Sonstige Titel
Zusammenfassung
Graphics cards that are equipped with Tensor Core units designed for AI applica tions, for example the NVIDIA Ampere A100, promise very high peak rates concerning their
computing power (156 TFLOP/s in single and 312 TFLOP/s in half precision in the case of
the A100). This is only achieved when performing arithmetically intensive operations such as
dense matrix multiplications in the aforementioned lower precision, which is an obstacle when
trying to use this hardware for solving linear systems arising from PDEs discretized with the
finite element method. In previous works, we delivered a proof of concept that the predecessor
of the A100, the V100 and its Tensor Cores, can be exploited to a great extent when solving
Poisson’s equation on the unit square if a hardware-oriented direct solver based on prehandling
via hierarchical finite elements and a Schur complement approach is used. In this work, using
numerical results on an A100 graphics card, we show that the method also achieves a very high
performance if Poisson’s equation, which is discretized by linear finite elements, is solved on a
more complex domain corresponding to a flow around a square configuration.
Beschreibung
Inhaltsverzeichnis
Schlagwörter
accelerator hardware, lower precision, hierarchical finite elements, prehandling, NVIDIA A100, tensor core GPUs
