Ruda, DustinTurek, StefanRibbrock, Dirk2024-02-112024-02-112024-012190-1767http://hdl.handle.net/2003/4231710.17877/DE290R-24154The impetus for the research presented in this work is provided by recent developments in the field of GPU computing. Nvidia GPUs that are equipped with Tensor Cores, such as the A100 or the latest H100, promise an immense computing power of 156 and 495 TFLOPS, respectively, but only for dense matrix operations carried out in single precision (with even higher rates in half precision), since this serves their actual purpose of accelerating AI training. It is shown that this performance can also be exploited to a large extent in the domain of matrix-based finite element methods for solving PDEs, if specially tailored, hardware-oriented methods are used. Such methods need to preserve sufficient accuracy, even if single precision is used, and mostly consist of dense matrix operations. A semi-iterative method for solving Poisson’s equation in 2D and 3D based on prehandling, i.e., explicit preconditioning, by means of hierarchical finite elements or generating systems, that satisfies these requirements, is derived and analyzed.Actual benchmark results on an H100 allow the determination of optimal solver configurations in terms of performance, which ultimately exceeds that of a standard geometric multigrid solver on CPU.enErgebnisberichte des Instituts für Angewandte Mathematik;671610Fast Semi-Iterative Finite Element Poisson Solvers for Tensor Core GPUs Based on PrehandlingPreprint