Efficient GPU Offloading with OpenMP for a Hyperbolic Finite Volume Solver on Dynamically Adaptive Meshes

Mario Wille; Tobias Weinzierl; Gonzalo Brito Gadeschi; Michael Bader

doi:https://doi.org/10.1007/978-3-031-32041-5_4

If you experience problems opening the document, please try this link.

Title:: Efficient GPU Offloading with OpenMP for a Hyperbolic Finite Volume Solver on Dynamically Adaptive Meshes
Document type:: Buchbeitrag
Author(s):: Mario Wille; Tobias Weinzierl; Gonzalo Brito Gadeschi; Michael Bader
Pages contribution:: 65-85
Chapter contribution:: HPC Algorithms and Applications
Abstract:: We identify and show how to overcome an OpenMP bottleneck in the administration of GPU memory. It arises for a wave equation solver on dynamically adaptive block-structured Cartesian meshes, which keeps all CPU threads busy and allows all of them to offload sets of patches to the GPU. Our studies show that multithreaded, concurrent, non-deterministic access to the GPU leads to performance breakdowns, since the GPU memory bookkeeping as offered through OpenMP’s map clause, i.e., the allocation and freeing, becomes another runtime challenge besides expensive data transfer and actual computation. We, therefore, propose to retain the memory management responsibility on the host: A caching mechanism acquires memory on the accelerator for all CPU threads, keeps hold of this memory and hands it out to the offloading threads upon demand. We show that this user-managed, CPU-based memory administration helps us to overcome the GPU memory bookkeeping bottleneck and speeds up the time-to-solution of Finite Volume kernels by more than an order of magnitude. «
We identify and show how to overcome an OpenMP bottleneck in the administration of GPU memory. It arises for a wave equation solver on dynamically adaptive block-structured Cartesian meshes, which keeps all CPU threads busy and allows all of them to offload sets of patches to the GPU. Our studies show that multithreaded, concurrent, non-deterministic access to the GPU leads to performance breakdowns, since the GPU memory bookkeeping as offered through OpenMP’s map clause, i.e., the allocation an... »
Book title:: High Performance Computing
Book subtitle:: 38th International Conference, ISC High Performance 2023, Hamburg, Germany, May 21–25, 2023, Proceedings
Volume:: Lecture Notes in Computer Science
Edition:: 13948
Publisher:: Springer
Date of publication:: 10.05.2023
Year:: 2023
Year / month:: 2023-05
Month:: May
Pages:: 65-85
Print-ISBN:: 978-3-031-32040-8
E-ISBN:: 978-3-031-32041-5
Reviewed:: ja
Language:: en
DOI:: doi:https://doi.org/10.1007/978-3-031-32041-5_4
Notes:: https://doi.org/10.5281/zenodo.7741217
TUM Institution:: Department of Informatics
BibTeX

Occurrences:

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Computation, Information and Technology Departments Computer Science Informatik 5 - Lehrstuhl für Scientific Computing (Prof. Bungartz)2023

mediaTUM Gesamtbestand Hochschulbibliographie 2023 Schools und Fakultäten TUM School of Computation, Information and Technology Informatik 5 - Lehrstuhl für Scientific Computing (Prof. Bungartz)