Parallel computing makes use of various system architectures and hardware configu-
rations, as the application context typically determines suitable machines. It is really
challenging for an application programmer to optimize memory management and
leverage hardware traits, especially in those cases in which the platform may change.
In this bachelor’s thesis, the speed-up of computation as well as the performance
portability using the framework "Kokkos" in the context of shallow water equations is
investigated. The framework generates performance portable code for heterogeneous
architectures, which optimizes computation time independently of the underlying hard-
ware. It is achieved by providing an abstraction of the interfaces of the computational
devices and using hardware specific characteristics like data layout or memory perfor-
mance. Furthermore the LRZ-Cluster is used to compare the legacy implementation
with the Kokkos implementation using several Intel KNL processors. In order to check
performance portability the implementation is also evaluated using different GPU
generations. As the results show, Kokkos is indeed generating performance portable
code, which is mostly even faster than the legacy approaches.
«
Parallel computing makes use of various system architectures and hardware configu-
rations, as the application context typically determines suitable machines. It is really
challenging for an application programmer to optimize memory management and
leverage hardware traits, especially in those cases in which the platform may change.
In this bachelor’s thesis, the speed-up of computation as well as the performance
portability using the framework "Kokkos" in the context of shallow water equati...
»