ls1-mardyn is a molecular dynamics code specialized in simulations of
very large scale of up to 4.125 × 10 12 particles [EHB + 13]. Although the
established approach for its force calculation, which is using coloring and
OpenMP loop parallelization, proves to be very efficient, it still leaves potential
for optimization because of its use of barriers. A promising method to replace
these barriers are task-based scheduling concepts due to their higher flexibility
and finer granularity. As such a library, the highly dynamic Quicksched code
[GCS16] has here been used for implementing task-based shared-memory
parallelization approaches for the force calculations in ls1-mardyn. Their
performance has been tested on an Intel Xeon Phi Knights Corner accelerator
and is compared to established approaches that use domain coloring and
OpenMP loop parallelization. When directly comparing the results it is
hard to beat the efficiency of the highly optimized established approaches.
Yet, through an in-depth analysis of the scheduling, possible performance
potentials can be identified and shown to be exploited by the task-based
approaches. However, task-based approaches introduce new challenges of their
own. These are analyzed and discussed and possible solutions are proposed.
«
ls1-mardyn is a molecular dynamics code specialized in simulations of
very large scale of up to 4.125 × 10 12 particles [EHB + 13]. Although the
established approach for its force calculation, which is using coloring and
OpenMP loop parallelization, proves to be very efficient, it still leaves potential
for optimization because of its use of barriers. A promising method to replace
these barriers are task-based scheduling concepts due to their higher flexibility
and finer granularity. As su...
»