Distributed memory modern HPC systems consist of millions of compute nodes interconnected via a high-performance network. These systems are shared among thousands of
users for submitting and executing their applications. The main software component responsible for scheduling and allocating resources to the user’s jobs is the middleware called Resource and Job Management System (RJMS). The main element of RJMS responsible for resource management and scheduling is the Batch Scheduler. The applications submitted by the users are typically large scale simulations related to astronomy, climate modeling, biology, fluid and molecular dynamics, etc. Moreover, the resource requirements of these applications can dynamically change during their runtime. Changing the resources of an
application at runtime requires an adaptive parallel runtime system and a dynamic resource management system. No RJMS software currently supports the dynamic reconfiguration of running applications. Towards this, in this thesis, the scalable workload manager SLURM is extended to support the dynamic reconfiguration of resource-elastic queued applications written using the Invasive MPI adaptive library. Different SLURM binaries are extended to allow users to submit resource-elastic jobs in batch mode. The batch scheduler in SLURM is extended through a scheduling plugin to support the efficient combined scheduling of rigid and malleable applications. Moreover, multiple scheduling strategies for elastic applications are implemented and evaluated. Finally, the overhead for dynamic adaptation operations, i.e., expansion and reduction, is analyzed.
«