User: Guest  Login
Document type:
Masterarbeit
Author(s):
Saroliya, Urvij
Title:
Reinforcement Learning based Resource Management for HPC Systems
Translated title:
Reinforcement Learning basierte Ressourcenmanagement für HPC-Systeme
Abstract:
In recent years there have been an exponential rise in the capabilities of the modern High Performance Computing (HPC) systems. Such trend poses new challenges for managing node-level resources such as compute cores, memory bandwidth, and shared cache. This has led to an increasing demand for effective resource management methodologies in HPC systems. As modern HPC systems are typically composed of fat and rich compute nodes, it is usually difficult to fully utilize all the in-node resources by...     »
Translated abstract:
In den letzten Jahren ist die Leistungsfähigkeit moderner High Performance Computing (HPC)-Systeme exponentiell gestiegen. Dieser Trend stellt neue Herausforderungen für die Verwaltung von Ressourcen auf Knotenebene wie Rechenkernen, Speicherbandbreite und gemeinsam genutztem Cache dar. Dies hat zu einer steigenden Nachfrage nach effektiven Ressourcenmanagementmethoden in HPC-Systemen geführt. Da moderne HPC-Systeme typischerweise aus Fat- und Rich-Rechenknoten bestehen, ist es in der Regel schw...     »
Keywords:
NUMA Systems, GPUs, Co-Scheduling, Resource Management, Reinforcement Learning
Subject:
DAT Datenverarbeitung, Informatik
DDC:
000 Informatik, Wissen, Systeme
Advisor:
Arima, Eishi; Liu, Dai
Referee:
Schulz, Martin (Prof. Dr.)
Year:
2023
Pages:
89
Language:
en
Language from translation:
de
University:
Technische Universität München
Faculty:
TUM School of Computation, Information and Technology
 BibTeX