Original title:

Efficient Algorithms for On-Line Scheduling and Load Distribution in Parallel Systems

Author:

Bischof, Stefan

Year:

1999

Document type:

Dissertation

Institution:

Fakultät für Informatik

Advisor:

Mayr, Ernst W. (Prof.Dr.)

Referee:

Zenger, Christoph (Prof. Dr.); Woeginger, Gerhard J. (Prof.Dr.)

Format:

Text

Language:

en

Subject group:

DAT Datenverarbeitung, Informatik

Keywords:

on-line; algorithm; scheduling; parallel job; task; makespan; UET; hypercube; array; mesh; load-balancing; bisector; bisection; partitioning; upper bound; lower bound; weighted tree; distributed finite element simulation; recursive substructuring; domain decomposition; average-case; martingale; concentration

Controlled terms:

Parallelverarbeitung; Scheduling

TUM classification:

DAT 216d

Abstract:

The efficient operation of parallel computing systems requires the best possible use of the resources that a system provides. In order to achieve an effective utilization of a parallel machine a smart coordination of the resource demands of all currently operating applications is necessary. Dynamic resource management is particularly essential for the parallel solution of irregular problems that arise frequently, for example, during numerical simulations. On-line scheduling and load distribution are widely used techniques for the assignment of resources in a dynamic fashion. This thesis provides a theoretical treatment of both approaches and presents efficient on-line scheduling and load distribution algorithms for a wide range of problems. On-line scheduling of parallel job systems with the goal to minimize the total execution time is studied as a promising framework for the efficient execution of large-scale parallel applications. Under the assumption that the execution times of the jobs meet certain requirements, several efficient on-line scheduling algorithms for various network topologies are described. Thorough competitive analysis shows that almost all of the proposed on-line scheduling algorithms are optimal or near-optimal from a worst-case point of view. It is demonstrated that runtime restrictions improve the competitive performance achievable by on-line scheduling algorithms for parallel job systems, and therefore a satisfactory utilization of a parallel system can be guaranteed in this case. Dynamic load distribution is used for partitioning problems for parallel execution. The only assumption is that a given class of problems has a certain bisection property. Such classes of problems appear, for example, in the context of distributed hierarchical finite element simulations. The sequential and parallel algorithms presented to tackle this load distribution problem yield provably good load balancing even in the worst case. Furthermore, under reasonable stochastic assumptions, it is shown that the average-case performance of these algorithms is substantially better than the worst-case bounds and surprisingly close to the optimum solution in some cases. Extensive simulation studies complement the mathematical analysis, and the results demonstrate that a satisfactory balancing quality can be achieved in this model by efficient algorithms. An integration of the sequential algorithm into existing finite element software already yielded a significant decrease of the total execution time. «

The efficient operation of parallel computing systems requires the best possible use of the resources that a system provides. In order to achieve an effective utilization of a parallel machine a smart coordination of the resource demands of all currently operating applications is necessary. Dynamic resource management is particularly essential for the parallel solution of irregular problems that arise frequently, for example, during numerical simulations. On-line scheduling and load distribution... »

Translated abstract:

[Abstract nur auf Englisch verfügbar.] The efficient operation of parallel computing systems requires the best possible use of the resources that a system provides. In order to achieve an effective utilization of a parallel machine a smart coordination of the resource demands of all currently operating applications is necessary. Dynamic resource management is particularly essential for the parallel solution of irregular problems that arise frequently, for example, during numerical simulations. On-line scheduling and load distribution are widely used techniques for the assignment of resources in a dynamic fashion. This thesis provides a theoretical treatment of both approaches and presents efficient on-line scheduling and load distribution algorithms for a wide range of problems. On-line scheduling of parallel job systems with the goal to minimize the total execution time is studied as a promising framework for the efficient execution of large-scale parallel applications. Under the assumption that the execution times of the jobs meet certain requirements, several efficient on-line scheduling algorithms for various network topologies are described. Thorough competitive analysis shows that almost all of the proposed on-line scheduling algorithms are optimal or near-optimal from a worst-case point of view. It is demonstrated that runtime restrictions improve the competitive performance achievable by on-line scheduling algorithms for parallel job systems, and therefore a satisfactory utilization of a parallel system can be guaranteed in this case. Dynamic load distribution is used for partitioning problems for parallel execution. The only assumption is that a given class of problems has a certain bisection property. Such classes of problems appear, for example, in the context of distributed hierarchical finite element simulations. The sequential and parallel algorithms presented to tackle this load distribution problem yield provably good load balancing even in the worst case. Furthermore, under reasonable stochastic assumptions, it is shown that the average-case performance of these algorithms is substantially better than the worst-case bounds and surprisingly close to the optimum solution in some cases. Extensive simulation studies complement the mathematical analysis, and the results demonstrate that a satisfactory balancing quality can be achieved in this model by efficient algorithms. An integration of the sequential algorithm into existing finite element software already yielded a significant decrease of the total execution time. «

[Abstract nur auf Englisch verfügbar.] The efficient operation of parallel computing systems requires the best possible use of the resources that a system provides. In order to achieve an effective utilization of a parallel machine a smart coordination of the resource demands of all currently operating applications is necessary. Dynamic resource management is particularly essential for the parallel solution of irregular problems that arise frequently, for example, during numerical simulations. O... »

Publication :

Universitätsbibliothek der TU München

Date of submission:

21.01.1999

Oral examination:

09.09.1999

File size:

3195915 bytes

Pages:

194

Urn (citeable URL):

Last change:

26.06.2007