Implementing a learning-rate scheduler in a Newton-CG Optimizer for Deep Learning

Danylo Movchan

Document type:: Bachelorarbeit
Author(s):: Danylo Movchan
Title:: Implementing a learning-rate scheduler in a Newton-CG Optimizer for Deep Learning
Abstract:: Nowadays, Deep Neural Networks models are at the peak of their popularity and find applications in a variety fields, e.g. in translation engines, where Natural Language Processing is used. Training such networks requires enormous computing resources and can take up to 2 weeks, and most often uses rather naive first-order optimization algorithms. Given the fact that modern deep neural networks have several millions of parameters, second-order methods have long been considered unfeasible because of their quadratic complexity of network size. Previous studies have shown that combining Newton’s method with Conjugate Gradients method and Fast Exact Multiplication (short: Newton-CG) leads to speed-up and accuracy benefits in areas such as image classification and neural machine translation. In our work we tried to determine whether the use of different learning rate schedulers can help to develop these benefits, eliminating some of the drawbacks of standard Newton-CG. The Newton-CG with learn-rate scheduler allows for bigger initial learning rates, while still being stable close to minimum, and thus, faster training. «
Nowadays, Deep Neural Networks models are at the peak of their popularity and find applications in a variety fields, e.g. in translation engines, where Natural Language Processing is used. Training such networks requires enormous computing resources and can take up to 2 weeks, and most often uses rather naive first-order optimization algorithms. Given the fact that modern deep neural networks have several millions of parameters, second-order methods have long been considered unfeasible beca... »
Supervisor:: Hans-Joachim Bungartz
Advisor:: Severin Reiz
Year:: 2022
Quarter:: 2. Quartal
Year / month:: 2022-05
Month:: May
Pages:: 65
Language:: en
University:: Technical University of Munich
Faculty:: TUM School of Computation, Information and Technology
TUM Institution:: Fakultät für Informatik
BibTeX

Occurrences:

mediaTUM Gesamtbestand Einrichtungen Schools TUM School of Computation, Information and Technology Departments Computer Science Informatik 5 - Lehrstuhl für Scientific Computing (Prof. Bungartz)New folder