This thesis presents a second-order optimizer called Newton-CG to solve a Portuguese
to English Neural Machine Translation (NMT) task on the most dominant NMT model,
Transformer. We mainly focus on comparing the performance between Newton-CG and two
popular first-order optimizers, Adam and Stochastic gradient descent(SGD). In our previous
research, the Newton-CG has already gained speed-up and accuracy in image classification.
Besides, Newton-CG has shown higher accuracy than other first-order optimizers in senti-
ment analysis on the Attention model. In this NMT task, Newton-CG with pre-training
outperforms others in BLEU scores and overcomes the overfitting.
«
This thesis presents a second-order optimizer called Newton-CG to solve a Portuguese
to English Neural Machine Translation (NMT) task on the most dominant NMT model,
Transformer. We mainly focus on comparing the performance between Newton-CG and two
popular first-order optimizers, Adam and Stochastic gradient descent(SGD). In our previous
research, the Newton-CG has already gained speed-up and accuracy in image classification.
Besides, Newton-CG has shown higher accuracy than other first-or...
»