In this work we aim at providing a overview on gradient based temporal difference learning methods in reinforcement learning. We will look at three different cost functions, the mean squared Bellman error, the mean squared projected Bellman error and the norm of the expected update. Finally we will derive two new on-line gradient algorithms for TD learning, that base on the idea of bootstrapping.