Reinforcement Learning, specically the field of Deep Reinforcement Learning, has gained increasing interest over the past ten years due to its ability to find optimal controls in a stochastic environment that can achieve `super-human' performances. In this thesis, we analyse the application of model-free Q-Learning to derive expected utility-maximising investment and consumption strategies for an investor with a finite planning horizon for two different underlying financial market models. First, we consider a discrete-time Cox-Ross-Rubinstein financial market model and formalise the portfolio optimisation problem as an in finite-horizon discounted Markov decision process. In this setting, we are able to provide a convergence guarantee of the Q-Learning algorithm to the optimal solution. Even though, in theory, the Cox-Ross-Rubinstein model converges to the Black-Scholes model, we find that its practicability is limited as the algorithm suffers from the curse of dimensionality. Secondly, we turn our attention to the well-known finite-horizon Merton portfolio problem with an underlying continuous-time Black-Scholes financial market model and address the modifications needed to apply the Q-Learning algorithm. We prove characteristic properties of the so-called action-value function for an investor with a logarithmic utility function and explain how to exploit the known structure. We find that our modified Q-Learning algorithm is able to derive near-to-optimal investment strategies and that faster convergence is observable, though not provable, if parameterised appropriately.
«
Reinforcement Learning, specically the field of Deep Reinforcement Learning, has gained increasing interest over the past ten years due to its ability to find optimal controls in a stochastic environment that can achieve `super-human' performances. In this thesis, we analyse the application of model-free Q-Learning to derive expected utility-maximising investment and consumption strategies for an investor with a finite planning horizon for two different underlying financial market models. Firs...
»