In this thesis, we investigate the applicability of reinforcement learning to dynamic investment strategies in continuous time. Firstly, we derive optimal trading and consumption strategies for an utility maximizing investor in a Black-Scholes market. In particular, we consider the logarithmic and power utility. To bridge the theoretical gap between continuous-time portfolio optimization and discrete-time RL, we also derive the discretized optimal trading and consumption strategies and the optimal Q-value functions. Secondly, we derive asymptotic and non-asymptotic convergence guarantees of a RL-algorithm for continuous state and action space under assumptions which generalize the assumption that the Q-value function is linear in the learnable parameters. Lastly, we empirically verify our results.
«
In this thesis, we investigate the applicability of reinforcement learning to dynamic investment strategies in continuous time. Firstly, we derive optimal trading and consumption strategies for an utility maximizing investor in a Black-Scholes market. In particular, we consider the logarithmic and power utility. To bridge the theoretical gap between continuous-time portfolio optimization and discrete-time RL, we also derive the discretized optimal trading and consumption strategies and the opti...
»