In this work we propose a reinforcement learning approach to the optimization problem of the upstream and downstream operations of the primary biopharmaceutical production. The reinforcement learning agent has to determine when to harvest the product, when to keep accumulating product
through fermentation and when to exchange the resin in the chromatography column under uncertain product formation and uncertain chromatography consumables capacity decay. We converted the MDP setting in the work of Mirko-Schoemig Beissner and Grunow to a reinforcement learning environment
and trained the agent using the Proximal Policy Optimization algorithm implemented by Stable Baselines.
«
In this work we propose a reinforcement learning approach to the optimization problem of the upstream and downstream operations of the primary biopharmaceutical production. The reinforcement learning agent has to determine when to harvest the product, when to keep accumulating product
through fermentation and when to exchange the resin in the chromatography column under uncertain product formation and uncertain chromatography consumables capacity decay. We converted the MDP setting in the work...
»