Producers of aging cheese serve multiple demand streams for products with different maturation ages. The distinct taste of these age-differentiated cheeses prevents product substitution, posing unique challenges in production decisions. These involve balancing immediate revenue from selling younger cheeses against potentially higher future earnings from cheeses aged longer. Additionally, managers must deal with uncertainties, as raw milk costs and final product sales prices follow correlated stochastic processes.
Our Markov Decision Process formulation uses Ornstein-Uhlenbeck processes to capture the price dynamics. The action space includes purchasing decisions, i.e., the amount of raw milk transformed into young cheese, and production decisions, i.e., the volumes of different products placed in the sales market. Further, as product labels define age ranges such as “matured for 3-6 months”, issuance decisions determine the allocation of stock volumes from specific age classes to individual products.
We propose a novel Deep Reinforcement Learning algorithm that combines Average Policy Optimization with a rolling horizon lookahead heuristic. In numerical experiments, we investigate the effect of price process parameters on near-optimal policies. Our findings suggest that the value of using age ranges on product labels increases with the mean reversion rate of these processes.
«
Producers of aging cheese serve multiple demand streams for products with different maturation ages. The distinct taste of these age-differentiated cheeses prevents product substitution, posing unique challenges in production decisions. These involve balancing immediate revenue from selling younger cheeses against potentially higher future earnings from cheeses aged longer. Additionally, managers must deal with uncertainties, as raw milk costs and final product sales prices follow correlated sto...
»