Price-based demand response (DR) enables households to provide the flexibility required in power grids with a high share of volatile renewable energy sources. Multi-agent reinforcement learning (MARL) offers a powerful, decentralized decision-making tool for autonomous agents participating in DR programs. Unfortunately, MARL algorithms do not naturally allow one to incorporate safety guarantees, preventing their real-world deployment. To meet safety constraints, we propose a safety layer that minimally adjusts each agent’s decisions. We investigate the influence of using a reward function that reflects these safety adjustments. Results show that considering safety aspects in the reward during training improves both convergence speed and performance of the MARL agents in the investigated numerical experiments.
«
Price-based demand response (DR) enables households to provide the flexibility required in power grids with a high share of volatile renewable energy sources. Multi-agent reinforcement learning (MARL) offers a powerful, decentralized decision-making tool for autonomous agents participating in DR programs. Unfortunately, MARL algorithms do not naturally allow one to incorporate safety guarantees, preventing their real-world deployment. To meet safety constraints, we propose a safety layer that mi...
»