Training reinforcement learning (RL) agents for motion planning in heavily constrained solution spaces may require extensive exploration, leading to long training times. In automated driving, RL agents have to learn multiple skills at once, such as collision avoidance, traffic rule adherence, and goal reaching. In this work, we decompose this complicated learning task by applying curriculum learning for the first time onto an RL agent based on graph neural networks. The curriculum's sequence of sub-tasks gradually increases the difficulty of the longitudinal and lateral motion planning problem for the agent. Each of our sub-tasks contains a set of rewards, including novel rewards for temporal-logic-based traffic rules for speed, safety distance, and braking. Unlike prior work, the agent's state is extended by map and traffic rule information. Its performance is evaluated on prerecorded, real-world traffic data instead of simulations. Our numerical results show that the multi-stage curricula let the agent learn goal-seeking highway driving faster than in baseline setups trained from scratch. Including traffic rule information in both the RL state and rewards stabilizes the training and improves the agent's final goal-reaching performance.
«
Training reinforcement learning (RL) agents for motion planning in heavily constrained solution spaces may require extensive exploration, leading to long training times. In automated driving, RL agents have to learn multiple skills at once, such as collision avoidance, traffic rule adherence, and goal reaching. In this work, we decompose this complicated learning task by applying curriculum learning for the first time onto an RL agent based on graph neural networks. The curriculum's sequence of...
»