Outstanding Paper Award on Scientific Understanding in RL
We are honoured to announce that members of the ELLIS Delft Unit have won the Outstanding Paper Award on Scientific Understanding in RL at the first Reinforcement Learning Conference (RLC). The papers were awarded based on specific aspects of their contribution.
Outstanding Paper Award on Scientific Understanding in RL
Bad Habits: Policy Confounding and Out-of-Trajectory Generalization in RL.
By Miguel Suau, Matthijs T. J. Spaan, and Frans A Oliehoek
This paper furthers our scientific understanding of why RL agents struggle to generalize to new scenarios at test time, paving the way to developing more robust RL algorithms. The paper characterizes the phenomenon of policy confounding through the lens of causality, whereby when following specific trajectories, RL agents can learn behaviours based on spurious correlations (between observations and rewards) because the policy is confounded with the data. The paper shows that on-policy algorithms can learn representations that are sufficient for the trajectory induced by the optimal policy but do not necessarily generalize well to new states, making agents non-robust to changes in the environment’s dynamics.