Abstract
We study the design of learning architectures for behavioural planning in a dense traffic setting. Such architectures should deal with a varying number of nearby vehicles, be invariant to the ordering chosen to describe them, while staying accurate and compact. We observe that the two most popular representations in the literature do not fit these criteria, and perform badly on an complex negotiation task. We propose an attention-based architecture that satisfies all these properties and explicitly accounts for the existing interactions between the traffic participants. We show that this architecture leads to significant performance gains, and is able to capture interactions patterns that can be visualized and qualitatively interpreted.
Videos
Full episode
Attention and distance to vehicles
Sensitivity to vehicles states
Effect of road priorities
Supplementary videos
Reproduce the experiments
- Install the highway-env environment
pip install --user git+https://github.com/eleurent/highway-env
- Install the rl-agents implementations
pip install --user git+https://github.com/eleurent/rl-agents
- Train the agents (repeat for several seeds)
- MLP/List
python experiments.py evaluate configs/IntersectionEnv/env.json \
configs/IntersectionEnv/agents/DQNAgent/baseline.json \
--train --episodes=4000 --name-from-config
- CNN/Grid
python experiments.py evaluate configs/IntersectionEnv/env_grid_dense.json \
configs/IntersectionEnv/agents/DQNAgent/grid_convnet.json \
--train --episodes=4000 --name-from-config
- Ego-Attention
python experiments.py evaluate configs/IntersectionEnv/env.json \
configs/IntersectionEnv/agents/DQNAgent/ego_attention_2h.json \
--train --episodes=4000 --name-from-config
- Visualize the results
python analyze.py run out/IntersectionEnv/DQNAgent/