Rainbow: Combining Improvements in Deep Reinforcement Learning

Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, David Silver

2017

3 references

Abstract

The deep reinforcement learning community has made several independent improvements to the DQN algorithm. However, it is unclear which of these extensions are complementary and can be fruitfully combined. This paper examines six extensions to the DQN algorithm and empirically studies their combination. Our experiments show that the combination provides state-of-the-art performance on the Atari 2600 benchmark, both in terms of data efficiency and final performance. We also provide results from a detailed ablation study that shows the contribution of each component to overall performance.

View Paper PDF

🤖 Machine Learning

1 repository

3 references

Code References

▶ ray-project/ray

2 files

▶ doc/source/rllib/rllib-algorithms.rst

L110

All of the DQN improvements evaluated in `Rainbow <https://arxiv.org/abs/1710.02298>`__ are available, though not all are enabled by default.

L121

For a complete `rainbow <https://arxiv.org/pdf/1710.02298.pdf>`__ setup,

▶ rllib/algorithms/dqn/README.md

L30

[Rainbow](https://arxiv.org/pdf/1710.02298.pdf) - Rainbow DQN, as the word Rainbow suggests, aggregates the many improvements discovered in research to improve DQN performance. This includes a multi-step distributional loss (extended from Distributional DQN), prioritized replay (inspired from APEX-DQN), double Q-networks (inspired from Double DQN), and dueling networks (inspired from Dueling DQN).

Link copied to clipboard!