Bayesian reinforcement learning with exploration

Authors

Lattimore, Tor
Hutter, Marcus

Journal Title

Journal ISSN

Volume Title

Publisher

Springer Verlag

Abstract

We consider a general reinforcement learning problem and show that carefully combining the Bayesian optimal policy and an exploring policy leads to minimax sample-complexity bounds in a very general class of (history-based) environments. We also prove lower bounds and show that the new algorithm displays adaptive behaviour when the environment is easier than worst-case.

Description

Keywords

Citation

Source

Book Title

Algorithmic Learning Theory: 25th International Conference, ALT 2014, Bled, Slovenia, October 8-10, 2014. Proceedings

Entity type

Access Statement

Open Access

License Rights

Restricted until