Bayesian reinforcement learning with exploration

Date

2014

Authors

Lattimore, Tor
Hutter, Marcus

Journal Title

Journal ISSN

Volume Title

Publisher

Springer

Abstract

We consider a general reinforcement learning problem and show that carefully combining the Bayesian optimal policy and an exploring policy leads to minimax sample-complexity bounds in a very general class of (history-based) environments. We also prove lower bounds and show that the new algorithm displays adaptive behaviour when the environment is easier than worst-case.

Description

Keywords

Citation

Source

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Volume 8776

Type

Conference paper

Book Title

Entity type

Access Statement

Open Access

License Rights

Restricted until