Bayesian reinforcement learning with exploration
Date
2014
Authors
Lattimore, Tor
Hutter, Marcus
Journal Title
Journal ISSN
Volume Title
Publisher
Springer
Abstract
We consider a general reinforcement learning problem and show that carefully combining the Bayesian optimal policy and an exploring policy leads to minimax sample-complexity bounds in a very general class of (history-based) environments. We also prove lower bounds and show that the new algorithm displays adaptive behaviour when the environment is easier than worst-case.
Description
Keywords
Citation
Collections
Source
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Volume 8776
Type
Conference paper
Book Title
Entity type
Access Statement
Open Access
License Rights
Restricted until
Downloads
File
Description