Bayesian reinforcement learning with exploration

Lattimore, Tor; Hutter, Marcus

Bayesian reinforcement learning with exploration

Date

2014

Authors

Lattimore, Tor

Hutter, Marcus

Publisher

Springer

Abstract

We consider a general reinforcement learning problem and show that carefully combining the Bayesian optimal policy and an exploring policy leads to minimax sample-complexity bounds in a very general class of (history-based) environments. We also prove lower bounds and show that the new algorithm displays adaptive behaviour when the environment is easier than worst-case.

URI

http://hdl.handle.net/1885/58180

Collections

ANU Research Publications

Source

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) Volume 8776

Type

Conference paper

Access Statement

Open Access

DOI

10.1007/978-3-319-11662-4_13

Downloads

File

Description

01_Lattimore_Bayesian_reinforcement_2014.pdf (317.25 KB)

Full item page

Bayesian reinforcement learning with exploration

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Source

Type

Book Title

Entity type

Access Statement

License Rights

DOI

Restricted until

Downloads