A Monte-Carlo AIXI Approximation
Date
2011-01
Authors
Veness, Joel
Ng, Kee Siong
Hutter, Marcus
Uther, William
Silver, David
Journal Title
Journal ISSN
Volume Title
Publisher
AAAI Press
Abstract
This paper introduces a principled approach for the design of a
scalable general reinforcement learning agent. Our approach is
based on a direct approximation of AIXI, a Bayesian optimality
notion for general reinforcement learning agents. Previously,
it has been unclear whether the theory of AIXI could motivate
the design of practical algorithms. We answer this hitherto
open question in the affirmative, by providing the first
computationally feasible approximation to the AIXI agent. To
develop our approximation, we introduce a new Monte-Carlo Tree
Search algorithm along with an agent-specific extension to the
Context Tree Weighting algorithm. Empirically, we present a set
of encouraging results on a variety of stochastic and partially
observable domains. We conclude by proposing a number of
directions for future research.
Description
Keywords
Reinforcement Learning (RL), Context Tree Weighting (CTW), Monte Carlo Tree Search (MCTS), Upper Confidence bounds applied to Trees (UCT), Partially Observable Markov Decision Process (POMDP), Prediction Suffix Trees (PST)
Citation
Collections
Source
Journal of Artificial Intelligence Research
Type
Journal article
Book Title
Entity type
Access Statement
Open Access