Context tree maximizing reinforcement learning
Loading...
Date
Authors
Nguyen, Phuong
Sunehag, Peter
Hutter, Marcus
Journal Title
Journal ISSN
Volume Title
Publisher
Association for the Advancement of Artificial Intelligence
Abstract
Recent developments in reinforcement learning for nonMarkovian
problems witness a surge in history-based methods,
among which we are particularly interested in two frameworks,
ΦMDP and MC-AIXI-CTW. ΦMDP attempts to reduce
the general RL problem, where the environment’s states
and dynamics are both unknown, to an MDP, while MCAIXI-CTW
incrementally learns a mixture of context trees
as its environment model. The main idea of ΦMDP is to connect
generic reinforcement learning with classical reinforcement
learning. The first implementation of ΦMDP relies on a
stochastic search procedure for finding a tree that minimizes a
certain cost function. This does not guarantee finding the minimizing
tree, or even a good one, given limited search time.
As a consequence it appears that the approach has difficulties
with large domains. MC-AIXI-CTW is attractive in that it can
incrementally and analytically compute the internal model
through interactions with the environment. Unfortunately, it
is computationally demanding due to requiring heavy planning
simulations at every single time step. We devise a novel
approach called CTMRL, which analytically and efficiently
finds the cost-minimizing tree. Instead of the context-tree
weighting method that MC-AIXI-CTW is based on, we use
the closely related context-tree maximizing algorithm that selects
just one single tree. This approach falls under the ΦMDP
framework, which allows the replacement of the costly planning
component of MC-AIXI-CTW with simple Q-Learning.
Our empirical investigation shows that CTMRL finds policies
of quality as good as MC-AIXI-CTW’s on six domains
including a challenging Pacman domain, but in an order of
magnitude less time.
Description
Citation
Collections
Source
Type
Book Title
Proceedings of the 26th AAAI Conference on Artificial Intelligence
Entity type
Access Statement
License Rights
DOI
Restricted until
Downloads
File
Description