Skip navigation
Skip navigation
The system will be down for maintenance between 8:00 and 8:15am on Thursday 13, December 2018

Reinforcement learning via AIXI Approximation

Veness, Joel; Ng, Kee Siong; Hutter, Marcus; Silver, David

Description

This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. This approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the affirmative, by providing the first computationally feasible approximation to the AIXI agent....[Show more]

dc.contributor.authorVeness, Joel
dc.contributor.authorNg, Kee Siong
dc.contributor.authorHutter, Marcus
dc.contributor.authorSilver, David
dc.date.accessioned2015-08-24T05:26:26Z
dc.date.available2015-08-24T05:26:26Z
dc.identifier.urihttp://hdl.handle.net/1885/14903
dc.description.abstractThis paper introduces a principled approach for the design of a scalable general reinforcement learning agent. This approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the affirmative, by providing the first computationally feasible approximation to the AIXI agent. To develop our approximation, we introduce a Monte Carlo Tree Search algorithm along with an agent-specific extension of the Context Tree Weighting algorithm. Empirically, we present a set of encouraging results on a number of stochastic, unknown, and partially observable domains.
dc.publisherAAAI Press
dc.relation.ispartofProceedings of the 24th AAAI Conference on Artificial Intelligence
dc.rightsCopyright © 2010, Association for the Advancement of Artificial Intelligence (www.aaai.org). Authors can archive papers http://www.aaai.org/ocs/index.php/AAAI/index/about/submissions#copyrightNotice as at 24/08/15
dc.subjectReinforcement Learning (RL)
dc.subjectContext Tree Weighting (CTW)
dc.subjectMonte Carlo Tree Search (MCTS)
dc.subjectUpper Confidence bounds applied to Trees (UCT)
dc.subjectPartially Observable Markov Decision Process (POMDP)
dc.subjectPrediction Suffix Trees (PST)
dc.titleReinforcement learning via AIXI Approximation
dc.typeConference paper
dc.date.issued2010-07
local.type.statusAccepted Version
local.contributor.affiliationNg, K. S., Research School of Computer Science, The Australian National University
local.contributor.affiliationHutter, M., Research School of Computer Science, The Australian National University
dc.relationhttp://purl.org/au-research/grants/arc/DP0988049
local.bibliographicCitation.startpage605
local.bibliographicCitation.lastpage611
CollectionsANU Research Publications

Download

File Description SizeFormat Image
Veness et al Reinforcement Learning via AIXI Approximation 2010.pdf154.38 kBAdobe PDFThumbnail


Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  27 November 2018/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator