A Monte-Carlo AIXI Approximation

Veness, Joel; Ng, Kee Siong; Hutter, Marcus; Uther, William; Silver, David

A Monte-Carlo AIXI Approximation

dc.contributor.author	Veness, Joel
dc.contributor.author	Ng, Kee Siong
dc.contributor.author	Hutter, Marcus
dc.contributor.author	Uther, William
dc.contributor.author	Silver, David
dc.date.accessioned	2015-08-24T06:29:36Z
dc.date.available	2015-08-24T06:29:36Z
dc.date.issued	2011-01
dc.description.abstract	This paper introduces a principled approach for the design of a scalable general reinforcement learning agent. Our approach is based on a direct approximation of AIXI, a Bayesian optimality notion for general reinforcement learning agents. Previously, it has been unclear whether the theory of AIXI could motivate the design of practical algorithms. We answer this hitherto open question in the affirmative, by providing the first computationally feasible approximation to the AIXI agent. To develop our approximation, we introduce a new Monte-Carlo Tree Search algorithm along with an agent-specific extension to the Context Tree Weighting algorithm. Empirically, we present a set of encouraging results on a variety of stochastic and partially observable domains. We conclude by proposing a number of directions for future research.	en_AU
dc.identifier.issn	1076-9757	en_AU
dc.identifier.uri	http://hdl.handle.net/1885/14906
dc.provenance	https://v2.sherpa.ac.uk/id/publication/32771/..."published version can be archived in institutional repository" from SHERPA/RoMEO site as at 14/05/2024
dc.publisher	AAAI Press	en_AU
dc.relation	http://purl.org/au-research/grants/arc/DP0988049	en_AU
dc.source	Journal of Artificial Intelligence Research	en_AU
dc.subject	Reinforcement Learning (RL)	en_AU
dc.subject	Context Tree Weighting (CTW)	en_AU
dc.subject	Monte Carlo Tree Search (MCTS)	en_AU
dc.subject	Upper Confidence bounds applied to Trees (UCT)	en_AU
dc.subject	Partially Observable Markov Decision Process (POMDP)	en_AU
dc.subject	Prediction Suffix Trees (PST)	en_AU
dc.title	A Monte-Carlo AIXI Approximation	en_AU
dc.type	Journal article	en_AU
dcterms.accessRights	Open Access
local.bibliographicCitation.lastpage	142	en_AU
local.bibliographicCitation.startpage	95	en_AU
local.contributor.affiliation	Ng, K. S., Research School of Computer Science, The Australian National University	en_AU
local.contributor.affiliation	Hutter, M., Research School of Computer Science, The Australian National University	en_AU
local.contributor.authoremail	keesiong.ng@gmail.com	en_AU
local.contributor.authoremail	marcus.hutter@anu.edu.au	en_AU
local.contributor.authoruid	u4350841	en_AU
local.identifier.citationvolume	40	en_AU
local.identifier.doi	10.1613/jair.3125	en_AU
local.identifier.uidSubmittedBy	u1005913	en_AU
local.publisher.url	http://jair.org/	en_AU
local.type.status	Published Version	en_AU

Collections

ANU Research Publications

A Monte-Carlo AIXI Approximation

Downloads

Collections