Skip navigation
Skip navigation

Constructing States for Reinforcement Learning

Mahmud, Hassan

Description

POMDPs are the models of choice for reinforcement learning (RL) tasks where the environment cannot be observed directly. In many applications we need to learn the POMDP structure and parameters from experience and this is considered to be a difficult problem. In this paper we address this issue by modeling the hidden environment with a novel class of models that are less expressive, but easier to learn and plan with than POMDPs. We call these models deterministic Markov models (DMMs), which are...[Show more]

dc.contributor.authorMahmud, Hassan
dc.coverage.spatialHaifa Israel
dc.date.accessioned2015-12-08T22:30:18Z
dc.date.createdJune 21 2010
dc.identifier.isbn9781605589077
dc.identifier.urihttp://hdl.handle.net/1885/34380
dc.description.abstractPOMDPs are the models of choice for reinforcement learning (RL) tasks where the environment cannot be observed directly. In many applications we need to learn the POMDP structure and parameters from experience and this is considered to be a difficult problem. In this paper we address this issue by modeling the hidden environment with a novel class of models that are less expressive, but easier to learn and plan with than POMDPs. We call these models deterministic Markov models (DMMs), which are deterministic-probabilistic finite automata from learning theory, extended with actions to the sequential (rather than i.i.d.) setting. Conceptually, we extend the Utile Suffix Memory method of McCal-lum to handle long term memory. We describe DMMs, give Bayesian algorithms for learning and planning with them and also present experimental results for some standard POMDP tasks and tasks to illustrate its efficacy.
dc.publisherOmniPress
dc.relation.ispartofseriesInternational Conference on Machine Learning (ICML 2010)
dc.sourceProceedings of International Conference on Machine Learning (ICML 2010)
dc.subjectKeywords: Bayesian algorithms; Learning Theory; Long term memory; Markov model; Probabilistic finite automata; Markov processes; Reinforcement learning; Automata theory
dc.titleConstructing States for Reinforcement Learning
dc.typeConference paper
local.description.notesImported from ARIES
local.description.refereedYes
dc.date.issued2010
local.identifier.absfor089999 - Information and Computing Sciences not elsewhere classified
local.identifier.ariespublicationu4963866xPUB112
local.type.statusPublished Version
local.contributor.affiliationMahmud, Hassan, College of Engineering and Computer Science, ANU
local.description.embargo2037-12-31
local.bibliographicCitation.startpage8
local.identifier.absseo970108 - Expanding Knowledge in the Information and Computing Sciences
dc.date.updated2016-02-24T11:29:47Z
local.identifier.scopusID2-s2.0-77956529192
CollectionsANU Research Publications

Download

File Description SizeFormat Image
01_Mahmud_Constructing_States_for_2010.pdf385.68 kBAdobe PDFThumbnail


Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  19 May 2020/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator