Avoiding wireheading with value reinforcement learning

dc.contributor.authorEveritt, Tom
dc.contributor.authorHutter, Marcus
dc.date.accessioned2016-12-21T01:37:16Z
dc.date.available2016-12-21T01:37:16Z
dc.date.issued2016-06-25
dc.description.abstractHow can we design good goals for arbitrarily intelligent agents? Reinforcement learning (RL) may seem like a natural approach. Unfortunately, RL does not work well for generally intelligent agents, as RL agents are incentivised to shortcut the reward sensor for maximum reward – the so-called wireheading problem. In this paper we suggest an alternative to RL called value reinforcement learning (VRL). In VRL, agents use the reward signal to a utility function. The VRL setup allows us to remove the incentive to wirehead by placing a constraint on the agent’s actions. The constraint is defined in terms of the agent’s belief distributions, and does not require an explicit specification of which actions constitute wireheading.en_AU
dc.format12 pagesen_AU
dc.format.mimetypeapplication/pdfen_AU
dc.identifier.issn0302-9743en_AU
dc.identifier.urihttp://hdl.handle.net/1885/111445
dc.publisherSpringer Verlag (Germany)en_AU
dc.rights© Springer International Publishing Switzerland 2016en_AU
dc.sourceLecture Notes in Computer Scienceen_AU
dc.subjectintelligenten_AU
dc.subjectagenten_AU
dc.subjecteinforcement learning (RL)en_AU
dc.subjectwireheadingen_AU
dc.subjectproblemen_AU
dc.subjectvalue reinforcement learning (VRL)en_AU
dc.subjectreward signalen_AU
dc.subjectlearnen_AU
dc.subjectutility functionen_AU
dc.titleAvoiding wireheading with value reinforcement learningen_AU
dc.typeJournal articleen_AU
dcterms.accessRightsOpen Accessen_AU
local.bibliographicCitation.lastpage22en_AU
local.bibliographicCitation.startpage12en_AU
local.contributor.affiliationHutter, Marcus, Research School of Computer Science, College of Engineering and Computer Science, The Australian National Universityen_AU
local.contributor.affiliationEveritt, Tom, Research School of Computer Science, College of Engineering and Computer Science, The Australian National Universityen_AU
local.contributor.authoruidu4350841en_AU
local.description.notesThe article appears as a monographic series in Everitt T., Hutter M. (2016) Avoiding Wireheading with Value Reinforcement Learning. In: Steunebrink B., Wang P., Goertzel B. (eds) Artificial General Intelligence. AGI 2016. Lecture Notes in Computer Science, vol 9782. Springer, Cham.en_AU
local.identifier.citationvolume9782en_AU
local.identifier.doi10.1007/978-3-319-41649-6_2en_AU
local.publisher.urlhttp://link.springer.com/en_AU
local.type.statusPublished Versionen_AU

Downloads

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
884 B
Format:
Item-specific license agreed upon to submission
Description: