Avoiding wireheading with value reinforcement learning
| dc.contributor.author | Everitt, Tom | |
| dc.contributor.author | Hutter, Marcus | |
| dc.date.accessioned | 2016-12-21T01:37:16Z | |
| dc.date.available | 2016-12-21T01:37:16Z | |
| dc.date.issued | 2016-06-25 | |
| dc.description.abstract | How can we design good goals for arbitrarily intelligent agents? Reinforcement learning (RL) may seem like a natural approach. Unfortunately, RL does not work well for generally intelligent agents, as RL agents are incentivised to shortcut the reward sensor for maximum reward – the so-called wireheading problem. In this paper we suggest an alternative to RL called value reinforcement learning (VRL). In VRL, agents use the reward signal to a utility function. The VRL setup allows us to remove the incentive to wirehead by placing a constraint on the agent’s actions. The constraint is defined in terms of the agent’s belief distributions, and does not require an explicit specification of which actions constitute wireheading. | en_AU |
| dc.format | 12 pages | en_AU |
| dc.format.mimetype | application/pdf | en_AU |
| dc.identifier.issn | 0302-9743 | en_AU |
| dc.identifier.uri | http://hdl.handle.net/1885/111445 | |
| dc.publisher | Springer Verlag (Germany) | en_AU |
| dc.rights | © Springer International Publishing Switzerland 2016 | en_AU |
| dc.source | Lecture Notes in Computer Science | en_AU |
| dc.subject | intelligent | en_AU |
| dc.subject | agent | en_AU |
| dc.subject | einforcement learning (RL) | en_AU |
| dc.subject | wireheading | en_AU |
| dc.subject | problem | en_AU |
| dc.subject | value reinforcement learning (VRL) | en_AU |
| dc.subject | reward signal | en_AU |
| dc.subject | learn | en_AU |
| dc.subject | utility function | en_AU |
| dc.title | Avoiding wireheading with value reinforcement learning | en_AU |
| dc.type | Journal article | en_AU |
| dcterms.accessRights | Open Access | en_AU |
| local.bibliographicCitation.lastpage | 22 | en_AU |
| local.bibliographicCitation.startpage | 12 | en_AU |
| local.contributor.affiliation | Hutter, Marcus, Research School of Computer Science, College of Engineering and Computer Science, The Australian National University | en_AU |
| local.contributor.affiliation | Everitt, Tom, Research School of Computer Science, College of Engineering and Computer Science, The Australian National University | en_AU |
| local.contributor.authoruid | u4350841 | en_AU |
| local.description.notes | The article appears as a monographic series in Everitt T., Hutter M. (2016) Avoiding Wireheading with Value Reinforcement Learning. In: Steunebrink B., Wang P., Goertzel B. (eds) Artificial General Intelligence. AGI 2016. Lecture Notes in Computer Science, vol 9782. Springer, Cham. | en_AU |
| local.identifier.citationvolume | 9782 | en_AU |
| local.identifier.doi | 10.1007/978-3-319-41649-6_2 | en_AU |
| local.publisher.url | http://link.springer.com/ | en_AU |
| local.type.status | Published Version | en_AU |
Downloads
License bundle
1 - 1 of 1
Loading...
- Name:
- license.txt
- Size:
- 884 B
- Format:
- Item-specific license agreed upon to submission
- Description: