Avoiding wireheading with value reinforcement learning
How can we design good goals for arbitrarily intelligent agents? Reinforcement learning (RL) may seem like a natural approach. Unfortunately, RL does not work well for generally intelligent agents, as RL agents are incentivised to shortcut the reward sensor for maximum reward – the so-called wireheading problem. In this paper we suggest an alternative to RL called value reinforcement learning (VRL). In VRL, agents use the reward signal to a utility function. The VRL setup allows us to...[Show more]
|Collections||ANU Research Publications|
|Source:||Lecture Notes in Computer Science|
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.