Death and suicide in universal artificial intelligence
Date
2016-06-25
Authors
Martin, Jarryd
Everitt, Tom
Hutter, Marcus
Journal Title
Journal ISSN
Volume Title
Publisher
Springer Verlag (Germany)
Abstract
Reinforcement learning (RL) is a general paradigm for studying
intelligent behaviour, with applications ranging from artificial intelligence
to psychology and economics. AIXI is a universal solution to the
RL problem; it can learn any computable environment. A technical subtlety
of AIXI is that it is defined using a mixture over semimeasures
that need not sum to 1, rather than over proper probability measures.
In this work we argue that the shortfall of a semimeasure can naturally
be interpreted as the agent’s estimate of the probability of its death. We
formally define death for generally intelligent agents like AIXI, and prove
a number of related theorems about their behaviour. Notable discoveries
include that agent behaviour can change radically under positive linear
transformations of the reward signal (from suicidal to dogmatically
self-preserving), and that the agent’s posterior belief that it will survive
increases over time.
Description
Keywords
reinforcement learning (RL), semimeasure, intelligent, behaviour, artificial, intelligence, AIXI, death, probability
Citation
Collections
Source
Lecture Notes in Computer Science
Type
Journal article
Book Title
Entity type
Access Statement
Open Access