Aslanides, John; Leike, Jan; Hutter, Marcus
Many state-of-the-art reinforcement learning (RL) algorithms typically assume that the environment is an ergodic Markov Decision Process (MDP). In contrast, the field of universal reinforcement learning (URL) is concerned with algorithms that make
as few assumptions as possible about the environment. The universal Bayesian agent AIXI and a family of related URL algorithms have been developed in this setting. While numerous theoretical optimality results have been proven for these agents, there...[Show more]
Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.