Methods for Efficient Reinforcement Learning in Novel Environments

Date

2023

Authors

Nikonova, Ekaterina

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Recent advancements in artificial intelligence have resulted in remarkable progress in developing systems that can perform tasks at both superhuman and human levels. In deep reinforcement learning, this progress has led to the creation of agents that outperform humans in games such as Atari, Dota 2, and Go. However, these systems often require massive amounts of data and computational resources to learn to perform a single task. When deploying such systems in the real world, it is essential to ensure that they can adapt to changes quickly without needing to retrain from scratch. Creating artificially intelligent systems that can adjust to environmental changes rapidly remains a significant and challenging task. While there have been many approaches to making learning more efficient and adaptable, recent research in lifelong and open-world learning emphasizes the importance of reusing previous knowledge when dealing with novelty. In this thesis, we focus on developing approaches that use previous knowledge and experience to make deep reinforcement learning agents more efficient and adaptable to changes. We present several methods that explore this idea. Our methods vary from using human knowledge and agent observations to measuring the similarity between the tasks and reusing learned policies. We begin by presenting a method that combines deep learning with Newton's laws of physics for accurate trajectory prediction in physical games. We demonstrate that this method has transfer learning capabilities and can transfer from one physical domain to another. We then propose a method to measure the similarity between scenarios and a method to construct a policy from the policies of highly similar tasks. Our results show that the proposed methods allow the agent to solve new scenarios more quickly or even instantly. We then address safety concerns in reinforcement learning by using human-defined safety rules that override the agent's actions if they are considered unsafe, resulting in much more efficient and safer learning. Later, we eliminate the need for human-defined rules by proposing a general framework for deep reinforcement learning agents that discovers task-specific rules and self-supervises its learning. We demonstrate that this approach leads to significantly more efficient learning in domains of various complexities. Finally, we propose a universal method to measure the difficulty of novelties and test it in real-world evaluation. We present the experimental results for all of these methods and show how each method makes learning more efficient and adaptable to sudden changes and novelties.

Description

Keywords

Citation

Source

Type

Thesis (PhD)

Book Title

Entity type

Access Statement

License Rights

Restricted until

Downloads

File
Description