Representation Learning for Agents in Non-Markovian Environments

Loading...
Thumbnail Image

Date

Authors

Wang, Tianyu

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

This thesis focuses on representation learning for agents that seek to acquire an environment model by interacting with the environment and learning from observation data. Simplified models such as Markov Decision Processes (MDP), which assume a fully observable state and independence of the future from the past given current observations, are widely employed. However, such an assumption is commonly violated in practical applications as distant history can still be crucial in predictions of the future and the entire historical context may need to be retained. Our research, categorized as representation learning, aims to develop feature functions that map history to compact representations. Such representations should not only retain relevant historical information but also facilitate efficient downstream inference processes. We first devise a representation learning scheme for AIXI, an artificial general intelligence framework. The generality of AIXI partially stems from treating each environment model as a probability distribution conditioned on the entire history. Approximations of AIXI typically assume the environment to be Markovian, which compromises their generality. In our approach, we identify a set of feature functions in a rich expressive logic that can map the entire history into an element within a state space, in the process show how feature reinforcement learning ideas can be used in direct approximation of AIXI. Then, we redirected our focus towards acquiring representations of 3D environments using deep learning techniques. We extend the object-centric learning literature to parse 3D point clouds into a set of object-level representations in an unsupervised fashion. These representations encompass object instances and semantic information that can be used for downstream decision-making. We then turn our attention to the scalability aspect of object-centric learning. Earlier methods primarily concentrate on scenes with constrained sizes and do not address the issue of generalization to larger scenes. To bridge this gap, we introduce an online, scalable 3D object-centric pipeline capable of detecting and modelling objects based on sequential observation inputs collected from scenes of varying sizes. This inference process can be regarded as a feature function that efficiently maps historical data to a state but in constant time at each step.

Description

Keywords

Citation

Source

Book Title

Entity type

Access Statement

License Rights

Restricted until

2024-11-26

Downloads

File
Description