Physical reasoning for intelligent agents in simulated environments

Ge, Xiaoyu

Physical reasoning for intelligent agents in simulated environments

Date

2017

Authors

Ge, Xiaoyu

Abstract

Developing Artificial Intelligence (AI) that is capable of understanding and interacting with the real world in a sophisticated way has long been a grand vision of AI. There is an increasing number of AI agents coming into our daily lives and assisting us with various daily tasks ranging from house cleaning to serving food in restaurants. While different tasks have different goals, the domains of the tasks all obey the physical rules (classic Newtonian physics) of the real world. To successfully interact with the physical world, an agent needs to be able to {u0300}{u0300}understand" its surrounding environment, to predict the consequences of its actions and to draw plans that can achieve a goal without causing any unintended outcomes. Much of AI research over the past decades has been dedicated to specific sub-problems such as machine learning and computer vision, etc. Simply plugging in techniques from these subfields is far from creating a comprehensive AI agent that can work well in a physical environment. Instead, it requires an integration of methods from different AI areas that considers specific conditions and requirements of the physical environment.In this thesis, we identified several capabilities that are essential for AI to interact with the physcial world, namely, visual perception, object detection, object tracking, action selection, and structure planning. As the real world is a highly complex environment, we started with developing these capabilities in virtual environments with realistic physics simulations. The central part of our methods is the combination of qualitative reasoning and standard techniques from different AI areas. For the visual perception capability, we developed a method that can infer spatial properties of rectangular objects from their minimum bounding rectangles. For the object detection capability, we developed a method that can detect unknown objects in a structure by reasoning about the stability of the structure. For the object tracking capability, we developed a method that can match perceptually indistinguishable objects in visual observations made before and after a physical impact. This method can identify spatial changes of objects in the physical event, and the result of matching can be used for learning the consequence of the impact. For the action selection capability, we developed a method that solves a hole-in-one problem that requires selecting an action out of an infinite number of actions with unknown consequences. For the structure planning capability, we developed a method that can arrange objects to form a stable and robust structure by reasoning about structural stability and robustness.