Understanding the World: High Level Perception Based on Spatial and Intuitive Physical Reasoning

Zhang, Peng

Understanding the World: High Level Perception Based on Spatial and Intuitive Physical Reasoning

Date

2018

Authors

Zhang, Peng

Abstract

Artificial intelligence (AI) systems that can interact with real world environments have been widely investigated in the past decades. A variety of AI systems are demonstrated to be able to solve problems robustly in a fully observed and controlled environment. However, there still remain problems for AI systems to effectively analyse and interact with semi-observed, unknown or constantly changing environments. One main difficulty is the lack of capability of dealing with raw sensory data in a high perceptual level. Low level perception cares about receiving sensory data and processing the data systematically so that it can be used in other modules in the system. Low level perception may be acceptable for AI systems working in a fixed environment, as additional information about the environment is usually available. However, the absence of prior knowledge in less observed environments produces a gap between raw sensory data and the high level information required by many AI systems. To fill the gap, a perception system which can interpret raw sensory input into high level knowledge which can be understood by AI systems is required. The problems that limit the quality and capability of perception of AI systems are multitudinous. Although low level perception which concerns data reception and pre-processing is a significant component of a perception system, in this work, we focus on the subsequent high level perception tasks which relate to abstraction, representation and reasoning of the processed sensory data. There are several essential capabilities for high level perception of AI systems for analysing the requirement of critical information before a decision is made. First, the ability to represent spatial properties of the sensory data helps the perception system to resolve conflicts from sensory noise and recover incomplete information missed by the sensors. We develop an algorithm to combine raw sensory measurements from different view points of the same scene by resolving contradictory information and reconcile spatial features from different measurements. With spatial knowledge representation and reasoning (KRR), the ability of inferring and predicting changes of the environment from current and previous states will provide further guidance to the AI system for decision making. For this ability, we develop a general spatial reasoning system that predicts the evolution of constantly changing regions. However, in many situations where the AI system needs to physically interact with the entities, spatial knowledge is necessary but not sufficient. The ability of analysing physical properties of entities and their relations in the environment is required. For this task, we first develop a 2-dimensional reasoning system that analyse the support relations of rectangular blocks and predict the weak part of the structure of blocks. Then, we extend this idea to develop a method to reason about the support relations of real-world objects in a stack using RGB-D image data as input. With the above mentioned capabilities, the perception system is able to represent spatial properties of entities and their relations as well as predicting their evolutionary trend by discovering hidden information from the raw sensory data. Meanwhile, for manipulation tasks, supporting relations between objects can also be inferred by combining spatial and physical reasoning in the perception system.