Human Motion Prediction: From Deterministic to Stochastic

Mao, Wei

Human Motion Prediction: From Deterministic to Stochastic

Date

2022

Authors

Mao, Wei

Abstract

Humans are the central subjects to be studied in a computer vision system. In particular, the ability of forecasting future human motion is the key to the success of many applications such as, autonomous driving and human robot interaction. In this thesis, we tackle the problem of 3D human motion prediction, which aims to predict the future movements of a person given his/her motion in the past. Human motion is highly dynamic, uncertain and often reflected by a sequence of action labels. To address this, we try to tackle the three main issues in this topic: 1) how to effectively model the human motion data? 2) how to produce diverse and plausible future motions given one history? 3) how to produce realistic future motions of clear semantic meanings i.e., actions? The main issues also perfectly correspond to the three tasks we focus on: deterministic human motion prediction, stochastic human motion prediction and action-driven stochastic human motion prediction. First, we propose a novel spatial-temporal encoding strategy which takes account both temporal smoothness of joint trajectory and spatial dependencies among human joints. It is done by bringing the motion sequence to trajectory space and using a fully-connected graph structure to model the spatial relationships among different trajectories. Moreover, we introduce an attention-based approach that is able to make use of the past sub-motions from a long historical motion, that can better reflect the current context. Second, for stochastic human motion prediction, we propose an end-to-end trainable approach that is able to produce diverse future motions by predicting a sequence of valid human poses with smooth trajectories. Our approach relaxes the requirement of a large amount of diverse training motion data and also can be extended to new applications like controllable human motion prediction. Last, we introduce the new task of action-driven stochastic human motion prediction that aims to predict a set of future motions given a sequence of action labels and past motion observations. We develop a weakly-supervised training strategy to learn various action transitions from data with few or even no transitions. We also propose an effective way to produce future motions of varying lengths. In summary, the goal of this thesis is to enable vision systems to effectively model and predict/produce 3D human motions. In particular, we propose several novel ideas that push the boundaries of current state-of-the-arts solutions for 3D human motion prediction.