Tensor Representations for Action Recognition
Loading...
Date
Authors
Koniusz, Piotr
Wang, Lei
Cherian, Anoop
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronics Engineers (IEEE Inc)
Abstract
Human actions in videos are characterized by the complex interplay between various spatial features and their temporal dynamics. In this paper, we propose novel tensor representations for compactly capturing such higher-order relationships. We propose two tensor-based feature representations, viz. (i) sequence compatibility kernel (SCK) and (ii) dynamics compatibility kernels (DCK); the former capitalizing on the spatio-temporal correlations between features, the latter capturing spatio-temporal variations of pairs of features. We propose generalization of SCK, coined SCK+, that operates on subsequences to capture the local-global interplay of correlations for skeleton 3D body-joins and/or per-frame CNN classifier scores. As naive tensor formulations may be intractable, we introduce linearization of these kernels that lead to fast descriptors. We provide experiments on (i) 3D skeleton action sequences, (ii) fine-grained videos, and (iii) standard non-fine-grained videos using CNN and articulated human 3D body-joint sequences. Our results show state-of-the-art performance. We use higher-order tensors, related to bilinear models for fine-grained data, and so-called Eigenvalue Power Normalization (EPN) which have been long speculated to form a higher-order occurrence detector; thus capturing the fine-grained relationships of features rather than merely count features in scenes. We prove that a tensor coupled with EPN indeed acts as such a detector.
Description
Citation
Collections
Source
IEEE Transactions on Pattern Analysis and Machine Intelligence
Type
Book Title
Entity type
Access Statement
License Rights
Restricted until
2099-12-31
Downloads
File
Description