Cultural advice

The Australian National University acknowledges, celebrates and pays our respects to the Ngunnawal and Ngambri people of the Canberra region and to all First Nations Australians on whose traditional lands we meet and work, and whose cultures are among the oldest continuing cultures in human history.

Aboriginal and Torres Strait Islander peoples are advised that ANU Library collections may include images, names, voices, and other representations of deceased persons.

Material in the collection may contain terms, language or views that reflect the period in which the item was created and may be considered inappropriate today.

Tensor Representations for Action Recognition

Loading...
Thumbnail Image

Date

Authors

Koniusz, Piotr
Wang, Lei
Cherian, Anoop

Journal Title

Journal ISSN

Volume Title

Publisher

Institute of Electrical and Electronics Engineers (IEEE Inc)

Abstract

Human actions in videos are characterized by the complex interplay between various spatial features and their temporal dynamics. In this paper, we propose novel tensor representations for compactly capturing such higher-order relationships. We propose two tensor-based feature representations, viz. (i) sequence compatibility kernel (SCK) and (ii) dynamics compatibility kernels (DCK); the former capitalizing on the spatio-temporal correlations between features, the latter capturing spatio-temporal variations of pairs of features. We propose generalization of SCK, coined SCK+, that operates on subsequences to capture the local-global interplay of correlations for skeleton 3D body-joins and/or per-frame CNN classifier scores. As naive tensor formulations may be intractable, we introduce linearization of these kernels that lead to fast descriptors. We provide experiments on (i) 3D skeleton action sequences, (ii) fine-grained videos, and (iii) standard non-fine-grained videos using CNN and articulated human 3D body-joint sequences. Our results show state-of-the-art performance. We use higher-order tensors, related to bilinear models for fine-grained data, and so-called Eigenvalue Power Normalization (EPN) which have been long speculated to form a higher-order occurrence detector; thus capturing the fine-grained relationships of features rather than merely count features in scenes. We prove that a tensor coupled with EPN indeed acts as such a detector.

Description

Citation

Source

IEEE Transactions on Pattern Analysis and Machine Intelligence

Book Title

Entity type

Access Statement

License Rights

Restricted until

2099-12-31
abcd