Skip navigation
Skip navigation

Second-order Temporal Pooling for Action Recognition

Cherian, Anoop; Gould, Stephen


Deep learning models for video-based action recognition usually generate features for short clips (consisting of a few frames); such clip-level features are aggregated to video-level representations by computing statistics on these features. Typically zero-th (max) or the first-order (average) statistics are used. In this paper, we explore the benefits of using second-order statistics.Specifically, we propose a novel end-to-end learnable feature aggregation scheme, dubbed temporal correlation...[Show more]

CollectionsANU Research Publications
Date published: 2018-08-19
Type: Journal article
Source: International Journal of Computer Vision
DOI: 10.1007/s11263-018-1111-5
Access Rights: Open Access


File Description SizeFormat Image
1704.06925.pdfAuthor Accepted Manuscript1.91 MBAdobe PDFThumbnail
    Request a copy

Items in Open Research are protected by copyright, with all rights reserved, unless otherwise indicated.

Updated:  17 November 2022/ Responsible Officer:  University Librarian/ Page Contact:  Library Systems & Web Coordinator