3DInAction: Understanding Human Actions in 3D Point Clouds

Ben-Shabat, Yizhak; Shrout, Oren; Gould, Stephen

3DInAction: Understanding Human Actions in 3D Point Clouds

Date

2024

Authors

Ben-Shabat, Yizhak

Shrout, Oren

Gould, Stephen

Abstract

We propose a novel method for 3D point cloud action recognition. Understanding human actions in RGB videos has been widely studied in recent years, however, its 3D point cloud counterpart remains under-explored despite the clear value that 3D information may bring. This is mostly due to the inherent limitation of the point cloud data modality—lack of structure, permutation invariance, and varying number of points—which makes it difficult to learn a spatio-temporal representation. To address this limitation, we propose the 3DinAction pipeline that first estimates patches moving in time (t-patches) as a key building block, alongside a hierarchical architecture that learns an informative spatio-temporal representation. We show that our method achieves improved performance on existing datasets, including DFAUST and IKEA ASM. Code is publicly available at https://github.com/sitzikbs/3dincaction.

Keywords

3D action recognition, point clouds, spatio-temporal representation, temporal patches

URI

http://www.scopus.com/inward/record.url?scp=85218340761&partnerID=8YFLogxK
https://hdl.handle.net/1885/733752528

Collections

ANU Research Publications

Source

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Type

Conference paper

Entity type

Publication

DOI

10.1109/CVPR52733.2024.01888

Full item page

Cultural advice

3DInAction: Understanding Human Actions in 3D Point Clouds

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Access Statement

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Citation

URI

Collections

Source

Type

Book Title

Entity type

Access Statement

License Rights

DOI

Restricted until