3DInAction: Understanding Human Actions in 3D Point Clouds

dc.contributor.authorBen-Shabat, Yizhaken
dc.contributor.authorShrout, Orenen
dc.contributor.authorGould, Stephenen
dc.date.accessioned2025-05-23T15:21:44Z
dc.date.available2025-05-23T15:21:44Z
dc.date.issued2024en
dc.description.abstractWe propose a novel method for 3D point cloud action recognition. Understanding human actions in RGB videos has been widely studied in recent years, however, its 3D point cloud counterpart remains under-explored despite the clear value that 3D information may bring. This is mostly due to the inherent limitation of the point cloud data modality—lack of structure, permutation invariance, and varying number of points—which makes it difficult to learn a spatio-temporal representation. To address this limitation, we propose the 3DinAction pipeline that first estimates patches moving in time (t-patches) as a key building block, alongside a hierarchical architecture that learns an informative spatio-temporal representation. We show that our method achieves improved performance on existing datasets, including DFAUST and IKEA ASM. Code is publicly available at https://github.com/sitzikbs/3dincaction.en
dc.description.sponsorshipThis project has received funding from the European Union\u2019s Horizon 2020 research and innovation pro-gramme under the Marie Sklodowska-Curie grant agreement No 893465. We also thank the Microsoft for Azure Credits and NVIDIA Academic Hardware Grant Program for providing high-speed A5000 GPU.en
dc.description.statusPeer-revieweden
dc.format.extent10en
dc.identifier.issn1063-6919en
dc.identifier.scopus85218340761en
dc.identifier.urihttp://www.scopus.com/inward/record.url?scp=85218340761&partnerID=8YFLogxKen
dc.identifier.urihttps://hdl.handle.net/1885/733752528
dc.language.isoenen
dc.relation.ispartofseries2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024en
dc.rightsPublisher Copyright: © 2024 IEEE.en
dc.sourceProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognitionen
dc.subject3D action recognitionen
dc.subjectpoint cloudsen
dc.subjectspatio-temporal representationen
dc.subjecttemporal patchesen
dc.title3DInAction: Understanding Human Actions in 3D Point Cloudsen
dc.typeConference paperen
dspace.entity.typePublicationen
local.bibliographicCitation.lastpage19987en
local.bibliographicCitation.startpage19978en
local.contributor.affiliationBen-Shabat, Yizhak; School of Computing, ANU College of Systems and Society, The Australian National Universityen
local.contributor.affiliationShrout, Oren; Technion-Israel Institute of Technologyen
local.contributor.affiliationGould, Stephen; School of Computing, ANU College of Systems and Society, The Australian National Universityen
local.identifier.doi10.1109/CVPR52733.2024.01888en
local.identifier.pure472b73a6-760c-435c-a00e-bd7c523235bben
local.identifier.urlhttps://www.scopus.com/pages/publications/85218340761en
local.type.statusPublisheden

Downloads