Meet JEANIE

dc.contributor.authorWang, Leien
dc.contributor.authorLiu, Junen
dc.contributor.authorZheng, Liangen
dc.contributor.authorGedeon, Tomen
dc.contributor.authorKoniusz, Piotren
dc.date.accessioned2025-05-30T03:27:25Z
dc.date.available2025-05-30T03:27:25Z
dc.date.issued2024en
dc.description.abstractVideo sequences exhibit significant nuisance variations (undesired effects) of speed of actions, temporal locations, and subjects’ poses, leading to temporal-viewpoint misalignment when comparing two sets of frames or evaluating the similarity of two sequences. Thus, we propose Joint tEmporal and cAmera viewpoiNt alIgnmEnt (JEANIE) for sequence pairs. In particular, we focus on 3D skeleton sequences whose camera and subjects’ poses can be easily manipulated in 3D. We evaluate JEANIE on skeletal Few-shot Action Recognition (FSAR), where matching well temporal blocks (temporal chunks that make up a sequence) of support-query sequence pairs (by factoring out nuisance variations) is essential due to limited samples of novel classes. Given a query sequence, we create its several views by simulating several camera locations. For a support sequence, we match it with view-simulated query sequences, as in the popular Dynamic Time Warping (DTW). Specifically, each support temporal block can be matched to the query temporal block with the same or adjacent (next) temporal index, and adjacent camera views to achieve joint local temporal-viewpoint warping. JEANIE selects the smallest distance among matching paths with different temporal-viewpoint warping patterns, an advantage over DTW which only performs temporal alignment. We also propose an unsupervised FSAR akin to clustering of sequences with JEANIE as a distance measure. JEANIE achieves state-of-the-art results on NTU-60, NTU-120, Kinetics-skeleton and UWA3D Multiview Activity II on supervised and unsupervised FSAR, and their meta-learning inspired fusion.en
dc.description.statusPeer-revieweden
dc.format.extent32en
dc.identifier.issn0920-5691en
dc.identifier.otherWOS:001216114400002en
dc.identifier.otherORCID:/0000-0002-8600-7099/work/162053178en
dc.identifier.otherORCID:/0000-0002-6340-5289/work/168471068en
dc.identifier.scopus85192186088en
dc.identifier.urihttp://www.scopus.com/inward/record.url?scp=85192186088&partnerID=8YFLogxKen
dc.identifier.urihttps://hdl.handle.net/1885/733754556
dc.language.isoenen
dc.rightsPublisher Copyright: © The Author(s) 2024.en
dc.sourceInternational Journal of Computer Visionen
dc.subjectDictionary learningen
dc.subjectDynamic time warpingen
dc.subjectFew-shot action recognitionen
dc.subjectFusionen
dc.subjectMAMLen
dc.subjectSkeletonsen
dc.subjectSoft assignmenten
dc.subjectSparse codingen
dc.subjectSuperviseden
dc.subjectUnsuperviseden
dc.titleMeet JEANIEen
dc.typeJournal articleen
dspace.entity.typePublicationen
local.contributor.affiliationWang, Lei; School of Computing, ANU College of Systems and Society, The Australian National Universityen
local.contributor.affiliationLiu, Jun; Singapore University of Technology and Designen
local.contributor.affiliationZheng, Liang; School of Computing, ANU College of Systems and Society, The Australian National Universityen
local.contributor.affiliationGedeon, Tom; Curtin Universityen
local.contributor.affiliationKoniusz, Piotr; School of Computing, ANU College of Systems and Society, The Australian National Universityen
local.identifier.doi10.1007/s11263-024-02070-2en
local.identifier.pure52c8af5b-bddf-42a5-8f17-23c8791651e3en
local.identifier.urlhttps://www.scopus.com/pages/publications/85192186088en
local.type.statusPublisheden

Downloads