Canonical Shape Projection Is All You Need for 3D Few-Shot Class Incremental Learning

Cheraghian, Ali; Hayder, Zeeshan; Ramasinghe, Sameera; Rahman, Shafin; Jafaryahya, Javad; Petersson, Lars; Harandi, Mehrtash

Canonical Shape Projection Is All You Need for 3D Few-Shot Class Incremental Learning

Date

2025

Authors

Cheraghian, Ali

Hayder, Zeeshan

Ramasinghe, Sameera

Rahman, Shafin

Jafaryahya, Javad

Petersson, Lars

Harandi, Mehrtash

Publisher

Springer Science+Business Media B.V.

Abstract

In recent years, robust pre-trained foundation models have been successfully used in many downstream tasks. Here, we would like to use such powerful models to address the problem of few-shot class incremental learning (FSCIL) tasks on 3D point cloud objects. Our approach is to reprogram the well-known CLIP-based foundation model (trained on 2D images and text pairs) for this purpose. The CLIP model works by ingesting 2D images, so to leverage it in our context, we project the 3D object point cloud onto 2D image space to create proper depth maps. For this, prior works consider a fixed and non-trainable set of camera poses. In contrast, we propose to train the network to find a projection that best describes the object and is appropriate for extracting 2D image features from the CLIP vision encoder. Directly using the generated depth map is not suitable for the CLIP model, so we apply the model reprogramming paradigm to the depth map to augment the foreground and background to adapt it. This removes the need for modification or fine-tuning of the foundation model. In the setting we have investigated, we have limited access to data from novel classes, resulting in a problem with overfitting. Here, we address this problem via the use of a prompt engineering approach using multiple GPT-generated text descriptions. Our method, C3PR, successfully outperforms existing FSCIL methods on ModelNet, ShapeNet, ScanObjectNN, and CO3D datasets. The code is available at https://github.com/alichr/C3PR.

Keywords

3D shape projection, Few-shot class incremental learning, Model reprogramming

URI

http://www.scopus.com/inward/record.url?scp=85210160949&partnerID=8YFLogxK
https://hdl.handle.net/1885/733752739

Collections

ANU Research Publications

Type

Conference paper

Book Title

Computer Vision – ECCV 2024 - 18th European Conference, Proceedings

Entity type

Publication

DOI

10.1007/978-3-031-72940-9_3

Full item page

Cultural advice

Canonical Shape Projection Is All You Need for 3D Few-Shot Class Incremental Learning

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Access Statement

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Citation

URI

Collections

Source

Type

Book Title

Entity type

Access Statement

License Rights

DOI

Restricted until