Latent-based Diffusion Model for Long-tailed Recognition
| dc.contributor.author | Han, Pengxiao | en |
| dc.contributor.author | Ye, Changkun | en |
| dc.contributor.author | Zhou, Jieming | en |
| dc.contributor.author | Zhang, Jing | en |
| dc.contributor.author | Hong, Jie | en |
| dc.contributor.author | Li, Xuesong | en |
| dc.date.accessioned | 2025-05-23T17:27:08Z | |
| dc.date.available | 2025-05-23T17:27:08Z | |
| dc.date.issued | 2024 | en |
| dc.description.abstract | Long-tailed imbalance distribution is a common issue in practical computer vision applications. Previous works proposed methods to address this problem, which can be categorized into several classes: re-sampling, re-weighting, transfer learning, and feature augmentation. In recent years, diffusion models have shown an impressive generation ability in many sub-problems of deep computer vision. However, its powerful generation has not been explored in long-tailed problems. We propose a new approach, the Latent-based Diffusion Model for Long-tailed Recognition (LDMLR), as a feature augmentation method to tackle the issue. First, we encode the imbalanced dataset into features using the baseline model. Then, we train a Denoising Diffusion Implicit Model (DDIM) using these encoded features to generate pseudo-features. Finally, we train the classifier using the encoded and pseudo-features from the previous two steps. The model's accuracy shows an improvement on the CIFAR-LT and ImageNet-LT datasets by using the proposed method. | en |
| dc.description.status | Peer-reviewed | en |
| dc.format.extent | 10 | en |
| dc.identifier.isbn | 9798350365474 | en |
| dc.identifier.issn | 2160-7508 | en |
| dc.identifier.scopus | 85206447690 | en |
| dc.identifier.uri | http://www.scopus.com/inward/record.url?scp=85206447690&partnerID=8YFLogxK | en |
| dc.identifier.uri | https://hdl.handle.net/1885/733752819 | |
| dc.language.iso | en | en |
| dc.publisher | IEEE Computer Society | en |
| dc.relation.ispartof | Proceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024 | en |
| dc.relation.ispartofseries | 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024 | en |
| dc.relation.ispartofseries | IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops | en |
| dc.rights | Publisher Copyright: © 2024 IEEE. | en |
| dc.subject | diffusion model | en |
| dc.subject | imbalance distribution | en |
| dc.subject | long-tailed recognition | en |
| dc.title | Latent-based Diffusion Model for Long-tailed Recognition | en |
| dc.type | Conference paper | en |
| dspace.entity.type | Publication | en |
| local.bibliographicCitation.lastpage | 2648 | en |
| local.bibliographicCitation.startpage | 2639 | en |
| local.contributor.affiliation | Han, Pengxiao; Australian National University | en |
| local.contributor.affiliation | Ye, Changkun; Australian National University | en |
| local.contributor.affiliation | Zhou, Jieming; ANU College of Systems and Society, The Australian National University | en |
| local.contributor.affiliation | Zhang, Jing; School of Computing, ANU College of Systems and Society, The Australian National University | en |
| local.contributor.affiliation | Hong, Jie; Australian National University | en |
| local.contributor.affiliation | Li, Xuesong; Biological Data Science Institute, ANU College of Science and Medicine, The Australian National University | en |
| local.identifier.doi | 10.1109/CVPRW63382.2024.00270 | en |
| local.identifier.essn | 2160-7516 | en |
| local.identifier.pure | 79554173-98e2-4f70-bf7f-09d64138c763 | en |
| local.identifier.url | https://www.scopus.com/pages/publications/85206447690 | en |
| local.type.status | Published | en |