Latent-based Diffusion Model for Long-tailed Recognition

dc.contributor.authorHan, Pengxiaoen
dc.contributor.authorYe, Changkunen
dc.contributor.authorZhou, Jiemingen
dc.contributor.authorZhang, Jingen
dc.contributor.authorHong, Jieen
dc.contributor.authorLi, Xuesongen
dc.date.accessioned2025-05-23T17:27:08Z
dc.date.available2025-05-23T17:27:08Z
dc.date.issued2024en
dc.description.abstractLong-tailed imbalance distribution is a common issue in practical computer vision applications. Previous works proposed methods to address this problem, which can be categorized into several classes: re-sampling, re-weighting, transfer learning, and feature augmentation. In recent years, diffusion models have shown an impressive generation ability in many sub-problems of deep computer vision. However, its powerful generation has not been explored in long-tailed problems. We propose a new approach, the Latent-based Diffusion Model for Long-tailed Recognition (LDMLR), as a feature augmentation method to tackle the issue. First, we encode the imbalanced dataset into features using the baseline model. Then, we train a Denoising Diffusion Implicit Model (DDIM) using these encoded features to generate pseudo-features. Finally, we train the classifier using the encoded and pseudo-features from the previous two steps. The model's accuracy shows an improvement on the CIFAR-LT and ImageNet-LT datasets by using the proposed method.en
dc.description.statusPeer-revieweden
dc.format.extent10en
dc.identifier.isbn9798350365474en
dc.identifier.issn2160-7508en
dc.identifier.scopus85206447690en
dc.identifier.urihttp://www.scopus.com/inward/record.url?scp=85206447690&partnerID=8YFLogxKen
dc.identifier.urihttps://hdl.handle.net/1885/733752819
dc.language.isoenen
dc.publisherIEEE Computer Societyen
dc.relation.ispartofProceedings - 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024en
dc.relation.ispartofseries2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPRW 2024en
dc.relation.ispartofseriesIEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshopsen
dc.rightsPublisher Copyright: © 2024 IEEE.en
dc.subjectdiffusion modelen
dc.subjectimbalance distributionen
dc.subjectlong-tailed recognitionen
dc.titleLatent-based Diffusion Model for Long-tailed Recognitionen
dc.typeConference paperen
dspace.entity.typePublicationen
local.bibliographicCitation.lastpage2648en
local.bibliographicCitation.startpage2639en
local.contributor.affiliationHan, Pengxiao; Australian National Universityen
local.contributor.affiliationYe, Changkun; Australian National Universityen
local.contributor.affiliationZhou, Jieming; ANU College of Systems and Society, The Australian National Universityen
local.contributor.affiliationZhang, Jing; School of Computing, ANU College of Systems and Society, The Australian National Universityen
local.contributor.affiliationHong, Jie; Australian National Universityen
local.contributor.affiliationLi, Xuesong; Biological Data Science Institute, ANU College of Science and Medicine, The Australian National Universityen
local.identifier.doi10.1109/CVPRW63382.2024.00270en
local.identifier.essn2160-7516en
local.identifier.pure79554173-98e2-4f70-bf7f-09d64138c763en
local.identifier.urlhttps://www.scopus.com/pages/publications/85206447690en
local.type.statusPublisheden

Downloads