ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models

Tian, Xinyu; Zou, Shu; Yang, Zhaoyuan; Zhang, Jing

ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models

dc.contributor.author	Tian, Xinyu	en
dc.contributor.author	Zou, Shu	en
dc.contributor.author	Yang, Zhaoyuan	en
dc.contributor.author	Zhang, Jing	en
dc.date.accessioned	2025-05-23T04:21:09Z
dc.date.available	2025-05-23T04:21:09Z
dc.date.issued	2024	en
dc.description.abstract	Although soft prompt tuning is effective in efficiently adapting Vision-Language (V&L) models for downstream tasks, it shows limitations in dealing with distribution shifts. We address this issue with Attribute-Guided Prompt Tuning (ArGue), making three key contributions. 1) In contrast to the conventional approach of directly appending soft prompts preceding class names, we align the model with primitive visual attributes generated by Large language Models (LLMs). We posit that a model's ability to express high confidence in these attributes signifies its capacity to discern the correct class rationales. 2) We introduce attribute sampling to eliminate disadvantageous attributes, thus only semantically meaningful attributes are preserved. 3) We propose negative prompting, explicitly enumerating class-agnostic attributes to activate spurious correlations and encourage the model to generate highly orthogonal probability distributions in relation to these negative features. In experiments, our method significantly out-performs current state-of-the-art prompt tuning methods on both novel class prediction and out-of-distribution generalization tasks. The code is available https://github.com/Liam-Tian/ArGue.	en
dc.description.status	Peer-reviewed	en
dc.format.extent	10	en
dc.identifier.issn	1063-6919	en
dc.identifier.scopus	85188970177	en
dc.identifier.uri	http://www.scopus.com/inward/record.url?scp=85188970177&partnerID=8YFLogxK	en
dc.identifier.uri	https://hdl.handle.net/1885/733751288
dc.language.iso	en	en
dc.relation.ispartofseries	2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024	en
dc.rights	Publisher Copyright: © 2024 IEEE.	en
dc.source	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition	en
dc.subject	few-shot adaptation	en
dc.subject	prompt tuning	en
dc.subject	vision-language model	en
dc.title	ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models	en
dc.type	Conference paper	en
dspace.entity.type	Publication	en
local.bibliographicCitation.lastpage	28587	en
local.bibliographicCitation.startpage	28578	en
local.contributor.affiliation	Tian, Xinyu; Australian National University	en
local.contributor.affiliation	Zou, Shu; Australian National University	en
local.contributor.affiliation	Yang, Zhaoyuan; GE Research	en
local.contributor.affiliation	Zhang, Jing; School of Computing, ANU College of Systems and Society, The Australian National University	en
local.identifier.doi	10.1109/CVPR52733.2024.02700	en
local.identifier.pure	fcb5d592-6fa6-47d3-b686-bc705668c9d5	en
local.identifier.url	https://www.scopus.com/pages/publications/85188970177	en
local.type.status	Published	en

Collections

ANU Research Publications

ArGue: Attribute-Guided Prompt Tuning for Vision-Language Models

Downloads

Collections