The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?
| dc.contributor.author | Zhao, Qinyu | en |
| dc.contributor.author | Xu, Ming | en |
| dc.contributor.author | Gupta, Kartik | en |
| dc.contributor.author | Asthana, Akshay | en |
| dc.contributor.author | Zheng, Liang | en |
| dc.contributor.author | Gould, Stephen | en |
| dc.date.accessioned | 2025-05-23T04:21:41Z | |
| dc.date.available | 2025-05-23T04:21:41Z | |
| dc.date.issued | 2025 | en |
| dc.description.abstract | Large vision-language models (LVLMs), designed to interpret and respond to human instructions, occasionally generate hallucinated or harmful content due to inappropriate instructions. This study uses linear probing to shed light on the hidden knowledge at the output layers of LVLMs. We demonstrate that the logit distributions of the first tokens contain sufficient information to determine whether to respond to the instructions, including recognizing unanswerable visual questions, defending against jailbreaking attacks, and identifying deceptive questions. Such hidden knowledge is gradually lost in logits of subsequent tokens during response generation. Then, we illustrate a simple decoding strategy at the generation of the first token, effectively improving the generated content. In experiments, we find a few interesting insights: First, the CLIP model already contains a strong signal for solving these tasks, which indicates potential bias in the existing datasets. Second, we observe performance improvement by utilizing the first logit distributions on three additional tasks, including indicating uncertainty in math solving, mitigating hallucination, and image classification. Last, with the same training data, simply finetuning LVLMs improves models’ performance but is still inferior to linear probing on these tasks (Our code is available at https://github.com/Qinyu-Allen-Zhao/LVLM-LP). | en |
| dc.description.sponsorship | We would like to extend our deepest appreciation to Jaskirat Singh, Yicong Hong, Taojun Lin, Weijian Deng, Dylan Campbell, Shu Zou, Yunzhong Hou, Yuchi Liu, Xiaoxiao Sun, Jiahao Zhang, Zeyu Zhang, Xingjian Leng, Yang Yang, and all our other lab colleagues for their invaluable support throughout this project. Their collaborative efforts, insightful discussions, and constructive feedback have been crucial in shaping and improving our paper. This work was supported by an Australian Research Council (ARC) Linkage grant (project number LP210200931). | en |
| dc.description.status | Peer-reviewed | en |
| dc.format.extent | 16 | en |
| dc.identifier.isbn | 9783031731945 | en |
| dc.identifier.issn | 0302-9743 | en |
| dc.identifier.scopus | 85210863064 | en |
| dc.identifier.uri | http://www.scopus.com/inward/record.url?scp=85210863064&partnerID=8YFLogxK | en |
| dc.identifier.uri | https://hdl.handle.net/1885/733751303 | |
| dc.language.iso | en | en |
| dc.provenance | https://www.springernature.com/gp/open-science/policies/book-policies..."The Accepted Version can be archived in a Non-Commercial Institutional Repository. 12 months embargo" from Open Policy Finder site (as at 04/09/2025) | en |
| dc.publisher | Springer Science+Business Media B.V. | en |
| dc.relation.ispartof | Computer Vision – ECCV 2024 - 18th European Conference, Proceedings | en |
| dc.relation.ispartofseries | 18th European Conference on Computer Vision, ECCV 2024 | en |
| dc.relation.ispartofseries | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) | en |
| dc.rights | © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025. | en |
| dc.subject | First Token | en |
| dc.subject | Hidden Knowledge | en |
| dc.subject | Large Vision-Language Models | en |
| dc.subject | Linear Probing | en |
| dc.subject | Logit Distribution | en |
| dc.title | The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models? | en |
| dc.type | Conference paper | en |
| dspace.entity.type | Publication | en |
| local.bibliographicCitation.lastpage | 142 | en |
| local.bibliographicCitation.startpage | 127 | en |
| local.contributor.affiliation | Zhao, Qinyu; Australian National University | en |
| local.contributor.affiliation | Xu, Ming; Australian National University | en |
| local.contributor.affiliation | Gupta, Kartik; Seeing Machines Group | en |
| local.contributor.affiliation | Asthana, Akshay; Seeing Machines Group | en |
| local.contributor.affiliation | Zheng, Liang; School of Computing, ANU College of Systems and Society, The Australian National University | en |
| local.contributor.affiliation | Gould, Stephen; School of Computing, ANU College of Systems and Society, The Australian National University | en |
| local.identifier.doi | 10.1007/978-3-031-73195-2_8 | en |
| local.identifier.essn | 1611-3349 | en |
| local.identifier.pure | 5e90b4a3-566a-448c-9c15-d0fd97597c24 | en |
| local.identifier.url | https://www.scopus.com/pages/publications/85210863064 | en |
| local.type.status | Published | en |