The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?

Zhao, Qinyu; Xu, Ming; Gupta, Kartik; Asthana, Akshay; Zheng, Liang; Gould, Stephen

The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?

Date

2025

Authors

Zhao, Qinyu

Xu, Ming

Gupta, Kartik

Asthana, Akshay

Zheng, Liang

Gould, Stephen

Publisher

Springer Science+Business Media B.V.

Abstract

Large vision-language models (LVLMs), designed to interpret and respond to human instructions, occasionally generate hallucinated or harmful content due to inappropriate instructions. This study uses linear probing to shed light on the hidden knowledge at the output layers of LVLMs. We demonstrate that the logit distributions of the first tokens contain sufficient information to determine whether to respond to the instructions, including recognizing unanswerable visual questions, defending against jailbreaking attacks, and identifying deceptive questions. Such hidden knowledge is gradually lost in logits of subsequent tokens during response generation. Then, we illustrate a simple decoding strategy at the generation of the first token, effectively improving the generated content. In experiments, we find a few interesting insights: First, the CLIP model already contains a strong signal for solving these tasks, which indicates potential bias in the existing datasets. Second, we observe performance improvement by utilizing the first logit distributions on three additional tasks, including indicating uncertainty in math solving, mitigating hallucination, and image classification. Last, with the same training data, simply finetuning LVLMs improves models’ performance but is still inferior to linear probing on these tasks (Our code is available at https://github.com/Qinyu-Allen-Zhao/LVLM-LP).

Keywords

First Token, Hidden Knowledge, Large Vision-Language Models, Linear Probing, Logit Distribution

URI

http://www.scopus.com/inward/record.url?scp=85210863064&partnerID=8YFLogxK
https://hdl.handle.net/1885/733751303

Collections

ANU Research Publications

Type

Conference paper

Book Title

Computer Vision – ECCV 2024 - 18th European Conference, Proceedings

Entity type

Publication

DOI

10.1007/978-3-031-73195-2_8

Full item page

Cultural advice

The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Access Statement

Research Projects

Organizational Units

Journal Issue

Abstract

Description

Keywords

Citation

URI

Collections

Source

Type

Book Title

Entity type

Access Statement

License Rights

DOI

Restricted until