LDP: Language-driven Dual-Pixel Image Defocus Deblurring Network

dc.contributor.authorYang, Haoen
dc.contributor.authorPan, Liyuanen
dc.contributor.authorYang, Yanen
dc.contributor.authorHartley, Richarden
dc.contributor.authorLiu, Miaomiaoen
dc.date.accessioned2025-05-23T12:24:22Z
dc.date.available2025-05-23T12:24:22Z
dc.date.issued2024en
dc.description.abstractRecovering sharp images from dual-pixel (DP) pairs with disparity-dependent blur is a challenging task. Existing blur map-based deblurring methods have demonstrated promising results. In this paper, we propose, to the best of our knowledge, the first framework that introduces the contrastive language-image pre-training framework (CLIP) to accurately estimate the blur map from a DP pair unsu-pervisedly. To achieve this, we first carefully design text prompts to enable CLIP to understand blur-related geo-metric prior knowledge from the DP pair. Then, we pro-pose a format to input a stereo DP pair to CLIP without any fine-tuning, despite the fact that CLIP is pre-trained on monocular images. Given the estimated blur map, we intro-duce a blur-prior attention block, a blur-weighting loss, and a blur-aware loss to recover the all-in-focus image. Our method achieves state-of-the-art performance in extensive experiments (see Fig. 1).en
dc.description.sponsorshipThis work was supported in part by the Beijing Institute of Technology Research Fund Program for Young Scholars, BIT Special-Zone, and National Natural Science Foundation of China 62302045, and Miaomiao Liu was supported by the ARC fellowship and Discovery Project grant (DE180100628, DP200102274).en
dc.description.statusPeer-revieweden
dc.format.extent10en
dc.identifier.issn1063-6919en
dc.identifier.scopus85204210019en
dc.identifier.urihttp://www.scopus.com/inward/record.url?scp=85204210019&partnerID=8YFLogxKen
dc.identifier.urihttps://hdl.handle.net/1885/733752258
dc.language.isoenen
dc.relation.ispartofseries2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024en
dc.rightsPublisher Copyright: © 2024 IEEE.en
dc.sourceProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognitionen
dc.titleLDP: Language-driven Dual-Pixel Image Defocus Deblurring Networken
dc.typeConference paperen
dspace.entity.typePublicationen
local.bibliographicCitation.lastpage24087en
local.bibliographicCitation.startpage24078en
local.contributor.affiliationYang, Hao; Beijing Institute of Technologyen
local.contributor.affiliationPan, Liyuan; Beijing Institute of Technologyen
local.contributor.affiliationYang, Yan; ANU College of Science and Medicine, The Australian National Universityen
local.contributor.affiliationHartley, Richard; School of Computing, ANU College of Systems and Society, The Australian National Universityen
local.contributor.affiliationLiu, Miaomiao; School of Computing, ANU College of Systems and Society, The Australian National Universityen
local.identifier.doi10.1109/CVPR52733.2024.02273en
local.identifier.purea0ab43e1-81c4-45b0-b4a2-632a1b9f8911en
local.identifier.urlhttps://www.scopus.com/pages/publications/85204210019en
local.type.statusPublisheden

Downloads