LDP: Language-driven Dual-Pixel Image Defocus Deblurring Network
| dc.contributor.author | Yang, Hao | en |
| dc.contributor.author | Pan, Liyuan | en |
| dc.contributor.author | Yang, Yan | en |
| dc.contributor.author | Hartley, Richard | en |
| dc.contributor.author | Liu, Miaomiao | en |
| dc.date.accessioned | 2025-05-23T12:24:22Z | |
| dc.date.available | 2025-05-23T12:24:22Z | |
| dc.date.issued | 2024 | en |
| dc.description.abstract | Recovering sharp images from dual-pixel (DP) pairs with disparity-dependent blur is a challenging task. Existing blur map-based deblurring methods have demonstrated promising results. In this paper, we propose, to the best of our knowledge, the first framework that introduces the contrastive language-image pre-training framework (CLIP) to accurately estimate the blur map from a DP pair unsu-pervisedly. To achieve this, we first carefully design text prompts to enable CLIP to understand blur-related geo-metric prior knowledge from the DP pair. Then, we pro-pose a format to input a stereo DP pair to CLIP without any fine-tuning, despite the fact that CLIP is pre-trained on monocular images. Given the estimated blur map, we intro-duce a blur-prior attention block, a blur-weighting loss, and a blur-aware loss to recover the all-in-focus image. Our method achieves state-of-the-art performance in extensive experiments (see Fig. 1). | en |
| dc.description.sponsorship | This work was supported in part by the Beijing Institute of Technology Research Fund Program for Young Scholars, BIT Special-Zone, and National Natural Science Foundation of China 62302045, and Miaomiao Liu was supported by the ARC fellowship and Discovery Project grant (DE180100628, DP200102274). | en |
| dc.description.status | Peer-reviewed | en |
| dc.format.extent | 10 | en |
| dc.identifier.issn | 1063-6919 | en |
| dc.identifier.scopus | 85204210019 | en |
| dc.identifier.uri | http://www.scopus.com/inward/record.url?scp=85204210019&partnerID=8YFLogxK | en |
| dc.identifier.uri | https://hdl.handle.net/1885/733752258 | |
| dc.language.iso | en | en |
| dc.relation.ispartofseries | 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024 | en |
| dc.rights | Publisher Copyright: © 2024 IEEE. | en |
| dc.source | Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition | en |
| dc.title | LDP: Language-driven Dual-Pixel Image Defocus Deblurring Network | en |
| dc.type | Conference paper | en |
| dspace.entity.type | Publication | en |
| local.bibliographicCitation.lastpage | 24087 | en |
| local.bibliographicCitation.startpage | 24078 | en |
| local.contributor.affiliation | Yang, Hao; Beijing Institute of Technology | en |
| local.contributor.affiliation | Pan, Liyuan; Beijing Institute of Technology | en |
| local.contributor.affiliation | Yang, Yan; ANU College of Science and Medicine, The Australian National University | en |
| local.contributor.affiliation | Hartley, Richard; School of Computing, ANU College of Systems and Society, The Australian National University | en |
| local.contributor.affiliation | Liu, Miaomiao; School of Computing, ANU College of Systems and Society, The Australian National University | en |
| local.identifier.doi | 10.1109/CVPR52733.2024.02273 | en |
| local.identifier.pure | a0ab43e1-81c4-45b0-b4a2-632a1b9f8911 | en |
| local.identifier.url | https://www.scopus.com/pages/publications/85204210019 | en |
| local.type.status | Published | en |