Partially-supervised image captioning

Anderson, Peter; Gould, Stephen; Johnson, Mark

Partially-supervised image captioning

dc.contributor.author	Anderson, Peter
dc.contributor.author	Gould, Stephen
dc.contributor.author	Johnson, Mark
dc.contributor.editor	Grauman, K
dc.contributor.editor	Cesa-Bianchi, N
dc.contributor.editor	Garnett, R
dc.contributor.editor	Bengio, S
dc.contributor.editor	Larochelle, H
dc.contributor.editor	Wallach, H
dc.coverage.spatial	Montreal, Canada
dc.date.accessioned	2024-02-12T22:37:02Z
dc.date.created	December 2-8 2018
dc.date.issued	2018
dc.date.updated	2022-10-02T07:19:29Z
dc.description.abstract	Image captioning models are becoming increasingly successful at describing the content of images in restricted domains. However, if these models are to function in the wild - for example, as assistants for people with impaired vision - a much larger number and variety of visual concepts must be understood. To address this problem, we teach image captioning models new visual concepts from labeled images and object detection datasets. Since image labels and object classes can be interpreted as partial captions, we formulate this problem as learning from partially-specified sequence data. We then propose a novel algorithm for training sequence models, such as recurrent neural networks, on partially-specified sequences which we represent using finite state automata. In the context of image captioning, our method lifts the restriction that previously required image captioning models to be trained on paired image-sentence corpora only, or otherwise required specialized model architectures to take advantage of alternative data modalities. Applying our approach to an existing neural captioning model, we achieve state of the art results on the novel object captioning task using the COCO dataset. We further show that we can train a captioning model to describe new visual concepts from the Open Images dataset while maintaining competitive COCO evaluation scores.	en_AU
dc.description.sponsorship	This research was supported by a Google award through the Natural Language Understanding Focused Program, CRP 8201800363 from Data61/CSIRO, and under the Australian Research Council’s Discovery Projects funding scheme (project number DP160102156).	en_AU
dc.format.mimetype	application/pdf	en_AU
dc.identifier.uri	http://hdl.handle.net/1885/313418
dc.language.iso	en_AU	en_AU
dc.publisher	Neural Information Processing Systems Foundation	en_AU
dc.relation	http://purl.org/au-research/grants/arc/DP160102156	en_AU
dc.relation.ispartofseries	32nd Conference on Neural Information Processing Systems, NeurIPS 2018	en_AU
dc.rights	© 2018 Neural Information Processing Systems Foundation	en_AU
dc.source	Advances in Neural Information Processing Systems	en_AU
dc.source.uri	https://proceedings.neurips.cc/paper_files/paper/2018	en_AU
dc.title	Partially-supervised image captioning	en_AU
dc.type	Conference paper	en_AU
dcterms.accessRights	Free Access via publisher website	en_AU
local.bibliographicCitation.lastpage	1886	en_AU
local.bibliographicCitation.startpage	1875	en_AU
local.contributor.affiliation	Anderson, Peter, Georgia Tech	en_AU
local.contributor.affiliation	Gould, Stephen, College of Engineering and Computer Science, ANU	en_AU
local.contributor.affiliation	Johnson, Mark, Macquarie University	en_AU
local.contributor.authoruid	Gould, Stephen, u4971180	en_AU
local.description.embargo	2099-12-31
local.description.notes	Imported from ARIES	en_AU
local.description.refereed	Yes
local.identifier.absfor	460200 - Artificial intelligence	en_AU
local.identifier.ariespublication	u3102795xPUB1743	en_AU
local.identifier.citationvolume	31	en_AU
local.identifier.scopusID	2-s2.0-85064841019
local.publisher.url	https://proceedings.neurips.cc/	en_AU
local.type.status	Published Version	en_AU

Downloads

Original bundle

Now showing 1 - 1 of 1

Name:: NeurIPS-2018-partially-supervised-image-captioning-Paper.pdf
Size:: 3.71 MB
Format:: Adobe Portable Document Format
Description:

Download

Collections

ANU Research Publications

Cultural advice

Partially-supervised image captioning

Downloads

Original bundle

Collections