中国科学院机构知识库网格系统: Learning visual relationship and context-aware attention for image captioning

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

Learning visual relationship and context-aware attention for image captioning

文献类型：期刊论文


作者	Wang, Junbo; Wang, Wei; Wang, Liang; Wang, Zhiyong ; Feng, Dagan ; Tan Tieniu
刊名	Pattern Recognition
出版日期	2020
期号	98 页码:107075
关键词	Image captioning Relational reasoning Context-aware attention
英文摘要	Image captioning which automatically generates natural language descriptions for images has attracted lots of research attentions and there have been substantial progresses with attention based captioning methods. However, most attention-based image captioning methods focus on extracting visual information in regions of interest for sentence generation and usually ignore the relational reasoning among those regions of interest in an image. Moreover, these methods do not take into account previously attended regions which can be used to guide the subsequent attention selection. In this paper, we propose a novel method to implicitly model the relationship among regions of interest in an image with a graph neural network, as well as a novel context-aware attention mechanism to guide attention selection by fully memorizing previously attended visual content. Compared with the existing attention-based image captioning methods, ours can not only learn relation-aware visual representations for image captioning, but also consider historical context information on previous attention. We perform extensive experiments on two public benchmark datasets: MS COCO and Flickr30K, and the experimental results indicate that our proposed method is able to outperform various state-of-the-art methods in terms of the widely used evaluation metrics.
源URL	[http://ir.ia.ac.cn/handle/173211/28361]
专题	自动化研究所_智能感知与计算研究中心
作者单位	1.School of Information Technologies, The University of Sydney 2.University of Chinese Academy of Sciences 3.Center for Excellence in Brain Science and Intelligence Technology 4.Center for Research on Intelligent Perception and Computing, National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
推荐引用方式 GB/T 7714	Wang, Junbo,Wang, Wei,Wang, Liang,et al. Learning visual relationship and context-aware attention for image captioning[J]. Pattern Recognition,2020(98):107075.
APA	Wang, Junbo,Wang, Wei,Wang, Liang,Wang, Zhiyong,Feng, Dagan,&Tan Tieniu.(2020).Learning visual relationship and context-aware attention for image captioning.Pattern Recognition(98),107075.
MLA	Wang, Junbo,et al."Learning visual relationship and context-aware attention for image captioning".Pattern Recognition .98(2020):107075.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。