中国科学院机构知识库网格系统: Integrating Scene Semantic Knowledge into Image Captioning

Integrating Scene Semantic Knowledge into Image Captioning

文献类型：期刊论文


作者	Wei, Haiyang 3; Li, Zhixin 3; Huang, Feicheng 3; Zhang, Canlong 3; Ma, Huifang 2; Shi, Zhongzhi 1
刊名	ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS
出版日期	2021-06-01
卷号	17 期号:2 页码:22
关键词	Image captioning attention mechanism scene semantics encoder-decoder framework
ISSN号	1551-6857
DOI	10.1145/3439734
英文摘要	Most existing image captioning methods use only the visual information of the image to guide the generation of captions, lack the guidance of effective scene semantic information, and the current visual attention mechanism cannot adjust the focus intensity on the image. In this article, we first propose an improved visual attention model. At each timestep, we calculated the focus intensity coefficient of the attention mechanism through the context information of themodel, then automatically adjusted the focus intensity of the attention mechanism through the coefficient to extract more accurate visual information. In addition, we represented the scene semantic knowledge of the image through topic words related to the image scene, then added them to the language model. We used the attention mechanism to determine the visual information and scene semantic information that the model pays attention to at each timestep and combined them to enable the model to generate more accurate and scene-specific captions. Finally, we evaluated our model on Microsoft COCO (MSCOCO) and Flickr30k standard datasets. The experimental results show that our approach generates more accurate captions and outperforms many recent advanced models in various evaluation metrics.
资助项目	National Natural Science Foundation of China[61966004] ; National Natural Science Foundation of China[61663004] ; National Natural Science Foundation of China[61866004] ; National Natural Science Foundation of China[61762078] ; Guangxi Natural Science Foundation[2019GXNSFDA245018] ; Guangxi Natural Science Foundation[2018GXNSFDA281009] ; Guangxi Bagui Scholar Teams for Innovation and Research Project ; Guangxi Talent Highland Project of Big Data Intelligence and Application ; Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing
WOS研究方向	Computer Science
语种	英语
WOS记录号	WOS:000661037000017
出版者	ASSOC COMPUTING MACHINERY
源URL	[http://119.78.100.204/handle/2XEOYT63/17624]
专题	中国科学院计算技术研究所期刊论文_英文
通讯作者	Li, Zhixin
作者单位	1.Chinese Acad Sci, Inst Comp Technol, Key Lab Intelligent Informat Proc, 6 Kexueyuan South Rd, Beijing 100190, Peoples R China 2.Northwest Normal Univ, Coll Comp Sci & Engn, 967 Anning East Rd, Lanzhou 730070, Gansu, Peoples R China 3.Guangxi Normal Univ, Guangxi Key Lab Multisource Informat Min & Secur, 15 Yucai Rd, Guilin 541004, Guangxi, Peoples R China
推荐引用方式 GB/T 7714	Wei, Haiyang,Li, Zhixin,Huang, Feicheng,et al. Integrating Scene Semantic Knowledge into Image Captioning[J]. ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS,2021,17(2):22.
APA	Wei, Haiyang,Li, Zhixin,Huang, Feicheng,Zhang, Canlong,Ma, Huifang,&Shi, Zhongzhi.(2021).Integrating Scene Semantic Knowledge into Image Captioning.ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS,17(2),22.
MLA	Wei, Haiyang,et al."Integrating Scene Semantic Knowledge into Image Captioning".ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS 17.2(2021):22.

入库方式： OAI收割

来源：计算技术研究所

下载0

Integrating Scene Semantic Knowledge into Image Captioning

其他版本