中国科学院机构知识库网格系统: A Multi-task Learning Approach for Image Captioning

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

A Multi-task Learning Approach for Image Captioning

文献类型：会议论文


作者	Wei Zhao; Benyou Wang; Jianbo Ye; Min Yang; Zhou Zhao; Ruotian Luo; Yu Qiao
出版日期	2018
会议日期	2018
会议地点	Stockholm, Sweden
英文摘要	In this paper, we propose a Multi-task Learning Approach for Image Captioning (MLAIC), motivated by the fact that humans have no difficulty performing such task because they have the capabilities of multiple domains. Specifically, MLAIC consists of three key components: (i) A multi-object classification model that learns rich category-aware image representations using a CNN image encoder; (ii) A syntax generation model that learns better syntaxaware LSTM based decoder; (iii) An image captioning model that generates image descriptions in text, sharing its CNN encoder and LSTM decoder with the object classification task and the syntax generation task, respectively. In particular, the image captioning model can benefit from the additional object categorization and syntax knowledge. The experimental results on MS-COCO dataset demonstrate that our model achieves impressive results compared to other strong competitors. We will release the source code of this work after publication.
源URL	[http://ir.siat.ac.cn:8080/handle/172644/13688]
专题	深圳先进技术研究院_集成所
推荐引用方式 GB/T 7714	Wei Zhao,Benyou Wang,Jianbo Ye,et al. A Multi-task Learning Approach for Image Captioning[C]. 见:. Stockholm, Sweden. 2018.

入库方式： OAI收割

来源：深圳先进技术研究院

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。