中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Neighborhood Contrastive Transformer for Change Captioning

文献类型:期刊论文

作者Tu, Yunbin5; Li, Liang3,4; Su, Li5; Lu, Ke1,2; Huang, Qingming5
刊名IEEE TRANSACTIONS ON MULTIMEDIA
出版日期2023
卷号25页码:9518-9529
关键词Change captioning neighborhood contrastive transformer syntax dependencies
ISSN号1520-9210
DOI10.1109/TMM.2023.3254162
英文摘要Change captioning is to describe the semantic change between a pair of similar images in natural language. It is more challenging than general image captioning, because it requires capturing fine-grained change information while being immune to irrelevant viewpoint changes, and solving syntax ambiguity in change descriptions. In this paper, we propose a neighborhood contrastive transformer to improve the model's perceiving ability for various changes under different scenes and cognition ability for complex syntax structure. Concretely, we first design a neighboring feature aggregating to integrate neighboring context into each feature, which helps quickly locate the inconspicuous changes under the guidance of conspicuous referents. Then, we devise a common feature distilling to compare two images at neighborhood level and extract common properties from each image, so as to learn effective contrastive information between them. Finally, we introduce the explicit dependencies between words to calibrate the transformer decoder, which helps better understand complex syntax structure during training. Extensive experimental results demonstrate that the proposed method achieves the state-of-the-art performance on three public datasets with different change scenarios.
资助项目National Key Ramp;D Program of China
WOS研究方向Computer Science ; Telecommunications
语种英语
WOS记录号WOS:001133324200036
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
源URL[http://119.78.100.204/handle/2XEOYT63/38415]  
专题中国科学院计算技术研究所期刊论文_英文
通讯作者Li, Liang; Su, Li
作者单位1.Peng Cheng Lab, Shenzhen 518055, Guangdong, Peoples R China
2.Univ Chinese Acad Sci, Sch Engn Sci, Beijing 101408, Peoples R China
3.Hangzhou Dianzi Univ, Lishui Inst, Lishui 323000, Zhejiang, Peoples R China
4.Chinese Acad Sci, Inst Comp Technol, Beijing 100190, Peoples R China
5.Univ Chinese Acad Sci, Sch Comp Sci & Technol, Beijing 101408, Peoples R China
推荐引用方式
GB/T 7714
Tu, Yunbin,Li, Liang,Su, Li,et al. Neighborhood Contrastive Transformer for Change Captioning[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2023,25:9518-9529.
APA Tu, Yunbin,Li, Liang,Su, Li,Lu, Ke,&Huang, Qingming.(2023).Neighborhood Contrastive Transformer for Change Captioning.IEEE TRANSACTIONS ON MULTIMEDIA,25,9518-9529.
MLA Tu, Yunbin,et al."Neighborhood Contrastive Transformer for Change Captioning".IEEE TRANSACTIONS ON MULTIMEDIA 25(2023):9518-9529.

入库方式: OAI收割

来源:计算技术研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。