Scene captioning with deep fusion of images and point clouds
文献类型:期刊论文
作者 | Yu, Qiang2,4![]() ![]() ![]() ![]() |
刊名 | PATTERN RECOGNITION LETTERS
![]() |
出版日期 | 2022-06-01 |
卷号 | 158页码:9-15 |
关键词 | Scene captioning Point cloud Deep fusion Scene captioning Point cloud Deep fusion |
ISSN号 | 0167-8655 |
DOI | 10.1016/j.patrec.2022.04.017 |
通讯作者 | Yu, Qiang(qiang.yu@ia.ac.cn) |
英文摘要 | Recently, the fusion of images and point clouds has received appreciable attentions in various fields, for example, autonomous driving, whose advantage over single-modal vision has been verified. However, it has not been extensively exploited in the scene captioning task. In this paper, a novel scene captioning framework with deep fusion of images and point clouds based on region correlation and attention is proposed to improve performances of captioning models. In our model, a symmetrical processing pipeline is designed for point clouds and images. First, 3D and 2D region features are generated respectively through region proposal generation, proposal fusion, and region pooling modules. Then, a feature fusion module is designed to integrate features according to the region correlation rule and the attention mechanism, which increases the interpretability of the fusion process and results in a sequence of fused visual features. Finally, the fused features are transformed into captions by an attention-based caption generation module. Comprehensive experiments indicate that the performance of our model reaches the state of the art.(c) 2022 Elsevier B.V. All rights reserved. |
资助项目 | National Key Research and Development Program of China[2020AAA0104903] ; National Natural Science Foundation of China[62072039] ; National Natural Science Foundation of China[62076242] ; National Natural Science Foundation of China[61976208] |
WOS研究方向 | Computer Science |
语种 | 英语 |
WOS记录号 | WOS:000797731300002 |
出版者 | ELSEVIER |
资助机构 | National Key Research and Development Program of China ; National Natural Science Foundation of China |
源URL | [http://ir.ia.ac.cn/handle/173211/49500] ![]() |
专题 | 自动化研究所_模式识别国家重点实验室_遥感图像处理团队 |
通讯作者 | Yu, Qiang |
作者单位 | 1.Beijing Inst Technol, Sch Comp Sci & Technol, Beijing 100081, Peoples R China 2.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China 3.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China 4.Chinese Acad Sci, Inst Automat, Res Ctr Aerosp Informat, Beijing 100190, Peoples R China |
推荐引用方式 GB/T 7714 | Yu, Qiang,Zhang, Chunxia,Weng, Lubin,et al. Scene captioning with deep fusion of images and point clouds[J]. PATTERN RECOGNITION LETTERS,2022,158:9-15. |
APA | Yu, Qiang,Zhang, Chunxia,Weng, Lubin,Xiang, Shiming,&Pan, Chunhong.(2022).Scene captioning with deep fusion of images and point clouds.PATTERN RECOGNITION LETTERS,158,9-15. |
MLA | Yu, Qiang,et al."Scene captioning with deep fusion of images and point clouds".PATTERN RECOGNITION LETTERS 158(2022):9-15. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。