MSL3D: 3D object detection from monocular, stereo and point cloud for autonomous driving
文献类型:期刊论文
作者 | Chen WY(陈文玉)1,2,3,4,5; Li PX(李培玄)1,2,3,4,5; Zhao HC(赵怀慈)1,2,4,5![]() |
刊名 | Neurocomputing
![]() |
出版日期 | 2022 |
卷号 | 494页码:23-32 |
关键词 | 3D object detection Automatic driving Multi-sensor fusion |
ISSN号 | 0925-2312 |
产权排序 | 1 |
英文摘要 | In this paper, we propose a novel deep architecture by combining multiple sensors for 3D object detection, named MSL3D. While recently LiDAR-Camera methods introduce additional semantic cues, working with fewer false detections, there is still a performance gap compared LiDAR-only methods. We argue that this gap is caused for two reasons: 1) the 3D spherical receptive fields of the set abstraction of the point clouds are not aligned with the 2D pixel-level receptive fields of the image. 2) the premature introduction of image information makes it is difficult to apply data augmentation both LiDAR and image synchronously. For the first problem, we extend 3D set abstraction to a 2D set abstraction that can transform the 2D image features to the 3D sphere to unify the receptive field of multi-modal data. For the second problem, we design a novel two-stage 3D detection framework that employs the LiDAR-only backbone in the first stage to estimate high-recall and high-quality proposals and then integrates the image and point clouds information for box refinement and confidence prediction. Besides, we add two auxiliary networks to effectively learn image features and point cloud features when using different multi-modal data augmentation strategies synchronously. Moreover, we design a consistency-structure generator using stereo images to determine whether any of a point in the 3D space belongs to the contour of the object, thereby supplementing the sparse point cloud information. Extensive experiments on the popular KITTI 3D objects detection dataset show that our proposed MSL3D achieves better performance comparing with other LiDAR-Only or LiDAR-Camera fusion approaches. |
语种 | 英语 |
源URL | [http://ir.sia.cn/handle/173321/30990] ![]() |
专题 | 沈阳自动化研究所_光电信息技术研究室 |
通讯作者 | Zhao HC(赵怀慈) |
作者单位 | 1.Key Lab of Image Understanding and Computer Vision, Liaoning Province, Shenyang, China 2.Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences, Shenyang, China 3.University of Chinese Academy of Sciences, Beijing, China 4.Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang, China 5.Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang, China |
推荐引用方式 GB/T 7714 | Chen WY,Li PX,Zhao HC. MSL3D: 3D object detection from monocular, stereo and point cloud for autonomous driving[J]. Neurocomputing,2022,494:23-32. |
APA | Chen WY,Li PX,&Zhao HC.(2022).MSL3D: 3D object detection from monocular, stereo and point cloud for autonomous driving.Neurocomputing,494,23-32. |
MLA | Chen WY,et al."MSL3D: 3D object detection from monocular, stereo and point cloud for autonomous driving".Neurocomputing 494(2022):23-32. |
入库方式: OAI收割
来源:沈阳自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。