中国科学院机构知识库网格系统: Monocular 3D Detection with Geometric Constraint Embedding and Semi-supervised Training

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

Monocular 3D Detection with Geometric Constraint Embedding and Semi-supervised Training

文献类型：期刊论文


作者	Li PX(李培玄)1,2,3,4,5 ; Zhao HC(赵怀慈)1,2,3,5
刊名	IEEE Robotics and Automation Letters
出版日期	2021
卷号	6 期号:3 页码:5565-5572
关键词	Object Detection Segmentation and Categorization Autonomous Vehicle Navigation Computer Vision for Automation
ISSN号	2377-3766
产权排序	1
英文摘要	In this work, we propose a novel one-stage and keypoints-based framework for monocular 3D objects detection using only RGB images, called KM3D-Net. 2D detection only requires a deep neural network to predict 2D properties of objects, as it is a semanticity-aware task. For image-based 3D detection, we argue that the combination of the deep neural network and geometric constraints are needed to estimate appearance-related and spatial-related information synergistically. Here, we design a fully convolutional model to predict object keypoints, dimension, and orientation and then combine these estimations with perspective geometry constraints to compute position attributes. Further, we reformulate the geometric constraints as a differentiable version and embed it into the network to reduce running time while maintaining the consistency of model outputs in an end-to-end fashion. Benefiting from this simple structure, we then propose an effective semi-supervised training strategy for the setting where labeled training data is scarce. In this strategy, we enforce a consensus prediction of two shared-weights KM3D-Net for the same unlabeled image under different input augmentation conditions and network regularization. In particular, we unify the coordinate-dependent augmentations as the affine transformation for the differential recovering position of objects and propose a keypoints-dropout module for the network regularization. Our model only requires RGB images without synthetic data, instance segmentation, CAD model, or depth generator. Nevertheless, extensive experiments on the popular KITTI 3D detection dataset indicate that the KM3D-Net surpasses all previous state-of-the-art methods in both efficiency and accuracy by a large margin. And also, to IEEE
语种	英语
WOS记录号	WOS:000655792200002
源URL	[http://ir.sia.cn/handle/173321/28417]
专题	沈阳自动化研究所_光电信息技术研究室
通讯作者	Zhao HC(赵怀慈)
作者单位	1.Shenyang Institute of Automation, Chinese Academy of Sciences 2.Key Lab of Image Understanding and Computer Vision, Liaoning Province 3.Key Laboratory of Opto-Electronic Information Processing, Chinese Academy of Sciences 4.University of Chinese Academy of Sciences 5.Institutes for Robotics and Intelligent Manufacturing, Chinese Academy of Sciences
推荐引用方式 GB/T 7714	Li PX,Zhao HC. Monocular 3D Detection with Geometric Constraint Embedding and Semi-supervised Training[J]. IEEE Robotics and Automation Letters,2021,6(3):5565-5572.
APA	Li PX,&Zhao HC.(2021).Monocular 3D Detection with Geometric Constraint Embedding and Semi-supervised Training.IEEE Robotics and Automation Letters,6(3),5565-5572.
MLA	Li PX,et al."Monocular 3D Detection with Geometric Constraint Embedding and Semi-supervised Training".IEEE Robotics and Automation Letters 6.3(2021):5565-5572.

入库方式： OAI收割

来源：沈阳自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。