中国科学院机构知识库网格系统: Active learning based 3d semantic labeling from images and videos

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

Active learning based 3d semantic labeling from images and videos

文献类型：期刊论文


作者	Mengqi Rong2,3,4 ; Hainan Cui2,3,4 ; Zhanyi Hu2,3,4 ; Hanqing Jiang 3; Hongmin Liu 1; Shuhan Shen2,3,4
刊名	IEEE Transactions on Circuits and Systems for Video Technology
出版日期	2021-05-13
卷号	32 期号:12 页码:8101-8115
DOI	10.1109/TCSVT.2021.3079991
文献子类	期刊论文
英文摘要	3D semantic segmentation is one of the most fundamental problems for 3D scene understanding and has attracted much attention in the field of computer vision. In this paper, we propose an active learning based 3D semantic labeling method for large-scale 3D mesh model generated from images or videos. Taking as input a 3D mesh model reconstructed from the image based 3D modeling system, coupled with the calibrated images, our method outputs a fine 3D semantic mesh model in which each facet is assigned a semantic label. There are three major steps in our framework: 2D semantic segmentation, 2D-3D semantic fusion, and batch image selection. A limited annotation image set is first used to fine-tune a pre-trained semantic segmentation network for obtaining the pixel-wise semantic probability maps. Then all these maps are back-projected into 3D space and fused on the 3D mesh model using Markov Random Field optimization, thus yield a preliminary 3D semantic mesh model and a heat model showing each facet’s confidence. This 3D semantic model is used as a reliable supervisor to select the parts that are not well segmented for manual annotation to boost the performance of the 2D semantic segmentation network, as well as the 3D mesh labeling, in the next iteration. This Training-Fusion-Selection process continues until the label assignment of the 3D mesh model becomes steady. By this means, we significantly reduce the amount for annotation but not the labeling quality of 3D semantic models. Extensive experiments demonstrate the effectiveness and generalization ability of our method on a wide variety of datasets.
URL标识	查看原文
语种	英语
源URL	[http://ir.ia.ac.cn/handle/173211/52437]
专题	精密感知与控制研究中心_精密感知与控制
通讯作者	Hongmin Liu; Shuhan Shen
作者单位	1.School of Automation and Electrical Engineering, University of Science and Technology Beijing, Beijing, China 2.School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China 3.CASIA-SenseTime Research Group, Hangzhou, China 4.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing, China
推荐引用方式 GB/T 7714	Mengqi Rong,Hainan Cui,Zhanyi Hu,et al. Active learning based 3d semantic labeling from images and videos[J]. IEEE Transactions on Circuits and Systems for Video Technology,2021,32(12):8101-8115.
APA	Mengqi Rong,Hainan Cui,Zhanyi Hu,Hanqing Jiang,Hongmin Liu,&Shuhan Shen.(2021).Active learning based 3d semantic labeling from images and videos.IEEE Transactions on Circuits and Systems for Video Technology,32(12),8101-8115.
MLA	Mengqi Rong,et al."Active learning based 3d semantic labeling from images and videos".IEEE Transactions on Circuits and Systems for Video Technology 32.12(2021):8101-8115.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。