中国科学院机构知识库网格系统: Semantic Scene Completion via Semantic-Aware Guidance and Interactive Refinement Transformer

Semantic Scene Completion via Semantic-Aware Guidance and Interactive Refinement Transformer

文献类型：期刊论文


作者	Xiao, Haihong 1; Kang, Wenxiong 1,2; Liu, Hao 3; Li, Yuqiong4 ; He, Ying 3; Li YQ(李玉琼)
刊名	IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
出版日期	2025-05-01
卷号	35 期号:5 页码:4212-4225
关键词	Semantics Proposals Feature extraction Three-dimensional displays Transformers Point cloud compression Image reconstruction Laser radar Circuits and systems Autonomous vehicles 3D vision semantic scene completion interactive refinement transformer
ISSN号	1051-8215
DOI	10.1109/TCSVT.2024.3518493
通讯作者	Kang, Wenxiong(auwxkang@scut.edu.cn) ; He, Ying(yhe@ntu.edu.sg)
英文摘要	Predicting per-voxel occupancy status and corresponding semantic labels in 3D scenes is pivotal to 3D intelligent perception in autonomous driving. In this paper, we propose a novel semantic scene completion framework that can generate complete 3D volumetric semantics from a single image at a low cost. To the best of our knowledge, this is the first endeavor specifically aimed at mitigating the negative impacts of incorrect voxel query proposals caused by erroneous depth estimates and enhancing interactions for positive ones in camera-based semantic scene completion tasks. Specifically, we present a straightforward yet effective Semantic-aware Guided (SAG) module, which seamlessly integrates with task-related semantic priors to facilitate effective interactions between image features and voxel query proposals in a plug-and-play manner. Furthermore, we introduce a set of learnable object queries to better perceive objects within the scene. Building on this, we propose an Interactive Refinement Transformer (IRT) block, which iteratively updates voxel query proposals to enhance the perception of semantics and objects within the scene by leveraging the interaction between object queries and voxel queries through query-to-query cross-attention. Extensive experiments demonstrate that our method outperforms existing state-of-the-art approaches, achieving overall improvements of 0.30 and 2.74 in mIoU metric on the SemanticKITTI and SSCBench-KITTI-360 validation datasets, respectively, while also showing superior performance in the aspect of small object generation.
WOS关键词	NETWORK
资助项目	National Natural Science Foundation of China[62376100] ; Ministry of Education, Singapore[MOE-T2EP20220-0005] ; Ministry of Education, Singapore[RT19/22] ; International Science and Technology Cooperation Project of Guangzhou Economic and Technological Development District[2023GH16] ; Fundamental Research Funds for the Central Universities[2024ZYGXZR104]
WOS研究方向	Engineering
语种	英语
WOS记录号	WOS:001483893200002
资助机构	National Natural Science Foundation of China ; Ministry of Education, Singapore ; International Science and Technology Cooperation Project of Guangzhou Economic and Technological Development District ; Fundamental Research Funds for the Central Universities
源URL	[http://dspace.imech.ac.cn/handle/311007/101540]
专题	力学研究所_流固耦合系统力学重点实验室(2012-)
通讯作者	Kang, Wenxiong; He, Ying
作者单位	1.South China Univ Technol, Sch Automat Sci & Engn, Guangzhou 510641, Peoples R China 2.Pazhou Lab, Guangzhou 510335, Peoples R China 3.Nanyang Technol Univ, Coll Comp & DataScience, Singapore 639798, Singapore 4.Chinese Acad Sci, Inst Mech, Key Lab Mech Fluid Solid Coupling Syst, Beijing 100190, Peoples R China
推荐引用方式 GB/T 7714	Xiao, Haihong,Kang, Wenxiong,Liu, Hao,et al. Semantic Scene Completion via Semantic-Aware Guidance and Interactive Refinement Transformer[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,2025,35(5):4212-4225.
APA	Xiao, Haihong,Kang, Wenxiong,Liu, Hao,Li, Yuqiong,He, Ying,&李玉琼.(2025).Semantic Scene Completion via Semantic-Aware Guidance and Interactive Refinement Transformer.IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,35(5),4212-4225.
MLA	Xiao, Haihong,et al."Semantic Scene Completion via Semantic-Aware Guidance and Interactive Refinement Transformer".IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 35.5(2025):4212-4225.

入库方式： OAI收割

来源：力学研究所

下载0

Semantic Scene Completion via Semantic-Aware Guidance and Interactive Refinement Transformer

其他版本