Semantic Scene Completion via Semantic-Aware Guidance and Interactive Refinement Transformer
文献类型:期刊论文
作者 | Xiao, Haihong1; Kang, Wenxiong1,2; Liu, Hao3; Li, Yuqiong4![]() |
刊名 | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
![]() |
出版日期 | 2025-05-01 |
卷号 | 35期号:5页码:4212-4225 |
关键词 | Semantics Proposals Feature extraction Three-dimensional displays Transformers Point cloud compression Image reconstruction Laser radar Circuits and systems Autonomous vehicles 3D vision semantic scene completion interactive refinement transformer |
ISSN号 | 1051-8215 |
DOI | 10.1109/TCSVT.2024.3518493 |
通讯作者 | Kang, Wenxiong(auwxkang@scut.edu.cn) ; He, Ying(yhe@ntu.edu.sg) |
英文摘要 | Predicting per-voxel occupancy status and corresponding semantic labels in 3D scenes is pivotal to 3D intelligent perception in autonomous driving. In this paper, we propose a novel semantic scene completion framework that can generate complete 3D volumetric semantics from a single image at a low cost. To the best of our knowledge, this is the first endeavor specifically aimed at mitigating the negative impacts of incorrect voxel query proposals caused by erroneous depth estimates and enhancing interactions for positive ones in camera-based semantic scene completion tasks. Specifically, we present a straightforward yet effective Semantic-aware Guided (SAG) module, which seamlessly integrates with task-related semantic priors to facilitate effective interactions between image features and voxel query proposals in a plug-and-play manner. Furthermore, we introduce a set of learnable object queries to better perceive objects within the scene. Building on this, we propose an Interactive Refinement Transformer (IRT) block, which iteratively updates voxel query proposals to enhance the perception of semantics and objects within the scene by leveraging the interaction between object queries and voxel queries through query-to-query cross-attention. Extensive experiments demonstrate that our method outperforms existing state-of-the-art approaches, achieving overall improvements of 0.30 and 2.74 in mIoU metric on the SemanticKITTI and SSCBench-KITTI-360 validation datasets, respectively, while also showing superior performance in the aspect of small object generation. |
WOS关键词 | NETWORK |
资助项目 | National Natural Science Foundation of China[62376100] ; Ministry of Education, Singapore[MOE-T2EP20220-0005] ; Ministry of Education, Singapore[RT19/22] ; International Science and Technology Cooperation Project of Guangzhou Economic and Technological Development District[2023GH16] ; Fundamental Research Funds for the Central Universities[2024ZYGXZR104] |
WOS研究方向 | Engineering |
语种 | 英语 |
WOS记录号 | WOS:001483893200002 |
资助机构 | National Natural Science Foundation of China ; Ministry of Education, Singapore ; International Science and Technology Cooperation Project of Guangzhou Economic and Technological Development District ; Fundamental Research Funds for the Central Universities |
源URL | [http://dspace.imech.ac.cn/handle/311007/101540] ![]() |
专题 | 力学研究所_流固耦合系统力学重点实验室(2012-) |
通讯作者 | Kang, Wenxiong; He, Ying |
作者单位 | 1.South China Univ Technol, Sch Automat Sci & Engn, Guangzhou 510641, Peoples R China 2.Pazhou Lab, Guangzhou 510335, Peoples R China 3.Nanyang Technol Univ, Coll Comp & DataScience, Singapore 639798, Singapore 4.Chinese Acad Sci, Inst Mech, Key Lab Mech Fluid Solid Coupling Syst, Beijing 100190, Peoples R China |
推荐引用方式 GB/T 7714 | Xiao, Haihong,Kang, Wenxiong,Liu, Hao,et al. Semantic Scene Completion via Semantic-Aware Guidance and Interactive Refinement Transformer[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,2025,35(5):4212-4225. |
APA | Xiao, Haihong,Kang, Wenxiong,Liu, Hao,Li, Yuqiong,&He, Ying.(2025).Semantic Scene Completion via Semantic-Aware Guidance and Interactive Refinement Transformer.IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,35(5),4212-4225. |
MLA | Xiao, Haihong,et al."Semantic Scene Completion via Semantic-Aware Guidance and Interactive Refinement Transformer".IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 35.5(2025):4212-4225. |
入库方式: OAI收割
来源:力学研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。