Bi-Directional and Early Interaction Transformers for Bird’s Eye View Semantic Segmentation
文献类型:会议论文
作者 | Pan Cong3,4![]() ![]() ![]() ![]() ![]() ![]() |
出版日期 | 2023-06 |
会议日期 | 2023 年 6 月 18 日 – 2023 年 6 月 22 日 |
会议地点 | Vancouver Convention Center |
英文摘要 | Bird's Eye View (BEV) semantic segmentation is a critical task in autonomous driving. However, existing Transformer-based methods confront difficulties in transforming Perspective View (PV) to BEV due to their unidirectional and posterior interaction mechanisms. To address this issue, we propose a novel Bi-directional and Early Interaction Transformers framework named BAEFormer, consisting of (i) an early-interaction PV-BEV pipeline and (ii) a bi-directional cross-attention mechanism. Moreover, we find that the image feature maps' resolution in the cross-attention module has a limited effect on the final performance. Under this critical observation, we propose to enlarge the size of input images and downsample the multi-view image features for cross-interaction, further improving the accuracy while keeping the amount of computation controllable. Our proposed method for BEV semantic segmentation achieves state-of-the-art performance in real-time inference speed on the nuScenes dataset, i.e., 38.9 mIoU at 45 FPS on a single A100 GPU. |
会议录出版者 | IEEE/CVF |
源URL | [http://ir.ia.ac.cn/handle/173211/57377] ![]() |
专题 | 自动化研究所_智能感知与计算研究中心 |
通讯作者 | Zhang Zhaoxiang |
作者单位 | 1.Horizon Robotics 2.Center for Artificial Intelligence and Robotics, HKISI CAS 3.School of Future Technology, University of Chinese Academy of Sciences 4.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences 5.Huawei Inc. |
推荐引用方式 GB/T 7714 | Pan Cong,He Yonghao,Peng Junran,et al. Bi-Directional and Early Interaction Transformers for Bird’s Eye View Semantic Segmentation[C]. 见:. Vancouver Convention Center. 2023 年 6 月 18 日 – 2023 年 6 月 22 日. |
入库方式: OAI收割
来源:自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。