中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
DepthFormer: Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation

文献类型:期刊论文

作者Zhenyu Li1; Zehui Chen2; Xianming Liu1; Junjun Jiang1
刊名Machine Intelligence Research
出版日期2023
卷号20期号:6页码:837-854
关键词Autonomous driving, 3D reconstruction, monocular depth estimation, Transformer, convolution
ISSN号2731-538X
DOI10.1007/s11633-023-1458-0
英文摘要

This paper aims to address the problem of supervised monocular depth estimation. We start with a meticulous pilot study to demonstrate that the long-range correlation is essential for accurate depth estimation. Moreover, the Transformer and convolution are good at long-range and close-range depth estimation, respectively. Therefore, we propose to adopt a parallel encoder architecture consisting of a Transformer branch and a convolution branch. The former can model global context with the effective attention mechanism and the latter aims to preserve the local information as the Transformer lacks the spatial inductive bias in modeling such contents. However, independent branches lead to a shortage of connections between features. To bridge this gap, we design a hierarchical aggregation and heterogeneous interaction module to enhance the Transformer features and model the affinity between the heterogeneous features in a set-to-set translation manner. Due to the unbearable memory cost introduced by the global attention on high-resolution feature maps, we adopt the deformable scheme to reduce the complexity. Extensive experiments on the KITTI, NYU, and SUN RGB-D datasets demonstrate that our proposed model, termed DepthFormer, surpasses state-of-the-art monocular depth estimation methods with prominent margins. The effectiveness of each proposed module is elaborately evaluated through meticulous and intensive ablation studies.

源URL[http://ir.ia.ac.cn/handle/173211/56013]  
专题自动化研究所_学术期刊_International Journal of Automation and Computing
作者单位1.Faculty of Computing, Harbin Institute of Technology, Harbin 150001, China
2.Department of Automation, University of Science and Technology of China, Hefei 230026, China
推荐引用方式
GB/T 7714
Zhenyu Li,Zehui Chen,Xianming Liu,et al. DepthFormer: Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation[J]. Machine Intelligence Research,2023,20(6):837-854.
APA Zhenyu Li,Zehui Chen,Xianming Liu,&Junjun Jiang.(2023).DepthFormer: Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation.Machine Intelligence Research,20(6),837-854.
MLA Zhenyu Li,et al."DepthFormer: Exploiting Long-range Correlation and Local Information for Accurate Monocular Depth Estimation".Machine Intelligence Research 20.6(2023):837-854.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。