中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Dadu-SV: Accelerate Stereo Vision Processing on NPU

文献类型:期刊论文

作者Min, Feng3; Wang, Ying2; Xu, Haobo3; Huang, Junpei1; Wang, Yujie3; Zou, Xingqi3; Lu, Meixuan3; Han, Yinhe
刊名IEEE EMBEDDED SYSTEMS LETTERS
出版日期2022-12-01
卷号14期号:4页码:191-194
ISSN号1943-0663
关键词Hardware acceleration neural computing neural processing unit (NPU) semiglobal matching (SGM) stereo vision
DOI10.1109/LES.2022.3162859
英文摘要Binocular vision and neural networks (CNNs) are widely seen in modern intelligent vision processing systems, such as robotics, autonomous vehicles, and AR gadgets. However, both the classic semiglobal matching (SGM) and deep CNNs entail substantial computing resource to reach the performance goal. Traditional embedded CPU/graphic processor unit (GPU) cannot simultaneously meet the processing speed and energy requirement, while the specialized circuits dedicated to SGM and CNN processing, respectively, will take considerable hardware and development costs. However, as the popularity of deep learning, neural processing units (NPUs) become prevalent in many embedded and edge devices, which possess high throughput computing power to deal with the matrix operations involved by neural networks. In this work, we attempt to take advantage of the neural processing architectures integrated in SoC chips to accelerate the SGM process, so that the hardware resources will be better utilized instead of investing more resources to create specialized SGM components. Thereby, this letter first deploys SGM on NPU by converting the incompatible operations into the neural-computing flow, and a configurable neural processing element is proposed to flexibly support various vector operation sequences. Then, a hybrid dataflow scheduler and the corresponding hardware modification are introduced to accelerate the cost processing, improving hardware utilization and on-chip memory footprint and access. Our solution runs at 45 fps for an image size of $640\times 480$ , with 128 disparity levels. The speed-energy efficiency is $52\times $ better than the GPU (Jetson TX1) solution with negligible additional hardware overhead and accuracy loss.
资助项目National Natural Science Foundation of China (NSFC)[62025404] ; National Natural Science Foundation of China (NSFC)[61834006] ; National Natural Science Foundation of China (NSFC)[61874124] ; Strategic Priority Research Program of Chinese Academy of Sciences[XDC05030100] ; Strategic Priority Research Program of Chinese Academy of Sciences[XDB4400000] ; Strategic Priority Research Program of Chinese Academy of Sciences[XDPB12]
WOS研究方向Computer Science ; Engineering
语种英语
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
WOS记录号WOS:000890850400008
源URL[http://119.78.100.204/handle/2XEOYT63/20273]  
专题中国科学院计算技术研究所期刊论文
通讯作者Wang, Ying; Xu, Haobo
作者单位1.Univ Sci & Technol China, Sch Microelect, Hefei 230026, Anhui, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, State Key Lab Comp Architecture, Beijing 100190, Peoples R China
3.Chinese Acad Sci, Inst Comp Technol, Res Ctr Intelligent Comp Syst, Beijing 100045, Peoples R China
推荐引用方式
GB/T 7714
Min, Feng,Wang, Ying,Xu, Haobo,et al. Dadu-SV: Accelerate Stereo Vision Processing on NPU[J]. IEEE EMBEDDED SYSTEMS LETTERS,2022,14(4):191-194.
APA Min, Feng.,Wang, Ying.,Xu, Haobo.,Huang, Junpei.,Wang, Yujie.,...&Han, Yinhe.(2022).Dadu-SV: Accelerate Stereo Vision Processing on NPU.IEEE EMBEDDED SYSTEMS LETTERS,14(4),191-194.
MLA Min, Feng,et al."Dadu-SV: Accelerate Stereo Vision Processing on NPU".IEEE EMBEDDED SYSTEMS LETTERS 14.4(2022):191-194.

入库方式: OAI收割

来源:计算技术研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。