中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Delving into the Effectiveness of Receptive Fields: Learning Scale-Transferrable Architectures for Practical Object Detection

文献类型:期刊论文

作者Zhang, Zhaoxiang2,3,4,5; Pan, Cong3,4,5; Peng, Junran1
刊名INTERNATIONAL JOURNAL OF COMPUTER VISION
出版日期2022-04-01
卷号130期号:4页码:970-989
关键词Computer vision Object detection Effective receptive fields Hardware acceleration
ISSN号0920-5691
DOI10.1007/s11263-021-01573-6
通讯作者Peng, Junran(pengjunran@huawei.com)
英文摘要Scale-sensitive object detection remains a challenging task, where most of the existing methods could not learn it explicitly and are not robust. Besides, they are less efficient during training or slow during inference, which is not friendly to real-time applications. In this paper, we propose a scale-transferrable architecture for practical object detection based on the analysis of the connection between dilation rate and effective receptive field. Our method firstly predicts a global continuous scale, which is shared by all positions, for each convolution filter of each network stage. Secondly, we average the spatial features and distill the scale from channels to effectively learn the scale. Thirdly, for fast-deployment, we propose a scale decomposition method that transfers the robust fractional scale into the combination of fixed integral scales for each convolution filter, which exploits the dilated convolution. Moreover, to overcome the shortcomings of our method for large-scale object detection, we modify the Feature Pyramid Network structure. Finally, we illustrate the orthogonality role of our method for sampling strategy. We demonstrate the effectiveness of our method on one-stage and two-stage algorithms under different configurations and compare them with different dilated convolution blocks. For practical applications, the training strategy of our method is simple and efficient, avoiding complex data sampling or optimization strategy. During inference, we reduce the latency of the proposed method by using the hardware accelerator TensorRT without extra operation. On the COCO test-dev, our model achieves 41.7% mAP on one-stage detector and 42.5% mAP on two-stage detector based on ResNet-101, and outperforms baselines by 3.2% and 3.1% mAP, respectively.
资助项目Major Project for New Generation of AI[2018AAA0100400] ; NationalNatural Science Foundation of China[61836014] ; NationalNatural Science Foundation of China[U21B 2042]
WOS研究方向Computer Science
语种英语
WOS记录号WOS:000759289300002
出版者SPRINGER
资助机构Major Project for New Generation of AI ; NationalNatural Science Foundation of China
源URL[http://ir.ia.ac.cn/handle/173211/47954]  
专题自动化研究所_智能感知与计算研究中心
通讯作者Peng, Junran
作者单位1.Huawei Cloud & AI, Beijing, Peoples R China
2.Chinese Acad Sci, Hong Kong Inst Sci & Innovat, Ctr Artificial Intelligence & Robot, Hong Kong, Peoples R China
3.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing, Peoples R China
4.Ctr Res Intelligent Percept & Comp, Beijing, Peoples R China
5.Univ Chinese Acad Sci, Sch Future Technol, Beijing, Peoples R China
推荐引用方式
GB/T 7714
Zhang, Zhaoxiang,Pan, Cong,Peng, Junran. Delving into the Effectiveness of Receptive Fields: Learning Scale-Transferrable Architectures for Practical Object Detection[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION,2022,130(4):970-989.
APA Zhang, Zhaoxiang,Pan, Cong,&Peng, Junran.(2022).Delving into the Effectiveness of Receptive Fields: Learning Scale-Transferrable Architectures for Practical Object Detection.INTERNATIONAL JOURNAL OF COMPUTER VISION,130(4),970-989.
MLA Zhang, Zhaoxiang,et al."Delving into the Effectiveness of Receptive Fields: Learning Scale-Transferrable Architectures for Practical Object Detection".INTERNATIONAL JOURNAL OF COMPUTER VISION 130.4(2022):970-989.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。