中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism

文献类型:期刊论文

作者Liu, Can1,2; Wang, Kaige3; Li, Qing1; Zhao, Fazhan1; Zhao, Kun4; Ma, Hongtu4
刊名NEURAL NETWORKS
出版日期2024-02-01
卷号170页码:276-284
ISSN号0893-6080
关键词Object detection Bounding box regression Loss function design Focusing mechanism
DOI10.1016/j.neunet.2023.11.041
通讯作者Liu, Can(canliu@whu.edu.cn)
英文摘要Bounding box regression (BBR) is one of the core tasks in object detection, and the BBR loss function significantly impacts its performance. However, we have observed that existing IoU-based loss functions suffer from unreasonable penalty factors, leading to anchor boxes expanding during regression and significantly slowing down convergence. To address this issue, we intensively analyzed the reasons for anchor box enlargement. In response, we propose a Powerful-IoU (PIoU) loss function, which combines a target size-adaptive penalty factor and a gradient-adjusting function based on anchor box quality. The PIoU loss guides anchor boxes to regress along efficient paths, resulting in faster convergence than existing IoU-based losses. Additionally, we investigate the focusing mechanism and introduce a non-monotonic attention layer that was combined with PIoU to obtain a new loss function PIoU v2. PIoU v2 loss enhances the capability to focus on anchor boxes of medium quality. By incorporating PIoU v2 into popular object detectors such as YOLOv8 and DINO, we achieved an increase in average precision (AP) and improved performance compared to their original loss functions on the MS COCO and PASCAL VOC datasets, thus validating the effectiveness of our proposed improvement strategies.
WOS关键词OBJECT DETECTION
资助项目National Key Research and Development Program of China[2021YFB3100904]
WOS研究方向Computer Science ; Neurosciences & Neurology
语种英语
出版者PERGAMON-ELSEVIER SCIENCE LTD
WOS记录号WOS:001125595300001
资助机构National Key Research and Development Program of China
源URL[http://ir.ia.ac.cn/handle/173211/54937]  
专题脑图谱与类脑智能实验室
通讯作者Liu, Can
作者单位1.Chinese Acad Sci, Inst Microelect, Beijing 100029, Peoples R China
2.Univ Chinese Acad Sci, Sch Integrated Circuits, Beijing 100020, Peoples R China
3.China Acad Aerosp Sci & Innovat, Beijing 100048, Peoples R China
4.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Liu, Can,Wang, Kaige,Li, Qing,et al. Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism[J]. NEURAL NETWORKS,2024,170:276-284.
APA Liu, Can,Wang, Kaige,Li, Qing,Zhao, Fazhan,Zhao, Kun,&Ma, Hongtu.(2024).Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism.NEURAL NETWORKS,170,276-284.
MLA Liu, Can,et al."Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism".NEURAL NETWORKS 170(2024):276-284.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。