中国科学院机构知识库网格系统: Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism

Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism

文献类型：期刊论文


作者	Liu, Can 1,2; Wang, Kaige 3; Li, Qing 1; Zhao, Fazhan 1; Zhao, Kun4 ; Ma, Hongtu4
刊名	NEURAL NETWORKS
出版日期	2024-02-01
卷号	170 页码:276-284
关键词	Object detection Bounding box regression Loss function design Focusing mechanism
ISSN号	0893-6080
DOI	10.1016/j.neunet.2023.11.041
通讯作者	Liu, Can(canliu@whu.edu.cn)
英文摘要	Bounding box regression (BBR) is one of the core tasks in object detection, and the BBR loss function significantly impacts its performance. However, we have observed that existing IoU-based loss functions suffer from unreasonable penalty factors, leading to anchor boxes expanding during regression and significantly slowing down convergence. To address this issue, we intensively analyzed the reasons for anchor box enlargement. In response, we propose a Powerful-IoU (PIoU) loss function, which combines a target size-adaptive penalty factor and a gradient-adjusting function based on anchor box quality. The PIoU loss guides anchor boxes to regress along efficient paths, resulting in faster convergence than existing IoU-based losses. Additionally, we investigate the focusing mechanism and introduce a non-monotonic attention layer that was combined with PIoU to obtain a new loss function PIoU v2. PIoU v2 loss enhances the capability to focus on anchor boxes of medium quality. By incorporating PIoU v2 into popular object detectors such as YOLOv8 and DINO, we achieved an increase in average precision (AP) and improved performance compared to their original loss functions on the MS COCO and PASCAL VOC datasets, thus validating the effectiveness of our proposed improvement strategies.
WOS关键词	OBJECT DETECTION
资助项目	National Key Research and Development Program of China[2021YFB3100904]
WOS研究方向	Computer Science ; Neurosciences & Neurology
语种	英语
WOS记录号	WOS:001125595300001
出版者	PERGAMON-ELSEVIER SCIENCE LTD
资助机构	National Key Research and Development Program of China
源URL	[http://ir.ia.ac.cn/handle/173211/54937]
专题	脑图谱与类脑智能实验室
通讯作者	Liu, Can
作者单位	1.Chinese Acad Sci, Inst Microelect, Beijing 100029, Peoples R China 2.Univ Chinese Acad Sci, Sch Integrated Circuits, Beijing 100020, Peoples R China 3.China Acad Aerosp Sci & Innovat, Beijing 100048, Peoples R China 4.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
推荐引用方式 GB/T 7714	Liu, Can,Wang, Kaige,Li, Qing,et al. Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism[J]. NEURAL NETWORKS,2024,170:276-284.
APA	Liu, Can,Wang, Kaige,Li, Qing,Zhao, Fazhan,Zhao, Kun,&Ma, Hongtu.(2024).Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism.NEURAL NETWORKS,170,276-284.
MLA	Liu, Can,et al."Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism".NEURAL NETWORKS 170(2024):276-284.

入库方式： OAI收割

来源：自动化研究所

下载0

Powerful-IoU: More straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism

其他版本