中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation

文献类型:会议论文

作者Zongyang Ma; Guan Luo; Jin Gao; Liang L; Yuxin Chen; Shaoru Wang; Congxuan Zhang; Weiming Hu
出版日期2022
会议日期2022-06
会议地点New Orleans, Louisiana
英文摘要

Open-vocabulary object detection aims to detect unseen objects beyond training set, and the recent two-stage detectors that distill knowledge from zero-shot image classification models, ie, CLIP, have achieved state-of-the-art performance in this area. However, when it comes to more efficient one-stage detectors, the missing of class-agnostic object proposals in distillation leads to an obvious performance gap between these two kinds of detectors. In this paper, a hierarchical knowledge distillation mechanism, namely HierKD, is explored to devise a fast and top-performing open-vocabulary one-stage detector. Besides the commonly used instance-level visual-to-visual knowledge distillation, it is also equipped with another weakly supervised global-level visual-to-language knowledge distillation with captions to learn novel category knowledge beyond training labels. The evaluation results on COCO dataset shows the proposed method significantly surpasses the previous best one-stage method with 70% and 33% AP50 gains under the Zero-shot Detection and Generalized Zero-shot Detection settings respectively. The code will be publicly available upon publication.

源URL[http://ir.ia.ac.cn/handle/173211/47438]  
专题自动化研究所_模式识别国家重点实验室_视频内容安全团队
通讯作者Jin Gao
作者单位National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
推荐引用方式
GB/T 7714
Zongyang Ma,Guan Luo,Jin Gao,et al. Open-Vocabulary One-Stage Detection with Hierarchical Visual-Language Knowledge Distillation[C]. 见:. New Orleans, Louisiana. 2022-06.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。