中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
A2Pt : Anti-Associative Prompt Tuning for Open Set Visual Recognition

文献类型:期刊论文

作者Ren, Hairui1; Tang, Fan2; Pan, Xingjia3; Cao, Juan2; Dong, Weiming4; Lin, Zhiwen3; Yan, Ke3; Xu, Changsheng
刊名IEEE TRANSACTIONS ON MULTIMEDIA
出版日期2024
卷号26页码:8419-8431
关键词Tuning Neck Task analysis Image recognition Calibration Visualization Training Multi-modality Pre-trained models (PTMs) open set recognition (OSR) class-aware representation anti-associative prompt tuning (A(2)Pt)
ISSN号1520-9210
DOI10.1109/TMM.2023.3339387
通讯作者Tang, Fan(tangfan@ict.ac.cn)
英文摘要Multi-modality pre-trained models (PTMs) have considerably boosted the performance on a broad range of computer vision topics. Still, they have not been explored purposefully in open set recognition (OSR) scenarios when applying PTMs to downstream recognition tasks. Directly fine/prompt tuning PTMs on closed-set classification tasks will inevitably suffer from data bias and always learn more or less target class-irrelevant co-occurring contextual information, which leads to over-confident predictions on unknown samples. In this paper, we propose a simple yet effective approach, termed Anti-Associative Prompt Tuning (A(2)Pt), toward learning compact and accurate class-related representation with few class-irrelevant associations from context using multi-modal priors. Specifically, a cross-modal guided activation module is adopted to refine the class-aware representation and suppress the associations from co-occurring contexts by involving text-modal information. We further design an anti-association calibration module to obtain compact class-aware and class-irrelevant representations, respectively, by introducing two additional object functions. Extensive experiments on publicly available benchmarks, including CIFAR series, TinyImageNet, and ImageNet-21K-P, show that the proposed A(2)Pt achieves substantial and consistent performance gains compared with both SOTA OSR and PTM prompt tuning approaches.
资助项目National Natural Science Foundation of China[62102162] ; National Natural Science Foundation of China[U20B2070] ; National Natural Science Foundation of China[61832016] ; National Natural Science Foundation of China[61832002] ; Beijing Natural Science Foundation[L221013]
WOS研究方向Computer Science ; Telecommunications
语种英语
WOS记录号WOS:001283692500027
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
资助机构National Natural Science Foundation of China ; Beijing Natural Science Foundation
源URL[http://ir.ia.ac.cn/handle/173211/59309]  
专题自动化研究所_模式识别国家重点实验室_多媒体计算与图形学团队
通讯作者Tang, Fan
作者单位1.Jilin Univ, Sch Artificial Intelligence, Jilin 130012, Peoples R China
2.Chinese Acad Sci, Inst Comp Technol, Beijing 100045, Peoples R China
3.Youtu Lab, Tencent, Shanghai 201103, Peoples R China
4.Chinese Acad Sci, Inst Automat, Beijing 100045, Peoples R China
推荐引用方式
GB/T 7714
Ren, Hairui,Tang, Fan,Pan, Xingjia,et al. A2Pt : Anti-Associative Prompt Tuning for Open Set Visual Recognition[J]. IEEE TRANSACTIONS ON MULTIMEDIA,2024,26:8419-8431.
APA Ren, Hairui.,Tang, Fan.,Pan, Xingjia.,Cao, Juan.,Dong, Weiming.,...&Xu, Changsheng.(2024).A2Pt : Anti-Associative Prompt Tuning for Open Set Visual Recognition.IEEE TRANSACTIONS ON MULTIMEDIA,26,8419-8431.
MLA Ren, Hairui,et al."A2Pt : Anti-Associative Prompt Tuning for Open Set Visual Recognition".IEEE TRANSACTIONS ON MULTIMEDIA 26(2024):8419-8431.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。