中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Self-Supervised Modality-Aware Multiple Granularity Pre-Training for RGB-Infrared Person Re-Identification

文献类型:期刊论文

作者Wan, Lin1; Jing, Qianyan1; Sun, Zongyuan1; Zhang, Chuang2; Li, Zhihang3; Chen, Yehansen1
刊名IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY
出版日期2023
卷号18页码:3044-3057
ISSN号1556-6013
关键词Task analysis Training Feature extraction Lighting Cameras Visualization Self-supervised learning Cross-modality person re-identification self-supervised learning multi-modality pre-training
DOI10.1109/TIFS.2023.3273911
通讯作者Li, Zhihang(lizhihang.cas@gmail.com)
英文摘要RGB-Infrared person re-identification (RGB-IR ReID) aims to associate people across disjoint RGB and IR camera views. Currently, state-of-the-art performance of RGB-IR ReID is not as impressive as that of conventional ReID. Much of that is due to the notorious modality bias training issue brought by the single-modality ImageNet pre-training, which might yield RGB-biased representations that severely hinder the cross-modality image retrieval. This paper makes first attempt to tackle the task from a pre-training perspective. We propose a self-supervised pre-training solution, named Modality-Aware Multiple Granularity Learning (MMGL), which directly trains models from scratch only on multi-modal ReID datasets, but achieving competitive results against ImageNet pre-training, without using any external data or sophisticated tuning tricks. First, we develop a simple-but-effective 'permutation recovery' pretext task that globally maps shuffled RGB-IR images into a shared latent permutation space, providing modality-invariant global representations for downstream ReID tasks. Second, we present a part-aware cycle-contrastive (PCC) learning strategy that utilizes cross-modality cycle-consistency to maximize agreement between semantically similar RGB-IR image patches. This enables contrastive learning for the unpaired multi-modal scenarios, further improving the discriminability of local features without laborious instance augmentation. Based on these designs, MMGL effectively alleviates the modality bias training problem. Extensive experiments demonstrate that it learns better representations (+8.03% Rank-1 accuracy) with faster training speed (converge only in few hours) and higher data efficiency (< 5% data size) than ImageNet pre-training. The results also suggest it generalizes well to various existing models, losses and has promising transferability across datasets. The code will be released at https://github.com/hansonchen1996/MMGL.
WOS研究方向Computer Science ; Engineering
语种英语
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
WOS记录号WOS:001010105200002
源URL[http://ir.ia.ac.cn/handle/173211/53471]  
专题自动化研究所_智能感知与计算研究中心
通讯作者Li, Zhihang
作者单位1.China Univ Geosci, Sch Comp Sci, Wuhan 430078, Peoples R China
2.Nanjing Univ Sci & Technol, Sch Comp Sci & Engn, Nanjing 210094, Peoples R China
3.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
推荐引用方式
GB/T 7714
Wan, Lin,Jing, Qianyan,Sun, Zongyuan,et al. Self-Supervised Modality-Aware Multiple Granularity Pre-Training for RGB-Infrared Person Re-Identification[J]. IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY,2023,18:3044-3057.
APA Wan, Lin,Jing, Qianyan,Sun, Zongyuan,Zhang, Chuang,Li, Zhihang,&Chen, Yehansen.(2023).Self-Supervised Modality-Aware Multiple Granularity Pre-Training for RGB-Infrared Person Re-Identification.IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY,18,3044-3057.
MLA Wan, Lin,et al."Self-Supervised Modality-Aware Multiple Granularity Pre-Training for RGB-Infrared Person Re-Identification".IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY 18(2023):3044-3057.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。