中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Improving Inconspicuous Attributes Modeling for Person Search by Language

文献类型:期刊论文

作者Niu, Kai1,4; Huang, Tao1,4; Huang, Linjiang3; Wang, Liang2; Zhang, Yanning1
刊名IEEE TRANSACTIONS ON IMAGE PROCESSING
出版日期2023
卷号32页码:3429-3441
关键词Person search by language cross-modal retrieval smart video surveillance
ISSN号1057-7149
DOI10.1109/TIP.2023.3285426
通讯作者Niu, Kai(kai.niu@nwpu.edu.cn) ; Huang, Linjiang(ljhuang524@gmail.com)
英文摘要Person search by language aims to retrieve the interested pedestrian images based on natural language sentences. Although great efforts have been made to address the cross-modal heterogeneity, most of the current solutions suffer from only capturing salient attributes while ignoring inconspicuous ones, being weak in distinguishing very similar pedestrians. In this work, we propose the Adaptive Salient Attribute Mask Network (ASAMN) to adaptively mask the salient attributes for cross-modal alignments, and therefore induce the model to simultaneously focus on inconspicuous attributes. Specifically, we consider the uni-modal and cross-modal relations for masking salient attributes in the Uni-modal Salient Attribute Mask (USAM) and Cross-modal Salient Attribute Mask (CSAM) modules, respectively. Then the Attribute Modeling Balance (AMB) module is presented to randomly select a proportion of masked features for cross-modal alignments, ensuring the balance of modeling capacity of both salient attributes and inconspicuous ones. Extensive experiments and analyses have been carried out to validate the effectiveness and generalization capacity of our proposed ASAMN method, and we have obtained the state-of-the-art retrieval performance on the widely-used CUHK-PEDES and ICFG-PEDES benchmarks.
WOS关键词VIDEO ; IMAGE
资助项目National Natural Science Foundation of China[62101451] ; National Natural Science Foundation of China[U19B2037] ; Guangdong Basic and Applied Basic Research Foundation[2023A1515011427] ; Fundamental Research Funds for the Central Universities[D5000210733]
WOS研究方向Computer Science ; Engineering
语种英语
WOS记录号WOS:001017283200006
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
资助机构National Natural Science Foundation of China ; Guangdong Basic and Applied Basic Research Foundation ; Fundamental Research Funds for the Central Universities
源URL[http://ir.ia.ac.cn/handle/173211/53678]  
专题多模态人工智能系统全国重点实验室
通讯作者Niu, Kai; Huang, Linjiang
作者单位1.Northwestern Polytech Univ, Sch Comp Sci, Natl Engn Lab Integrated Aerosp Ground Ocean Big D, Xian 710129, Peoples R China
2.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China
3.Chinese Univ Hong Kong, Multimedia Lab, Hong Kong, Peoples R China
4.Northwestern Polytech Univ Shenzhen, Res & Dev Inst, Shenzhen 518063, Peoples R China
推荐引用方式
GB/T 7714
Niu, Kai,Huang, Tao,Huang, Linjiang,et al. Improving Inconspicuous Attributes Modeling for Person Search by Language[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING,2023,32:3429-3441.
APA Niu, Kai,Huang, Tao,Huang, Linjiang,Wang, Liang,&Zhang, Yanning.(2023).Improving Inconspicuous Attributes Modeling for Person Search by Language.IEEE TRANSACTIONS ON IMAGE PROCESSING,32,3429-3441.
MLA Niu, Kai,et al."Improving Inconspicuous Attributes Modeling for Person Search by Language".IEEE TRANSACTIONS ON IMAGE PROCESSING 32(2023):3429-3441.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。