Improving Inconspicuous Attributes Modeling for Person Search by Language
文献类型:期刊论文
作者 | Niu, Kai1,4![]() ![]() ![]() |
刊名 | IEEE TRANSACTIONS ON IMAGE PROCESSING
![]() |
出版日期 | 2023 |
卷号 | 32页码:3429-3441 |
关键词 | Person search by language cross-modal retrieval smart video surveillance |
ISSN号 | 1057-7149 |
DOI | 10.1109/TIP.2023.3285426 |
通讯作者 | Niu, Kai(kai.niu@nwpu.edu.cn) ; Huang, Linjiang(ljhuang524@gmail.com) |
英文摘要 | Person search by language aims to retrieve the interested pedestrian images based on natural language sentences. Although great efforts have been made to address the cross-modal heterogeneity, most of the current solutions suffer from only capturing salient attributes while ignoring inconspicuous ones, being weak in distinguishing very similar pedestrians. In this work, we propose the Adaptive Salient Attribute Mask Network (ASAMN) to adaptively mask the salient attributes for cross-modal alignments, and therefore induce the model to simultaneously focus on inconspicuous attributes. Specifically, we consider the uni-modal and cross-modal relations for masking salient attributes in the Uni-modal Salient Attribute Mask (USAM) and Cross-modal Salient Attribute Mask (CSAM) modules, respectively. Then the Attribute Modeling Balance (AMB) module is presented to randomly select a proportion of masked features for cross-modal alignments, ensuring the balance of modeling capacity of both salient attributes and inconspicuous ones. Extensive experiments and analyses have been carried out to validate the effectiveness and generalization capacity of our proposed ASAMN method, and we have obtained the state-of-the-art retrieval performance on the widely-used CUHK-PEDES and ICFG-PEDES benchmarks. |
WOS关键词 | VIDEO ; IMAGE |
资助项目 | National Natural Science Foundation of China[62101451] ; National Natural Science Foundation of China[U19B2037] ; Guangdong Basic and Applied Basic Research Foundation[2023A1515011427] ; Fundamental Research Funds for the Central Universities[D5000210733] |
WOS研究方向 | Computer Science ; Engineering |
语种 | 英语 |
WOS记录号 | WOS:001017283200006 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
资助机构 | National Natural Science Foundation of China ; Guangdong Basic and Applied Basic Research Foundation ; Fundamental Research Funds for the Central Universities |
源URL | [http://ir.ia.ac.cn/handle/173211/53678] ![]() |
专题 | 多模态人工智能系统全国重点实验室 |
通讯作者 | Niu, Kai; Huang, Linjiang |
作者单位 | 1.Northwestern Polytech Univ, Sch Comp Sci, Natl Engn Lab Integrated Aerosp Ground Ocean Big D, Xian 710129, Peoples R China 2.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China 3.Chinese Univ Hong Kong, Multimedia Lab, Hong Kong, Peoples R China 4.Northwestern Polytech Univ Shenzhen, Res & Dev Inst, Shenzhen 518063, Peoples R China |
推荐引用方式 GB/T 7714 | Niu, Kai,Huang, Tao,Huang, Linjiang,et al. Improving Inconspicuous Attributes Modeling for Person Search by Language[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING,2023,32:3429-3441. |
APA | Niu, Kai,Huang, Tao,Huang, Linjiang,Wang, Liang,&Zhang, Yanning.(2023).Improving Inconspicuous Attributes Modeling for Person Search by Language.IEEE TRANSACTIONS ON IMAGE PROCESSING,32,3429-3441. |
MLA | Niu, Kai,et al."Improving Inconspicuous Attributes Modeling for Person Search by Language".IEEE TRANSACTIONS ON IMAGE PROCESSING 32(2023):3429-3441. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。