Hierarchical Attention Network for Open-Set Fine-Grained Image Recognition
文献类型:期刊论文
作者 | Sun, Jiayin1,2,3![]() ![]() |
刊名 | IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY
![]() |
出版日期 | 2024-05-01 |
卷号 | 34期号:5页码:3891-3904 |
关键词 | Transformers Feature extraction Task analysis Image recognition Training Visualization Computer vision Open-set fine-grained image recognition hierarchical attention long-short term memory |
ISSN号 | 1051-8215 |
DOI | 10.1109/TCSVT.2023.3325001 |
通讯作者 | Dong, Qiulei(qldong@nlpr.ia.ac.cn) |
英文摘要 | Triggered by the success of transformers in various visual tasks, the spatial self-attention mechanism has recently attracted more and more attention in the computer vision community. However, we empirically found that a typical vision transformer with the spatial self-attention mechanism could not learn accurate attention maps for distinguishing different categories of fine-grained images. To address this problem, motivated by the temporal attention mechanism in brains, we propose a hierarchical attention network for learning fine-grained feature representations, called HAN, where the features learnt by implementing a sequence of spatial self-attention operations corresponding to multiple moments are aggregated progressively. The proposed HAN consists of four modules: a self-attention backbone module for learning a sequence of features with self-attention operations, a spatial feature self-organizing module for facilitating the model training, a hierarchical aggregation module for aggregating the re-organized features via a Long Short-Term Memory network, and a context-aware module that is implemented as the forget block of the hierarchical aggregation module for preserving/forgetting the long-term memory by utilizing contextual information. Then, we propose a HAN-based method for open-set fine-grained recognition by integrating the proposed HAN network with a linear classifier, called HAN-OSFGR. Extensive experimental results on 3 fine-grained datasets and 2 coarse-grained datasets demonstrate that the proposed HAN-OSFGR outperforms 9 state-of-the-art open-set recognition methods significantly in most cases. |
WOS关键词 | TEMPORAL ATTENTION ; DIFFICULTY |
资助项目 | National Key Research and Development Program of China |
WOS研究方向 | Engineering |
语种 | 英语 |
WOS记录号 | WOS:001221132000022 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
资助机构 | National Key Research and Development Program of China |
源URL | [http://ir.ia.ac.cn/handle/173211/58660] ![]() |
专题 | 自动化研究所_模式识别国家重点实验室_机器人视觉团队 |
通讯作者 | Dong, Qiulei |
作者单位 | 1.Chinese Acad Sci, Inst Automat, State Key Lab Multimodal Artificial Intelligence S, Beijing 100190, Peoples R China 2.Chinese Acad Sci, Ctr Excellence Brain Sci & Intelligence Technol, Beijing 100190, Peoples R China 3.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China 4.Univ Chinese Acad Sci, Coll Life Sci, Beijing 100049, Peoples R China |
推荐引用方式 GB/T 7714 | Sun, Jiayin,Wang, Hong,Dong, Qiulei. Hierarchical Attention Network for Open-Set Fine-Grained Image Recognition[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,2024,34(5):3891-3904. |
APA | Sun, Jiayin,Wang, Hong,&Dong, Qiulei.(2024).Hierarchical Attention Network for Open-Set Fine-Grained Image Recognition.IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,34(5),3891-3904. |
MLA | Sun, Jiayin,et al."Hierarchical Attention Network for Open-Set Fine-Grained Image Recognition".IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY 34.5(2024):3891-3904. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。