中国科学院机构知识库网格系统: Dense Attention: A Densely Connected Attention Mechanism for Vision Transformer

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

Dense Attention: A Densely Connected Attention Mechanism for Vision Transformer

文献类型：会议论文


作者	Nannan Li1,2 ; Yaran Chen1,2 ; Dongbin Zhao1,2
出版日期	2023-05
会议日期	June 18 - 23, 2023
会议地点	Queensland, Australia
英文摘要	Recently, Vision Transformer has demonstrated its impressive capability in image understanding. The multi-head self-attention mechanism is fundamental to its formidable performance. However, self-attention has the drawback of high computational effort, which makes the training of the model require powerful computational resources or more time. This paper designs a novel and efficient attention mechanism Dense Attention to overcome the above problem. Dense attention aims to focus on features from multiple views through a dense connection paradigm. Benefiting from the attention of comprehensive features, dense attention can i) remarkably strengthen the image representation of the model, and ii) partially replace the multihead self-attention mechanism to allow model slimming. To verify the effectiveness of dense attention, we implement it in the prevalent Vision Transformer models, including non-pyramid architecture DeiT and pyramid architecture Swin Transformer. The experimental results on ImageNet classification show that dense attention indeed contributes to performance improvement, +1.8/1.3% for DeiT-T/S and +0.7/+1.2% for Swin-T/S, respectively. Dense attention also demonstrates its transferability on CIFAR10 and CIFAR100 recognition benchmarks with classification accuracy of 98.9% and 89.6% respectively. Furthermore, dense attention can weaken the performance sacrifice caused by the pruning in the number of heads. Code and pre-trained models will be available.
语种	英语
源URL	[http://ir.ia.ac.cn/handle/173211/52213]
专题	复杂系统管理与控制国家重点实验室_深度强化学习
作者单位	1.School of artificial intelligence, University of Chinese Academy of Sciences 2.The State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences
推荐引用方式 GB/T 7714	Nannan Li,Yaran Chen,Dongbin Zhao. Dense Attention: A Densely Connected Attention Mechanism for Vision Transformer[C]. 见:. Queensland, Australia. June 18 - 23, 2023.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。