Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition
文献类型:期刊论文
作者 | Zitong, Yu4; Benjia, Zhou5; Jun, Wan6![]() |
刊名 | IEEE Transactions on Image Processing
![]() |
出版日期 | 2021 |
卷号 | 30页码:5626-5640 |
英文摘要 | Gesture recognition has attracted considerable attention owing to its great potential in applications. Although the great progress has been made recently in multi-modal learning methods, existing methods still lack effective integration to fully explore synergies among spatio-temporal modalities effectively for gesture recognition. The problems are partially due to the fact that the existing manually designed network architectures have low efficiency in the joint learning of multi-modalities. In this paper, we propose the first neural architecture search (NAS)- based method for RGB-D gesture recognition. The proposed method includes two key components: 1) enhanced temporal representation via the proposed 3D Central Difference Convolution (3D-CDC) family, which is able to capture rich temporal context via aggregating temporal difference information; and 2) optimized backbones for multi-sampling-rate branches and lateral connections among varied modalities. The resultant multi-modal multi-rate network provides a new perspective to understand the relationship between RGB and depth modalities and their temporal dynamics. Comprehensive experiments are performed on three benchmark datasets (IsoGD, NvGesture, and EgoGesture), demonstrating the state-of-the-art performance in both single- and multi-modality settings. The code is available at https://github.com/ZitongYu/3DCDC-NAS. |
语种 | 英语 |
源URL | [http://ir.ia.ac.cn/handle/173211/57115] ![]() |
专题 | 多模态人工智能系统全国重点实验室 |
通讯作者 | Jun, Wan; Guoying, Zhao |
作者单位 | 1.Alibaba Group 2.Tianjin University 3.Westlake University 4.University of Oulu 5.Macau University of Science and Technology 6.Institute of Automation, Chinese Academy of Sciences |
推荐引用方式 GB/T 7714 | Zitong, Yu,Benjia, Zhou,Jun, Wan,et al. Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition[J]. IEEE Transactions on Image Processing,2021,30:5626-5640. |
APA | Zitong, Yu.,Benjia, Zhou.,Jun, Wan.,Pichao, Wang.,Haoyu, Chen.,...&Guoying, Zhao.(2021).Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition.IEEE Transactions on Image Processing,30,5626-5640. |
MLA | Zitong, Yu,et al."Searching Multi-Rate and Multi-Modal Temporal Enhanced Networks for Gesture Recognition".IEEE Transactions on Image Processing 30(2021):5626-5640. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。