中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Multi-Scale Spatial Temporal Graph Neural Network for Skeleton-Based Action Recognition

文献类型:期刊论文

作者Feng, Dong1,2,3; Wu, ZhongCheng1,2; Zhang, Jun1,3; Ren, TingTing1,3
刊名IEEE ACCESS
出版日期2021
卷号9
关键词Skeleton-based action recognition multi-scale spatial-temporal network graph convolutional network adaptive fusion
ISSN号2169-3536
DOI10.1109/ACCESS.2021.3073107
通讯作者Ren, TingTing(ttren@hmfl.ac.cn)
英文摘要Graph convolutional networks (GCNs) have achieved remarkable performance on skeleton-based action recognition. Existing GCN-based methods usually apply the fixed graph topology and one fixed temporal convolution kernel to extract the spatial features of joints and temporal features, which is from a single-scale perspective. Actually, human actions are coordinated by various body parts in the spatial domain, and exhibit different characteristics in the temporal domain. Therefore, it is appropriate to model the multi-scale information that can enhance both the explainability and stability, which is ignored in current literatures. To address this issue, we propose a multi-scale spatial-temporal graph neural network (MSTGNN) to discover multi-scale discriminative features from spatial and temporal aspects simultaneously. Our contributions are three-folds: 1) For the spatial domain, inspired by the kinematics of the human action, we develop a three-scale graph data structures in a fine-to-coarse way. A novel hybrid spatial pooling module is then proposed to dynamically exploit the global and comprehensive information step-by-step. 2) For the temporal domain, we design a multi-scale temporal convolution module adaptively fusing the temporal features extracted by different scale convolution kernels. 3) As utilizing one-stream architecture instead of multi-stream architecture, the proposed model can be trained in an end-to-end manner. MSTGNN achieves state-of-the-art performance with less computation complexity. Experimental results conducted on two large datasets (NTU-RGB+D and NTU-RGB+D-120) demonstrate the superiority of MSTGNN.
WOS研究方向Computer Science ; Engineering ; Telecommunications
语种英语
WOS记录号WOS:000641944500001
出版者IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
源URL[http://ir.hfcas.ac.cn:8080/handle/334002/121569]  
专题中国科学院合肥物质科学研究院
通讯作者Ren, TingTing
作者单位1.Chinese Acad Sci, Hefei Inst Phys Sci, High Magnet Field Lab, Hefei 230031, Peoples R China
2.Univ Sci & Technol China, Sch Hefei Inst Phys Sci, Hefei 230031, Peoples R China
3.High Magnet Field Lab Anhui Prov, Hefei 230031, Peoples R China
推荐引用方式
GB/T 7714
Feng, Dong,Wu, ZhongCheng,Zhang, Jun,et al. Multi-Scale Spatial Temporal Graph Neural Network for Skeleton-Based Action Recognition[J]. IEEE ACCESS,2021,9.
APA Feng, Dong,Wu, ZhongCheng,Zhang, Jun,&Ren, TingTing.(2021).Multi-Scale Spatial Temporal Graph Neural Network for Skeleton-Based Action Recognition.IEEE ACCESS,9.
MLA Feng, Dong,et al."Multi-Scale Spatial Temporal Graph Neural Network for Skeleton-Based Action Recognition".IEEE ACCESS 9(2021).

入库方式: OAI收割

来源:合肥物质科学研究院

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。