中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
A Multitask Learning Approach Based on Cascaded Attention Network and Self-Adaption Loss for Speech Emotion Recognition

文献类型:期刊论文

作者Liu, Yang2; Xia, Yuqi2; Sun, Haoqin2; Meng, Xiaolei2; Bai, Jianxiong2; Guan, Wenbo2; Zhao, Zhen2; LI, Yongwei1
刊名IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES
出版日期2023-06-01
卷号E106A期号:6页码:876-885
ISSN号0916-8508
关键词speech emotion recognition non-personalized features cascaded attention network multitask learning self-adaption loss
DOI10.1587/transfun.2022EAP1091
通讯作者Zhao, Zhen(zzqust@126.com)
英文摘要Speech emotion recognition (SER) has been a complex and difficult task for a long time due to emotional complexity. In this paper, we propose a multitask deep learning approach based on cascaded attention network and self-adaption loss for SER. First, non-personalized features are extracted to represent the process of emotion change while reducing external variables' influence. Second, to highlight salient speech emotion features, a cascade attention network is proposed, where spatial temporal attention can effectively locate the regions of speech that express emotion, while self-attention reduces the dependence on external information. Finally, the influence brought by the differences in gender and human perception of ex-ternal information is alleviated by using a multitask learning strategy, where a self-adaption loss is introduced to determine the weights of different tasks dynamically. Experimental results on IEMOCAP dataset demonstrate that our method gains an absolute improvement of 1.97% and 0.91% over state-of-the-art strategies in terms of weighted accuracy (WA) and unweighted accuracy (UA), respectively.
WOS关键词NEURAL-NETWORK
资助项目National Natural Science Foundation of China (NSFC)[62201314] ; National Natural Science Foundation of China (NSFC)[62201571] ; Natural Science Foundation of Shandong Province[ZR2020QF007] ; Key Technology Tackling and Industrialization Demonstration projects of Qingdao[23-1-2-qdjh-18-gx]
WOS研究方向Computer Science ; Engineering
语种英语
出版者IEICE-INST ELECTRONICS INFORMATION COMMUNICATION ENGINEERS
WOS记录号WOS:001018846400001
资助机构National Natural Science Foundation of China (NSFC) ; Natural Science Foundation of Shandong Province ; Key Technology Tackling and Industrialization Demonstration projects of Qingdao
源URL[http://ir.ia.ac.cn/handle/173211/53585]  
专题模式识别国家重点实验室_智能交互
通讯作者Zhao, Zhen
作者单位1.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100089, Peoples R China
2.Qingdao Univ Sci & Technol, Sch Informat Sci & Technol, Qingdao 266061, Peoples R China
推荐引用方式
GB/T 7714
Liu, Yang,Xia, Yuqi,Sun, Haoqin,et al. A Multitask Learning Approach Based on Cascaded Attention Network and Self-Adaption Loss for Speech Emotion Recognition[J]. IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES,2023,E106A(6):876-885.
APA Liu, Yang.,Xia, Yuqi.,Sun, Haoqin.,Meng, Xiaolei.,Bai, Jianxiong.,...&LI, Yongwei.(2023).A Multitask Learning Approach Based on Cascaded Attention Network and Self-Adaption Loss for Speech Emotion Recognition.IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES,E106A(6),876-885.
MLA Liu, Yang,et al."A Multitask Learning Approach Based on Cascaded Attention Network and Self-Adaption Loss for Speech Emotion Recognition".IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES E106A.6(2023):876-885.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。