中国科学院机构知识库网格系统: Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition

Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition

文献类型：期刊论文


作者	Li, Xingfeng 2; Shi, Xiaohan 1; Hu, Desheng 6; Li, Yongwei 5; Zhang, Qingchen 2; Wang, Zhengxia 4; Unoki, Masashi 3; Akagi, Masato 3
刊名	IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
出版日期	2023
卷号	31 页码:2534-2547
关键词	Affective computing speech emotion recognition acoustic representation music theory and speech analysis
ISSN号	2329-9290
DOI	10.1109/TASLP.2023.3289312
通讯作者	Li, Xingfeng(lixingfeng@hainanu.edu.cn)
英文摘要	This research presents a music theory-inspired acoustic representation (hereafter, MTAR) to address improved speech emotion recognition. The recognition of emotion in speech and music is developed in parallel, yet a relatively limited understanding of MTAR for interpreting speech emotions is involved. In the present study, we use music theory to study representative acoustics associated with emotion in speech from vocal emotion expressions and auditory emotion perception domains. In experiments assessing the role and effectiveness of the proposed representation in classifying discrete emotion categories and predicting continuous emotion dimensions, it shows promising performance compared with extensively used features for emotion recognition based on the spectrogram, Mel-spectrogram, Mel-frequency cepstral coefficients, VGGish, and the large baseline feature sets of the INTERSPEECH challenges. This proposal opens up a novel research avenue in developing a computational acoustic representation of speech emotion via music theory.
WOS关键词	PERCEPTION ; EXPRESSION ; PATTERNS ; FEATURES ; PITCH ; PERSPECTIVE ; MODALITIES ; KNOWLEDGE ; INTERVALS ; COGNITION
资助项目	Key Research and Development Program of Hainan Province[ZDYF2021GXJS017] ; National Natural Science Foundation of China[82160345] ; National Natural Science Foundation of China[62201571] ; Key Science and Technology Plan Project of Haikou[2021-016]
WOS研究方向	Acoustics ; Engineering
语种	英语
WOS记录号	WOS:001025466100003
出版者	IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
资助机构	Key Research and Development Program of Hainan Province ; National Natural Science Foundation of China ; Key Science and Technology Plan Project of Haikou
源URL	[http://ir.ia.ac.cn/handle/173211/53768]
专题	模式识别国家重点实验室_智能交互
通讯作者	Li, Xingfeng
作者单位	1.Nagoya Univ, Sch Informat Sci, Nagoya 4648601, Japan 2.Hainan Univ, Grad Sch Comp Sci & Technol, Haikou 570288, Peoples R China 3.Japan Adv Inst Sci & Technol, Sch Informat Sci, Nomi 9231292, Japan 4.Hainan Univ, Sch Comp Sci & Technol, Haikou 570288, Peoples R China 5.Chinese Acad Sci, Inst Automat, Beijing 100190, Peoples R China 6.Taiyuan Univ Technol, Coll Informat & Comp, Taiyuan 030024, Peoples R China
推荐引用方式 GB/T 7714	Li, Xingfeng,Shi, Xiaohan,Hu, Desheng,et al. Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING,2023,31:2534-2547.
APA	Li, Xingfeng.,Shi, Xiaohan.,Hu, Desheng.,Li, Yongwei.,Zhang, Qingchen.,...&Akagi, Masato.(2023).Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition.IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING,31,2534-2547.
MLA	Li, Xingfeng,et al."Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition".IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 31(2023):2534-2547.

入库方式： OAI收割

来源：自动化研究所

下载0

Music Theory-Inspired Acoustic Representation for Speech Emotion Recognition

其他版本