EmotionNAS: Two-stream Neural Architecture Search for Speech Emotion Recognition
文献类型:会议论文
作者 | Haiyang Sun3,4; Zheng Lian3![]() ![]() ![]() ![]() ![]() ![]() |
出版日期 | 2023 |
会议日期 | 20-24 August 2023 |
会议地点 | Dublin, Ireland |
英文摘要 | Speech emotion recognition (SER) is an important research topic in human-computer interaction. Existing works mainly rely on human expertise to design models. Despite their success, different datasets often require distinct structures and hyperparameters. Searching for an optimal model for each dataset is time-consuming and labor-intensive. To address this problem, we propose a two-stream neural architecture search (NAS) based framework, called “EmotionNAS”. Specifically, we take two-stream features (i.e., handcrafted and deep features) as the inputs, followed by NAS to search for the optimal structure for each stream. Furthermore, we incorporate complementary information in different streams through an efficient information supplement module. Experimental results demonstrate that our method outperforms existing manually-designed and NASbased models, setting the new state-of-the-art record. |
源URL | [http://ir.ia.ac.cn/handle/173211/57089] ![]() |
专题 | 模式识别国家重点实验室_智能交互 |
通讯作者 | Bin Liu; Jianhua Tao |
作者单位 | 1.Ant Group 2.Department of Automation, Tsinghua University 3.State Key Laboratory of Multimodal Artificial Intelligence Systems 4.School of Artificial Intelligence, University of Chinese Academy of Sciences |
推荐引用方式 GB/T 7714 | Haiyang Sun,Zheng Lian,Bin Liu,et al. EmotionNAS: Two-stream Neural Architecture Search for Speech Emotion Recognition[C]. 见:. Dublin, Ireland. 20-24 August 2023. |
入库方式: OAI收割
来源:自动化研究所
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。