中国科学院机构知识库网格系统: Multi-feature deep supervised voiceprint adversarial network for depression recognition from speech

Multi-feature deep supervised voiceprint adversarial network for depression recognition from speech

文献类型：期刊论文


作者	Pan, Yuchen 5; Shang, Yuanyuan 2,5; Wang, Wei 4,5; Shao, Zhuhong 1,5; Han, Zhuojin 5; Liu, Tie 1,5; Guo, Guodong 3; Ding, Hui 1,5
刊名	BIOMEDICAL SIGNAL PROCESSING AND CONTROL
出版日期	2024-03-01
卷号	89 页码:15
关键词	Adversarial learning Audio processing Attention mechanism Deep neural network Depression recognition Feature enhancement
ISSN号	1746-8094
DOI	10.1016/j.bspc.2023.105704
英文摘要	Depression can induce a range of physiological effects, leading to notable distinctions in the acoustic charac-teristics exhibited by individuals with depression as opposed to those without. Designing efficient algorithms to accurately identify depression through speech poses a formidable challenge. In this paper, we propose the Multi-Feature Deep Supervised Voiceprint Adversarial Network (MFDS-VAN) for audio-based depression recognition. The MFDS-VAN assimilates extracted acoustic features and the audio waveform, subsequently generating predictions regarding the depression score. In order to attain more robust and discriminative spatial- temporal features associated with depression, the Encoding Network module merges long-term and short-term acoustic features with the unprocessed audio waveform, while the Regression Network module enables prediction of the depression score. The Deep Supervised Regression algorithm is designed by combining GE2E clustering and Huber regression for better network optimization. Furthermore, to enhance the representation the MFDS-VAN while diminishing the influence of individual voiceprint information, we propose the Voiceprint Adversarial Network. Experimental results conducted on AVEC 2013, AVEC 2014, and AVEC 2017 datasets demonstrate that the MFDS-VAN significantly enhances robustness and performance in speech-based depression recognition. Our model achieves competitive results when compared to recent audio-based methodologies.
资助项目	National Natural Science tion of China[61876112] ; National Natural Science tion of China[61601311] ; Natural Science tion of Beijing, China[L201022]
WOS研究方向	Engineering
语种	英语
WOS记录号	WOS:001116988300001
出版者	ELSEVIER SCI LTD
源URL	[http://119.78.100.204/handle/2XEOYT63/38489]
专题	中国科学院计算技术研究所期刊论文_英文
通讯作者	Shang, Yuanyuan; Wang, Wei
作者单位	1.Beijing Key Lab Elect Syst Reliabil Technol, Beijing 100048, Peoples R China 2.Beijing Adv Innovat Ctr Imaging Technol, Beijing 100048, Peoples R China 3.West Virginia Univ, Lane Dept Comp Sci & Elect Engn, Morgantown, WV 26506 USA 4.Chinese Acad Sci, Inst Comp Technol, Beijing 100000, Peoples R China 5.Capital Normal Univ, Coll Informat Engn, Beijing 100048, Peoples R China
推荐引用方式 GB/T 7714	Pan, Yuchen,Shang, Yuanyuan,Wang, Wei,et al. Multi-feature deep supervised voiceprint adversarial network for depression recognition from speech[J]. BIOMEDICAL SIGNAL PROCESSING AND CONTROL,2024,89:15.
APA	Pan, Yuchen.,Shang, Yuanyuan.,Wang, Wei.,Shao, Zhuhong.,Han, Zhuojin.,...&Ding, Hui.(2024).Multi-feature deep supervised voiceprint adversarial network for depression recognition from speech.BIOMEDICAL SIGNAL PROCESSING AND CONTROL,89,15.
MLA	Pan, Yuchen,et al."Multi-feature deep supervised voiceprint adversarial network for depression recognition from speech".BIOMEDICAL SIGNAL PROCESSING AND CONTROL 89(2024):15.

入库方式： OAI收割

来源：计算技术研究所

下载0

Multi-feature deep supervised voiceprint adversarial network for depression recognition from speech

其他版本