SpecMNet: Spectrum mend network for monaural speech enhancement
文献类型:期刊论文
作者 | Fan, Cunhang5![]() ![]() ![]() |
刊名 | APPLIED ACOUSTICS
![]() |
出版日期 | 2022-06-15 |
卷号 | 194页码:9 |
关键词 | Monaural speech enhancement Speech distortion Spectrum mend network SI-SNR BLSTM |
ISSN号 | 0003-682X |
DOI | 10.1016/j.apacoust.2022.108792 |
通讯作者 | Yi, Jiangyan(jiangyan.yi@nlpr.ia.ac.cn) ; Lv, Zhao(kjlz@ahu.edu.cn) ; Tao, Jianhua(jhtao@nlpr.ia.ac.cn) |
英文摘要 | Speech enhancement methods usually suffer from speech distortion problem, which leads to the enhanced speech losing so much significant speech information. This damages the speech quality and intelligibility. In order to address this issue, we propose a spectrum mend network (SpecMNet) for monaural speech enhancement. The proposed SpecMNet aims to retrieve the lost information by mending the weighted enhanced spectrum with weighted original spectrum. More specifically, the proposed algorithm consists of pre-enhancement network and the mend network. The main task of preenhancement network is to acquire the pre-enhanced spectrum so that it can remove the most of the noise signals. Because of the speech distortion problem, it loses a great deal of speech components. While the original spectrum has no speech information lost. Therefore, we utilize the original spectrum to mend the pre-enhanced spectrum by adding these two weighted spectrums so that the lost speech information can be retrieved. Then the mend network is used to predict mend weights for these two spectrums. Finally, the mended spectrum is used as the enhanced output. Our experiments are conducted on the TIMIT + (100 Nonspeech Sounds and NOISEX-92) datasets. Experimental results demonstrate that our proposed SpecMNet approach is effective to alleviate the speech distortion problem. (c) 2022 Elsevier Ltd. All rights reserved. |
WOS关键词 | NEURAL-NETWORK ; NOISE |
资助项目 | National Key Research and Development Program of China[2021ZD0201502] ; National Natural Science Foundation of China (NSFC)[61972437] ; Open Research Projects of Zhejiang Lab[2021KH0AB06] ; Open Projects Program of National Laboratory of Pattern Recognition[202200014] |
WOS研究方向 | Acoustics |
语种 | 英语 |
WOS记录号 | WOS:000798344800011 |
出版者 | ELSEVIER SCI LTD |
资助机构 | National Key Research and Development Program of China ; National Natural Science Foundation of China (NSFC) ; Open Research Projects of Zhejiang Lab ; Open Projects Program of National Laboratory of Pattern Recognition |
源URL | [http://ir.ia.ac.cn/handle/173211/49551] ![]() |
专题 | 模式识别国家重点实验室_智能交互 |
通讯作者 | Yi, Jiangyan; Lv, Zhao; Tao, Jianhua |
作者单位 | 1.Natl Inst Informat & Commun Technol, Kyoto, Japan 2.Univ Chinese Acad Sci, Beijing 100049, Peoples R China 3.Chinese Acad Sci, Inst Automat, NLPR, Beijing 100190, Peoples R China 4.Artificial Intelligence Res Inst, Zhejiang Lab, Hangzhou 311121, Peoples R China 5.Anhui Univ, Sch Comp Sci & Technol, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Peoples R China |
推荐引用方式 GB/T 7714 | Fan, Cunhang,Zhang, Hongmei,Yi, Jiangyan,et al. SpecMNet: Spectrum mend network for monaural speech enhancement[J]. APPLIED ACOUSTICS,2022,194:9. |
APA | Fan, Cunhang.,Zhang, Hongmei.,Yi, Jiangyan.,Lv, Zhao.,Tao, Jianhua.,...&Li, Sheng.(2022).SpecMNet: Spectrum mend network for monaural speech enhancement.APPLIED ACOUSTICS,194,9. |
MLA | Fan, Cunhang,et al."SpecMNet: Spectrum mend network for monaural speech enhancement".APPLIED ACOUSTICS 194(2022):9. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。