中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Jointly Adversarial Enhancement Training for Robust End-to-End Speech Recognition

文献类型:会议论文

作者Bin,Liu1,6; Shuai,Nie6; Shan,Liang6; Wenju,Liu6; Meng,Yu5; Lianwu,Chen4; Shouye,Peng3; Changliang,Li2; Liang, Shan; Liu, Bin
出版日期2019-09
会议日期2019-9-15
会议地点Graz, Austria
关键词End-to-end Speech Recognition Robust Speech Recognition Speech Enhancement Generative Adversarial Networks
英文摘要

Recently, the end-to-end system has made significant breakthroughs
in the field of speech recognition. However, this single
end-to-end architecture is not especially robust to the input
variations interfered of noises and reverberations, resulting
in performance degradation dramatically in reality. To alleviate
this issue, the mainstream approach is to use a well-designed
speech enhancement module as the front-end of ASR. However,
enhancement modules would result in speech distortions
and mismatches to training, which sometimes degrades the ASR
performance. In this paper, we propose a jointly adversarial
enhancement training to boost robustness of end-to-end systems.
Specifically, we use a jointly compositional scheme of maskbased
enhancement network, attention-based encoder-decoder
network and discriminant network during training. The discriminator
is used to distinguish between the enhanced features from
enhancement network and clean features, which could guide enhancement network to output towards the realistic distribution.
With the joint optimization of the recognition, enhancement and
adversarial loss, the compositional scheme is expected to learn
more robust representations for the recognition task automatically.
Systematic experiments on AISHELL-1 show that the
proposed method improves the noise robustness of end-to-end
systems and achieves the relative error rate reduction of 4.6%
over the multi-condition training.

会议录出版者ISCA
会议录出版地Austria
语种英语
资助项目National Natural Science Foundation of China[61573357] ; National Natural Science Foundation of China[61503382] ; National Natural Science Foundation of China[61403370] ; National Natural Science Foundation of China[61273267] ; National Natural Science Foundation of China[91120303]
源URL[http://ir.ia.ac.cn/handle/173211/38561]  
专题模式识别国家重点实验室_智能交互
作者单位1.School of Artificial Intelligence, University of Chinese Academy of Sciences, China
2.kingsoft AI lab, China
3.Xueersi Online School, China
4.Tencent AI Lab, Shenzhen, China
5.Tencent AI Lab, Bellevue, WA, USA
6.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, China
推荐引用方式
GB/T 7714
Bin,Liu,Shuai,Nie,Shan,Liang,et al. Jointly Adversarial Enhancement Training for Robust End-to-End Speech Recognition[C]. 见:. Graz, Austria. 2019-9-15.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。