中国科学院机构知识库网格系统: Investigation of knowledge transfer approaches to improve the acoustic modeling of Vietnamese ASR system

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

Investigation of knowledge transfer approaches to improve the acoustic modeling of Vietnamese ASR system

文献类型：期刊论文


作者	Liu, DY (Liu, Danyang)[ 1,2 ]; Xu, J (Xu, Ji)[ 1 ]; Zhang, PY (Zhang, Pengyuan)[ 1,2 ]; Yan, YH (Yan, Yonghong)[ 1,2,3 ]
刊名	IEEE-CAA JOURNAL OF AUTOMATICA SINICA
出版日期	2019
卷号	6 期号:5 页码:1187-1195
关键词	Bottleneck feature (BNF) cross-lingual automatic speech recognition (ASR) progressive neural networks (Prognets) model transfer learning
ISSN号	2329-9266
DOI	10.1109/JAS.2019.1911693
英文摘要	It is well known that automatic speech recognition (ASR) is a resource consuming task. It takes sufficient amount of data to train a state-of-the-art deep neural network acoustic model. As for some low-resource languages where scripted speech is difficult to obtain, data sparsity is the main problem that limits the performance of speech recognition system. In this paper, several knowledge transfer methods are investigated to overcome the data sparsity problem with the help of high-resource languages. The first one is a pre-training and fine-tuning (PT/FT) method, in which the parameters of hidden layers are initialized with a well-trained neural network. Secondly, the progressive neural networks (Prognets) are investigated. With the help of lateral connections in the network architecture, Prognets are immune to forgetting effect and superior in knowledge transferring. Finally, bottleneck features (BNF) are extracted using cross-lingual deep neural networks and serves as an enhanced feature to improve the performance of ASR system. Experiments are conducted in a low-resource Vietnamese dataset. The results show that all three methods yield significant gains over the baseline system, and the Prognets acoustic model performs the best. Further improvements can be obtained by combining the Prognets model and bottleneck features.
语种	英语
WOS记录号	WOS:000489759800010
源URL	[http://ir.xjipc.cas.cn/handle/365002/7218]
专题	新疆理化技术研究所_多语种信息技术研究室
作者单位	1.Chinese Acad Sci, Xinjiang Tech Inst Phys & Chem, Xinjiang Lab Minor Speech & Language Informat Pro, Urumqi 830011, Peoples R China 2.Univ Chinese Acad Sci, Sch Elect Elect & Commun Engn, Beijing 101408, Peoples R China 3.Chinese Acad Sci, Inst Acoust, Key Lab Speech Acoust & Content Understanding, Beijing 100190, Peoples R China
推荐引用方式 GB/T 7714	Liu, DY ,Xu, J ,Zhang, PY ,et al. Investigation of knowledge transfer approaches to improve the acoustic modeling of Vietnamese ASR system[J]. IEEE-CAA JOURNAL OF AUTOMATICA SINICA,2019,6(5):1187-1195.
APA	Liu, DY ,Xu, J ,Zhang, PY ,&Yan, YH .(2019).Investigation of knowledge transfer approaches to improve the acoustic modeling of Vietnamese ASR system.IEEE-CAA JOURNAL OF AUTOMATICA SINICA,6(5),1187-1195.
MLA	Liu, DY ,et al."Investigation of knowledge transfer approaches to improve the acoustic modeling of Vietnamese ASR system".IEEE-CAA JOURNAL OF AUTOMATICA SINICA 6.5(2019):1187-1195.

入库方式： OAI收割

来源：新疆理化技术研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。