中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Investigation of Stacked Deep Neural Networks and Mixture Density Networks for Acoustic-to-Articulatory Inversion

文献类型:会议论文

作者Xie,Xurong; Liu,Xunying; Lee Tan; Wang Lan
出版日期2018
会议日期2018
会议地点台湾
英文摘要Acoustic-to-articulatory inversion predicting articulatory move ment based on the acoustic signal is useful for many appli cations like talking head, speech recognition, and education. DNN based technologies have achieved the state-of-the-art per formance in the area. This paper investigates different stacked network architectures for acoustic-to-articulatory inversion. T wo levels of DNNs or mixture density networks (MDNs) can be connected using different types of auxiliary features, includ ing bottleneck features, directly generated features, and predict ed articulatory features via MLPG algorithm extracted from the first level network. For the experiments, stacked systems using DNNs, time-delay DNNs (TDNNs), RNNs and MDNs were e valuated on both the MNGU0 English EMA database and AIM SL Chinese EMA database. Finally, on the default configu rations of MNGU0 data using LSF acoustic features, the pro posed stacked system using feed-forward MDNs with ellipsoid variance and MLPG generated features got 0.718mm in RMSE, which is similar to the RNN and RNN-MDN BLSTM systems with slower and more difficult training stage.
源URL[http://ir.siat.ac.cn:8080/handle/172644/13715]  
专题深圳先进技术研究院_集成所
推荐引用方式
GB/T 7714
Xie,Xurong,Liu,Xunying,Lee Tan,et al. Investigation of Stacked Deep Neural Networks and Mixture Density Networks for Acoustic-to-Articulatory Inversion[C]. 见:. 台湾. 2018.

入库方式: OAI收割

来源:深圳先进技术研究院

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。