Investigation of Stacked Deep Neural Networks and Mixture Density Networks for Acoustic-to-Articulatory Inversion
文献类型:会议论文
作者 | Xie,Xurong; Liu,Xunying; Lee Tan; Wang Lan |
出版日期 | 2018 |
会议日期 | 2018 |
会议地点 | 台湾 |
英文摘要 | Acoustic-to-articulatory inversion predicting articulatory move ment based on the acoustic signal is useful for many appli cations like talking head, speech recognition, and education. DNN based technologies have achieved the state-of-the-art per formance in the area. This paper investigates different stacked network architectures for acoustic-to-articulatory inversion. T wo levels of DNNs or mixture density networks (MDNs) can be connected using different types of auxiliary features, includ ing bottleneck features, directly generated features, and predict ed articulatory features via MLPG algorithm extracted from the first level network. For the experiments, stacked systems using DNNs, time-delay DNNs (TDNNs), RNNs and MDNs were e valuated on both the MNGU0 English EMA database and AIM SL Chinese EMA database. Finally, on the default configu rations of MNGU0 data using LSF acoustic features, the pro posed stacked system using feed-forward MDNs with ellipsoid variance and MLPG generated features got 0.718mm in RMSE, which is similar to the RNN and RNN-MDN BLSTM systems with slower and more difficult training stage. |
源URL | [http://ir.siat.ac.cn:8080/handle/172644/13715] ![]() |
专题 | 深圳先进技术研究院_集成所 |
推荐引用方式 GB/T 7714 | Xie,Xurong,Liu,Xunying,Lee Tan,et al. Investigation of Stacked Deep Neural Networks and Mixture Density Networks for Acoustic-to-Articulatory Inversion[C]. 见:. 台湾. 2018. |
入库方式: OAI收割
来源:深圳先进技术研究院
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。