中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Phoneme dependent speaker embedding and model factorization for multi-speaker speech synthesis and adaptation

文献类型:会议论文

作者Fu, Ruibo1,3; Tao, Jianhua1,2,3; Wen, Zhengqi1; Zheng, Yibin1,3
出版日期2019-05
会议日期MAY 12-17,2019
会议地点Brighton,UK
关键词speech synthesis speaker adaptation speaker embedding phoneme representation
页码6930-6934
英文摘要

This paper presents an architecture to perform speaker adaption in long short-term memory (LSTM) based Mandarin statistical parametric speech synthesis system. Compared with the conventional methods that focused on using fixed global speaker representations in utterance level for speaker recognition task, the proposed method extracts speaker representations in utterance and phoneme level, which can describe more pronunciation characteristics in phoneme level. And an attention mechanism is deployed to combine each level representations dynamically to train a task-specific phoneme dependent speaker embedding. To handle the unbalanced database and avoid over-fitting, the model is factored into an average model and an adaptation model and combined by an attention mechanism. We investigate the performance of speaker representations extracted by different methods. Experimental results confirm the adaptability of our proposed speaker embedding and model factorization structure. And listening tests demonstrate that our proposed method can achieve better adaptation performance than baselines in terms of naturalness and speaker similarity.

会议录出版者IEEE
语种英语
源URL[http://ir.ia.ac.cn/handle/173211/39591]  
专题模式识别国家重点实验室_智能交互
通讯作者Tao, Jianhua
作者单位1.National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences
2.School of Artificial Intelligence, University of Chinese Academy of Sciences
3.CAS Center for Excellence in Brain Science and Intelligence Technology
推荐引用方式
GB/T 7714
Fu, Ruibo,Tao, Jianhua,Wen, Zhengqi,et al. Phoneme dependent speaker embedding and model factorization for multi-speaker speech synthesis and adaptation[C]. 见:. Brighton,UK. MAY 12-17,2019.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。