|
作者 | Zou, Yuxiang1,2 ; Dong, Linhao1,2 ; Xu, Bo1
|
出版日期 | 2019
|
会议日期 | 2019.9.15-2019.9.19
|
会议地点 | 奥地利
|
英文摘要 | Recent character-based end-to-end text-to-speech (TTS) systems
have shown promising performance in natural speech generation,
especially for English. However, for Chinese TTS, the
character-based model is easy to generate speech with wrong
pronunciation due to the label sparsity issue. To address this
issue, we introduce an additional learning task of character-topinyin
mapping to boost the pronunciation learning of characters,
and leverage a pre-trained dictionary network to correct the
pronunciation mistake through joint training. Specifically, our
model predicts pinyin labels as an auxiliary task to assist learning
better hidden representations of Chinese characters, where
pinyin is a standard phonetic representation for Chinese characters.
The dictionary network plays a role as a tutor to further
help hidden representation learning. Experiments demonstrate
that employing the pinyin auxiliary task and an external dictionary
network clearly enhances the naturalness and intelligibility
of the synthetic speech directly from the Chinese character sequences. |
源URL | [http://ir.ia.ac.cn/handle/173211/39136]  |
专题 | 数字内容技术与服务研究中心_听觉模型与认知计算
|
通讯作者 | Xu, Bo |
作者单位 | 1.中国科学院自动化研究所 2.中国科学院大学
|
推荐引用方式 GB/T 7714 |
Zou, Yuxiang,Dong, Linhao,Xu, Bo. Boosting Character-Based Chinese Speech Synthesis via Multi-Task Learning and Dictionary Tutoring[C]. 见:. 奥地利. 2019.9.15-2019.9.19.
|