中国科学院机构知识库网格系统: 文本无关发音质量评估系统中声学模型的若干研究和改进

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

文本无关发音质量评估系统中声学模型的若干研究和改进

文献类型：期刊论文


作者	蒋同海; 齐耀辉; 葛凤培; 颜永红
刊名	网络新媒体技术
出版日期	2012
卷号	1 期号:2 页码:47-53
关键词	文本无关发音质量评估声学模型 MAP 基于说话人的倒谱均值方差规整
ISSN号	2095-347X
中文摘要	在无关的发音质量评估系统中,需要先识别出待测语音的说话内容,才能进行准确评估。真实的评测数据往往有很多不利的因素影响识别正确率,包括噪声、方言口音、信道噪声、说话随意性等。针对这些不利因素,本文对声学模型进行了深入的研究,包括:在训练数据中加入背景噪声,增强了模型的抗噪声能力;采用基于说话人的倒谱均值方差规整(SCMVN),降低信道及说话人个体特性的影响;用和待测语音相同地域的朗读数据做最大后验概率(MAP)自适应,使模型带有当地方言口音的发音特点;用自然口语数据做MAP自适应,使模型较好地描述自然口语中比较随意的发音现象。实验结果表明,使用这些措施之后,使待测语音的识别正确率相对提高了44.1%,从而使机器评分和专家评分的相关系数相对提高了6.3%。
英文摘要	In order to give an accurate assessment，the text of test speech should be recognized firstly in text － independent pronunciation quality assessment． Real evaluation data have some disadvantageous factors which affect the correct rate of recognition，such as noise，accent，channel noise and spontaneous speaking style． In this paper we investigate these factors by improving the acoustic model of the speech recognition system． Background noise is added to the training data to enhance the ability of anti － noise． Speaker － based Cepstral Mean and Variance Normalization ( SCMVN) is adopted to alleviate the distortion of channel and the impact of inter － speaker variability． Maximum a Posteriori ( MAP) adaptation is done by using reading speech from the same region as the test data to tune acoustic model to match the pronunciation characteristic of the accent． Spontaneous speech are used to do MAP adaptation to tune acoustic model to describe the spontaneous style in spoken language． According to the experimental results，the speech recognition accuracy of word correct rate is improved relatively by 44． 1%，and the speech evaluation accuracy of correlation coefficient between machine and expert score is improved relatively by 6.3%．
公开日期	2013-05-03
源URL	[http://ir.xjipc.cas.cn/handle/365002/2420]
专题	新疆理化技术研究所_多语种信息技术研究室
作者单位	中国科学院新疆理化技术研究所；中国科学院语言声学与内容理解重点实验室；北京理工大学信息与电子学院；河北师范大学物理科学与信息工程学院
推荐引用方式 GB/T 7714	蒋同海,齐耀辉,葛凤培,等. 文本无关发音质量评估系统中声学模型的若干研究和改进[J]. 网络新媒体技术,2012,1(2):47-53.
APA	蒋同海,齐耀辉,葛凤培,&颜永红.(2012).文本无关发音质量评估系统中声学模型的若干研究和改进.网络新媒体技术,1(2),47-53.
MLA	蒋同海,et al."文本无关发音质量评估系统中声学模型的若干研究和改进".网络新媒体技术 1.2(2012):47-53.

入库方式： OAI收割

来源：新疆理化技术研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。