中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
儿童语音识别中的关键技术研究

文献类型:学位论文

作者马瑞堂
学位类别工学硕士
答辩日期2007-06-10
授予单位中国科学院研究生院
授予地点中国科学院自动化研究所
导师李成荣
关键词儿童语音识别 儿童语音分析 声道归一化 人机语音交互 children's speech recognition children's speech analysis vocal tract length normalization speech interaction
其他题名Research on Key Techniques of Children's Speech Recognition
学位专业计算机应用技术
中文摘要语音识别技术经过几十年的艰苦探索和研究,已经获得了极大的发展,并开始逐步应用于日常生活中。但语音识别技术中存在的一些问题,特别是儿童语音识别,成为阻碍该技术进一步推广的主要障碍。在我们的系统应用中发现84%的语音数据来自儿童,而成人语音训练的系统用于儿童语音识别时,识别性能会急剧下降。 本文开展的工作主要集中于儿童语音识别中的关键技术研究。概括起来有以下几个方面: 1.分析了儿童语音的特点。在已有的儿童语音数据库基础上,通过对儿童语音基频和共振峰的求取,分析了儿童语音与年龄变化的关系,指出了儿童语音与成人语音存在的差异。 2.研究了儿童语音自适应技术。对男声,女声和混合语音各自训练的模型进行了性能比较,并且将声道长度归一化的说话人自适应技术用于儿童语音识别,在此基础上提出了一种基于比例门限动态调整的办法,使识别率得到了进一步提高。 3.对人机语音交互技术与模块的研究。介绍了DSP平台,识别系统优化和对话管理等相关技术以及交互模块的应用。
英文摘要Speech recognition technique has approached maturity as people spent many years in studying this subject, and it has been employed in our daily life. But disadvantages still exist in the practical applications of speech recognition techniques, especially for recognition of children’s speech. We found that 84% of data recorded by the robot are collected from children. However, recognition experiments using acoustic models trained from adult speech and tested against speech from children show performance degradation clearly. In this paper we focus on the key techniques of children’s speech recognition. There are several aspects in my work: 1.Children’s speech analysis. Based on the children’s speech database, we measured together with the pitch and formant frequencies, analyzed the age effects on children’s speech, and figured out the speech difference between children and adult. 2.Research on children’s speech adaptation techniques. Some recognition experiments have been done using several different acoustics models. One of these models is trained from children’s speech, one is from boys’ speech and another one is from girls’. For improving the performance of children’s speech recognition, a new approach which based on vocal tract length normalization by changing the scale threshold dynamical is introduced. 3.Research on the speech interactive technology and module. This paper introduced embedded system implementation on DSP, optimization techniques, dialogue management and so on. The applications of the module were introduced too.
语种中文
其他标识符200428014628011
源URL[http://ir.ia.ac.cn/handle/173211/7409]  
专题毕业生_硕士学位论文
推荐引用方式
GB/T 7714
马瑞堂. 儿童语音识别中的关键技术研究[D]. 中国科学院自动化研究所. 中国科学院研究生院. 2007.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。