语音识别中的置信度研究与应用
文献类型:学位论文
作者 | 梁家恩 |
学位类别 | 工学博士 |
答辩日期 | 2006-06-08 |
授予单位 | 中国科学院研究生院 |
授予地点 | 中国科学院自动化研究所 |
导师 | 徐波 |
关键词 | 语音识别 关键词检测 置信度 在线垃圾模型 MCE训练 ASR KWS Confidence Measure Online Garbage MCE Training |
其他题名 | Research and Application of Confidence Measure in Speech Recognition |
学位专业 | 模式识别与智能系统 |
中文摘要 | 本文主要研究在不同的语音识别应用场合中,在不同语法约束条件下,置信度的一般计算方法和相关具体应用。本文将置信度计算归结为纯声学和带语言两个部分,对这两部分置信度的一般方法进行了分析,给出了统一的置信度计算方法,并在该计算框架下,分别研究在弱语法、统计语法和强语法约束条件下,置信度的具体计算方法和应用效果。论文工作的主要内容和创新点如下: (1) 归结出语音识别中置信度计算的一般方法,将置信度分为纯声学和带语言两个部分,并给出各自的一般计算方法,使得在不同语法约束条件下的置信度计算得以在统一的框架下进行研究。 (2) 以电话关键词检测系统为例,研究在弱语法约束条件下的置信度算法,在基于在线垃圾模型的声学置信度基础上,引入基于MCE准则的声学置信度优化方法,并利用局部语法信息提升置信度性能,使关键词检测系统的等错误率相对降低了13.8%。 (3) 以大词汇量连续语音识别系统为例,研究在统计语法条件下的置信度算法,将基于词图后验概率的置信度计算方法归结为语言置信度的计算,并指出该方法与基于在线垃圾模型的置信度在引入竞争路径提高性能方面的一致性。在2004年度“863”连续语音测试集下,置信度的等错误率达到22.7%。 (4) 以电话关键槽检测系统为例,研究在强语法条件下的置信度算法,主要说明槽语法的动态扩展技术对压缩搜索空间,降低搜索错误方面的应用,并给出利用置信度信息降低前垃圾词对关键槽检测系统影响的方法。通过这两种改进方法,系统的槽识别率从47.1%提高到了65.2%。 |
英文摘要 | In this paper, the general computation methods of confidence measure (CM) and its applications are discussed under different grammar constraints. Here confidence measure is divided into two parts: acoustic CM and linguistic CM. By combining these two parts, a unified computation algorithm of confidence measure is proposed, where the detailed algorithms and their applications are explored under different grammar constraints respectively. The contribution of the thesis is shown as follows: (1) Firstly, confidence measure is divided into two parts, whose general computa- tional methods are given, which makes it possible to cast different grammar- constraint condition into one unified framework. (2) Secondly, taken CTS keyword spotting system as an example, the confidence measure algorithm under weak-sense grammar constraint is studied, and MCE optimized acoustic confidence measure and context enhanced verification method are introduced, which makes use of discriminative training and local linguistic information to get better performance. In CTS keyword spotting system, EER drops by 13.8% relatively. (3) Thirdly, taken LVCSR system for instance, statistical grammar-based confidence measure is explored, and word graph posterior probability based CM can be regarded as linguistic confidence measure, which indicates that it is similar to online garbage model based CM. On 2004 “863” national LVCSR evaluation set, the EER of confidence measure is 22.7%. (4) Finally, in key slot spotting system, confidence measure under strong-sense grammar constraint is explored, which shows how dynamic extension of slot grammar will influence the compression of search space and suppression of search error, and CM based pruning method can reduce the search error as well. By these methods, the slot accuracy of the system increase from 47.1% to 65.2%. |
语种 | 中文 |
其他标识符 | 200418014690010 |
源URL | [http://ir.ia.ac.cn/handle/173211/5943] ![]() |
专题 | 毕业生_博士学位论文 |
推荐引用方式 GB/T 7714 | 梁家恩. 语音识别中的置信度研究与应用[D]. 中国科学院自动化研究所. 中国科学院研究生院. 2006. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。