中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
人机口语对话系统的知识自动生成技术

文献类型:学位论文

作者黄韵竹
学位类别工学硕士
答辩日期2011-05-27
授予单位中国科学院研究生院
授予地点中国科学院自动化研究所
导师李成荣
关键词人机口语对话系统 词类扩展 一阶谓词逻辑 依存句法分析 知识库生成 human-computer dialog system word class expansion first order predicate logic dependency parsing knowledge base generation
其他题名Automatic Knowledge Generation Technique of Human-Computer Dialog System
学位专业模式识别与智能系统
中文摘要人机口语对话技术使得人机交互更加简单自然。然而,要生成一个人机口语对话系统,需要耗费大量的人力物力。如何自动的搜集限定领域语言模型的训练语料以及构建人机口语对话系统的知识库,是当前的两个研究难点。本文针对这些问题,重点对日常对话聊天领域开展研究,提出了半自动扩展语言模型训练语料和构建口语对话知识库的方法。论文的主要内容和贡献如下: 1. 从词级扩展的层面,提出了一种词类扩展方法,并通过实验说明了该方法对语音识别系统的贡献。 2. 提出了一种半自动生成一阶谓词知识表示的方法。该方法利用了依存句法分析。首先对句子去停用词,然后对句子进行句法分析,再根据分析结果和关键词表将句子转换成一阶谓词形式,最后生成谓词知识库。实验表明,采用该方法生成的知识库具有很高的检出率。 3. 将词类的思想用在口语对话知识库上。根据句型将文本进行分类,同类句型只保留一句,其它以同类词的形式存入词类查询表,并且进一步进行词类扩展。采用该方法可以大大缩小知识库的规模,提高系统的处理速度。 4. 运用词类语料扩展和一阶谓词知识表示方法,改进了语音地球仪系统。
英文摘要Human-computer dialog technology makes human-computer interaction more simple and natural. However, to generate a Human-computer dialog system a lot of manpower and resources are required. How to automatically collect training corpus of language modal in restricted domain and build knowledge base of Human-computer dialog system, are two challenges in current research. Focusing on the daily chatting area, we propose approaches of semi-automatic extension to the training corpus of language modal and building knowledge base for dialog. The main contents and contributions are as follows: 1. From the level of word-level expansion, a type of word class expansion methods is introduced. We illustrate the contribution of the method to the speech recognition system through experiments. 2. A semi-automatic method of generating first order predicate knowledge is proposed. The dependency parsing theory is used. We first get rid of stop words in the sentences, then analyze sentences with dependency parsing, next according to the parsing result and a key-word list convert the sentences into first order predicate logic form, finally generate the predicate logic knowledge base. Experimental results show that the method can reach the application level. 3. The thought of word classes applied on the knowledge base of the dialogue system. According to the sentence structure, we classify the text and maintain only one of the same structures. The other is saved into a list of word classes. Then the word classes are further expanded. This method can greatly reduce the size of the knowledge base, and improve processing speed. 4. Applying word class expansion and first order predicate logic knowledge representation methods, we improve the speech globe system.
语种中文
其他标识符200828014628036
源URL[http://ir.ia.ac.cn/handle/173211/7569]  
专题毕业生_硕士学位论文
推荐引用方式
GB/T 7714
黄韵竹. 人机口语对话系统的知识自动生成技术[D]. 中国科学院自动化研究所. 中国科学院研究生院. 2011.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。