中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
强噪声条件下的印刷体符号识别

文献类型:学位论文

作者董建雄
学位类别工学硕士
答辩日期1999-05-01
授予单位中国科学院自动化研究所
授予地点中国科学院自动化研究所
导师刘迎建
学位专业模式识别与智能系统
中文摘要近二十年来,符号识别技术取得了很大的进展.但近年来,社会需 要对符号识别技术提出更高的要求。在一些具体领域如银行帐单、税务 票据识别,它要求有非常高的识别准确性和可靠性。通常的OCR技术 已不适应这方面的要求。在这样的背景下,本文通过对符号识别问题的 深入分析,提出了分类器集成方案和符号识别可靠性等一系列有效技 术,并把它们应用到增值税发票自动处理系统中去.论文的主要内容如 下。 首先分析了强噪声条件下的印刷体符号识别研究的背景,指出该问 题的性质和难点,介绍了当前符号识别技术的发展状况,指明提高符号 识别系统性能可采用的一些有效技术。对于BP分类器,结合AdaBoost 机器学习算法提出一种新的训练算法,有效地改善了神经网络的推广能 力;在特征字的比较过程中,对几种有代表性的方法进行了实验比较, 得出特征向量集成是一种有效的技术;同时,本文还提出基于样本质量 分类的多识别器集成方案,获得很好的结果,为噪声条件下的字符识别 提供一种新的思路;最后,引入符号识别可靠性概念,基于多专家判定 和图象上下文提出一种实用的解决方案。 跟现有技术相比,本文的分类器集成方法和可靠性方案具有开拓 性,它们对强噪声条件下的符号识别性能有明显的改善.
英文摘要Symbol recognition techniques achieve a great progress in the late twenty years. But in recent years, social backgrounds urge more achievements in symbol recognition. It requires very high recognition rate and high reliability in some applied fields such as bank check and tax form recognition The general OCR techniques do not adapt these requirements. Under this Background, the thesis presents some effective techniques of combination of multi-classifiers and solution of reliability via deep analysis of the problem nature. These techniques have been successfully applied to the automatic entry system of tax form. The main contents of the thesis are as follows. Firstly the research background of machine-printed symbol recognition under bad noises is analyzed. The nature of its problem and difficulties is pointed out. The state of symbol recognition researches is also surveyed and some effective techniques for improving performance of symbol recognition system are proposed. In respect to backpropagation classifier, a new training method combined with AdaBoost algorithm is presented, which greatly improves the generation of neural network. In the process of feature character comparison, some existing methods are implemented and tested comparatively in experiments. As a result, a conclusion is drawn that synthesis of feature vector is an effective technique. In addition, the thesis presents a new method of combination of multi-classifiers based on the quality of training samples and a good result is obtained. It provides an original idea for character recognition under the noise conditions. Finally, conception of reliability of symbol recognition is introduced and an applicable method based on the vote of multi-experts and image context is proposed. Relative to the current state of symbol recognition research, the method of combination of multi-classifiers and solution of symbol recognition reliability in this thesis are creative. The performance of symbol recognition under bad noises has been improved obviously.
语种中文
其他标识符555
源URL[http://ir.ia.ac.cn/handle/173211/7287]  
专题毕业生_硕士学位论文
推荐引用方式
GB/T 7714
董建雄. 强噪声条件下的印刷体符号识别[D]. 中国科学院自动化研究所. 中国科学院自动化研究所. 1999.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。