联机手写日文文档分析方法研究
文献类型:学位论文
作者 | 周祥东 |
学位类别 | 工学博士 |
答辩日期 | 2009-02-18 |
授予单位 | 中国科学院研究生院 |
授予地点 | 中国科学院自动化研究所 |
导师 | 刘成林 |
关键词 | 联机手写文档分析 笔划分类 文本行分割 字符串识别 条件随机场 ink document analysis stroke classification text line grouping character string recognition conditional random field |
其他题名 | Methods for Online Handwritten Japanese Document Analysis |
学位专业 | 模式识别与智能系统 |
中文摘要 | 随着平板电脑、书写板、数码笔等笔输入设备的广泛应用,用户能够在更大的界面上输入文本、绘制图形、表格等操作,同时笔迹被这些设备捕获并保存成联机文档。与传统的笔和纸相比,联机文档从自动处理角度具有潜在的便利性,然而无约束的自由书写也给文档的自动处理带来了挑战。对于联机手写文档的分析,首先要将笔划聚类成文档结构对象,例如文本行和图形,然后再对不同的文档结构对象分别进行识别。本文对联机手写文档分析中的三个主要问题进行研究,包括笔划分类、文本行分割和字符串识别。这些方法在日文文档数据上进行实验,但同样可用于中文文档。主要工作包括以下几个方面: 一、为了有效地利用笔划之间的空间关系信息,本文利用马尔科夫随机场(MRF)对联机手写日文文档中的笔划进行分类,将笔划分为文本与非文本两类。MRF中的似然位势则由支持向量机(SVM)分类器的输出经概率转化得到。实验表明基于MRF方法的笔划分类正确率要高于基于隐马尔科夫模型(HMM)的方法和基于SVM的单分类方法。 二、针对联机手写文档中文本行书写方向任意、相互之间不必平行、距离可以任意靠近的特点,本文提出一种结合笔划序列的时间与空间信息的有效的文本行分割方法。该方法中主要步骤的参数都是在数据上经过监督学习得到,只用了很少的先验知识。在实验中从多个方面对文本行分割算法进行评价,证明了本算法的有效性。 三、为了在手写字符串识别中得到更高的字符切分和识别精度,本文利用条件随机场(CRF)在最大后验概率(MAP)框架下对字符串的候选切分和识别方式进行评价。该框架集成了字符分类器输出、语言上下文和几何上下文,能够有效地克服字符串候选切分长度的影响。为了提高系统性能,本文利用了多种几何特征函数,包括一元和二元几何特征函数,以及与字符类别有关和无关的几何特征函数。实验结果表明,对比基于归一化路径评价准则的方法,基于CRF的方法能够达到更高的切分和识别准确率,同时几何特征能够显著地提高系统性能。 |
英文摘要 | With the increasing use of tablet PCs and electronic whiteboards, users can input various heterogeneous structures such as text, drawings and table forms freely on a large writing area. Such digital ink captured by the pen-based input devices offers many potential advantages compared to traditional pen and paper. Nevertheless, the free and heterogeous structure of handwritten ink documents brings new challenges to automatic processing. For ink document analysis, the ink strokes should be first grouped into structural units such as text lines, mathematical formulas and drawings, which are then recognized respectively. This dissertation focuses on three important issues of ink document analysis: stroke classification, text line grouping and character string recognition. The methods were experimented on Japanese documents, but are applicable to Chinese documents as well. The major contributions of this dissertation are as follows: (1) To utilize the spatial relationship between strokes in online handwritten documents more effectively, we propose an approach based on Markov random field (MRF) for separating text and non-text ink strokes. The likelihood clique potentials of MRF are derived by converting the classifier outputs of support vector machines (SVMs) to probabilities. Experimental results show that the MRF based method can achieve higher recognition accuracy compared to the hidden Markov model (HMM) based method and the individual classification using SVM. (2) To exract text lines from online handwritten documents, we propose an effective approach for text line grouping by combing temporal and spatial information. To counter the situation of arbitrarily oriented, non-parallel and interfering text lines, the approach comprises some decision functions optimized by supervised learning from data, and hence involves few artificial parameters and utilizes little prior knowledge. The text line segmentation algorithms were evaluated in experiments in respect of multiple performance metrics, and the proposed algorithm was demonstrated superior. (3) To improve the performance of handwritten character string recognition, we propose a principled maximum a posterior (MAP) framework based on conditional random field (CRF) for fusing the information of character recognition, geometric context and linguistic context based on candidate character segmentation. This framework can effectively overcome the variable length of candidate segmentation. To measure the geoem... |
语种 | 中文 |
其他标识符 | 200518014628063 |
源URL | [http://ir.ia.ac.cn/handle/173211/6140] ![]() |
专题 | 毕业生_博士学位论文 |
推荐引用方式 GB/T 7714 | 周祥东. 联机手写日文文档分析方法研究[D]. 中国科学院自动化研究所. 中国科学院研究生院. 2009. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。