中国科学院机构知识库网格系统: 基于大规模特征学习的手写汉字识别研究

中国科学院机构知识库网格

Chinese Academy of Sciences Institutional Repositories Grid

基于大规模特征学习的手写汉字识别研究

文献类型：学位论文


作者	周明可
学位类别	工学博士
答辩日期	2015-04-26
授予单位	中国科学院大学
授予地点	中国科学院自动化研究所
导师	刘成林
关键词	手写汉字识别 GPU并行计算判别特征学习二次相关特征升维训练集扩充 handwritten Chinese character recognition GPU parallel computing discriminative feature learning quadratic correlation feature dimensionality promotion training set expansion
其他题名	Handwritten Chinese Character Recognition Based on Large-Scale Feature Learning
学位专业	模式识别与智能系统
中文摘要	手写汉字的识别技术可以应用在国民经济的很多行业中，比如邮政地址识别使得自动邮件分拣成为可能，这样能够节省大量的人力。银行票据、税单、书籍、手稿等的识别可以将文档图像电子化，便于日后的管理、搜索和传输。手写汉字识别作为手写中文文档识别的一个基本组成部分，长期以来得到很多研究人员的关注。手写汉字识别的主要难点在于：字符类别集很庞大，而且存在很多相似字，书写风格差异导致不同书写者书写的同类字符变形很大。以上困难使得自由书写的汉字识别性能一直难以令人满意。鉴于此，本文在传统识别方法的基础上，通过分类器训练加速和大规模特征学习来提升手写汉字识别性能。主要工作和贡献如下。（1）使用图形处理器（GPU）对训练过程做并行加速。为了提高分类器的泛化性能，增加训练集的规模是常用的方法。然而，大规模训练集对一些分类器的训练带来了挑战，特别是一些基于判别学习的训练方法。GPU 拥有大量的浮点计算单元，适合于做大规模并行计算。本文使用GPU对判别特征提取（DFE）和判别学习的二次判别函数（DLQDF）分类器的训练过程做并行加速，使它们的训练速度分别提高了30倍和10倍，从而使大规模数据集训练成为可能。（2）提出大规模特征学习方法提高识别性能。为了提高特征的判别能力，本文在原始梯度方向直方图特征的基础上，利用特征间的相关信息进行二次升维，得到了数万计的二次特征。然后在二次特征和梯度特征构成的高维特征空间中，使用判别学习获得低维特征子空间。由于在特征向量中引入了大量二次信息并使用了判别学习，最终得到的低维特征是具有较强判别能力的二次特征。最后，在该子空间上训练分类器。同时，为了增强特征学习和分类器的泛化性能，本文使用合成样本对训练集进行扩充。在手写汉字识别上，使用本文提出的特征学习方法和DLQDF分类器，我们获得了和深度卷积神经网络（deep CNN）相当的性能，而训练和识别的计算代价要比深度卷积神经网络低很多。
英文摘要	Machine recognition of Chinese handwriting finds many applications. For example, mail address recognition leads to automatic mail sorting which saves a lot of human labor. Bank check reading, tax form processing, book and handwritten notes transcription transfer documents into digital format which is convenient for administration, search and transmission. Handwritten Chinese character recognition (HCCR), which is an integral part of handwritten Chinese text recognition, has drawn much attention from the community. The difficulty of HCCR lies in the large number of character classes, the presence of many confusing character pairs, and the variability of writing styles. Due to these difficulties, the performance of HCCR is still not satisfactory. In this thesis, based on traditional recognition methods, we boost the performance of HCCR with two methods -- accelerating classifier training and large-scale feature learning. The proposed methods are summarized as follows. 1. Accelerating classifier training using graphics processing units (GPU). For enhancing the generalization performance of classifiers, training set expansion is an effective strategy usually adopted. However, increasing training set incurs long training time for classifier training, especially for those methods based on discriminative learning. In this thesis, we propose to accelerate the training of discriminative feature extraction (DFE) and discriminative learning quadratic discriminant function (DLQDF) classifier by parallelizing the computation using GPU. Thirty times and ten times speedup was achieved, respectively. 2. Improving recognition accuracy through large-scale feature learning. For enhancing the discrimination ability of features, we increase the feature dimensionality by using statistical and spatial correlation of original gradient direction histogram features. This generates tens of thousands of quadratic features. A low-dimensional subspace is learned from the quadratic features and original gradient features by discriminative learning. For the integration of quadratic information and discriminative learning, the resultant subspace features are quadratic as well as discriminative. To further improve the generalization capability of learned features as well as classifiers, we expand the training set with synthesized samples. In experiments of HCCR, with the proposed feature learning method and DLQDF classifier, we achieved recognition performance comparable to deep conv...
语种	中文
其他标识符	201118014628077
源URL	[http://ir.ia.ac.cn/handle/173211/6667]
专题	毕业生_博士学位论文
推荐引用方式 GB/T 7714	周明可. 基于大规模特征学习的手写汉字识别研究[D]. 中国科学院自动化研究所. 中国科学院大学. 2015.

入库方式： OAI收割

来源：自动化研究所

浏览0

下载0

收藏0

其他版本

除非特别说明，本系统中所有内容都受版权保护，并保留所有权利。