中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
热门
Handwritten Chinese/Japanese Text Recognition Using Semi-Markov Conditional Random Fields

文献类型:期刊论文

作者Zhou, Xiang-Dong ; Wang, Da-Han ; Tian, Feng ; Liu, Cheng-Lin ; Nakagawa, Masaki
刊名IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
出版日期2013
卷号35期号:10页码:2413-2426
关键词Character string recognition semi-Markov conditional random field lattice pruning beam search
ISSN号0162-8828
中文摘要This paper proposes a method for handwritten Chinese/Japanese text (character string) recognition based on semi-Markov conditional random fields (semi-CRFs). The high-order semi-CRF model is defined on a lattice containing all possible segmentation-recognition hypotheses of a string to elegantly fuse the scores of candidate character recognition and the compatibilities of geometric and linguistic contexts by representing them in the feature functions. Based on given models of character recognition and compatibilities, the fusion parameters are optimized by minimizing the negative log-likelihood loss with a margin term on a training string sample set. A forward-backward lattice pruning algorithm is proposed to reduce the computation in training when trigram language models are used, and beam search techniques are investigated to accelerate the decoding speed. We evaluate the performance of the proposed method on unconstrained online handwritten text lines of three databases. On the test sets of databases CASIA-OLHWDB (Chinese) and TUAT Kondate (Japanese), the character level correct rates are 95.20 and 95.44 percent, and the accurate rates are 94.54 and 94.55 percent, respectively. On the test set (online handwritten texts) of ICDAR 2011 Chinese handwriting recognition competition, the proposed method outperforms the best system in competition.
英文摘要This paper proposes a method for handwritten Chinese/Japanese text (character string) recognition based on semi-Markov conditional random fields (semi-CRFs). The high-order semi-CRF model is defined on a lattice containing all possible segmentation-recognition hypotheses of a string to elegantly fuse the scores of candidate character recognition and the compatibilities of geometric and linguistic contexts by representing them in the feature functions. Based on given models of character recognition and compatibilities, the fusion parameters are optimized by minimizing the negative log-likelihood loss with a margin term on a training string sample set. A forward-backward lattice pruning algorithm is proposed to reduce the computation in training when trigram language models are used, and beam search techniques are investigated to accelerate the decoding speed. We evaluate the performance of the proposed method on unconstrained online handwritten text lines of three databases. On the test sets of databases CASIA-OLHWDB (Chinese) and TUAT Kondate (Japanese), the character level correct rates are 95.20 and 95.44 percent, and the accurate rates are 94.54 and 94.55 percent, respectively. On the test set (online handwritten texts) of ICDAR 2011 Chinese handwriting recognition competition, the proposed method outperforms the best system in competition.
收录类别SCI
语种英语
WOS记录号WOS:000323175200008
公开日期2014-12-16
源URL[http://ir.iscas.ac.cn/handle/311060/16723]  
专题软件研究所_软件所图书馆_期刊论文
推荐引用方式
GB/T 7714
Zhou, Xiang-Dong,Wang, Da-Han,Tian, Feng,et al. Handwritten Chinese/Japanese Text Recognition Using Semi-Markov Conditional Random Fields[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,2013,35(10):2413-2426.
APA Zhou, Xiang-Dong,Wang, Da-Han,Tian, Feng,Liu, Cheng-Lin,&Nakagawa, Masaki.(2013).Handwritten Chinese/Japanese Text Recognition Using Semi-Markov Conditional Random Fields.IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,35(10),2413-2426.
MLA Zhou, Xiang-Dong,et al."Handwritten Chinese/Japanese Text Recognition Using Semi-Markov Conditional Random Fields".IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 35.10(2013):2413-2426.

入库方式: OAI收割

来源:软件研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。