中国科学院机构知识库网格系统: A benchmark for unconstrained online handwritten Uyghur word recognition

A benchmark for unconstrained online handwritten Uyghur word recognition

文献类型：期刊论文


作者	Simayi, Wujiahemaiti 2; Ibrahim, Mayire 2; Zhang, Xu-Yao1 ; Liu, Cheng-Lin1 ; Hamdulla, Askar 2
刊名	INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION
出版日期	2020-07-28
页码	14
关键词	Online handwriting recognition Uyghur alphabet Database Out-of-vocabulary words Recurrent neural network 1D Convolution
ISSN号	1433-2833
DOI	10.1007/s10032-020-00354-0
通讯作者	Hamdulla, Askar(askar@xju.edu.cn)
英文摘要	Despite some interesting results from different research groups, a public database for Uyghur online handwriting recognition and a baseline study are not yet available for comparison purpose. In order to fill this void, we present a database of Uyghur online handwritten words and carry out the first benchmark experiments using it. This database contains 125,020 samples of 2030 words collected from 393 writers. According to Uyghur lexicon characteristics, two out-of-vocabulary datasets are especially provided for evaluation. We carry out some unconstrained handwritten word recognition experiments on the database using recurrent neural networks as base model. Recognition results are acquired using connectionist temporal classification without lexicon search and external language model. Concatenated and averaged bidirectional recurrent layers are compared for better generalization. Based on Uyghur unicode representation, we are interested in comparing the models using different alphabets, based both on character types and character forms. To improve generalization, we propose 1D convolutional model which implements 1D convolutional layers for sequence feature extraction. In our experiments, the proposed 1D convolutional model and its variations surpassed the base recurrent layered model on the out-of-vocabulary words by clear margin. 83.23% CAR (character accurate rate) was resulted when out-of-vocabulary samples are used for testing. The highest recognition rate is as high as 94.95% CAR when the test set shares the same lexicon to the training set. The experiments in this paper can be the baseline references for the future study using this database.
WOS关键词	CHINESE ; NETWORKS ; DATABASE
资助项目	National Key Research and Development Plan of China[2017YFC0820603] ; National Science Foundation of China (NSFC)[61462081] ; National Science Foundation of China (NSFC)[61263038]
WOS研究方向	Computer Science
语种	英语
WOS记录号	WOS:000553232000001
出版者	SPRINGER HEIDELBERG
资助机构	National Key Research and Development Plan of China ; National Science Foundation of China (NSFC)
源URL	[http://ir.ia.ac.cn/handle/173211/40274]
专题	自动化研究所_模式识别国家重点实验室_模式分析与学习团队
通讯作者	Hamdulla, Askar
作者单位	1.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit NLPR, Beijing 100190, Peoples R China 2.Xinjiang Univ, Inst Informat Sci & Engn, Urumqi, Peoples R China
推荐引用方式 GB/T 7714	Simayi, Wujiahemaiti,Ibrahim, Mayire,Zhang, Xu-Yao,et al. A benchmark for unconstrained online handwritten Uyghur word recognition[J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION,2020:14.
APA	Simayi, Wujiahemaiti,Ibrahim, Mayire,Zhang, Xu-Yao,Liu, Cheng-Lin,&Hamdulla, Askar.(2020).A benchmark for unconstrained online handwritten Uyghur word recognition.INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION,14.
MLA	Simayi, Wujiahemaiti,et al."A benchmark for unconstrained online handwritten Uyghur word recognition".INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION (2020):14.

入库方式： OAI收割

来源：自动化研究所

下载0

A benchmark for unconstrained online handwritten Uyghur word recognition

其他版本