中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
A benchmark for unconstrained online handwritten Uyghur word recognition

文献类型:期刊论文

作者Simayi, Wujiahemaiti2; Ibrahim, Mayire2; Zhang, Xu-Yao1; Liu, Cheng-Lin1; Hamdulla, Askar2
刊名INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION
出版日期2020-07-28
页码14
关键词Online handwriting recognition Uyghur alphabet Database Out-of-vocabulary words Recurrent neural network 1D Convolution
ISSN号1433-2833
DOI10.1007/s10032-020-00354-0
通讯作者Hamdulla, Askar(askar@xju.edu.cn)
英文摘要Despite some interesting results from different research groups, a public database for Uyghur online handwriting recognition and a baseline study are not yet available for comparison purpose. In order to fill this void, we present a database of Uyghur online handwritten words and carry out the first benchmark experiments using it. This database contains 125,020 samples of 2030 words collected from 393 writers. According to Uyghur lexicon characteristics, two out-of-vocabulary datasets are especially provided for evaluation. We carry out some unconstrained handwritten word recognition experiments on the database using recurrent neural networks as base model. Recognition results are acquired using connectionist temporal classification without lexicon search and external language model. Concatenated and averaged bidirectional recurrent layers are compared for better generalization. Based on Uyghur unicode representation, we are interested in comparing the models using different alphabets, based both on character types and character forms. To improve generalization, we propose 1D convolutional model which implements 1D convolutional layers for sequence feature extraction. In our experiments, the proposed 1D convolutional model and its variations surpassed the base recurrent layered model on the out-of-vocabulary words by clear margin. 83.23% CAR (character accurate rate) was resulted when out-of-vocabulary samples are used for testing. The highest recognition rate is as high as 94.95% CAR when the test set shares the same lexicon to the training set. The experiments in this paper can be the baseline references for the future study using this database.
WOS关键词CHINESE ; NETWORKS ; DATABASE
资助项目National Key Research and Development Plan of China[2017YFC0820603] ; National Science Foundation of China (NSFC)[61462081] ; National Science Foundation of China (NSFC)[61263038]
WOS研究方向Computer Science
语种英语
WOS记录号WOS:000553232000001
出版者SPRINGER HEIDELBERG
资助机构National Key Research and Development Plan of China ; National Science Foundation of China (NSFC)
源URL[http://ir.ia.ac.cn/handle/173211/40274]  
专题自动化研究所_模式识别国家重点实验室_模式分析与学习团队
通讯作者Hamdulla, Askar
作者单位1.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit NLPR, Beijing 100190, Peoples R China
2.Xinjiang Univ, Inst Informat Sci & Engn, Urumqi, Peoples R China
推荐引用方式
GB/T 7714
Simayi, Wujiahemaiti,Ibrahim, Mayire,Zhang, Xu-Yao,et al. A benchmark for unconstrained online handwritten Uyghur word recognition[J]. INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION,2020:14.
APA Simayi, Wujiahemaiti,Ibrahim, Mayire,Zhang, Xu-Yao,Liu, Cheng-Lin,&Hamdulla, Askar.(2020).A benchmark for unconstrained online handwritten Uyghur word recognition.INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION,14.
MLA Simayi, Wujiahemaiti,et al."A benchmark for unconstrained online handwritten Uyghur word recognition".INTERNATIONAL JOURNAL ON DOCUMENT ANALYSIS AND RECOGNITION (2020):14.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。