Mixed-Supervised Scene Text Detection With Expectation-Maximization Algorithm
文献类型:期刊论文
作者 | Zhao, Mengbiao1,2![]() ![]() ![]() ![]() ![]() |
刊名 | IEEE TRANSACTIONS ON IMAGE PROCESSING
![]() |
出版日期 | 2022 |
卷号 | 31页码:5513-5528 |
关键词 | Costs Annotations Training Labeling Detectors Data models Benchmark testing Mixed-supervised learning scene text detection weak supervision forms expectation-maximization algorithm |
ISSN号 | 1057-7149 |
DOI | 10.1109/TIP.2022.3197987 |
通讯作者 | Liu, Cheng-Lin(liucl@nlpr.ia.ac.cn) |
英文摘要 | Scene text detection is an important and challenging task in computer vision. For detecting arbitrarily-shaped texts, most existing methods require heavy data labeling efforts to produce polygon-level text region labels for supervised training. In order to reduce the cost in data labeling, we study mixed-supervised arbitrarily-shaped text detection by combining various weak supervision forms (e.g., image-level tags, coarse, loose and tight bounding boxes), which are far easier to annotate. Whereas the existing weakly-supervised learning methods (such as multiple instance learning) do not promote full object coverage, to approximate the performance of fully-supervised detection, we propose an Expectation-Maximization (EM) based mixed-supervised learning framework to train scene text detector using only a small amount of polygon-level annotated data combined with a large amount of weakly annotated data. The polygon-level labels are treated as latent variables and recovered from the weak labels by the EM algorithm. A new contour-based scene text detector is also proposed to facilitate the use of weak labels in our mixed-supervised learning framework. Extensive experiments on six scene text benchmarks show that (1) using only 10% strongly annotated data and 90% weakly annotated data, our method yields comparable performance to that of fully supervised methods, (2) with 100% strongly annotated data, our method achieves state-of-the-art performance on five scene text benchmarks (CTW1500, Total-Text, ICDAR-ArT, MSRA-TD500, and C-SVT), and competitive results on the ICDAR2015 Dataset. We will make our weakly annotated datasets publicly available. |
WOS关键词 | LOCALIZATION |
资助项目 | National Key Research and Development Program[2020AAA0108003] ; National Natural Science Foundation of China (NSFC)[61733007] ; National Natural Science Foundation of China (NSFC)[61721004] |
WOS研究方向 | Computer Science ; Engineering |
语种 | 英语 |
WOS记录号 | WOS:000844128200003 |
出版者 | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
资助机构 | National Key Research and Development Program ; National Natural Science Foundation of China (NSFC) |
源URL | [http://ir.ia.ac.cn/handle/173211/49873] ![]() |
专题 | 自动化研究所_模式识别国家重点实验室_模式分析与学习团队 |
通讯作者 | Liu, Cheng-Lin |
作者单位 | 1.Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China 2.Chinese Acad Sci, Inst Automat, Natl Lab Pattern Recognit, Beijing 100190, Peoples R China |
推荐引用方式 GB/T 7714 | Zhao, Mengbiao,Feng, Wei,Yin, Fei,et al. Mixed-Supervised Scene Text Detection With Expectation-Maximization Algorithm[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING,2022,31:5513-5528. |
APA | Zhao, Mengbiao,Feng, Wei,Yin, Fei,Zhang, Xu-Yao,&Liu, Cheng-Lin.(2022).Mixed-Supervised Scene Text Detection With Expectation-Maximization Algorithm.IEEE TRANSACTIONS ON IMAGE PROCESSING,31,5513-5528. |
MLA | Zhao, Mengbiao,et al."Mixed-Supervised Scene Text Detection With Expectation-Maximization Algorithm".IEEE TRANSACTIONS ON IMAGE PROCESSING 31(2022):5513-5528. |
入库方式: OAI收割
来源:自动化研究所
浏览0
下载0
收藏0
其他版本
除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。