中国科学院机构知识库网格
Chinese Academy of Sciences Institutional Repositories Grid
Layout-Aware Single-Image Document Flattening

文献类型:期刊论文

作者Li, Pu1,2; Quan, Weize1,2; Guo, Jianwei1,2; Yan, Dong-Ming1,2
刊名ACM Transactions on Graphics
出版日期2023-12-02
卷号43期号:1页码:9: 1-17
关键词Document Image Rectiication Document Layout Analysis Deep Neural Networks Geometric Models
DOIhttps://doi.org/10.1145/3627818
英文摘要

Single image rectification of document deformation is a challenging task. Although some recent deep learning-based methods have attempted to solve this problem, they cannot achieve satisfactory results when dealing with document images with complex deformations. In this article, we propose a new efficient framework for document flattening. Our main insight is that most layout primitives in a document have rectangular outline shapes, making unwarping local layout primitives essentially homogeneous with unwarping the entire document. The former task is clearly more straightforward to solve than the latter due to the more consistent texture and relatively smooth deformation. On this basis, we propose a layout-aware deep model working in a divide-and-conquer manner. First, we employ a transformer-based segmentation module to obtain the layout information of the input document. Then a new regression module is applied to predict the global and local UV maps. Finally, we design an effective merging algorithm to correct the global prediction with local details. Both quantitative and qualitative experimental results demonstrate that our framework achieves favorable performance against state-of-the-art methods. In addition, the current publicly available document flattening datasets have limited 3D paper shapes without layout annotation and also lack a general geometric correction metric. Therefore, we build a new large-scale synthetic dataset by utilizing a fully automatic rendering method to generate deformed documents with diverse shapes and exact layout segmentation labels. We also propose a new geometric correction metric based on our paired document UV maps. Code and dataset will be released at https://github.com/BunnySoCrazy/LA-DocFlatten.

URL标识查看原文
语种英语
源URL[http://ir.ia.ac.cn/handle/173211/57111]  
专题模式识别国家重点实验室_三维可视计算
通讯作者Guo, Jianwei
作者单位1.MAIS, Institute of Automation, Chinese Academy of Sciences
2.School of Artiicial Intelligence, UCAS
推荐引用方式
GB/T 7714
Li, Pu,Quan, Weize,Guo, Jianwei,et al. Layout-Aware Single-Image Document Flattening[J]. ACM Transactions on Graphics,2023,43(1):9: 1-17.
APA Li, Pu,Quan, Weize,Guo, Jianwei,&Yan, Dong-Ming.(2023).Layout-Aware Single-Image Document Flattening.ACM Transactions on Graphics,43(1),9: 1-17.
MLA Li, Pu,et al."Layout-Aware Single-Image Document Flattening".ACM Transactions on Graphics 43.1(2023):9: 1-17.

入库方式: OAI收割

来源:自动化研究所

浏览0
下载0
收藏0
其他版本

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。