Lightweight, highly accurate line and paragraph detection

Unified Line and Paragraph Detection by Graph Convolutional Networks Overview This research paper presents a novel method for detecting lines and paragraphs in documents by framing the task as a unified two-level clustering problem. The approach starts with a set of text detection boxes that approximately correspond to words. Text line: Defined as a cluster of word-level text boxes. Paragraph: Defined as a cluster of these lines. These clusters form a hierarchical two-level tree structure representing the document layout. Methodology A Graph Convolutional Network (GCN) is used to predict relationships between the text boxes. The network outputs are then used to construct both line-level and paragraph-level clusters. This unified method efficiently processes and consolidates the document layout. Results Experiments show the approach achieves state-of-the-art performance for paragraph detection. Demonstrated effectiveness on public benchmarks and real-world images. High efficiency of the method alongside quality detection. Metadata Authors: Shuang Liu, Renshen Wang, Michalis Raptis, Yasuhisa Fujii Date: Submitted on March 17, 2022 Categories: Computer Vision and Pattern Recognition (cs.CV), Machine Learning (cs.LG) DOI: 10.48550/arXiv.2203.09638 Publication Status: Accepted to DAS 2022 as an oral paper Access and Licensing Available in PDF and TeX source formats on arXiv. Licensed under Creative Commons Attribution 4.0 International License. Additional Resources Various bibliographic, code, data, demo tools, and related paper recommenders are accessible through arXivLabs integrations. Community and user support acknowledged from the Simons Foundation and member institutions. --- This work provides a significant contribution to document layout analysis by integrating multi-level clustering via graph neural networks, enhancing the detection and understanding of textual structures like lines and paragraphs.