2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2022
DOI: 10.1109/wacv51458.2022.00259
|View full text |Cite
|
Sign up to set email alerts
|

Post-OCR Paragraph Recognition by Graph Convolutional Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 19 publications
(21 citation statements)
references
References 24 publications
0
21
0
Order By: Relevance
“…These methods fail to produce word or line level detections and can only be used in company with standalone 1,000 0 500 4.4/6.5K Total-Text [10] 1,255 0 300 7.4/11K CTW1500 [60] 1,000 0 500 6.7/10K MSRA-TD500 [59] 300 0 200 6.9/3.5K IC17 MLT [38] 7,200 1,800 9,000 9.5/85K IC19 MLT [37] 10,000 0 10,000 8.9/89K IC19 LSVT [49] 30,000 0 20,000 8.1/243K IC19 ArT [11] 5,603 0 4,563 8.9/50K TextOCR [48] 21,778 3,124 3,232 32.1/903K Intel OCR [22] 191,059 text detectors, increasing the complexity of the pipeline. Another branch of work [54] takes a hierarchical view and apply graph-based models on the finest granularity, i.e. individual words, to analyze the layout.…”
Section: Layout Analysismentioning
confidence: 99%
See 3 more Smart Citations
“…These methods fail to produce word or line level detections and can only be used in company with standalone 1,000 0 500 4.4/6.5K Total-Text [10] 1,255 0 300 7.4/11K CTW1500 [60] 1,000 0 500 6.7/10K MSRA-TD500 [59] 300 0 200 6.9/3.5K IC17 MLT [38] 7,200 1,800 9,000 9.5/85K IC19 MLT [37] 10,000 0 10,000 8.9/89K IC19 LSVT [49] 30,000 0 20,000 8.1/243K IC19 ArT [11] 5,603 0 4,563 8.9/50K TextOCR [48] 21,778 3,124 3,232 32.1/903K Intel OCR [22] 191,059 text detectors, increasing the complexity of the pipeline. Another branch of work [54] takes a hierarchical view and apply graph-based models on the finest granularity, i.e. individual words, to analyze the layout.…”
Section: Layout Analysismentioning
confidence: 99%
“…We therefore carefully select the following baselines representing non-end-to-end methods: Commercial solution: The GCP API, as mentioned above, is a commercial solution that produces text detection and recognition results at word, line and paragraph level. GCN Post-Processing: The GCN [20] based postprocessing method (GCN-PP) [54] applies the GCN on text line bounding boxes to cluster lines into paragraphs. Object detection baselines: PubLayNet [62] formulates the layout analysis as an instance segmentation task predicting text clusters as pixel masks.…”
Section: Baselinesmentioning
confidence: 99%
See 2 more Smart Citations
“…Graph convolutional networks (GCNs) are becoming a prominent type of neural networks due to their capability of handling non-Euclidean data [4]. They naturally fit many problems in OCR and document analysis, and have been applied to help form lines [5] [6] [7], paragraphs [3] or other types of document entities [8]. Besides the quality gain from these GCN models, another benefit from these approaches is that we can potentially combine all the machine learning tasks and build a single, unified, multi-task GCN model.…”
Section: Introductionmentioning
confidence: 99%