2020
DOI: 10.1007/s12650-020-00702-6
|View full text |Cite
|
Sign up to set email alerts
|

Reverse-engineering bar charts using neural networks

Abstract: Reverse-engineering bar charts extracts textual and numeric information from the visual representations of bar charts to support application scenarios that require the underlying information. In this paper, we propose a neural network-based method for reverse-engineering bar charts. We adopt a neural network-based object detection model to simultaneously localize and classify textual information. This approach improves the efficiency of textual information extraction. We design an encoder-decoder framework tha… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
21
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 21 publications
(21 citation statements)
references
References 42 publications
0
21
0
Order By: Relevance
“…The author uses custom CNN architecture and achieves average classification accuracy of 97%. Bar charts are researched in BarChartAnalyzer [18] and by Zhou et al [19]. BarChartAnalyzer uses CNN that classifies the bar chart into seven subtypes (simple bar, grouped bar, stacked bar, and a combination of different orientations).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The author uses custom CNN architecture and achieves average classification accuracy of 97%. Bar charts are researched in BarChartAnalyzer [18] and by Zhou et al [19]. BarChartAnalyzer uses CNN that classifies the bar chart into seven subtypes (simple bar, grouped bar, stacked bar, and a combination of different orientations).…”
Section: Related Workmentioning
confidence: 99%
“…The average classification accuracy is 85%. In [19] authors proposed a new method for extracting textual and numerical information from bar charts. For textual information, extraction Region-based CNN combined with Tesseract Optical Character Recognition (OCR) engine is used.…”
Section: Related Workmentioning
confidence: 99%
“…ReVision [39] used a combination of feature identification and patch clustering to not only classify figures but also reverse-engineer their data to enable re-visualization to other chart formats, and Jung et al and Dai et al [2,16] similarly trained classifiers to categorize published charts in addition to extracting features and text. Last year, Fangfang Zhou et al [53] took a fully neural network-based approach to interpreting bar charts, using a Faster-RCNN [36] to locate and classify textual chart elements, and an attentional encoder-decoder to extract numerical information. To our knowledge the prior work focuses entirely on 2D charts, leaving the problem of interpreting 3D surface plots like that in Figure 2 unaddressed.…”
Section: Related Workmentioning
confidence: 99%
“…Recent interest in automatic document processing and conversion, such as in summarization and question answering tasks, has increased the importance of the extraction of underlying tabular data from chart images embedded in the converted documents. Chart analysis methods have evolved substantially in recent years from human-in-the-loop platforms relying on manual annotations [8,15], through early data extraction algorithms [2], hybrid neural-algorithmic pipelines [7,13], to end-to-end processing by a neural network [9,12,16].…”
Section: Introductionmentioning
confidence: 99%
“…Commonly a two-stage approach is used, first detecting the chart regions in the documents, and then applying some data extraction process to the detected charts. While the scope of the detection stage can be quite wide, including many types of charts [7], current tabular data extraction systems are mostly limited to the bar charts [2,7,9,16], with few exceptions. One of the possible reasons is that standard object detectors, employed in recent works, better cope with (and enable easy inference from) objects like rectangular bars and text elements, less so with pie segments, while elements like line or area plots defy handling by box proposals.…”
Section: Introductionmentioning
confidence: 99%