Proceedings of the First Workshop on Scholarly Document Processing 2020
DOI: 10.18653/v1/2020.sdp-1.10
|View full text |Cite
|
Sign up to set email alerts
|

DeepPaperComposer: A Simple Solution for Training Data Preparation for Parsing Research Papers

Abstract: We present DeepPaperComposer, a simple solution for preparing highly accurate (100%) training data without manual labeling to extract content from scholarly articles using convolutional neural networks (CNNs). We used our approach to generate data and trained CNNs to extract eight categories of both textual (titles, abstracts, authors, headers, figure and table captions, and body texts) and nontextual content (figures and tables) from 30 years of 2916 IEEE VIS conference papers, of which a third were scanned b… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
2

Relationship

4
5

Authors

Journals

citations
Cited by 11 publications
(9 citation statements)
references
References 25 publications
(26 reference statements)
0
9
0
Order By: Relevance
“…We make use of metadata from this collection in our work to gather paper PDFs prior to automatic extraction. Conceptually, our new VIS30K extends this line of work by leveraging new image-based extraction methods [27] and search tools to make the figure and table data accessible.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…We make use of metadata from this collection in our work to gather paper PDFs prior to automatic extraction. Conceptually, our new VIS30K extends this line of work by leveraging new image-based extraction methods [27] and search tools to make the figure and table data accessible.…”
Section: Related Workmentioning
confidence: 99%
“…After model prediction, we used heuristics in [27] to combine both models' results by merging the bounding boxes from Faster R-CNN (better localization) with any additional images/bounding boxes detected by YOLOv3 (better detection) into an initial set of labeled bounding boxes. We further tightened or expanded these bounding boxes to acquire accurate regions for each figure and table.…”
Section: Table Captionmentioning
confidence: 99%
“…This dataset is also cross-linked to vispubdata [34] so one can find images by keyword search. Their open-source models [44] can make the subsequent data collection easier with minimum human intervention.…”
Section: Resources Focused On Collections Of Refereed Literaturementioning
confidence: 99%
“…Compared with Yang et al, our approach does not require another neural network for feature engineering. Ling and Chen [25] also used a rendering solution and they randomized figure and table positions for extracting those two categories. Our work broadens this approach by randomizing many document structural parts to acquire both structural and semantic labels.…”
Section: Document Parts and Layout Analysismentioning
confidence: 99%