2021
DOI: 10.48550/arxiv.2106.00978
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Span Extraction Approach for Information Extraction on Visually-Rich Documents

Abstract: Information extraction (IE) for visually-rich documents (VRDs) has achieved SOTA performance recently thanks to the adaptation of Transformer-based language models, which shows the great potential of pre-training methods. In this paper, we present a new approach to improve the capability of language model pre-training on VRDs. Firstly, we introduce a new query-based IE model that employs span extraction instead of using the common sequence labeling approach. Secondly, to extend the span extraction formulation,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 9 publications
0
2
0
Order By: Relevance
“…Majumder et al 2020 present a field-value pairing framework that learns the representations of fields and value candidates in the same feature space using metric learning. Nguyen et al 2021 propose a span extraction approach to extract the start and end of a value for each queried field. Gao et al 2021 introduce a field extraction system that can be trained with large-scale unlabeled documents.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Majumder et al 2020 present a field-value pairing framework that learns the representations of fields and value candidates in the same feature space using metric learning. Nguyen et al 2021 propose a span extraction approach to extract the start and end of a value for each queried field. Gao et al 2021 introduce a field extraction system that can be trained with large-scale unlabeled documents.…”
Section: Related Workmentioning
confidence: 99%
“…Our baseline. We implement our baseline following (Majumder et al, 2020;Nguyen et al, 2021). Unlike our method that utilizes a unified transformer to deeply model interactions among the query words and the OCR words, our baseline models the interactions in a shallower way (see Section A for details).…”
Section: Experimental Settingsmentioning
confidence: 99%