2021
DOI: 10.1016/j.engappai.2020.104100
|View full text |Cite
|
Sign up to set email alerts
|

Transformers-based information extraction with limited data for domain-specific business documents

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
3
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 24 publications
(7 citation statements)
references
References 8 publications
0
7
0
Order By: Relevance
“…Liu et al [14] proposed a pattern-based approach to extract disease and drug combination pairs from MEDLINE abstracts. Nguyen et al [15] utilized the NLP model Transformer to extract information from domain-speci c business documents with limited training data. M. Kerroumi et al [16] proposed a multimodal approach, VisualWordGrid, to extract information from documents with rich visual characteristics, such as tables.…”
Section: ) Data Integrationmentioning
confidence: 99%
“…Liu et al [14] proposed a pattern-based approach to extract disease and drug combination pairs from MEDLINE abstracts. Nguyen et al [15] utilized the NLP model Transformer to extract information from domain-speci c business documents with limited training data. M. Kerroumi et al [16] proposed a multimodal approach, VisualWordGrid, to extract information from documents with rich visual characteristics, such as tables.…”
Section: ) Data Integrationmentioning
confidence: 99%
“…With modern advancements in deep learning technology and the increased need for processing large text datasets, researchers have been optimizing the task of automated text segmentation. Common applications of this natural language processing (NLP) task include information retrieval (Oh et al, 2007;Nguyen et al, 2021), topic segmentation (Arnold et al, 2019;Aumiller et al, 2021), and document summarization (Chuang and Yang, 2000). These tasks can take either linear or hierarchical approaches, with the latter taking into account structural representation of topics within documents (Glavaš and Swapna, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…Cho et al (2020) describe a method of automatically classifying the types of documents through a neural network for scanned business documents [29], Lee et al, (2018) presents the automatic classification according to the KSIC (Korea Standard Industry Code) [30], and Yun et al, (2018) describes the automatic classification method for business documents for which document classification is not defined [31]. In addition, Tien et al, (2020) describe a method for deriving meaning for unlabeled documents and then performs classification [32].…”
Section: Review Of Advanced Research For Bdamentioning
confidence: 99%