2013 12th International Conference on Document Analysis and Recognition 2013
DOI: 10.1109/icdar.2013.28
|View full text |Cite
|
Sign up to set email alerts
|

Intellix -- End-User Trained Information Extraction for Document Archiving

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
41
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 79 publications
(41 citation statements)
references
References 16 publications
0
41
0
Order By: Relevance
“…For the more specific task of extracting information from business documents several works use a pattern matching approach. Schuster et al [2013], Rusinol et al [2013] and Cesarini et al [2003] require users to annotate which words should be extracted for a given document template, then automatically generate patterns matching those words. At test time, these patterns generate candidate words, which are scored using heuristics.…”
Section: Related Workmentioning
confidence: 99%
“…For the more specific task of extracting information from business documents several works use a pattern matching approach. Schuster et al [2013], Rusinol et al [2013] and Cesarini et al [2003] require users to annotate which words should be extracted for a given document template, then automatically generate patterns matching those words. At test time, these patterns generate candidate words, which are scored using heuristics.…”
Section: Related Workmentioning
confidence: 99%
“…Flexible template-based extraction systems [1], [6]- [8] locate the required text in the document by using the distance and direction from important surrounding keywords such as field labels. Cesarini et al [6] only look at the nearest keyword whereas d'Andecy et al [7] computes distances and angles from every other word in the document and predicts the final location by averaging over the distances and angles from all words weighted by their itf-df scores.…”
Section: Related Workmentioning
confidence: 99%
“…We adopted forty-year-old data frames [21]. Our example-based approach for user interaction has some similarities with the end-user-provided training examples used commercially for scanned business documents [22]. Some aspects of our templates, like the use of literals and semantic tags, were anticipated in [23].…”
Section: Prior Workmentioning
confidence: 99%