2012
DOI: 10.1007/978-3-642-32281-5_41
|View full text |Cite
|
Sign up to set email alerts
|

WYSIWYE: An Algebra for Expressing Spatial and Textual Rules for Information Extraction

Abstract: The visual layout of a webpage can provide valuable clues for certain types of Information Extraction (IE) tasks. In traditional rule based IE frameworks, these layout cues are mapped to rules that operate on the HTML source of the webpages. In contrast, we have developed a framework in which the rules can be specified directly at the layout level. This has many advantages, since the higher level of abstraction leads to simpler extraction rules that are largely independent of the source code of the page, and, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 15 publications
0
2
0
Order By: Relevance
“…For future work, further studies to advance experiments from evolving data streams—those generated by mechanisms that change or fluctuate over time, by implementing the Weka/MOA package, designed specifically for data stream mining including new adaptations with Deep Learning algorithms. Furthermore, we intend to advance the development of a module for a pre-processing face as proposed [ 77 ], prioritizing data collection, transformation, and preparation of datasets and images. This is essential for the crossing of data and construction of rules to ensure the quality of the information to the user and decision maker.…”
Section: Discussionmentioning
confidence: 99%
“…For future work, further studies to advance experiments from evolving data streams—those generated by mechanisms that change or fluctuate over time, by implementing the Weka/MOA package, designed specifically for data stream mining including new adaptations with Deep Learning algorithms. Furthermore, we intend to advance the development of a module for a pre-processing face as proposed [ 77 ], prioritizing data collection, transformation, and preparation of datasets and images. This is essential for the crossing of data and construction of rules to ensure the quality of the information to the user and decision maker.…”
Section: Discussionmentioning
confidence: 99%
“…Some more recent works are presented in [26]- [28]. The former presents an algebra for expressing spatial and textual rules that can be defined directly at a layout level.…”
Section: Related Workmentioning
confidence: 99%