2013
DOI: 10.4028/www.scientific.net/amr.756-759.1590
|View full text |Cite
|
Sign up to set email alerts
|

Web Data Extraction Based on Tag Path Clustering

Abstract: Fully automatic methods that extract structured data from the Web have been studied extensively. The existing methods suffice for simple extraction, but they often fail to handle more complicated Web pages. This paper introduces a method based on tag path clustering to extract structured data. The method gets complete tag path collection by parsing the DOM tree of the Web document. Clustering of tag paths is performed based on introduced similarity measure and the data area can be targeted, then taking advanta… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 3 publications
0
2
0
Order By: Relevance
“…Separately, attention has been used for contextual learning in OD (Li et al, 2013;Hsieh et al, 2019;Morabia et al, 2020) and image captioning (You et al, 2016). Attention mechanisms have also been employed over graphs to learn an optimal representation of nodes while taking graph structure into account (Veličković et al, 2017).…”
Section: Related Workmentioning
confidence: 99%
“…Separately, attention has been used for contextual learning in OD (Li et al, 2013;Hsieh et al, 2019;Morabia et al, 2020) and image captioning (You et al, 2016). Attention mechanisms have also been employed over graphs to learn an optimal representation of nodes while taking graph structure into account (Veličković et al, 2017).…”
Section: Related Workmentioning
confidence: 99%
“…Luo et al [34] use attention over a BiLSTM-CRF layer for Named Entity Recognition (NER) on biomedical data. Separately, attention has been used for contextual learning in OD [20,26,37] and image captioning [56]. Attention mechanisms have also been employed over graphs to learn an optimal representation of nodes while taking graph structure into account [52].…”
Section: Related Workmentioning
confidence: 99%