2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology 2008
DOI: 10.1109/wiiat.2008.241
|View full text |Cite
|
Sign up to set email alerts
|

Discriminating Meaningful Web Tables from Decorative Tables Using a Composite Kernel

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2008
2008
2021
2021

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(15 citation statements)
references
References 6 publications
0
15
0
Order By: Relevance
“…However, this is the obstacle of their approach. Jeong-Woo Son et al [5] have proposed an approach to discriminate web tables using a composite kernel which combines a parse tree kernel and a linear kernel. They proposed three kinds of features to capture both kinds of web table information which is composed of structural and content ones.…”
Section: Structure-basedmentioning
confidence: 99%
“…However, this is the obstacle of their approach. Jeong-Woo Son et al [5] have proposed an approach to discriminate web tables using a composite kernel which combines a parse tree kernel and a linear kernel. They proposed three kinds of features to capture both kinds of web table information which is composed of structural and content ones.…”
Section: Structure-basedmentioning
confidence: 99%
“…Figure 1 illustrates the taxonomy of Information Extraction which consists of different type of data as input and the approaches that have been proposed for extracting information from semistructured data. The web tables provide more organized information, summarized information, and conciseness in expressing knowledge (Jeong-Woo Son et al 2008). Therefore, focus is given more on the structure-based which is the main focus of this chapter.…”
Section: Concepts Of Information Extraction (Ie)mentioning
confidence: 99%
“…They argued that there is a need to divide a web page into information blocks or several segments before organizing the content into hierarchical groups and during this process (partition a web page) some of the attribute labels of values may be missing. Structure-based: The structure based approaches employ assumptions about the general structure of tables (i.e., <TABLE> tags) on the web pages (Wolfgang Gatterbauer et al 2007;Jeong-Woo Son et al 2008). Wolfgang Gatterbauer et al (2007) have proposed an approach for extracting information from web tables.…”
Section: Semantic-basedmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, table tags exist in HTML, but they are often used for formatting web page layout. Previous work focused on detecting tables from PDF, HTML and ASCII documents using Optical Character Recognition [13], machine learning algorithms such as C4.5 decision trees [17] or SVM [22,19], and heuristics [26].…”
Section: Introductionmentioning
confidence: 99%