Using linked data to mine RDF from wikipedia's tables

Muñoz, Emir; Hogan, Aidan; Mileo, Alessandra

doi:10.1145/2556195.2556266

Cited by 56 publications

(50 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Any predicates found in the query result are considered as relations between the two entities. The work is later extended in Muñoz et al [24] by adding a machine learning process to filter triples that are likely to be incorrect, exploiting features derived from both the knowledge base and the text content from the target cells.…”

Section: Semantic Table Interpretationmentioning

confidence: 99%

“…Current methods are non-efficient because they typically adopt an exhaustive strategy that examines the entire table content, e.g., column classification depends on every cell in the column. This results in quadratic growth of the number of computations and knowledge base queries with respect to the size of tables, as such operations are usually required for every pair of candidates, e.g., candidate relation lookup between every pair of entities on the same row [26,23,24], or similarity computation between every pair of candidate entity and concept in a column [21]. This can be redundant as Zwicklbauer et al [46] have empirically shown that comparable accuracy can be obtained by using only a fraction of data (i.e., sample) from the column.…”

Section: Remarkmentioning

confidence: 99%

“…Each internal link in a table is firstly searched using the MediaWiki API 23 to find the corresponding Wikipedia page number. The page number is then queried on Freebase using MQL 24 to find the corresponding Freebase URI. The end outcome of this process is a collection of tables whose cells are annotated by Freebase URIs.…”

Section: B Recreation Of the Limaye Datasetsmentioning

confidence: 99%

“…The major bottleneck is mainly due to three types of operations: querying the knowledge bases, building feature representations for candidates, and computing similarity between candidates. Both the number of queries and similarity computation can grow quadratically with respect to the size of a table as often such operations are required for each pair of candidates [21,26,24]. Empirically, Limaye et al [21] show that the actual inference algorithm only consumes less than 1% of total running time.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Effective and efficient Semantic Table Interpretation using TableMiner+

Zhang

2017

125

View full text Add to dashboard Cite

Abstract. This article introduces TableMiner + , a Semantic Table Interpretation method that annotates Web tables in a both effective and efficient way. Built on our previous work TableMiner, the extended version advances state-of-the-art in several ways. First, it improves annotation accuracy by making innovative use of various types of contextual information both inside and outside tables as features for inference. Second, it reduces computational overheads by adopting an incremental, bootstrapping approach that starts by creating preliminary and partial annotations of a table using 'sample' data in the table, then using the outcome as 'seed' to guide interpretation of remaining contents. This is then followed by a message passing process that iteratively refines results on the entire table to create the final optimal annotations. Third, it is able to handle all annotation tasks of Semantic Table Interpretation (e.g., annotating a column, or entity cells) while state-of-the-art methods are limited in different ways. We also compile the largest dataset known to date and extensively evaluate TableMiner + against four baselines and two reimplemented (near-identical, as adaptations are needed due to the use of different knowledge bases) state-of-the-art methods. TableMiner + consistently outperforms all models under all experimental settings. On the two most diverse datasets covering multiple domains and various table schemata, it achieves improvement in F1 by between 1 and 42 percentage points depending on specific annotation tasks. It also significantly reduces computational overheads in terms of wall-clock time when compared against classic methods that 'exhaustively' process the entire table content to build features for inference. As a concrete example, compared against a method based on joint inference implemented with parallel computation, the non-parallel implementation of TableMiner + achieves significant improvement in learning accuracy and almost orders of magnitude of savings in wall-clock time.

show abstract

Section: Semantic Table Interpretationmentioning

confidence: 99%

Section: Remarkmentioning

confidence: 99%

Section: B Recreation Of the Limaye Datasetsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Effective and efficient Semantic Table Interpretation using TableMiner+

Zhang

2017

125

View full text Add to dashboard Cite

show abstract

“…Therefore, several approaches for interpreting tables from Wikipedia with LOD have been proposed. Munoz et al [47,48] propose methods for triplifying Wikipedia tables, called WikiTables, using existing LOD knowledge bases, like DBpedia and YAGO. Following the idea of the previous approaches, this approach starts by extracting entities from the tables, and then discovering existing relations between them.…”

Section: Using Lod To Interpret Semi-structured Datamentioning

confidence: 99%

Semantic Web in Data Mining and Knowledge Discovery: A Comprehensive Survey

Ristoski

Paulheim

2016

SSRN Journal

View full text Add to dashboard Cite

a b s t r a c tData Mining and Knowledge Discovery in Databases (KDD) is a research field concerned with deriving higher-level insights from data. The tasks performed in that field are knowledge intensive and can often benefit from using additional knowledge from various sources. Therefore, many approaches have been proposed in this area that combine Semantic Web data with the data mining and knowledge discovery process. This survey article gives a comprehensive overview of those approaches in different stages of the knowledge discovery process. As an example, we show how Linked Open Data can be used at various stages for building content-based recommender systems. The survey shows that, while there are numerous interesting research works performed, the full potential of the Semantic Web and Linked Open Data for data mining and KDD is still to be unlocked.

show abstract

Table understanding: Problem overview

Shigarov

2022

WIREs Data Min & Knowl

View full text Add to dashboard Cite

Tables are probably the most natural way to represent relational data in various media and formats. They store a large number of valuable facts that could be utilized for question answering, knowledge base population, natural language generation, and other applications. However, many tables are not accompanied by semantics for the automatic interpretation of the information they present. Table Understanding (TU) aims at recovering the missing semantics that enables the extraction of facts from tables. This problem covers a range of issues from table detection in document images to semantic table interpretation with the help of external knowledge bases. To date, the TU research has been ongoing on for 30 years. Nevertheless, there is no common point of view on the scope of TU; the terminology still needs agreement and unification. In recent years, science and technology have shown a rapidly increasing interest in TU. Nowadays, it is especially important to check the meaning of this research problem once again. This article gives a comprehensive characterization of the TU problem, including a description of its subproblems, tasks, subtasks, and applications. It also discusses the common limitations used in the existing problem statements and proposes some directions for further research that would help overcome the corresponding limitations. This article is categorized under: Algorithmic Development > Text Mining Algorithmic Development > Web Mining

show abstract

Using linked data to mine RDF from wikipedia's tables

Cited by 56 publications

References 16 publications

Effective and efficient Semantic Table Interpretation using TableMiner+

Effective and efficient Semantic Table Interpretation using TableMiner+

Semantic Web in Data Mining and Knowledge Discovery: A Comprehensive Survey

Table understanding: Problem overview

Contact Info

Product

Resources

About