Fourth Workshop in Exploiting AI Techniques for Data Management 2021
DOI: 10.1145/3464509.3464892
|View full text |Cite
|
Sign up to set email alerts
|

Pre-Trained Web Table Embeddings for Table Discovery

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 28 publications
0
7
0
Order By: Relevance
“…With a fastText embedding model, we first get cell embeddings by averaging token embeddings in each cell and then aggregate cell embeddings to get a column embedding. More interestingly, web table embedding models [8] consider each cell as a single token (they concatenate tokens in a cell with underscores) and output embeddings at cell level. Nevertheless, we aggregate cell embeddings to derive the column embedding.…”
Section: Choices Of the Base Encodermentioning
confidence: 99%
See 2 more Smart Citations
“…With a fastText embedding model, we first get cell embeddings by averaging token embeddings in each cell and then aggregate cell embeddings to get a column embedding. More interestingly, web table embedding models [8] consider each cell as a single token (they concatenate tokens in a cell with underscores) and output embeddings at cell level. Nevertheless, we aggregate cell embeddings to derive the column embedding.…”
Section: Choices Of the Base Encodermentioning
confidence: 99%
“…Word embedding models have previously been used to find union-able tables. Two state-of-the-art choices are fastText and WTE (web table embeddings [8]). Language models have not thus far been used for the union-ability problem.…”
Section: Choices Of the Base Encodermentioning
confidence: 99%
See 1 more Smart Citation
“…A model is composed of a collection of real-valued vectors, each associated with a relevant DB term. The process of deriving vectors from DBderived text is called relational embedding, which is a very active area of research [2,6,14]. We provide initial vectors by textifying the database intelligently [3,19] and operate in two modes: assigning vectors to either whole tuples or to columns (per-tuple).…”
Section: Relational Embeddingmentioning
confidence: 99%
“…Such a model is composed of a collection of real-valued vectors, each associated with a relevant DB term. The process of deriving vectors from DB-derived text is called relational embedding, which is a very active area of research [2,8,20].…”
Section: Word Vectors Modelmentioning
confidence: 99%