2019 IEEE International Conference on Data Mining (ICDM) 2019
DOI: 10.1109/icdm.2019.00011
|View full text |Cite
|
Sign up to set email alerts
|

Dataset Recommendation via Variational Graph Autoencoder

Abstract: This paper targets on designing a query-based dataset recommendation system, which accepts a query denoting a user's research interest as a set of research papers and returns a list of recommended datasets that are ranked by the potential usefulness for the user's research need. The motivation of building such a system is to save users from spending time on heavy literature review work to find usable datasets. We start by constructing a two-layer network: one layer of citation network, and the other layer of d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 18 publications
(11 citation statements)
references
References 32 publications
0
11
0
Order By: Relevance
“…Deepwalk [20], Node2vec [39], Metapath2vec [44]. (d) Network and Content Representation Learning Techniques: Text-associated Deepwalk (TADW) [45], NRL+ LSI [46], NRL+ Doc2vec [47], GCN [48], HVGAE [6]. (e) Topic Modeling based Techniques: HFT [23], CDL [30], CVAE-CF [49], JMARS [29], CTFP [28].…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Deepwalk [20], Node2vec [39], Metapath2vec [44]. (d) Network and Content Representation Learning Techniques: Text-associated Deepwalk (TADW) [45], NRL+ LSI [46], NRL+ Doc2vec [47], GCN [48], HVGAE [6]. (e) Topic Modeling based Techniques: HFT [23], CDL [30], CVAE-CF [49], JMARS [29], CTFP [28].…”
Section: Methodsmentioning
confidence: 99%
“…These approaches satisfy the needs of a specific domain and are inapplicable to our studied problem. In our study, we need to jointly model the relationship between homogeneous nodes (paper) and the relationship between heterogeneous nodes (paper and dataset), also the text information of nodes (paper) for retrieving datasets based on the query specified as a list of papers as studied in [6]. However, our work is different from [6] that learn paper and dataset representation via Variational Autoencoders, as we model the shared topics between paper and dataset nodes, to help us understand paper-paper citation and paper-dataset citation as well as to facilitate the query based dataset recommendation.…”
Section: B Query Based Recommendationmentioning
confidence: 99%
See 1 more Smart Citation
“…Scientists are encouraged by funding agencies to publish datasets using the FAIR principles [41]. It is widely acknowledged that open and FAIR datasets contribute to both the transparency of science, to its quality, its reproducibility Because of the increasing importance of open datasets for modern science, a number of dataset search engines can be found online nowadays, including Google Dataset Search, 1 Mendeley Data, 2 Microsoft Rearch Open Data 3 and others. These dataset search engines help researchers to find datasets based on an input query consisting of keywords.…”
Section: Introductionmentioning
confidence: 99%
“…Ellefi et al [11] provide a dataset recommendation approach by considering the overlap between the schema of two datasets, which achieved perfect recall and a precision of 0.53. Altaf et al [1] provide a dataset recommendation method based on a set of research papers given by the user, achieving 0.92 recall score and 0.18 precision score. Giseli et al [28] present two approaches for dataset recommendation, based on Bayesian classifiers and on Social Network connections, which achieved a mean average precision score of around 0.6.…”
Section: Introductionmentioning
confidence: 99%