2022
DOI: 10.1108/dta-09-2021-0261
|View full text |Cite
|
Sign up to set email alerts
|

Modular framework for similarity-based dataset discovery using external knowledge

Abstract: PurposeSemantic retrieval and discovery of datasets published as open data remains a challenging task. The datasets inherently originate in the globally distributed web jungle, lacking the luxury of centralized database administration, database schemes, shared attributes, vocabulary, structure and semantics. The existing dataset catalogs provide basic search functionality relying on keyword search in brief, incomplete or misleading textual metadata attached to the datasets. The search results are thus often in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 41 publications
0
2
0
Order By: Relevance
“…Everyone recognizes the massive rise of data. In addition, this data needs to be stored for future use without incurring a high cost (Necasky et al , 2022; Sharifpour et al , 2023). On a similar note, research data has also increased significantly as a result of digitalization.…”
Section: Introductionmentioning
confidence: 99%
“…Everyone recognizes the massive rise of data. In addition, this data needs to be stored for future use without incurring a high cost (Necasky et al , 2022; Sharifpour et al , 2023). On a similar note, research data has also increased significantly as a result of digitalization.…”
Section: Introductionmentioning
confidence: 99%
“…Furthermore, there has recently been growing interest in approaches based on dataset similarity rather than simple query matching. It is impractical for users to understand the perfect query words that represent the required datasets in advance [9]. This allows users to avoid constructing appropriate query words and to carry richer semantics [10].…”
Section: Introductionmentioning
confidence: 99%