2018
DOI: 10.1093/database/bax104
|View full text |Cite
|
Sign up to set email alerts
|

Probabilistic and machine learning-based retrieval approaches for biomedical dataset retrieval

Abstract: The bioCADDIE dataset retrieval challenge brought together different approaches to retrieval of biomedical datasets relevant to a user’s query, expressed as a text description of a needed dataset. We describe experiments in applying a data-driven, machine learning-based approach to biomedical dataset retrieval as part of this challenge. We report on a series of experiments carried out to evaluate the performance of both probabilistic and machine learning-driven techniques from information retrieval, as applied… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 18 publications
0
4
0
Order By: Relevance
“…While messy, unstructured data makes up the majority of data created on a daily basis, estimated to be 95% of all "big data" [35]. Alongside IR's ability to read unstructured, semistructred data -data that possess attributes of both structured and unstructured -and structured data, it is the ability to consume the sheer quantity of data [36], the ability to filter data [37], as well as the ability to process and categorize data that has provided it's ultimate strength [31]. Categorizing data into various classes, the act of grouping alike data, can be accomplished in a variety of ways, one of which is clustering.…”
Section: Importance Of Information Retrieval Systemsmentioning
confidence: 99%
“…While messy, unstructured data makes up the majority of data created on a daily basis, estimated to be 95% of all "big data" [35]. Alongside IR's ability to read unstructured, semistructred data -data that possess attributes of both structured and unstructured -and structured data, it is the ability to consume the sheer quantity of data [36], the ability to filter data [37], as well as the ability to process and categorize data that has provided it's ultimate strength [31]. Categorizing data into various classes, the act of grouping alike data, can be accomplished in a variety of ways, one of which is clustering.…”
Section: Importance Of Information Retrieval Systemsmentioning
confidence: 99%
“…This classification is available (in tabular form) at shorturl.at/D1234. [24], [15] [18], [17], [20], [22], [25], Techniques [27], [28], [43], [64], [30], [32], [34], [38], [40], [45], [46], [47], [49], [54], [55], [56], [57], [59], [60], [61], [66], [68], [69], [70], [73], [74], [75], [77], [26], [42], [44], [11], [13], [36]. [12], [14], [16], [19], [21], [23], [29], [31], [35], [39], [41], …”
Section: Data Extraction and Classificationmentioning
confidence: 99%
“…In addition to that biomedical and healthCAre Data Discovery Index Ecosystem (bioCADDIE) dataset retrieval challenge was organized in 2016 to evaluate the effectiveness of information retrieval (IR) techniques in identifying relevant biomedical datasets in DataMed ( 3 ). Among the teams participated in this shared task, use of probabilistic or machine learning based IR ( 4 ), medical subject headings (MeSH) term based query expansion ( 5 ), word embeddings and identifying named entity ( 6 ), and re-ranking ( 7 ) for searching datasets using a query were the prevalent approaches. Similarly, a specialized search engine named Omicseq was developed for retrieving omics data ( 8 ).…”
Section: Introductionmentioning
confidence: 99%