2018
DOI: 10.1093/jamia/ocx121
|View full text |Cite
|
Sign up to set email alerts
|

DataMed – an open source discovery index for finding biomedical datasets

Abstract: Our manual review shows that the ingestion pipeline could achieve an accuracy of 90% and core elements of DATS had varied frequency across repositories. On a manually curated benchmark dataset, the DataMed search engine achieved an inferred average precision of 0.2033 and a precision at 10 (P@10, the number of relevant results in the top 10 search results) of 0.6022, by implementing advanced natural language processing and terminology services. Currently, we have made the DataMed system publically available as… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
40
0
1

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 59 publications
(41 citation statements)
references
References 36 publications
0
40
0
1
Order By: Relevance
“…This does not mean that all data providers must expose metadata about their holdings in DATS. The bioCADDIE Project has demonstrated the flexibility of DATS by mapping and ingesting metadata from more than 70 data repositories into DataMed (Chen et al, 2018). However, the capabilities of data repositories vary widely.…”
Section: Discussionmentioning
confidence: 99%
“…This does not mean that all data providers must expose metadata about their holdings in DATS. The bioCADDIE Project has demonstrated the flexibility of DATS by mapping and ingesting metadata from more than 70 data repositories into DataMed (Chen et al, 2018). However, the capabilities of data repositories vary widely.…”
Section: Discussionmentioning
confidence: 99%
“…Apache Solr is an open source search platform built on Apache Lucene library. Apache Lucene provides rich features to handle document such as full-text search and real-time indexing for various applications [18,19]. The Reuters news data were filtered in Apache Solr to retrieve articles that mentioned the 10 public health issues.…”
Section: Filtering News Articles On Public Health Issuesmentioning
confidence: 99%
“…DataMed [16,17] is an open source system that facilitates discovery and indexing of biomedical datasets. In this system, users can access an integrated data using unified schema (called Data Tag Suite) and search for biomedical articles using a proposed search engine.…”
Section: Related Workmentioning
confidence: 99%