2020
DOI: 10.1016/j.jbi.2020.103399
|View full text |Cite
|
Sign up to set email alerts
|

A content-based literature recommendation system for datasets to improve data reusability – A case study on Gene Expression Omnibus (GEO) datasets

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 40 publications
(38 citation statements)
references
References 19 publications
0
31
0
Order By: Relevance
“…There are already numerous public repositories for genomic and gene-expression data, such as the Sequence Read Archive (SRA)/European Nucleotide Archive (ENA) and Gene Expression Omnibus (GEO), respectively. Recently, GEO has been used in a case study to improve dataset reusability with a literature recommendation system (Patra et al, 2020) and is highly recommended over other databases for the submission of RNA-Seq datasets (Bhandary et al, 2018). In medical research, information loss stems from large amounts of gathered data remaining inaccessible for reuse by a wider audience (sometimes even the authors) after the initial publication (Wade, 2014).…”
Section: Preventing Information Lossmentioning
confidence: 99%
See 4 more Smart Citations
“…There are already numerous public repositories for genomic and gene-expression data, such as the Sequence Read Archive (SRA)/European Nucleotide Archive (ENA) and Gene Expression Omnibus (GEO), respectively. Recently, GEO has been used in a case study to improve dataset reusability with a literature recommendation system (Patra et al, 2020) and is highly recommended over other databases for the submission of RNA-Seq datasets (Bhandary et al, 2018). In medical research, information loss stems from large amounts of gathered data remaining inaccessible for reuse by a wider audience (sometimes even the authors) after the initial publication (Wade, 2014).…”
Section: Preventing Information Lossmentioning
confidence: 99%
“…In metagenomics, where datasets tend to be in the gigabyte range, appropriate archiving of workflow intermediates for reuse can decrease the costs of re-analysis (Ten Hoopen et al, 2017). Additionally, many datasets deposited in sequence repositories like GEO were collected at enormous effort and used only once and so reusing them greatly increases their utility (Patra et al, 2020). The labour efficiency of reuse is also illustrated by the famous eGFP browser (Winter et al, 2007) which presents the content of RNA-Seq datasets to biologists in a simple way.…”
Section: Maximising Time Labour and Cost Efficiencymentioning
confidence: 99%
See 3 more Smart Citations