2017
DOI: 10.1158/0008-5472.can-17-0598
|View full text |Cite
|
Sign up to set email alerts
|

Developing Cancer Informatics Applications and Tools Using the NCI Genomic Data Commons API

Abstract: The NCI Genomic Data Commons (GDC) was launched in 2016 and makes available over 2 petabytes (PB) of cancer genomic and associated clinical data to the research community. This dataset continues to grow and currently includes over 14,500 patients. The GDC is an example of a biomedical data commons, which collocates biomedical data with storage and computing infrastructure and commonly used web services, software applications, and tools to create a secure, interoperable, and extensible resource for researchers.… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
29
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
7
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 35 publications
(29 citation statements)
references
References 4 publications
0
29
0
Order By: Relevance
“…Examples include developing algorithms for identifying particular types of cells in cell images and searching for these cells or processing BAM files to compute data quality scores and searching for BAM files with particular data quality problems. When data are curated and integrated with a common data model, synthetic cohorts can be created through a query, such as ‘find all males over 50 years of old that smoked and have a KRAS mutation [ 45 ].’…”
Section: Platforms For Data Sharingmentioning
confidence: 99%
“…Examples include developing algorithms for identifying particular types of cells in cell images and searching for these cells or processing BAM files to compute data quality scores and searching for BAM files with particular data quality problems. When data are curated and integrated with a common data model, synthetic cohorts can be created through a query, such as ‘find all males over 50 years of old that smoked and have a KRAS mutation [ 45 ].’…”
Section: Platforms For Data Sharingmentioning
confidence: 99%
“…The raw data generated from these systems is a need to improve and optimize all the necessary tools that perform the query processing, data synchronization, real-time data accumulation and determination of automatic cloud storage capacity, despite the availability of these tools at this point of time. In agreement to the need in improving the tools and having robust pipelines, National Cancer Institute in the United States initiate the NCI Cancer Research Data Commons (NCRDC) to improve the preventions, diagnostics, treatments on cancer diseases through open science efforts [80]. NDRDC is a cloud-based infrastructure consisting of multiple nodes that house processed data, raw data, metadata and analyzed data from cross-domains.…”
Section: Improve Tools and Pipelinesmentioning
confidence: 99%
“…That software ecosystem was then described as “OpenHealth” with reference to the OpenData policy (Burwell et al, 2013). A multitude of BigData health-related resources has since become available, from the National Institutes of Health such as NCI’s Genome Data Commons (Wilson et al, 2017), to Population Health outcomes data collected by the health departments of a number of US states such as New York (NY. State of New York-Open Data Health-Health Data NY, 2018).…”
Section: Introductionmentioning
confidence: 99%