Luyao Ren scite author profile

Clinical applications of precision oncology require accurate tests that can distinguish true cancer specific mutations from errors introduced at each step of next-generation sequencing (NGS). To date, no bulk sequencing study has addressed the effects of cross-site reproducibility, nor the biological, technical and computational factors that influence variant identification. Here we report a systematic interrogation of somatic mutations in paired tumor-normal cell lines to identify factors affecting detection reproducibility and accuracy at six different centers. Using whole genome sequencing (WGS) and whole-exome sequencing (WES), we evaluated the reproducibility of different sample types with varying input amount and tumor purity, and multiple library construction protocols, followed by processing with nine bioinformatics pipelines. We found that read coverage and callers affected both WGS and WES reproducibility, but WES performance was influenced by insert fragment size, genomic copy content and the global imbalance score (GIV; G > T/C > A). Finally, taking into account library preparation protocol, tumor content, read coverage and bioinformatics processes concomitantly, we recommend actionable practices to improve the reproducibility and accuracy of NGS experiments for cancer mutation detection.

show abstract

Establishing community reference samples, data and call sets for benchmarking cancer mutation detection using whole-genome sequencing

Fang¹,

Zhu²,

Zhao³

et al. 2021

Nat Biotechnol

View full text Add to dashboard Cite

PreMedKB: an integrated precision medicine knowledgebase for interpreting relationships between diseases, genes, variants and drugs

Wang

Xia

et al. 2018

View full text Add to dashboard Cite

One important aspect of precision medicine aims to deliver the right medicine to the right patient at the right dose at the right time based on the unique ‘omics’ features of each individual patient, thus maximizing drug efficacy and minimizing adverse drug reactions. However, fragmentation and heterogeneity of available data makes it challenging to readily obtain first-hand information regarding some particular diseases, drugs, genes and variants of interest. Therefore, we developed the Precision Medicine Knowledgebase (PreMedKB) by seamlessly integrating the four fundamental components of precision medicine: diseases, genes, variants and drugs. PreMedKB allows for search of comprehensive information within each of the four components, the relationships between any two or more components, and importantly, the interpretation of the clinical meanings of a patient's genetic variants. PreMedKB is an efficient and user-friendly tool to assist researchers, clinicians or patients in interpreting a patient's genetic profile in terms of discovering potential pathogenic variants, recommending therapeutic regimens, designing panels for genetic testing kits, and matching patients for clinical trials. PreMedKB is freely accessible and available at http://www.fudan-pgx.org/premedkb/index.html#/home.

show abstract

Assessing reproducibility of inherited variants detected with short-read whole genome sequencing

et al. 2022

View full text Add to dashboard Cite

Background Reproducible detection of inherited variants with whole genome sequencing (WGS) is vital for the implementation of precision medicine and is a complicated process in which each step affects variant call quality. Systematically assessing reproducibility of inherited variants with WGS and impact of each step in the process is needed for understanding and improving quality of inherited variants from WGS. Results To dissect the impact of factors involved in detection of inherited variants with WGS, we sequence triplicates of eight DNA samples representing two populations on three short-read sequencing platforms using three library kits in six labs and call variants with 56 combinations of aligners and callers. We find that bioinformatics pipelines (callers and aligners) have a larger impact on variant reproducibility than WGS platform or library preparation. Single-nucleotide variants (SNVs), particularly outside difficult-to-map regions, are more reproducible than small insertions and deletions (indels), which are least reproducible when > 5 bp. Increasing sequencing coverage improves indel reproducibility but has limited impact on SNVs above 30×. Conclusions Our findings highlight sources of variability in variant detection and the need for improvement of bioinformatics pipelines in the era of precision medicine with WGS.

show abstract

Inferring Program Transformations From Singular Examples via Big Code

Jiang

Ren

Xiong

et al. 2019

View full text Add to dashboard Cite

Quartet DNA reference materials and datasets for comprehensively evaluating germline variants calling performance

Ren

Duan

et al. 2022

Preprint

View full text Add to dashboard Cite

Current methods for evaluating the accuracy of germline variant calls are restricted to easy-to-detect high-confidence regions, thus ignoring a substantial portion of difficult variants beyond the benchmark regions. We established four DNA reference materials from immortalized cell lines derived from a Chinese Quartet including parents and monozygotic twins. We integrated benchmark calls of 4.2 million small variants and 15,000 structural variants from multiple platforms and bioinformatic pipelines for evaluating the reliability of germline variant calls inside the benchmark regions. The genetic built-in-truth of the Quartet family design not only improved sensitivity of benchmark calls by removing additional false positive variants with apparently high quality, but also enabled estimation of the precision of variants calls outside the benchmark regions. Batch effects of variant calling in large-scale DNA sequencing efforts can be effectively identified with the concurrent use of the Quartet DNA reference materials along with study samples, and can be alleviated by training a machine learning model with the Quartet reference datasets to remove potential artifact calls. Matched RNA and protein reference materials were also established in the Quartet project, thereby enabling benchmark calls constructed from DNA reference materials for evaluation of variants calling performance on RNA and protein data. The Quartet DNA reference materials from this study are a resource for objective and comprehensive assessment of the accuracy of germline variant calls throughout the whole-genome regions.

show abstract

Identifying Redundancies in Fork-based Development

Ren

Zhou

Kästner

et al. 2019

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Luyao Ren

Genomic and Transcriptomic Landscape of Triple-Negative Breast Cancers: Subtypes and Treatment Strategies

Toward best practice in cancer mutation detection with whole-genome and whole-exome sequencing

Establishing community reference samples, data and call sets for benchmarking cancer mutation detection using whole-genome sequencing

PreMedKB: an integrated precision medicine knowledgebase for interpreting relationships between diseases, genes, variants and drugs

Assessing reproducibility of inherited variants detected with short-read whole genome sequencing

Inferring Program Transformations From Singular Examples via Big Code

Quartet DNA reference materials and datasets for comprehensively evaluating germline variants calling performance

Identifying Redundancies in Fork-based Development

Contact Info

Product

Resources

About