Wan‐Ping Lee scite author profile

The human reference genome serves as the foundation for genomics by providing a scaffold for alignment of sequencing reads, but currently only reflects a single consensus haplotype, which impairs read alignment and downstream analysis accuracy. Reference genome structures incorporating known genetic variation have been shown to improve the accuracy of genomic analyses, but have so far remained computationally prohibitive for routine large-scale use. Here we present a graph genome implementation that enables read alignment across 2,800 diploid genomes encompassing 12.6 million SNPs and 4.0 million indels. Our Graph Genome Pipeline requires 6.5 hours to process a 30x coverage WGS sample on a system with 36 CPU cores compared with 11 hours required by the GATK Best Practices pipeline.Using complementary benchmarking experiments based on real and simulated data, we show that using a graph genome reference improves read mapping sensitivity and produces a 0.5% increase in variant calling recall, or about 20,000 additional variants being detected per sample, while variant calling specificity is unaffected. Structural variations (SVs) incorporated into a graph genome can be genotyped accurately under a unified framework. Finally, we show that iterative augmentation of graph genomes yields incremental gains in variant calling accuracy. Our implementation is a significant advance towards fulfilling the promise of graph genomes to radically enhance the scalability and accuracy of genomic analyses.

show abstract

FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods

Becker

Lee

Leone

et al. 2018

Genome Biol

View full text Add to dashboard Cite

Comprehensive and accurate identification of structural variations (SVs) from next generation sequencing data remains a major challenge. We develop FusorSV, which uses a data mining approach to assess performance and merge callsets from an ensemble of SV-calling algorithms. It includes a fusion model built using analysis of 27 deep-coverage human genomes from the 1000 Genomes Project. We identify 843 novel SV calls that were not reported by the 1000 Genomes Project for these 27 samples. Experimental validation of a subset of these calls yields a validation rate of 86.7%. FusorSV is available at https://github.com/TheJacksonLaboratory/SVE.Electronic supplementary materialThe online version of this article (10.1186/s13059-018-1404-6) contains supplementary material, which is available to authorized users.

show abstract

Expectations and blind spots for structural variation detection from long-read assemblies and short-read genome sequencing technologies

Zhao

Collins

Lee

et al. 2021

The American Journal of Human Genetics

View full text Add to dashboard Cite

One reference genome is not enough

Yang

Lee

et al. 2019

Genome Biol

View full text Add to dashboard Cite

show abstract

Voltage island aware floorplanning for power and timing optimization

Lee

Liu

Chang

2006

View full text Add to dashboard Cite

Power consumption is a crucial concern in nanometer chip design. Researchers have shown that multiple supply voltage (MSV) is an effective method for power consumption reduction. The underlying idea behind MSV is the trade-off between power saving and performance. In this paper, we present an effective voltage assignment technique based on dynamic programming. Given a netlist without reconvergent fanouts, the dynamic programming can guarantee an optimal solution for the voltage assignment. We then generate a level shifter for each net that connects two blocks in different voltage domains, and perform power-network aware floorplanning for the MSV design. Experimental results show that our floorplanner is very effective in optimizing power consumption under timing constraints.

show abstract

Biomonitoring of alkylphenols exposure for textile and housekeeping workers

Chen¹,

Lee²,

Chung³

et al. 2005

International Journal of Environmental Analytical Chemistry

View full text Add to dashboard Cite

A Provably Good Approximation Algorithm for Power Optimization Using Multiple Supply Voltages

Liu¹,

Lee²,

Chang³

2007

View full text Add to dashboard Cite

Polygenic Risk Scores in Alzheimer’s Disease Genetics: Methodology, Applications, Inclusion, and Diversity

Clark

Leung

Lee

et al. 2022

JAD

View full text Add to dashboard Cite

The success of genome-wide association studies (GWAS) completed in the last 15 years has reinforced a key fact: polygenic architecture makes a substantial contribution to variation of susceptibility to complex disease, including Alzheimer’s disease. One straight-forward way to capture this architecture and predict which individuals in a population are most at risk is to calculate a polygenic risk score (PRS). This score aggregates the risk conferred across multiple genetic variants, ultimately representing an individual’s predicted genetic susceptibility for a disease. PRS have received increasing attention after having been successfully used in complex traits. This has brought with it renewed attention on new methods which improve the accuracy of risk prediction. While these applications are initially informative, their utility is far from equitable: the majority of PRS models use samples heavily if not entirely of individuals of European descent. This basic approach opens concerns of health equity if applied inaccurately to other population groups, or health disparity if we fail to use them at all. In this review we will examine the methods of calculating PRS and some of their previous uses in disease prediction. We also advocate for, with supporting scientific evidence, inclusion of data from diverse populations in these existing and future studies of population risk via PRS.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Wan‐Ping Lee

Fast and accurate genomic analyses using genome graphs

FusorSV: an algorithm for optimally combining data from multiple structural variation detection methods

Expectations and blind spots for structural variation detection from long-read assemblies and short-read genome sequencing technologies

One reference genome is not enough

Voltage island aware floorplanning for power and timing optimization

Biomonitoring of alkylphenols exposure for textile and housekeeping workers

A Provably Good Approximation Algorithm for Power Optimization Using Multiple Supply Voltages

Polygenic Risk Scores in Alzheimer’s Disease Genetics: Methodology, Applications, Inclusion, and Diversity

Contact Info

Product

Resources

About