Weichen Zhou scite author profile

Long-read and strand-specific sequencing technologies together facilitate the de novo assembly of high-quality haplotype-resolved human genomes without parent–child trio data. We present 64 assembled haplotypes from 32 diverse human genomes. These highly contiguous haplotype assemblies (average contig N50: 26 Mbp) integrate all forms of genetic variation even across complex loci. We identify 107,590 structural variants (SVs), of which 68% are not discovered by short-read sequencing, and 278 SV hotspots (spanning megabases of gene-rich sequence). We characterize 130 of the most active mobile element source elements and find that 63% of all SVs arise by homology-mediated mechanisms. This resource enables reliable graph-based genotyping from short reads of up to 50,340 SVs, resulting in the identification of 1,526 expression quantitative trait loci as well as SV candidates for adaptive selection within the human population.

show abstract

TBX6 Null Variants and a Common Hypomorphic Allele in Congenital Scoliosis

Wu¹,

Xuan²,

Xiao³

et al. 2015

N Engl J Med

245

331

View full text Add to dashboard Cite

BACKGROUND Congenital scoliosis is a common type of vertebral malformation. Genetic susceptibility has been implicated in congenital scoliosis. METHODS We evaluated 161 Han Chinese persons with sporadic congenital scoliosis, 166 Han Chinese controls, and 2 pedigrees, family members of which had a 16p11.2 deletion, using comparative genomic hybridization, quantitative polymerase-chain-reaction analysis, and DNA sequencing. We carried out tests of replication using an additional series of 76 Han Chinese persons with congenital scoliosis and a multi-center series of 42 persons with 16p11.2 deletions. RESULTS We identified a total of 17 heterozygous TBX6 null mutations in the 161 persons with sporadic congenital scoliosis (11%); we did not observe any null mutations in TBX6 in 166 controls (P<3.8×10−6). These null alleles include copy-number variants (12 instances of a 16p11.2 deletion affecting TBX6) and single-nucleotide variants (1 nonsense and 4 frame-shift mutations). However, the discordant intrafamilial phenotypes of 16p11.2 deletion carriers suggest that heterozygous TBX6 null mutation is insufficient to cause congenital scoliosis. We went on to identify a common TBX6 haplotype as the second risk allele in all 17 carriers of TBX6 null mutations (P<1.1×10−6). Replication studies involving additional persons with congenital scoliosis who carried a deletion affecting TBX6 confirmed this compound inheritance model. In vitro functional assays suggested that the risk haplotype is a hypomorphic allele. Hemivertebrae are characteristic of TBX6-associated congenital scoliosis. CONCLUSIONS Compound inheritance of a rare null mutation and a hypomorphic allele of TBX6 accounted for up to 11% of congenital scoliosis cases in the series that we analyzed.

show abstract

A robust benchmark for detection of germline large deletions and insertions

et al. 2020

View full text Add to dashboard Cite

any diseases have been linked to SVs, most often defined as genomic changes at least 50 bp in size, but SVs are challenging to detect accurately. Conditions linked to SVs include autism 1 , schizophrenia, cardiovascular disease 2 , Huntington's disease and several other disorders 3. Far fewer SVs exist in germline genomes relative to small variants, but SVs affect more base pairs, and each SV might be more likely to affect phenotype 4-6. Although next-generation sequencing technologies can detect many SVs, each technology and analysis method has different strengths and weaknesses. To enable the community to

show abstract

High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios

Byrska-Bishop

Evani

Zhao

et al. 2022

Cell

332

245

View full text Add to dashboard Cite

Identification and characterization of occult human-specific LINE-1 insertions using long-read sequencing technology

Zhou

Emery

Flasch

et al. 2019

View full text Add to dashboard Cite

Long Interspersed Element-1 (LINE-1) retrotransposition contributes to inter- and intra-individual genetic variation and occasionally can lead to human genetic disorders. Various strategies have been developed to identify human-specific LINE-1 (L1Hs) insertions from short-read whole genome sequencing (WGS) data; however, they have limitations in detecting insertions in complex repetitive genomic regions. Here, we developed a computational tool (PALMER) and used it to identify 203 non-reference L1Hs insertions in the NA12878 benchmark genome. Using PacBio long-read sequencing data, we identified L1Hs insertions that were absent in previous short-read studies (90/203). Approximately 81% (73/90) of the L1Hs insertions reside within endogenous LINE-1 sequences in the reference assembly and the analysis of unique breakpoint junction sequences revealed 63% (57/90) of these L1Hs insertions could be genotyped in 1000 Genomes Project sequences. Moreover, we observed that amplification biases encountered in single-cell WGS experiments led to a wide variation in L1Hs insertion detection rates between four individual NA12878 cells; under-amplification limited detection to 32% (65/203) of insertions, whereas over-amplification increased false positive calls. In sum, these data indicate that L1Hs insertions are often missed using standard short-read sequencing approaches and long-read sequencing approaches can significantly improve the detection of L1Hs insertions present in individual genomes.

show abstract

Predictive model for inflammation grades of chronic hepatitis B: Large‐scale analysis of clinical parameters and gene expressions

Zhou

Zhang

et al. 2017

Liver International

View full text Add to dashboard Cite

show abstract

A robust benchmark for germline structural variant detection

Zook¹,

Nf²,

Nd³

et al. 2019

Preprint

View full text Add to dashboard Cite

New technologies and analysis methods are enabling genomic structural variants (SVs) to be detected with ever-increasing accuracy, resolution, and comprehensiveness. Translating these methods to routine research and clinical practice requires robust benchmark sets. We developed the first benchmark set for identification of both false negative and false positive germline SVs, which complements recent efforts emphasizing increasingly comprehensive characterization of SVs. To create this benchmark for a broadly consented son in a Personal Genome Project trio with broadly available cells and DNA, the Genome in a Bottle (GIAB) Consortium integrated 19 sequence-resolved variant calling methods, both alignment-and de novo assembly-based, from short-, linked-, and long-read sequencing, as well as optical and electronic mapping. The final benchmark set contains 12745 isolated, sequence-resolved insertion and deletion calls ≥50 base pairs (bp) discovered by at least 2 technologies or 5 callsets, genotyped as heterozygous or homozygous variants by long reads. The Tier 1 benchmark regions, for which any extra calls are putative false positives, cover 2.66 Gbp and 9641 SVs supported by at least one diploid assembly. Support for SVs was assessed using svviz with short-, linked-, and long-read sequence data. In general, there was strong support from multiple technologies for the benchmark SVs, with 90 % of the Tier 1 SVs having support in reads from more than one technology. The Mendelian genotype error rate was 0.3 %, and genotype concordance with manual curation was >98.7 %. We demonstrate the utility of the benchmark set by showing it reliably identifies both false negatives and false positives in high-quality SV callsets from short-, linked-, and long-read sequencing and optical mapping. GIAB is working towards a new version of the benchmark set that will use new technologies and methods such as PacBio Circular Consensus Sequencing and ultralong Oxford Nanopore sequencing to expand to more challenging genome regions and include more challenging SVs such as inversions. We are also developing a robust integration process to make calls on GRCh37 and GRCh38 for all seven GIAB samples.

show abstract

Learning by explaining to oneself and a peer enhances learners’ theta and alpha oscillations while watching video lectures

Pi¹,

Zhang²,

Zhou³

et al. 2020

Brit J Educational Tech

View full text Add to dashboard Cite

In the present study, we tested the effectiveness of three learning strategies (self‐explanation, learning by teaching and passive viewing) used by students who were learning from video lectures. Effectiveness was measured not only with traditional measures, but also with electroencephalography (EEG). Using a within‐subjects design, 26 university students viewed three sets of short lectures, each presenting a different set of English vocabulary words and were asked to use a different learning strategy for each set of lectures. Participants’ EEG signals were assessed while watching the videos; learning experience (self‐reported motivation and engagement) and learning performance (vocabulary recall test score) were assessed after watching the videos. Repeated measures ANOVAs showed that the self‐explaining and teaching strategies were more beneficial than the passive viewing strategy, as indicated by higher EEG theta and alpha band power, a more positive learning experience (higher motivation and engagement) and better learning performance. However, whereas the teaching strategy elicited greater neural oscillations related to working memory and attention compared to the self‐explanation strategy, the two groups did not differ on self‐reported learning experience or learning performance. Our findings are discussed in terms of potential application in courses using video lectures and in terms of their heuristic value for future research on the neural processes that differentiate learning strategies. What is already known about this topic Watching video lectures does not always result in learners actively making sense of the learning material. Self‐explaining facilitates deep learning from viewing video lectures and in traditional educational settings. Learning by teaching also facilitates deep learning in traditional educational settings. What this paper adds Learning by teaching resulted in the highest theta and alpha band power in EEG assessment while viewing video lectures. Compared with passive viewing, learning by teaching enhanced students’ motivation to try to understand the material; in addition, both learning by teaching and self‐explaining enhanced the amount of mental effort students put into understanding the material. Learning was increased via both self‐explaining and teaching strategies after viewing video lectures. Implications for practice and/or policy Learners are encouraged to generate explanations during pauses in video lectures or after viewing them, in order to increase learning. Learners are also encouraged to learn by teaching, as this strategy can increase learning and also increase neural oscillations associated with memory and attention.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Weichen Zhou

Haplotype-resolved diverse human genomes and integrated analysis of structural variation

TBX6 Null Variants and a Common Hypomorphic Allele in Congenital Scoliosis

A robust benchmark for detection of germline large deletions and insertions

High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios

Identification and characterization of occult human-specific LINE-1 insertions using long-read sequencing technology

Predictive model for inflammation grades of chronic hepatitis B: Large‐scale analysis of clinical parameters and gene expressions

A robust benchmark for germline structural variant detection

Learning by explaining to oneself and a peer enhances learners’ theta and alpha oscillations while watching video lectures

Contact Info

Product

Resources

About