Qi Zhao scite author profile

A 2.91-billion base pair (bp) consensus sequence of the euchromatic portion of the human genome was generated by the whole-genome shotgun sequencing method. The 14.8-billion bp DNA sequence was generated over 9 months from 27,271,853 high-quality sequence reads (5.11-fold coverage of the genome) from both ends of plasmid clones made from the DNA of five individuals. Two assembly strategies—a whole-genome assembly and a regional chromosome assembly—were used, each combining sequence data from Celera and the publicly funded genome effort. The public data were shredded into 550-bp segments to create a 2.9-fold coverage of those genome regions that had been sequenced, without including biases inherent in the cloning and assembly procedure used by the publicly funded group. This brought the effective coverage in the assemblies to eightfold, reducing the number and size of gaps in the final assembly over what would be obtained with 5.11-fold coverage. The two assembly strategies yielded very similar results that largely agree with independent mapping data. The assemblies effectively cover the euchromatic regions of the human chromosomes. More than 90% of the genome is in scaffold assemblies of 100,000 bp or more, and 25% of the genome is in scaffolds of 10 million bp or larger. Analysis of the genome sequence revealed 26,588 protein-encoding transcripts for which there was strong corroborating evidence and an additional ∼12,000 computationally derived genes with mouse matches or other weak supporting evidence. Although gene-dense clusters are obvious, almost half the genes are dispersed in low G+C sequence separated by large tracts of apparently noncoding sequence. Only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. Duplications of segmental blocks, ranging in size up to chromosomal lengths, are abundant throughout the genome and reveal a complex evolutionary history. Comparative genomic analysis indicates vertebrate expansions of genes associated with neuronal function, with tissue-specific developmental regulation, and with the hemostasis and immune systems. DNA sequence comparisons between the consensus sequence and publicly funded genome data provided locations of 2.1 million single-nucleotide polymorphisms (SNPs). A random pair of human haploid genomes differed at a rate of 1 bp per 1250 on average, but there was marked heterogeneity in the level of polymorphism across the genome. Less than 1% of all SNPs resulted in variation in proteins, but the task of determining which SNPs have functional consequences remains an open challenge.

show abstract

The Genome Sequence of Drosophila melanogaster

Adams¹,

Celniker²,

Holt³

et al. 2000

Science

5,369

3,418

View full text Add to dashboard Cite

The fly Drosophila melanogaster is one of the most intensively studied organisms in biology and serves as a model system for the investigation of many developmental and cellular processes common to higher eukaryotes, including humans. We have determined the nucleotide sequence of nearly all of the ∼120-megabase euchromatic portion of the Drosophila genome using a whole-genome shotgun sequencing strategy supported by extensive clone-based sequence and a high-quality bacterial artificial chromosome physical map. Efforts are under way to close the remaining gaps; however, the sequence is of sufficient accuracy and contiguity to be declared substantially complete and to support an initial analysis of genome structure and preliminary gene annotation and interpretation. The genome encodes ∼13,600 genes, somewhat fewer than the smaller Caenorhabditis elegans genome, but with comparable functional diversity.

show abstract

Targeting CXCL12 from FAP-expressing carcinoma-associated fibroblasts synergizes with anti–PD-L1 immunotherapy in pancreatic cancer

Feig

Jones

Kraman

et al. 2013

Proc. Natl. Acad. Sci. U.S.A.

1,488

1,379

View full text Add to dashboard Cite

Significance Cancer immune evasion is well described. In some cases, this may be overcome by enhancing T-cell responses. We show that despite the presence of antitumor T cells, immunotherapeutic antibodies are ineffective in a murine pancreatic cancer model recapitulating the human disease. Removing the carcinoma-associated fibroblast (CAF) expressing fibroblast activation protein (FAP) from tumors permitted immune control of tumor growth and uncovered the efficacy of these immunotherapeutic antibodies. FAP + CAFs are the only tumoral source of chemokine (C-X-C motif) ligand 12 (CXCL12), and administering AMD3100, an inhibitor of chemokine (C-X-C motif) receptor 4, a CXCL12 receptor, also revealed the antitumor effects of an immunotherapeutic antibody and greatly diminished cancer cells. These findings may have wide clinical relevance because FAP + cells are found in almost all human adenocarcinomas.

show abstract

The Genome Sequence of the Malaria Mosquito Anopheles gambiae

Holt¹,

Subramanian²,

Halpern³

et al. 2002

Science

1,816

1,177

View full text Add to dashboard Cite

Anopheles gambiae is the principal vector of malaria, a disease that afflicts more than 500 million people and causes more than 1 million deaths each year. Tenfold shotgun sequence coverage was obtained from the PEST strain of A. gambiae and assembled into scaffolds that span 278 million base pairs. A total of 91% of the genome was organized in 303 scaffolds; the largest scaffold was 23.1 million base pairs. There was substantial genetic variation within this strain, and the apparent existence of two haplotypes of approximately equal frequency (“dual haplotypes”) in a substantial fraction of the genome likely reflects the outbred nature of the PEST strain. The sequence produced a conservative inference of more than 400,000 single-nucleotide polymorphisms that showed a markedly bimodal density distribution. Analysis of the genome sequence revealed strong evidence for about 14,000 protein-encoding transcripts. Prominent expansions in specific families of proteins likely involved in cell adhesion and immunity were noted. An expressed sequence tag analysis of genes regulated by blood feeding provided insights into the physiological adaptations of a hematophagous insect.

show abstract

Genome Sequence of Aedes aegypti , a Major Arbovirus Vector

Nene

Wortman

Lawson

et al. 2007

Science

1,018

1,059

View full text Add to dashboard Cite

We present a draft sequence of the genome of Aedes aegypti, the primary vector for yellow fever and dengue fever, which at ~1.38 Gbp is ~5-fold larger in size than the genome of the malaria vector, Anopheles gambiae. Nearly 50% of the Aedes aegypti genome consists of transposable elements. These contribute to a ~4-6 fold increase in average gene length and the size of intergenic regions relative to Anopheles gambiae and Drosophila melanogaster. Nevertheless, chromosomal synteny is generally maintained between all three insects although conservation of orthologous gene order is higher (~2-fold) between the mosquito species than between either of them and fruit fly. Three methods have provided transcriptional evidence for 80% of the 15,419 predicted protein coding genes in Aedes aegypti. An increase in genes encoding odorant binding, cytochrome P450 and cuticle domains relative to Anopheles gambiae suggests that members of these protein families underpin some of the biological differences between them.

show abstract

Comparative Genomics of the Eukaryotes

Rubin

Yandell²,

Wortman³

et al. 2000

Science

1,515

873

View full text Add to dashboard Cite

A comparative analysis of the genomes of Drosophila melanogaster, Caenorhabditis elegans, and Saccharomyces cerevisiae-and the proteins they are predicted to encode-was undertaken in the context of cellular, developmental, and evolutionary processes. The nonredundant protein sets of flies and worms are similar in size and are only twice that of yeast, but different gene families are expanded in each genome, and the multidomain proteins and signaling pathways of the fly and worm are far more complex than those of yeast. The fly has orthologs to 177 of the 289 human disease genes examined and provides the foundation for rapid analysis of some of the basic processes involved in human disease.

show abstract

Design of Wide-Spectrum Inhibitors Targeting Coronavirus Main Proteases

et al. 2005

View full text Add to dashboard Cite

The genus Coronavirus contains about 25 species of coronaviruses (CoVs), which are important pathogens causing highly prevalent diseases and often severe or fatal in humans and animals. No licensed specific drugs are available to prevent their infection. Different host receptors for cellular entry, poorly conserved structural proteins (antigens), and the high mutation and recombination rates of CoVs pose a significant problem in the development of wide-spectrum anti-CoV drugs and vaccines. CoV main proteases (Mpros), which are key enzymes in viral gene expression and replication, were revealed to share a highly conservative substrate-recognition pocket by comparison of four crystal structures and a homology model representing all three genetic clusters of the genus Coronavirus. This conclusion was further supported by enzyme activity assays. Mechanism-based irreversible inhibitors were designed, based on this conserved structural region, and a uniform inhibition mechanism was elucidated from the structures of Mpro-inhibitor complexes from severe acute respiratory syndrome-CoV and porcine transmissible gastroenteritis virus. A structure-assisted optimization program has yielded compounds with fast in vitro inactivation of multiple CoV Mpros, potent antiviral activity, and extremely low cellular toxicity in cell-based assays. Further modification could rapidly lead to the discovery of a single agent with clinical potential against existing and possible future emerging CoV-related diseases.

show abstract

Structures of Two Coronavirus Main Proteases: Implications for Substrate Binding and Antiviral Drug Design

Xue

Yang

et al. 2008

J Virol

424

459

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Qi Zhao

The Sequence of the Human Genome

The Genome Sequence of Drosophila melanogaster

Targeting CXCL12 from FAP-expressing carcinoma-associated fibroblasts synergizes with anti–PD-L1 immunotherapy in pancreatic cancer

The Genome Sequence of the Malaria Mosquito Anopheles gambiae

Genome Sequence of Aedes aegypti , a Major Arbovirus Vector

Comparative Genomics of the Eukaryotes

Design of Wide-Spectrum Inhibitors Targeting Coronavirus Main Proteases

Structures of Two Coronavirus Main Proteases: Implications for Substrate Binding and Antiviral Drug Design

Contact Info

Product

Resources

About