Yao Tong scite author profile

The availability of next-generation sequencing (NGS) in recent years has facilitated a revolution in the availability of mitochondrial (mt) genome sequences. The mt genome is a powerful tool for comparative studies and resolving the phylogenetic relationships among insect lineages. The mt genomes of phytophagous scarabs of the subfamilies Cetoniinae and Dynastinae were under-represented in GenBank. Previous research found that the subfamily Rutelinae was recovered as a paraphyletic group because the few representatives of the subfamily Dynastinae clustered into Rutelinae, but the subfamily position of Dynastinae was still unclear. In the present study, we sequenced 18 mt genomes from Dynastinae and Cetoniinae using next-generation sequencing (NGS) to re-assess the phylogenetic relationships within Scarabaeidae. All sequenced mt genomes contained 37 sets of genes (13 protein-coding genes, 22 tRNAs, and two ribosomal RNAs), with one long control region, but the gene order was not the same between Cetoniinae and Dynastinae species. All mt genomes of Dynastinae species showed the same gene rearrangement of trnQ-NCR-trnI-trnM, whereas all mt genomes of Cetoniinae species showed the ancestral insect gene order of trnI-trnQ-trnM. Phylogenetic analyses (IQ-tree and MrBayes) were conducted using 13 protein-coding genes based on nucleotide and amino acid datasets. In the ML and BI trees, we recovered the monophyly of Rutelinae, Cetoniinae, Dynastinae, and Sericinae, and the non-monophyly of Melolonthinae. Cetoniinae was shown to be a sister clade to (Dynastinae + Rutelinae).

show abstract

Genome-wide detection of short tandem repeat expansions by long-read sequencing

Liu

Tong

Wang

2020

BMC Bioinformatics

View full text Add to dashboard Cite

Background Short tandem repeat (STR), or “microsatellite”, is a tract of DNA in which a specific motif (typically < 10 base pairs) is repeated multiple times. STRs are abundant throughout the human genome, and specific repeat expansions may be associated with human diseases. Long-read sequencing coupled with bioinformatics tools enables the estimation of repeat counts for STRs. However, with the exception of a few well-known disease-relevant STRs, normal ranges of repeat counts for most STRs in human populations are not well known, preventing the prioritization of STRs that may be associated with human diseases. Results In this study, we extend a computational tool RepeatHMM to infer normal ranges of 432,604 STRs using 21 long-read sequencing datasets on human genomes, and build a genomic-scale database called RepeatHMM-DB with normal repeat ranges for these STRs. Evaluation on 13 well-known repeats show that the inferred repeat ranges provide good estimation to repeat ranges reported in literature from population-scale studies. This database, together with a repeat expansion estimation tool such as RepeatHMM, enables genomic-scale scanning of repeat regions in newly sequenced genomes to identify disease-relevant repeat expansions. As a case study of using RepeatHMM-DB, we evaluate the CAG repeats of ATXN3 for 20 patients with spinocerebellar ataxia type 3 (SCA3) and 5 unaffected individuals, and correctly classify each individual. Conclusions In summary, RepeatHMM-DB can facilitate prioritization and identification of disease-relevant STRs from whole-genome long-read sequencing data on patients with undiagnosed diseases. RepeatHMM-DB is incorporated into RepeatHMM and is available at https://github.com/WGLab/RepeatHMM.

show abstract

OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data

Wang

Chen

Zhang

et al. 2021

IJMS

View full text Add to dashboard Cite

Open chromatin regions (OCRs) are special regions of the human genome that can be accessed by DNA regulatory elements. Several studies have reported that a series of OCRs are associated with mechanisms involved in human diseases, such as cancers. Identifying OCRs using ATAC-seq or DNase-seq is often expensive. It has become popular to detect OCRs from plasma cell-free DNA (cfDNA) sequencing data, because both the fragmentation modes of cfDNA and the sequencing coverage in OCRs are significantly different from those in other regions. However, it is a challenging computational problem to accurately detect OCRs from plasma cfDNA-seq data, as multiple factors—e.g., sequencing and mapping bias, insufficient read depth, etc.—often mislead the computational model. In this paper, we propose a novel bioinformatics pipeline, OCRDetector, for detecting OCRs from whole-genome cfDNA sequencing data. The pipeline calculates the window protection score (WPS) waveform and the cfDNA sequencing coverage. To validate the proposed pipeline, we compared the percentage overlap of our OCRs with those obtained by other methods. The experimental results show that 81% of the TSS regions of housekeeping genes are detected, and our results have obvious tissue specificity. In addition, the overlap percentage between our OCRs and the high-confidence OCRs obtained by ATAC-seq or DNase-seq is greater than 70%.

show abstract

Cryptic Species Exist in Vietnamella sinensis Hsu, 1936 (Insecta: Ephemeroptera) from Studies of Complete Mitochondrial Genomes

Tong

Ayivi

et al. 2022

Insects

View full text Add to dashboard Cite

Ephemeroptera (Insecta: Pterygota) are widely distributed all over the world with more than 3500 species. During the last decade, the phylogenetic relationships within Ephemeroptera have been a hot topic of research, especially regarding the phylogenetic relationships among Vietnamellidae. In this study, three mitochondrial genomes from three populations of Vienamella sinensis collected from Tonglu (V. sinensis TL), Chun’an (V. sinensis CN), and Qingyuan (V. sinensis QY) in Zhejiang Province, China were compared to discuss the potential existence of cryptic species. We also established their phylogenetic relationship by combining the mt genomes of 69 Ephemeroptera downloaded from NCBI. The mt genomes of V. sinensis TL, V. sinensis CN, and V. sinensis QY showed the same gene arrangement with lengths of 15,674 bp, 15,674 bp, and 15,610 bp, respectively. Comprehensive analyses of these three mt genomes revealed significant differences in mt genome organization, genetic distance, and divergence time. Our results showed that the specimens collected from Chun’an and Tonglu in Zhejiang Province, China belonged to V. sinensis, and the specimens collected from Qingyuan, Zhejiang Province, China were a cryptic species of V. sinensis. In maximum likelihood (ML) and Bayesian inference (BI) phylogenetic trees, the monophyly of the family Vietnamellidae was supported and Vietnamellidae has a close relationship with Ephemerellidae.

show abstract

A DNA microarray for differentiation of the Chinese medicinal herb Dendrobium officinale (Fengdou Shihu) by its 5 S ribosomal DNA intergenic spacer region

Sze

Zhang

Shaw

et al. 2008

Biotech and App Biochem

View full text Add to dashboard Cite

A DNA microarray was constructed for high-throughput identification of the plant resource of commercial FDSH [Fengdu Shihu (Dendrobium officinale)]. The 5 S rDNA (ribosomal DNA) intergenic spacer region in D. officinale, D. nobile, D. moniliforme, D. hercoglossum, D. williamsonii, D. capillipes, D. wilsonii and D. jenkinsii was amplified by a single primer pair and sequenced. The sequences showed polymorphism. They were incorporated on a glass slide and hybridized with fluorescently labelled 5 S sequences from commercial Shihu. The DNA microarray enabled the differentiation of D. officinale from the other species tested. FDSH could thus be distinguished from its adulterants. It is evident that DNA microarrays provide a high-throughput and reliable approach for the identification of plant resources, and the method presented here is useful for the authentication of FDSH.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yao Tong

The Mitochondrial Genomes of 18 New Pleurosticti (Coleoptera: Scarabaeidae) Exhibit a Novel trnQ-NCR-trnI-trnM Gene Rearrangement and Clarify Phylogenetic Relationships of Subfamilies within Scarabaeidae

Genome-wide detection of short tandem repeat expansions by long-read sequencing

OCRDetector: Accurately Detecting Open Chromatin Regions via Plasma Cell-Free DNA Sequencing Data

Cryptic Species Exist in Vietnamella sinensis Hsu, 1936 (Insecta: Ephemeroptera) from Studies of Complete Mitochondrial Genomes

A DNA microarray for differentiation of the Chinese medicinal herb Dendrobium officinale (Fengdou Shihu) by its 5 S ribosomal DNA intergenic spacer region

Contact Info

Product

Resources

About