The quality of data generated by high-throughput DNA sequencing tools must be rapidly assessed in order to determine how useful the data may be in making biological discoveries; higher quality data leads to more confident results and conclusions. Due to the ever-increasing size of data sets and the importance of rapid quality assessment, tools that analyze sequencing data should quickly produce easily interpretable graphics. Quack addresses these issues by generating information-dense visualizations from FASTQ files at a speed far surpassing other publicly available quality assurance tools in a manner independent of sequencing technology.
We employed phylogenomic methods to study molecular evolutionary processes and phylogeny in the geographically widely dispersed New World diploid cottons (Gossypium, subg. Houzingenia). Whole genome resequencing data (average of 33× genomic coverage) were generated to reassess the phylogenetic history of the subgenus and provide a temporal framework for its diversification. Phylogenetic analyses indicate that the subgenus likely originated following transoceanic dispersal from Africa about 6.6 Ma, but that nearly all of the biodiversity evolved following rapid diversification in the mid-Pleistocene (0.5–2.0 Ma), with multiple long-distance dispersals required to account for range expansion to Arizona, the Galapagos Islands, and Peru. Comparative analyses of cpDNAversus nuclear data indicate that this history was accompanied by several clear cases of interspecific introgression. Repetitive DNAs contribute roughly half of the total 880 Mb genome, but most transposable element families are relatively old and stable among species. In the genic fraction, pairwise synonymous mutation rates average 1% per Myr, with nonsynonymous changes being about seven times less frequent. Over 1.1 million indels were detected and phylogenetically polarized, revealing a 2-fold bias toward deletions over small insertions. We suggest that this genome down-sizing bias counteracts genome size growth by TE amplification and insertions, and helps explain the relatively small genomes that are restricted to this subgenus. Compared with the rate of nucleotide substitution, the rate of indel occurrence is much lower averaging about 17 nucleotide substitutions per indel event.
Long-distance insular dispersal is associated with divergence and speciation because of founder effects and strong genetic drift. The cotton tribe (Gossypieae) has experienced multiple transoceanic dispersals, generating an aggregate geographic range that encompasses much of the tropics and subtropics worldwide. Two genera in the Gossypieae, Kokia and Gossypioides, exhibit a remarkable geographic disjunction, being restricted to the Hawaiian Islands and Madagascar/East Africa, respectively. We assembled and use de novo genome sequences to address questions regarding the divergence of these two genera from each other and from their sister-group, Gossypium. In addition, we explore processes underlying the genome downsizing that characterizes Kokia and Gossypioides relative to other genera in the tribe. Using 13,000 gene orthologs and synonymous substitution rates, we show that the two disjuncts last shared a common ancestor ∼5 Ma, or half as long ago as their divergence from Gossypium. We report relative stasis in the transposable element fraction. In comparison to Gossypium, there is loss of ∼30% of the gene content in the two disjunct genera and a history of genome-wide accumulation of deletions. In both genera, there is a genome-wide bias toward deletions over insertions, and the number of gene losses exceeds the number of gains by ∼2- to 4-fold. The genomic analyses presented here elucidate genomic consequences of the demographic and biogeographic history of these closest relatives of Gossypium, and enhance their value as phylogenetic outgroups.
In recent years, a bioinformatics method for interpreting genome-wide association study (GWAS) data using metabolic pathway analysis has been developed and successfully used to find significant pathways and mechanisms explaining phenotypic traits of interest in plants. However, the many scripts implementing this method were not straightforward to use, had to be customized for each project, required user supervision, and took more than 24 h to process data. PAST (Pathway Association Study Tool), a new implementation of this method, has been developed to address these concerns. PAST has been implemented as a package for the R language. Two user-interfaces are provided; PAST can be run by loading the package in R and calling its methods, or by using an R Shiny guided user interface. In testing, PAST completed analyses in approximately half an hour to one hour by processing data in parallel and produced the same results as the previously developed method. PAST has many user-specified options for maximum customization. Thus, to promote a powerful new pathway analysis methodology that interprets GWAS data to find biological mechanisms associated with traits of interest, we developed a more accessible, efficient, and user-friendly tool. These attributes make PAST accessible to researchers interested in associating metabolic pathways with GWAS datasets to better understand the genetic architecture and mechanisms affecting phenotypes.
SummaryMaize (Zea mays mays) oil is a rich source of polyunsaturated fatty acids (FAs) and energy, making it a valuable resource for human food, animal feed, and bio‐energy. Although this trait has been studied via conventional genome‐wide association study (GWAS), the single nucleotide polymorphism (SNP)‐trait associations generated by GWAS may miss the underlying associations when traits are based on many genes, each with small effects that can be overshadowed by genetic background and environmental variation. Detecting these SNPs statistically is also limited by the levels set for false discovery rate. A complementary pathways analysis that emphasizes the cumulative aspects of SNP‐trait associations, rather than just the significance of single SNPs, was performed to understand the balance of lipid metabolism, conversion, and catabolism in this study. This pathway analysis indicated that acyl‐lipid pathways, including biosynthesis of wax esters, sphingolipids, phospholipids and flavonoids, along with FA and triacylglycerol (TAG) biosynthesis, were important for increasing oil and FA content. The allelic variation found among the genes involved in many degradation pathways, and many biosynthesis pathways leading from FAs and carbon partitioning pathways, was critical for determining final FA content, changing FA ratios and, ultimately, to final oil content. The pathways and pathway networks identified in this study, and especially the acyl‐lipid associated pathways identified beyond what had been found with GWAS alone, provide a real opportunity to precisely and efficiently manipulate high‐oil maize genetic improvement.
Background A key use of high throughput sequencing technology is the sequencing and assembly of full genome sequences. These genome assemblies are commonly assessed using statistics relating to contiguity of the assembly. Measures of contiguity are not strongly correlated with information about the biological completion or correctness of the assembly, and a commonly reported metric, N50, can be misleading. Over the years, multiple research groups have rejected the overuse of N50 and sought to develop more informative metrics. Results This paper presents a review of problems that arise from relying solely on contiguity as a measure of genome assembly quality as well as current alternative methods. Alternative methods are compared on the basis of how informative they are about the biological quality of the assembly and how easy they are to use. A comprehensive method for using multiple metrics of measuring assembly quality is presented. Conclusions This study aims to report on the status of assembly assessment methods and compare them, as well as to offer a comprehensive method that incorporates multiple facets of quality assessment. Weaknesses and strengths of varying methods are presented and explained, with recommendations based on speed of analysis and user friendliness.
Aeromonas veronii is a Gram-negative species ubiquitous in different aquatic environments and capable of causing a variety of diseases to a broad host range. Aeromonas species have the capability to carry and acquire antimicrobial resistance (AMR) elements, and currently multi-drug resistant (MDR) Aeromonas isolates are commonly found across the world. A. veronii strain MS-17-88 is a MDR strain isolated from catfish in the southeastern United States. The present study was undertaken to uncover the mechanism of resistance in MDR A. veronii strain MS-17-88 through the detection of genomic features. To achieve this, genomic DNA was extracted, sequenced, and assembled. The A. veronii strain MS-17-88 genome comprised 5,178,226-bp with 58.6% G+C, and it encoded several AMR elements, including imiS, ampS, mcr-7.1, mcr-3, catB2, catB7, catB1, floR, vat(F), tet(34), tet(35), tet(E), dfrA3, and tetR. The phylogeny and resistance profile of a large collection of A. veronii strains, including MS-17-88, were evaluated. Phylogenetic analysis showed a close relationship between MS-17-88 and strain Ae5 isolated from fish in China and ARB3 strain isolated from pond water in Japan, indicating a common ancestor of these strains. Analysis of phage elements revealed 58 intact, 63 incomplete, and 15 questionable phage elements among the 53 A. veronii genomes. The average phage element number is 2.56 per genome, and strain MS-17-88 is one of two strains having the maximum number of identified prophage elements (6 elements each). The profile of resistance against various antibiotics across the 53 A. veronii genomes revealed the presence of tet(34), mcr-7.1, mcr-3, and dfrA3 in all genomes (100%). By comparison, sul1 and sul2 were detected in 7.5% and 1.8% of A. veronii genomes. Nearly 77% of strains carried tet(E), and 7.5% of strains carried floR. This result suggested a low abundance and prevalence of sulfonamide and florfenicol resistance genes compared with tetracycline resistance among A. veronii strains. Overall, the present study provides insights into the resistance patterns among 53 A. veronii genomes, which can inform therapeutic options for fish affected by A. veronii.
Maize (Zea mays mays L.) is a staple crop of economic, industrial, and food security importance. Damage to the growing ears by corn earworm [Helicoverpa zea (Boddie)] is a major economic burden and increases secondary fungal infections and mycotoxin levels. To identify biochemical pathways associated with native resistance mechanisms, a genome-wide association analysis was performed, followed by pathway analysis using a gene-set enrichment-based approach. The gene-set enrichment exposed the cumulative effects of genes in pathways to identify those that contributed the most to resistance. Single nucleotide polymorphism-trait associations were linked to genes including transcription factors, protein kinases, hormone-responsive proteins, hydrolases, pectinases, xylogluconases, and the flavonol synthase gene (in the maysin biosynthesis pathway). The most significantly associated metabolic pathways identified included those that modified cell wall components, especially homogalacturonan, wax esters, and fatty acids; those involved in antibiosis, especially 2,4-dihydroxy-7-methoxy-1,4-benzoxazin-3-one (DIMBOA), flavonoids, and phenolics; and those involved in plant growth, including N uptake and energy production. The pathways identified in this study, and especially the cell wall-associated pathways, identified here for the first time, provide clues to resistance mechanisms that could guide the identification of new resistant ideotypes and candidate genes for creation of resistant maize germplasm via selection of natural variants or gene editing.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.