A canine coronavirus (CCoV) has now been reported from two independent human samples from Malaysia (respiratory, collected in 2017–2018; CCoV-HuPn-2018) and Haiti (urine, collected in 2017); these two viruses were nearly genetically identical. In an effort to identify any novel adaptations associated with this apparent shift in tropism we carried out detailed evolutionary analyses of the spike gene of this virus in the context of related Alphacoronavirus 1 species. The spike 0-domain retains homology to CCoV2b (enteric infections) and Transmissible Gastroenteritis Virus (TGEV; enteric and respiratory). This domain is subject to relaxed selection pressure and an increased rate of molecular evolution. It contains unique amino acid substitutions, including within a region important for sialic acid binding and pathogenesis in TGEV. Overall, the spike gene is extensively recombinant, with a feline coronavirus type II strain serving a prominent role in the recombinant history of the virus. Molecular divergence time for a segment of the gene where temporal signal could be determined, was estimated at around 60 years ago. We hypothesize that the virus had an enteric origin, but that it may be losing that particular tropism, possibly because of mutations in the sialic acid binding region of the spike 0-domain.
Feline Coronaviruses (FCoVs) commonly cause mild enteric infections in felines worldwide (termed Feline Enteric Coronavirus [FECV]), with around 12% developing into deadly Feline Infectious Peritonitis (FIP; Feline Infectious Peritonitis Virus [FIPV]). Genomic differences between FECV and FIPV have been reported, yet the putative genotypic basis of the highly pathogenic phenotype remains unclear. Here, we used state-of-the-art molecular evolutionary genetic statistical techniques to identify and compare differences in natural selection pressure between FECV and FIPV sequences, as well as to identify FIPV and FECV specific signals of positive selection. We analyzed full length FCoV protein coding genes thought to contain mutations associated with FIPV (Spike, ORF3abc, and ORF7ab). We identified two sites exhibiting differences in natural selection pressure between FECV and FIPV: one within the S1/S2 furin cleavage site, and the other within the fusion domain of Spike. We also found 15 sites subject to positive selection associated with FIPV within Spike, 11 of which have not previously been suggested as possibly relevant to FIP development. These sites fall within Spike protein subdomains that participate in host cell receptor interaction, immune evasion, tropism shifts, host cellular entry, and viral escape. There were 14 sites (12 novel) within Spike under positive selection associated with the FECV phenotype, almost exclusively within the S1/S2 furin cleavage site and adjacent C domain, along with a signal of relaxed selection in FIPV relative to FECV, suggesting that furin cleavage functionality may not be needed for FIPV. Positive selection inferred in ORF7b was associated with the FECV phenotype, and included 24 positively selected sites, while ORF7b had signals of relaxed selection in FIPV. We found evidence of positive selection in ORF3c in FCoV wide analyses, but no specific association with the FIPV or FECV phenotype. We hypothesize that some combination of mutations in FECV may contribute to FIP development, and that is unlikely to be one singular “switch” mutational event. This work expands our understanding of the complexities of FIP development and provides insights into how evolutionary forces may alter pathogenesis in coronavirus genomes.
Recombination contributes to the genetic diversity found in coronaviruses and is known to be a prominent mechanism whereby they evolve. It is apparent, both from controlled experiments and in genome sequences sampled from nature, that patterns of recombination in coronaviruses are non-random and that this is likely attributable to a combination of sequence features that favour the occurrence of recombination breakpoints at specific genomic sites, and selection disfavouring the survival of recombinants within which favourable intra-genome interactions have been disrupted. Here we leverage available whole-genome sequence data for six coronavirus subgenera to identify specific patterns of recombination that are conserved between multiple subgenera and then identify the likely factors that underlie these conserved patterns. Specifically, we confirm the non-randomness of recombination breakpoints across all six tested coronavirus subgenera, locate conserved recombination hot- and cold-spots, and determine that the locations of transcriptional regulatory sequences are likely major determinants of conserved recombination breakpoint hot-spot locations. We find that while the locations of recombination breakpoints are not uniformly associated with degrees of nucleotide sequence conservation, they display significant tendencies in multiple coronavirus subgenera to occur in low guanine-cytosine content genome regions, in non-coding regions, at the edges of genes, and at sites within the Spike gene that are predicted to be minimally disruptive of Spike protein folding. While it is apparent that sequence features such as transcriptional regulatory sequences are likely major determinants of where the template-switching events that yield recombination breakpoints most commonly occur, it is evident that selection against misfolded recombinant proteins also strongly impacts observable recombination breakpoint distributions in coronavirus genomes sampled from nature.
Inference and interpretation of evolutionary processes - in particular of the types and targets of natural selection affecting coding sequences, are critically influenced by the assumptions built into statistical models for such analyses. If certain aspects of the substitution process (even when they are not of direct interest) are presumed absent or are modeled with too crude of a simplification, estimates of key model parameters can become biased - often systematically, and lead to poor statistical performance. Here, we performed a detailed characterization of how modeling instantaneous multi-nucleotide (or multi-hit, MH) substitutions impacts dN/dS based inference of episodic diversifying selection at the level of the entire alignment. The inclusion of MH reduces the rate (1.37-fold or 26.8%) at which positive selection is called based on the analysis of N = 9,861 empirical data-sets, while offering significantly better statistical fit to sequence data in 8.37% of cases. Through additional simulation studies, we show that this reduction is not simply due to loss of power because of additional model complexity. After a detailed examination of 21 benchmark alignments and a new high-resolution analysis showing which parts of the alignment provide support for positive selection, we reveal that MH substitutions occurring along shorter branches in the tree are largely responsible for discrepant results in selection detection. Our results add to the growing body of literature which examines decades-old modeling assumptions and finds them to be problematic for biological data analysis. Because multi-nucleotide substitutions have a significant impact on natural selection detection even at the level of an entire gene, we recommend that routine selection analysis of this type consider their inclusion. To facilitate this procedure, we developed a simple model testing selection detection framework able to screen an alignment for positive selection with two biologically important confounding processes: synonymous rate variation, and multi-nucleotide instantaneous substitutions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.