The incidence, distribution, and variation of simple sequence repeats (SSRs) in viruses is instrumental in understanding the functional and evolutionary aspects of repeat sequences. Full-length genome sequences retrieved from NCBI were used for extraction and analysis of repeat sequences using IMEx software. We have also developed two MATLAB-based tools for extraction of gene locations from GenBank in tabular format and simulation of this data with SSR incidence data. Present study encompassing 147 Mycobacteriophage genomes revealed 25,284 SSRs and 1,127 compound SSRs (cSSRs) through IMEx. Mono- to hexa-nucleotide motifs were present. The SSR count per genome ranged from 78 (M100) to 342 (M58) while cSSRs incidence ranged from 1 (M138) to 17 (M28, M73). Though cSSRs were present in all the genomes, their frequency and SSR to cSSR conversion percentage varied from 1.08 (M138 with 93 SSRs) to 8.33 (M116 with 96 SSRs). In terms of localization, the SSRs were predominantly localized to coding regions (∼78%). Interestingly, genomes of around 50 kb contained a similar number of SSRs/cSSRs to that in a 110 kb genome, suggesting functional relevance for SSRs which was substantiated by variation in motif constitution between species with different host range. The three species with broad host range (M97, M100, M116) have around 90% of their mono-nucleotide repeat motifs composed of G or C and only M16 has both A and T mononucleotide motifs. Around 20% of the di-nucleotide repeat motifs in the genomes exhibiting a broad host range were CT/TC, which were either absent or represented to a much lesser extent in the other genomes.
The compilation of simple sequence repeats (SSRs) in viruses and its analysis with reference to incidence, distribution and variation would be instrumental in understanding the functional and evolutionary aspects of repeat sequences. Present study encompasses the analysis of SSRs across 30 species of alphaviruses. The full length genome sequences, assessed from NCBI were used for extraction and analysis of repeat sequences using IMEx software. The repeats of different motif sizes (mono- to penta-nucleotide) observed therein exhibited variable incidence across the species. Expectedly, mononucleotide A/T was the most prevalent followed by dinucleotide AG/GA and trinucleotide AAG/GAA in these genomes. The conversion of SSRs to imperfect microsatellite or compound microsatellite (cSSR) is low. cSSR, primarily constituted by variant motifs accounted for up to 12.5% of the SSRs. Interestingly, seven species lacked cSSR in their genomes. However, the SSR and cSSR are predominantly localized to the coding region ORFs for non structural protein and structural proteins. The relative frequencies of different classes of simple and compound microsatellites within and across genomes have been highlighted.
Background:Microsatellites have evoked the interest of researchers owing to their applications in different fields such as DNA fingerprinting, genetic mapping, population genetics, forensics, paternity studies and evolution. Objectives: The present study focused on the analysis of simple sequence repeats (SSRs) in genomes of seven species from three genera of the Filoviridae family. Materials and Methods: Genome sequences of seven species from the Filoviridae family were assessed by the National Center for Biotechnology Information (NCBI), microsatellites were extracted using the IMEx software, and statistical analysis was performed Microsoft Office Excel 2007. Results: A total of 516 microsatellites and 14 Compound Simple Sequence Repeats (cSSR) (also known as compound microsatellites) were extracted. Evidently, the conversion of SSRs to cSSR was low. Mononucleotide A/T was the most prevalent followed by dinucleotide AC/CA and trinucleotide AAC/CAA. Highest incidence of SSRs (mon-/di-nucleotide motif) was observed in RNA Dependent RNA Polymerase (RDRP) gene whereas tri-nucleotide motif was maximally localized in nucleoproteins (NP).
Conclusions:The salient features of simple and compound microsatellites in Filoviridae family have been highlighted herein. Microsatellite regions with higher mutation rates compared to the rest of the genome play a crucial role in genome evolution by acting as a source of quantitative genetic variation. The SSR mutation rate is known to be affected by motif length, motif sequence, and number of repeats and purity of repetition. The functional role of tandem repeats in viruses, remains to be fully elucidated. However, with the repetitive sequence allegedly acting as a hot spot for recombination, we postulate their involvement in genetic events such as recombination, replication, and repair mechanisms that drive sequence diversity leading to the formation of the genetic basis of adaptation.
Most of the viral diseases of plants are caused by RNA viruses which drastically reduce crop yield. In order to generate resistance against RNA viruses infecting plants, we isolated the dicer 1 protein (CaDcr1), a member of RNAse III family (enzyme that cleaves double stranded RNA) from an opportunistic fungus Candida albicans. In vitro analysis revealed that the CaDcr1 cleaved dsRNA of the coat protein gene of cucumber mosaic virus (genus Cucumovirus, family Bromoviridae). Furthermore, we developed transgenic tobacco plants (Nicotiana tabacum cv. Xanthi) over-expressing expressing CaDcr1 by Agrobacterium mediated transformation. Transgenic tobacco lines were able to suppress infection of an Indian isolate of potato virus X (genus Potexvirus, family Alphaflexiviridae). The present study demonstrates that CaDcr1 can cleave double stranded replicative intermediate and provide tolerance to plant against RNA viruses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.