Six different codification variants based on Euclidean space, just like SOM processing, have been tested using two SOM models: the classical Kohonen's SOM and growing cell structures. They have been applied to two different sets of sequences: 32 sequences of small sub-unit ribosomal RNA from organisms belonging to the three domains of life, and 44 sequences of the reverse transcriptase region of the pol gene of human immunodeficiency virus type 1 belonging to different groups and sub-types. Our results show that the most important factor affecting the accuracy of sequence clustering is the assignment of an extra weight to the presence of alignment-derived gaps. Although each of the codification variants shows a different level of taxonomic consistency, the results are in agreement with sequence-based phylogenetic reconstructions and anticipate a broad applicability of this codification method.
Populations of RNA viruses are composed of complex and dynamic mixtures of variant genomes that are termed mutant spectra or mutant clouds. This applies also to SARS-CoV-2, and mutations that are detected at low frequency in an infected individual can be dominant (represented in the consensus sequence) in subsequent variants of interest or variants of concern. Here we briefly review the main conclusions of our work on mutant spectrum characterization of hepatitis C virus (HCV) and SARS-CoV-2 at the nucleotide and amino acid levels and address the following two new questions derived from previous results: (i) how is the SARS-CoV-2 mutant and deletion spectrum composition in diagnostic samples, when examined at progressively lower cut-off mutant frequency values in ultra-deep sequencing; (ii) how the frequency distribution of minority amino acid substitutions in SARS-CoV-2 compares with that of HCV sampled also from infected patients. The main conclusions are the following: (i) the number of different mutations found at low frequency in SARS-CoV-2 mutant spectra increases dramatically (50- to 100-fold) as the cut-off frequency for mutation detection is lowered from 0.5% to 0.1%, and (ii) that, contrary to HCV, SARS-CoV-2 mutant spectra exhibit a deficit of intermediate frequency amino acid substitutions. The possible origin and implications of mutant spectrum differences among RNA viruses are discussed.
Human Immunodeficiency Virus type 1 (HIV-1) because of high mutation rates, large population sizes, and rapid replication, exhibits complex evolutionary strategies. For the analysis of evolutionary processes, the graphical representation of fitness landscapes provides a significant advantage. The experimental determination of viral fitness remains, in general, difficult and consequently most published fitness landscapes have been artificial, theoretical or estimated. Self-Organizing Maps (SOM) are a class of Artificial Neural Network (ANN) for the generation of topological ordered maps. Here, three-dimensional (3D) data driven fitness landscapes, derived from a collection of sequences from HIV-1 viruses after “in vitro” passages and labelled with the corresponding experimental fitness values, were created by SOM. These maps were used for the visualization and study of the evolutionary process of HIV-1 “in vitro” fitness recovery, by directly relating fitness values with viral sequences. In addition to the representation of the sequence space search carried out by the viruses, these landscapes could also be applied for the analysis of related variants like members of viral quasiespecies. SOM maps permit the visualization of the complex evolutionary pathways in HIV-1 fitness recovery. SOM fitness landscapes have an enormous potential for the study of evolution in related viruses of “in vitro” works or from “in vivo” clinical studies with human, animal or plant viral infections.
The study provides for the first time the haplotype profile and its variation in the course of its adaptation to a cell culture environment in the absence of external selective constraints. The deep sequencing-based self-organized maps document a two-layer haplotype distribution with an ample basal platform and a lower number of protruding peaks.
The new era of information and the needs of our society require continuous change in software and technology. Changes are produced very quickly and software systems require evolving at the same velocity, which implies that the decision-making process of software architectures should be (semi-)automated to satisfy changing needs and to avoid wrong decisions. This issue is critical since suboptimal architecture design decisions may lead to high cost and poor software quality. Therefore, systematic and (semi-)automated mechanisms that help software architects during the decision-making process are required. Architectural patterns are one of the most important features of software applications, but the same pattern can be implemented in different ways, leaving to results of different quality. When an application requires to evolve, knowledge extracted from similar applications is useful for driving decisions, since quality pattern implementations can be reproduced in similar applications to improve specific quality attributes. Therefore, clustering methods are especially suitable for classifying similar pattern implementations. In this paper, we apply a novel unsupervised clustering technique, based on the well-known artificial neural network model Self-Organizing Maps, to classify Model-View-Controller (MVC) pattern from a quality point of view. Software quality is analyzed by 24 metrics organized into the categories of Count/Size, Maintainability, Duplications, Complexity, and Design Quality. The main goal of this work is twofold: to identify the quality features that establish the similarity of MVC applications without software architect bias, and to classify MVC applications by means of Self-Organizing Maps based on quality metrics. To that end, this work performs an exploratory study by conducting two analyses with a dataset of 87 Java MVC applications characterized by the 24 metrics and two attributes that describe the technology dimension of the application. The stated findings provide a knowledge base that can help in the decision-making process for the architecture of Java MVC applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.