Repetitive element sequences are adjacent, repeating patterns, also called motifs, and can be of different lengths; repetitions can involve their exact or approximate copies. They have been widely used as molecular markers in population biology. Given the sizes of sequenced genomes, various bioinformatics tools have been developed for the extraction of repetitive elements from DNA sequences. However, currently available tools do not provide options for identifying repetitive elements in the genome or proteome, displaying a user-friendly web interface, and performing-exhaustive searches. ProGeRF is a web site for extracting repetitive regions from genome and proteome sequences. It was designed to be efficient, fast, and accurate and primarily user-friendly web tool allowing many ways to view and analyse the results. ProGeRF (Proteome and Genome Repeat Finder) is freely available as a stand-alone program, from which the users can download the source code, and as a web tool. It was developed using the hash table approach to extract perfect and imperfect repetitive regions in a (multi)FASTA file, while allowing a linear time complexity.
Bioinformatics and other well-established sciences, such as molecular biology, genetics, and biochemistry, provide a scientific approach for the analysis of data generated through “omics” projects that may be used in studies of chronobiology. The results of studies that apply these techniques demonstrate how they significantly aided the understanding of chronobiology. However, bioinformatics tools alone cannot eliminate the need for an understanding of the field of research or the data to be considered, nor can such tools replace analysts and researchers. It is often necessary to conduct an evaluation of the results of a data mining effort to determine the degree of reliability. To this end, familiarity with the field of investigation is necessary. It is evident that the knowledge that has been accumulated through chronobiology and the use of tools derived from bioinformatics has contributed to the recognition and understanding of the patterns and biological rhythms found in living organisms. The current work aims to develop new and important applications in the near future through chronobiology research.
Begonia jaguarensis L. Kollmann, R. S. Lopes & Peixoto (Begoniaceae), a new species from North of Espírito Santo state, Brazil, is described and illustrated. This new species is related to Begonia thelmae L. B. Sm. & Wassh. with which it is compared. Diagnosis, description, conservation status, pictures, map and comments about geographic distribution are also provided.
The public health system is extremely dependent on the use of vaccines to immunize the population from a series of infectious and dangerous diseases, preventing the system from collapsing and millions of people dying every year. However, to develop these vaccines and effectively monitor these diseases, it is necessary to use accurate diagnostic methods capable of identifying highly immunogenic regions within a given pathogenic protein. Existing experimental methods are expensive, time-consuming, and require arduous laboratory work, as they require the screening of a large number of potential candidate epitopes, making the methods extremely laborious, especially for application to larger microorganisms. In the last decades, researchers have developed in silico prediction methods, based on machine learning, to identify these markers, to drastically reduce the list of potential candidate epitopes for experimental tests, and, consequently, to reduce the laborious task associated with their mapping. Despite these efforts, the tools and methods still have low accuracy, slow diagnosis, and offline training. Thus, we develop a method to predict B-cell linear epitopes which are based on a Fuzzy-ARTMAP neural network architecture, called BepFAMN (B Epitope Prediction Fuzzy ARTMAP Artificial Neural Network). This was trained using a linear averaging scheme on 15 properties that include an amino acid ratio scale and a set of 14 physicochemical scales. The database used was obtained from the IEDB website, from which the amino acid sequences with the annotations of their positive and negative epitopes were taken. To train and validate the knowledge models, five-fold cross-validation and competition techniques were used. The BepiPred-2.0 database, an independent database, was used for the tests. In our experiment, the validation dataset reached sensitivity = 91.50%, specificity = 91.49%, accuracy = 91.49%, MCC = 0.83, and an area under the curve (AUC) ROC of approximately 0.9289. The result in the testing dataset achieves a significant improvement, with sensitivity = 81.87%, specificity = 74.75%, accuracy = 78.27%, MCC = 0.56, and AOC = 0.7831. These achieved values demonstrate that BepFAMN outperforms all other linear B-cell epitope prediction tools currently used. In addition, the architecture provides mechanisms for online training, which allow the user to find a new B-cell linear epitope, and to improve the model without need to re-train itself with the whole dataset. This fact contributes to a considerable reduction in the number of potential linear epitopes to be experimentally validated, reducing laboratory time and accelerating the development of diagnostic tests, vaccines, and immunotherapeutic approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.