Transposable elements (TEs) are repetitive nucleotide sequences that make up a large portion of eukaryotic genomes. They can move and duplicate within a genome, increasing genome size and contributing to genetic diversity within and across species. Accurate identification and classification of TEs present in a genome is an important step towards understanding their effects on genes and their role in genome evolution. We introduce TE-Learner, a framework based on machine learning that automatically identifies TEs in a given genome and assigns a classification to them. We present an implementation of our framework towards LTR retrotransposons, a particular type of TEs characterized by having long terminal repeats (LTRs) at their boundaries. We evaluate the predictive performance of our framework on the well-annotated genomes of Drosophila melanogaster and Arabidopsis thaliana and we compare our results for three LTR retrotransposon superfamilies with the results of three widely used methods for TE identification or classification: RepeatMasker, Censor and LtrDigest. In contrast to these methods, TE-Learner is the first to incorporate machine learning techniques, outperforming these methods in terms of predictive performance, while able to learn models and make predictions efficiently. Moreover, we show that our method was able to identify TEs that none of the above method could find, and we investigated TE-Learner’s predictions which did not correspond to an official annotation. It turns out that many of these predictions are in fact strongly homologous to a known TE.
The main goal of our research was to search for SSRs in the Eucalyptus EST FORESTs database (using a software for mining SSR-motifs). With this objective, we created a database for cataloging Eucalyptus EST-derived SSRs, and developed a bioinformatics tool, named Satellyptus, for finding and analyzing microsatellites in the Eucalyptus EST database. The search for microsatellites in the FORESTs database containing 71,115 Eucalyptus EST sequences (52.09 Mb) revealed 20,530 SSRs in 15,621 ESTs. The SSR abundance detected on the Eucalyptus ESTs database (29% or one microsatellite every four sequences) is considered very high for plants. Amongst the categories of SSR motifs, the dimeric (37%) and trimeric ones (33%) predominated. The AG/CT motif was the most frequent (35.15%) followed by the trimeric CCG/CGG (12.81%). From a random sample of 1,217 sequences, 343 microsatellites in 265 SSR-containing sequences were identified. Approximately 48% of these ESTs containing microsatellites were homologous to proteins with known biological function. Most of the microsatellites detected in Eucalyptus ESTs were positioned at either the 5’ or 3’ end. Our next priority involves the design of flanking primers for codominant SSR loci, which could lead to the development of a set of microsatellite-based markers suitable for marker-assisted Eucalyptus breeding programs
The Citrus ESTs Sequencing Project (CitEST) conducted at Centro APTA Citros Sylvio Moreira/IAC has identified and catalogued ESTs representing a set of citrus genes expressed under relevant stress responses, including diseases such as citrus variegated chlorosis (CVC), caused by Xylella fastidiosa. All sweet orange (Citrus sinensis L. Osb.) varieties are susceptible to X. fastidiosa. On the other hand, mandarins (C. reticulata Blanco) are considered tolerant or resistant to the disease, although the bacterium can be sporadically detected within the trees, but no disease symptoms or economic losses are observed. To study their genetic responses to the presence of X. fastidiosa, we have compared EST libraries of leaf tissue of sweet orange Pêra IAC (highly susceptible cultivar to X. fastidiosa) and mandarin 'Ponkan' (tolerant) artificially infected with the bacterium. Using an in silico differential display, 172 genes were found to be significantly differentially expressed in such conditions. Sweet orange presented an increase in expression of photosynthesis related genes that could reveal a strategy to counterbalance a possible lower photosynthetic activity resulting from early effects of the bacterial colonization in affected plants. On the other hand, mandarin showed an active multi-component defense response against the bacterium similar to the non-host resistance pattern.
Reserarch question: the paper investigates the influence of football clubs investments on their performancen the first division of the Brazilian league. Motivation: Considering football as the most important sport in Brazil, representing a cultural symbol of the country, understanding it better becomes necessary. Therefore, it is necessary to investigate the financial amounts involved in the management actions to try to understand the influence of the investments that the clubs make about the marketing and championships improvement and, thus, to be able to delineate the relation between investment and success in Football. Idea: In this perspective, the central hypothesis of this study is that the clubs that present the highest expenditures on soccer will also present the best classifications, since these expenses are related to better training conditions and salaries, which may contribute to the recruitment of the best athletes. Data: The study was conducted with data collected over the internet, using data provided by clubs. Only the clubs belonging to the first division were used, being a total of 19 clubs, divided into 3 groups, according to the investment value in Football. Tools: This study presents descriptive and inferential analyzes, since the qualitative-quantitative approach was assumed as a way of understanding the data. Assuming the number of clubs participating in the study, we chose non-parametric inferential analyzes in the intra- and inter-group evaluations, using the alpha value of 0.05 as criterion. Findings: The results showed that clubs in groups G1 and G2 had similar expenditures dynamics, while G3 group clubs had a slight swing. In addition, clubs in groups G1 and G2 had the largest amount of expenditures, while G3 clubs had the lowest expenditure during the period analyzed. These results demonstrated that the clubs of the G1 and G2 groups achieved the best positions in the championships, corroborating the research hypothesis. Contribution: One can conclude that the investment of the clubs directly influences the classification in the championships, being necessary great investment to conquer the championship. However, this study has some limitations, such as sample size (19 teams only). Therefore, we emphasize the need for new studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.