Adoption of messaging communication and voice assistants has grown rapidly in the last years. This creates a demand for tools that speed up prototyping of featurerich dialogue systems. An open-source library DeepPavlov is tailored for development of conversational agents. The library prioritises efficiency, modularity, and extensibility with the goal to make it easier to develop dialogue systems from scratch and with limited data available. It supports modular as well as end-to-end approaches to implementation of conversational agents. Conversational agent consists of skills and every skill can be decomposed into components. Components are usually models which solve typical NLP tasks such as intent classification, named entity recognition or pre-trained word vectors. Sequence-to-sequence chitchat skill, question answering skill or task-oriented skill can be assembled from components provided in the library.
BackgroundModeling of a complex biological process can explain the results of experimental studies and help predict its characteristics. Among such processes is transcription in the presence of competing RNA polymerases. This process involves RNA polymerases collision followed by transcription termination.ResultsA mathematical and computer simulation model is developed to describe the competition of RNA polymerases during genes transcription on complementary DNA strands. E.g., in the barley Hordeum vulgare the polymerase competition occurs in the locus containing plastome genes psbA, rpl23, rpl2 and four bacterial type promoters. In heat shock experiments on isolated chloroplasts, a twofold decrease of psbA transcripts and even larger increase of rpl23-rpl2 transcripts were observed, which is well reproduced in the model. The model predictions are in good agreement with virtually all relevant experimental data (knockout, heat shock, chromatogram data, etc.). The model allows to hypothesize a mechanism of cell response to knockout and heat shock, as well as a mechanism of gene expression regulation in presence of RNA polymerase competition. The model is implemented for multiprocessor platforms with MPI and supported on Linux and MS Windows. The source code written in C++ is available under the GNU General Public License from the laboratory website. A user-friendly GUI version is also provided at http://lab6.iitp.ru/en/rivals.ConclusionsThe developed model is in good agreement with virtually all relevant experimental data. The model can be applied to estimate intensities of binding of the holoenzyme and phage type RNA polymerase to their promoters using data on gene transcription levels, as well as to predict characteristics of RNA polymerases and the transcription process that are difficult to measure directly, e.g., the intensity (frequency) of holoenzyme binding to the promoter in correlation to its nucleotide composition and the type of σ-subunit, the amount of transcription initiation aborts, etc. The model can be used to make functional predictions, e.g., heat shock response in isolated chloroplasts and changes of gene transcription levels under knockout of different σ-subunits or RNA polymerases or due to gene expression regulation.ReviewersThis article was reviewed by Dr. Anthony Almudevar, Dr. Aniko Szabo, Dr. Yuri Wolf (nominated by Dr. Peter Olofsson) and Prof. Marek Kimmel.
BackgroundOne of the main aims of phylogenomics is the reconstruction of objects defined in the leaves along the whole phylogenetic tree to minimize the specified functional, which may also include the phylogenetic tree generation. Such objects can include nucleotide and amino acid sequences, chromosomal structures, etc. The structures can have any set of linear and circular chromosomes, variable gene composition and include any number of paralogs, as well as any weights of individual evolutionary operations to transform a chromosome structure. Many heuristic algorithms were proposed for this purpose, but there are just a few exact algorithms with low (linear, cubic or similar) polynomial computational complexity among them to our knowledge. The algorithms naturally start from the calculation of both the distance between two structures and the shortest sequence of operations transforming one structure into another. Such calculation per se is an NP-hard problem.ResultsA general model of chromosomal structure rearrangements is considered. Exact algorithms with almost linear or cubic polynomial complexities have been developed to solve the problems for the case of any chromosomal structure but with certain limitations on operation weights. The computer programs are tested on biological data for the problem of mitochondrial or plastid chromosomal structure reconstruction. To our knowledge, no computer programs are available for this model.ConclusionsExactness of the proposed algorithms and such low polynomial complexities were proved. The reconstructed evolutionary trees of mitochondrial and plastid chromosomal structures as well as the ancestral states of the structures appear to be reasonable.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-0878-z) contains supplementary material, which is available to authorized users.
BackgroundPerfectly or highly conserved DNA elements were found in vertebrates, invertebrates, and plants by various methods. However, little is known about such elements in protists. The evolutionary distance between apicomplexans can be very high, in particular, due to the positive selection pressure on them. This complicates the identification of highly conserved elements in alveolates, which is overcome by the proposed algorithm.ResultsA novel algorithm is developed to identify highly conserved DNA elements. It is based on the identification of dense subgraphs in a specially built multipartite graph (whose parts correspond to genomes). Specifically, the algorithm does not rely on genome alignments, nor pre-identified perfectly conserved elements; instead, it performs a fast search for pairs of words (in different genomes) of maximum length with the difference below the specified edit distance. Such pair defines an edge whose weight equals the maximum (or total) length of words assigned to its ends. The graph composed of these edges is then compacted by merging some of its edges and vertices. The dense subgraphs are identified by a cellular automaton-like algorithm; each subgraph defines a cluster composed of similar inextensible words from different genomes. Almost all clusters are considered as predicted highly conserved elements. The algorithm is applied to the nuclear genomes of the superphylum Alveolata, and the corresponding phylogenetic tree is built and discussed.ConclusionWe proposed an algorithm for the identification of highly conserved elements. The multitude of identified elements was used to infer the phylogeny of Alveolata.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1257-5) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.