One of the greatest challenges that modern molecular biology is facing is the understanding of the complex mechanisms regulating gene expression. A fundamental step in this process requires the characterization of regulatory motifs playing key roles in the regulation of gene expression at transcriptional and post-transcriptional levels. In particular, transcription is modulated by the interaction of transcription factors with their corresponding binding sites. Weeder Web is a web interface to Weeder, an algorithm for the automatic discovery of conserved motifs in a set of related regulatory DNA sequences. The motifs found are in turn likely to be instances of binding sites for some transcription factor. Other than providing access to the program, the interface has been designed so to make usage of the program itself as simple as possible, and to require very little prior knowledge about the length and the conservation of the motifs to be found. In fact, the interface automatically starts different runs of the program, each one with different parameters, and provides the user with an overall summary of the results as well as some 'advice' on which motifs look more interesting according to their statistical significance and some simple considerations. The web interface is available at the address www.pesolelab.it by following the 'Tools' link.
Pattern discovery in unaligned DNA sequences is a challenging problem in both computer science and molecular biology. Several different methods and techniques have been proposed so far, but in most of the cases signals in DNA sequences are very complicated and avoid detection. Exact exhaustive methods can solve the problem only for short signals with a limited number of mutations. In this work, we extend exhaustive enumeration also to longer patterns. More in detail, the basic version of algorithm presented in this paper, given as input a set of sequences and an error ratio epsilon < 1, finds all patterns that occur in at least q sequences of the set with at most epsilonm mutations, where m is the length of the pattern. The only restriction is imposed on the location of mutations along the signal. That is, a valid occurrence of a pattern can present at most [epsiloni] mismatches in the first i nucleotides, and so on. However, we show how the algorithm can be used also when no assumption can be made on the position of mutations. In this case, it is also possible to have an estimate of the probability of finding a signal according to the signal length, the error ratio, and the input parameters. Finally, we discuss some significance measures that can be used to sort the patterns output by the algorithm.
Systems biology has experienced dramatic growth in the number, size, and complexity of computational models. To reproduce simulation results and reuse models, researchers must exchange unambiguous model descriptions. We review the latest edition of the Systems Biology Markup Language (SBML), a format designed for this purpose. A community of modelers and software authors developed SBML Level 3 over the past decade. Its modular form consists of a core suited to representing reaction‐based models and packages that extend the core with features suited to other model types including constraint‐based models, reaction‐diffusion models, logical network models, and rule‐based models. The format leverages two decades of SBML and a rich software ecosystem that transformed how systems biologists build and interact with models. More recently, the rise of multiscale models of whole cells and organs, and new data sources such as single‐cell measurements and live imaging, has precipitated new ways of integrating data with models. We provide our perspectives on the challenges presented by these developments and how SBML Level 3 provides the foundation needed to support this evolution.
Prostate cancer is the most common malignant tumors in men but prostate Magnetic Resonance Imaging (MRI) analysis remains challenging. Besides whole prostate gland segmentation, the capability to differentiate between the blurry boundary of the Central Gland (CG) and Peripheral Zone (PZ) can lead to differential diagnosis, since the frequency and severity of tumors differ in these regions. To tackle the prostate zonal segmentation task, we propose a novel Convolutional Neural Network (CNN), called USE-Net, which incorporates Squeeze-and-Excitation (SE) blocks into U-Net, i.e., one of the most effective CNNs in biomedical image segmentation. Especially, the SE blocks are added after every Encoder (Enc USE-Net) or Encoder-Decoder block (Enc-Dec USE-Net). This study evaluates the generalization ability of CNN-based architectures on three T2-weighted MRI datasets, each one consisting of a different number of patients and heterogeneous image characteristics, collected by different institutions. The following mixed scheme is used for training/testing: (i ) training on either each individual dataset or multiple prostate MRI datasets and (ii ) testing on all three datasets with all possible training/testing combinations. USE-Net is compared against three stateof-the-art CNN-based architectures (i.e., U-Net, pix2pix, and Mixed-Scale Dense Network), along with a semi-automatic continuous max-flow model. The results show that training on the union of the datasets generally outperforms training on each dataset separately, allowing for both intra-/crossdataset generalization. Enc USE-Net shows good overall generalization under any training condition, while Enc-Dec USE-Net remarkably outperforms the other methods when trained on all datasets. These findings reveal that the SE blocks' adaptive feature recalibration provides excellent cross-dataset generalization when testing is performed on samples of the datasets used during training. Therefore, we should consider multi-dataset training and SE
scite is a Brooklyn-based startup that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.