Highlights d Histone 3 lysine 14 is essential and required for developmental patterning d H3K14ac decorates a set of tissue-specific genes that lack canonical histone marks d H3K14 is necessary for expression of genes marked uniquely by H3K14 acetylation d H3K14ac is recognized by the Brahma bromodomain
Human carbonic anhydrase IX (CAIX) has evolved as a promising biomarker for cancer prognosis, due to its overexpression in various cancers and restricted expression in normal tissue. However, limited information is available on its biophysical behavior. The unfolding of CAIX in aqueous urea solution was studied using all-atom molecular dynamics simulation approach. The results of this study revealed a stable intermediate state along the unfolding pathway of CAIX. At intermediate concentrations of urea (2.0-4.0 M), the protein displays a native-like structure with a large population of its secondary structure and hydrophobic contacts remaining intact in addition to small confined overall motions. Beyond 4.0 M urea, the unfolding is more gradual and at 8.0 M urea the structure is largely collapsed due to the solvent effect. The hydrophobic contact analysis suggests that the contact in terminal α-helices is separated initially which propagates in the loss of contacts from centrally located β-sheets. The reduction of 60-65% tertiary contacts in 7.0-8.0 M urea suggested the presence of residual structure in unfolded state and is confirmed with structural snap shot. Free energy landscape analysis suggested that unfolding of CAIX exists through the different intermediate states.
Bacterial populations are routinely characterized based on microscopic examination, colony formation, and biochemical tests. However, in the recent past, bacterial identification, classification, and nomenclature have been strongly influenced by genome sequence information. Advances in bioinformatics and growth in genome databases has placed genome-based metadata analysis in the hands of researchers who will require taxonomic experience to resolve intricacies. To achieve this, different tools are now available to quantitatively measure genome relatedness within members of the same species, and genome-wide average nucleotide identity (gANI) is one such reliable tool to measure genome similarity. A genome assembly with a gANI score of <95% at the intraspecies level is generally considered indicative of a separate species. In this study, we have analysed 300 whole-genome sequences belonging to 26 different bacterial species available in the NCBI Genome database and calculated their similarity at the intraspecies level based on gANI score. At the intraspecies level, nine bacterial species showed less than 90% gANI and more than 10% of unaligned regions. We suggest the appropriate use of available bioinformatics resources after genome assembly to arrive at the proper bacterial identification, classification, and nomenclature to avoid erroneous species assignments and disparity due to diversity at the intraspecies level.
Highlights d Histone 3 lysine 14 is essential and required for developmental patterning d H3K14ac decorates a set of tissue-specific genes that lack canonical histone marks d H3K14 is necessary for expression of genes marked uniquely by H3K14 acetylation d H3K14ac is recognized by the Brahma bromodomain
Recent large datasets measuring the gene expression of millions of possible gene promoter sequences provide a resource to design and train optimized deep neural network architectures to predict expression from sequences. High predictive performance due to the modeling of dependencies within and between regulatory sequences is an enabler for biological discoveries in gene regulation through model interpretation techniques. To understand the regulatory code that delineates gene expression, we have designed a novel deep-learning model (CRMnet) to predict gene expression in Saccharomyces cerevisiae. Our model outperforms the current benchmark models and achieves a Pearson correlation coefficient of 0.971 and a mean squared error of 3.200. Interpretation of informative genomic regions determined from model saliency maps, and overlapping the saliency maps with known yeast motifs, supports that our model can successfully locate the binding sites of transcription factors that actively modulate gene expression. We compare our model's training times on a large compute cluster with GPUs and Google TPUs to indicate practical training times on similar datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.