Motivation Recombinant protein production is a widely used technique in the biotechnology and biomedical industries, yet only a quarter of target proteins are soluble and can therefore be purified. Results We have discovered that global structural flexibility, which can be modeled by normalised B-factors, accurately predicts the solubility of 12,216 recombinant proteins expressed in Escherichia coli. We have optimised these B-factors, and derived a new set of values for solubility scoring that further improves prediction accuracy. We call this new predictor the ‘Solubility-Weighted Index’ (SWI). Importantly, SWI outperforms many existing protein solubility prediction tools. Furthermore, we have developed ‘SoDoPE’ (Soluble Domain for Protein Expression), a web interface that allows users to choose a protein region of interest for predicting and maximising both protein expression and solubility. Availability The SoDoPE web server and source code are freely available at https://tisigner.com/sodope and https://github.com/Gardner-BinfLab/TISIGNER-ReactJS, respectively. The code and data for reproducing our analysis can be found at https://github.com/Gardner-BinfLab/SoDoPE_paper_2020. Supplementary information Supplementary data are available at Bioinformatics online.
Nasopharyngeal carcinoma (NPC) is originated from the epithelial cells of nasopharynx, Epstein–Barr virus (EBV)‐associated and has the highest incidence and mortality rates in Southeast Asia. Late presentation is a common issue and early detection could be the key to reduce the disease burden. Sensitivity of plasma EBV DNA, an established NPC biomarker, for Stage I NPC is controversial. Most newly reported NPC biomarkers have neither been externally validated nor compared to the established ones. This causes difficulty in planning for cost‐effective early detection strategies. Our study systematically evaluated six established and four new biomarkers in NPC cases, population controls and hospital controls. We showed that BamHI‐W 76 bp remains the most sensitive plasma biomarker, with 96.7% (29/30), 96.7% (58/60) and 97.4% (226/232) sensitivity to detect Stage I, early stage and all NPC, respectively. Its specificity was 94.2% (113/120) against population controls and 90.4% (113/125) against hospital controls. Diagnostic accuracy of BamHI‐W 121 bp and ebv‐miR‐BART7‐3p were validated. Hsa‐miR‐29a‐3p and hsa‐miR‐103a‐3p were not, possibly due to lower number of advanced stage NPC cases included in this subset. Decision tree modeling suggested that combination of BamHI‐W 76 bp and VCA IgA or EA IgG may increase the specificity or sensitivity to detect NPC. EBNA1 99 bp could identify NPC patients with poor prognosis in early and advanced stage NPC. Our findings provided evidence for improvement in NPC screening strategies, covering considerations of opportunistic screening, combining biomarkers to increase sensitivity or specificity and testing biomarkers from single sampled specimen to avoid logistic problems of resampling.
Hepatitis B virus (HBV) is a major human pathogen that causes liver diseases. The main HBV RNAs are unspliced transcripts that encode the key viral proteins. Recent studies have shown that some of the HBV spliced transcript isoforms are predictive of liver cancer, yet the roles of these spliced transcripts remain elusive. Furthermore, there are nine major HBV genotypes common in different regions of the world, these genotypes may express different spliced transcript isoforms. To systematically study the HBV splice variants, we transfected human hepatoma cells, Huh7, with four HBV genotypes (A2, B2, C2 and D3), followed by deep RNA-sequencing. We found that 13–28 % of HBV RNAs were splice variants, which were reproducibly detected across independent biological replicates. These comprised 6 novel and 10 previously identified splice variants. In particular, a novel, singly spliced transcript was detected in genotypes A2 and D3 at high levels. The biological relevance of these splice variants was supported by their identification in HBV-positive liver biopsy and serum samples, and in HBV-infected primary human hepatocytes. Interestingly the levels of HBV splice variants varied across the genotypes, but the spliced pregenomic RNA SP1 and SP9 were the two most abundant splice variants. Counterintuitively, these singly spliced SP1 and SP9 variants had a suboptimal 5′ splice site, supporting the idea that splicing of HBV RNAs is tightly controlled by the viral post-transcriptional regulatory RNA element.
Introns in mRNA leaders are common in complex eukaryotes, but often overlooked. These introns are spliced out before translation, leaving exon-exon junctions in the mRNA leaders (leader EEJs). Our multi-omic approach shows that the number of leader EEJs inversely correlates with the main protein translation, as does the number of upstream open reading frames (uORFs). Across the five species studied, the lowest levels of translation were observed for mRNAs with both leader EEJs and uORFs (29%). This class of mRNAs also have ribosome footprints on uORFs, with strong triplet periodicity indicating uORF translation. Furthermore, the positions of both leader EEJ and uORF are conserved between human and mouse. Thus, the uORF, in combination with leader EEJ predicts lower expression for nearly one-third of eukaryotic proteins.
Human Motion Analysis (HMA) is currently one of the most popularly active research domains as such significant research interests are motivated by a number of real world applications such as video surveillance, sports analysis, healthcare monitoring and so on. However, most of these real world applications face high levels of uncertainties that can affect the operations of such applications. Hence, the fuzzy set theory has been applied and showed great success in the recent past. In this paper, we aim at reviewing the fuzzy set oriented approaches for HMA, individuating how the fuzzy set may improve the HMA, envisaging and delineating the future perspectives. To the best of our knowledge, there is not found a single survey in the current literature that has discussed and reviewed fuzzy approaches towards the HMA. For ease of understanding, we conceptually classify the human motion into three broad levels: Low-Level (LoL), Mid-Level (MiL), and High-Level (HiL) HMA.
Structured RNA elements may control virus replication, transcription and translation, and their distinct features are being exploited by novel antiviral strategies. Viral RNA elements continue to be discovered using combinations of experimental and computational analyses. However, the wealth of sequence data, notably from deep viral RNA sequencing, viromes, and metagenomes, necessitates computational approaches being used as an essential discovery tool. In this review, we describe practical approaches being used to discover functional RNA elements in viral genomes. In addition to success stories in new and emerging viruses, these approaches have revealed some surprising new features of well-studied viruses e.g., human immunodeficiency virus, hepatitis C virus, influenza, and dengue viruses. Some notable discoveries were facilitated by new comparative analyses of diverse viral genome alignments. Importantly, comparative approaches for finding RNA elements embedded in coding and non-coding regions differ. With the exponential growth of computer power we have progressed from stem-loop prediction on single sequences to cutting edge 3D prediction, and from command line to user friendly web interfaces. Despite these advances, many powerful, user friendly prediction tools and resources are underutilized by the virology community.
Many viruses contain RNA elements that modulate splicing and/or promote nuclear export of their RNAs. The RNAs of the major human pathogen, hepatitis B virus (HBV) contain a large (~600 bases) composite cis-acting 'post-transcriptional regulatory element' (PRE). This element promotes expression from these naturally intronless transcripts. Indeed, the related woodchuck hepadnavirus PRE (WPRE) is used to enhance expression in gene therapy and other expression vectors. These PRE are likely to act through a combination of mechanisms, including promotion of RNA nuclear export. Functional components of both the HBV PRE and WPRE are 2 conserved RNA cis-acting stem-loop (SL) structures, SLα and SLβ. They are within the coding regions of polymerase (P) gene, and both P and X genes, respectively. Based on previous studies using mutagenesis and/or nuclear magnetic resonance (NMR), here we propose 2 covariance models for SLα and SLβ. The model for the 30-nucleotide SLα contains a G-bulge and a CNGG(U) apical loop of which the first and the fourth loop residues form a CG pair and the fifth loop residue is bulged out, as observed in the NMR structure. The model for the 23-nucleotide SLβ contains a 7-base-pair stem and a 9-nucleotide loop. Comparison of the models with other RNA structural elements, as well as similarity searches of human transcriptome and viral genomes demonstrate that SLα and SLβ are specific to HBV transcripts. However, they are well conserved among the hepadnaviruses of non-human primates, the woodchuck and ground squirrel.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.