DNA-binding proteins (DBPs) play crucial roles in numerous cellular processes including nucleotide recognition, transcriptional control and the regulation of gene expression. Majority of the existing computational techniques for identifying DBPs are mainly applicable to human and mouse datasets. Even though some models have been tested on Arabidopsis, they produce poor accuracy when applied to other plant species. Therefore, it is imperative to develop an effective computational model for predicting plant DBPs. In this study, we developed a comprehensive computational model for plant specific DBPs identification. Five shallow learning and six deep learning models were initially used for prediction, where shallow learning methods outperformed deep learning algorithms. In particular, support vector machine achieved highest repeated 5-fold cross-validation accuracy of 94.0% area under receiver operating characteristic curve (AUC-ROC) and 93.5% area under precision recall curve (AUC-PR). With an independent dataset, the developed approach secured 93.8% AUC-ROC and 94.6% AUC-PR. While compared with the state-of-art existing tools by using an independent dataset, the proposed model achieved much higher accuracy. Overall results suggest that the developed computational model is more efficient and reliable as compared to the existing models for the prediction of DBPs in plants. For the convenience of the majority of experimental scientists, the developed prediction server PlDBPred is publicly accessible at https://iasri-sg.icar.gov.in/pldbpred/.The source code is also provided at https://iasri-sg.icar.gov.in/pldbpred/source_code.php for prediction using a large-size dataset.
Identifying the factors determining the RBP-RNA interactions remains a big challenge. It involves sparse binding motifs and a suitable sequence context for binding. The present work describes an approach to detect RBP binding sites in RNAs using an ultra-fast inexact k-mers search for statistically significant seeds. The seeds work as an anchor to evaluate the context and binding potential using flanking region information while leveraging from Deep Feed-forward Neural Network. The developed models also received support from MD-simulation studies. The implemented software, RBPSpot, scored consistently high for all the performance metrics including average accuracy of $90% across a large number of validated datasets. It outperformed the compared tools, including some with much complex deep-learning models, during a comprehensive benchmarking process. RBPSpot can identify RBP binding sites in the human system and can also be used to develop new models, making it a valuable resource in the area of regulatory system studies.
RNA-binding proteins (RBPs) are essential for post-transcriptional gene regulation in eukaryotes, including splicing control, mRNA transport and decay. Thus, accurate identification of RBPs is important to understand gene expression and regulation of cell state. In order to detect RBPs, a number of computational models have been developed. These methods made use of datasets from several eukaryotic species, specifically from mice and humans. Although some models have been tested on Arabidopsis, these techniques fall short of correctly identifying RBPs for other plant species. Therefore, the development of a powerful computational model for identifying plant-specific RBPs is needed. In this study, we presented a novel computational model for locating RBPs in plants. Five deep learning models and ten shallow learning algorithms were utilized for prediction with 20 sequence-derived and 20 evolutionary feature sets. The highest repeated five-fold cross-validation accuracy, 91.24% AU-ROC and 91.91% AU-PRC, was achieved by light gradient boosting machine. While evaluated using an independent dataset, the developed approach achieved 94.00% AU-ROC and 94.50% AU-PRC. The proposed model achieved significantly higher accuracy for predicting plant-specific RBPs as compared to the currently available state-of-art RBP prediction models. Despite the fact that certain models have already been trained and assessed on the model organism Arabidopsis, this is the first comprehensive computer model for the discovery of plant-specific RBPs. The web server RBPLight was also developed, which is publicly accessible at https://iasri-sg.icar.gov.in/rbplight/, for the convenience of researchers to identify RBPs in plants.
Defining nutrient management zones (MZs) is crucial for the implementation of site-specific management. The determination of MZs is based on several factors, including crop, soil, climate, and terrain characteristics. This study aims to delineate MZs by means of geostatistical and fuzzy clustering algorithms considering remotely sensed and laboratory data and, subsequently, to compare the zone maps in the north-eastern Himalayan region of India. For this study, 896 grid-wise representative soil samples (0–25 cm depth) were collected from the study area (1615 km2). The soils were analysed for soil reaction (pH), soil organic carbon and available macro (N, P and K) and micronutrients (Fe, Mn, Zn and Cu). The predicted soil maps were developed using regression kriging, where 28 digital elevation model-derived terrain attributes and two vegetation derivatives were used as environmental covariates. The coefficient of determination (R2) and root mean square error were used to evaluate the model’s performance. The predicted soil parameters were accurate, and regression kriging identified the highest variability for the majority of the soil variables. Further, to define the management zones, the geographically weighted principal component analysis and possibilistic fuzzy c-means clustering method were employed, based on which the optimum clusters were identified by employing fuzzy performance index and normalized classification entropy. The management zones were constructed considering the total pixel points of 30 m spatial resolution (17, 86,985 data points). The area was divided into four distinct zones, which could be differently managed. MZ 1 covers the maximum (43.3%), followed by MZ 2 (29.4%), MZ 3 (27.0%) and MZ 4 (0.3%). The MZs map thus would not only serve as a guide for judicious location-specific nutrient management, but would also help the policymakers to bring sustainable changes in the north-eastern Himalayan region of India.
Formation of mature miRNAs and their expression is a highly controlled process. It is very much dependent upon the post-transcriptional regulatory events. Recent findings suggest that several RNA binding proteins beyond Drosha/Dicer are involved in the processing of miRNAs. Deciphering of conditional networks for these RBP-miRNA interactions may help to reason the spatio-temporal nature of miRNAs which can also be used to predict miRNA profiles. In this direction, >25TB of data from different platforms were studied (CLIP-seq/RNA-seq/miRNA-seq) to develop Bayesian causal networks capable of reasoning miRNA biogenesis. The networks ably explained the miRNA formation when tested across a large number of conditions and experimentally validated data. The networks were modeled into an XGBoost machine learning system where expression information of the network components was found capable to quantitatively explain the miRNAs formation levels and their profiles. The models were developed for 1,204 human miRNAs whose accurate expression level could be detected directly from the RNA-seq data alone without any need of doing separate miRNA profiling experiments like miRNA-seq or arrays. A first of its kind, miRbiom performed consistently well with high average accuracy (91%) when tested across a large number of experimentally established data from several conditions. It has been implemented as an interactive open access web-server where besides finding the profiles of miRNAs, their downstream functional analysis can also be done. miRbiom will help to get an accurate prediction of human miRNAs profiles in the absence of profiling experiments and will be an asset for regulatory research areas. The study also shows the importance of having RBP interaction information in better understanding the miRNAs and their functional projectiles where it also lays the foundation of such studies and software in future.
Background Picrorhiza kurroa Royle ex Benth. being a rich source of phytochemicals, is a promising high altitude medicinal herb of Himalaya. The medicinal potential is attributed to picrosides i.e. iridoid glycosides, which synthesized in organ-specific manner through highly complex pathways. Here, we present a large-scale proteome reference map of P. kurroa, consisting of four morphologically differentiated organs and two developmental stages. Results We were able to identify 5186 protein accessions (FDR < 1%) providing a deep coverage of protein abundance array, spanning around six orders of magnitude. Most of the identified proteins are associated with metabolic processes, response to abiotic stimuli and cellular processes. Organ specific sub-proteomes highlights organ specialized functions that would offer insights to explore tissue profile for specific protein classes. With reference to P. kurroa development, vegetative phase is enriched with growth related processes, however generative phase harvests more energy in secondary metabolic pathways. Furthermore, stress-responsive proteins, RNA binding proteins (RBPs) and post-translational modifications (PTMs), particularly phosphorylation and ADP-ribosylation play an important role in P. kurroa adaptation to alpine environment. The proteins involved in the synthesis of secondary metabolites are well represented in P. kurroa proteome. The phytochemical analysis revealed that marker compounds were highly accumulated in rhizome and overall, during the late stage of development. Conclusions This report represents first extensive proteomic description of organ and developmental dissected P. kurroa, providing a platform for future studies related to stress tolerance and medical applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.