Protein secondary structure elements (SSE) like alpha-helices and beta-strands are monitored throughout the simulation. The plot above reports SSE distribution by residue index throughout the protein structure. The plot below summarizes the SSE composition for each trajectory frame over the course of the simulation, and the plot at the bottom monitors each residue and its SSE assignment over time.
Advancements in the field of cancer research have enabled researchers and clinicians to access a massive amount of data to aid cancer patients and to add to the existing knowledge of research. However, despite the existence of reliable sources for extricating this data, it remains a challenge to accurately comprehend and draw conclusions based on the entirety of available information. Therefore, the current study aimed to design and develop a database for the identified variants of 5 different cancer types using 20 different cancer exomes. The exome data were retrieved from NCBI SRA and an NGS data clean-up protocol was implemented to obtain the best quality reads. The reads which passed the quality checks were then used for calling the variants which were then processed and filtered. This data was used to normalize and the normalized data generated was used for developing the database. MutaXome, which stands for mutations in cancer exome was designed in SQL, with the front end in bootstrap and HTML, and backend in PHP. The normalized data containing the variants inclusive of Single Nucleotide Polymorphisms (SNPs), were added into MutaXome, which contains detailed information regarding each type of identified variant. This database, available online via http://www.vidyalab.rf.gd/ , serves as a knowledge base for cancer exome variations and holds much potential for enriching it by linking it to a decision support system as prospective studies.
Using a decision support system (DSS) that classifies various cancers provides support to the clinicians/researchers to make better decisions that can aid in early cancer diagnosis, thereby reducing chances of incorrect disease diagnosis. Thus, this work aimed at designing a classification model that can predict accurately for 5 different cancer types comprising of 20 cancer exomes, using the mutations identified from whole exome cancer analysis. Initially, a basic model was designed using supervised machine learning classification algorithms such as K-nearest neighbor (KNN), support vector machine (SVM), decision tree, naïve bayes and random forest (RF), among which decision tree and random forest performed better in terms of preliminary model accuracy. However, output predictions were incorrect due to less training scores. Thus, 16 essential features were then selected for model improvement using 2 approaches. All imbalanced datasets were balanced using SMOTE. In the first approach, all features from 20 cancer exome datasets were trained and models were designed using decision tree and random forest. Balanced datasets for decision tree model showed an accuracy of 77%, while with the RF model, the accuracy improved to 82% where all 5 cancer types were predicted correctly. Area under the curve for RF model was closer to 1, than decision tree model. In the second approach, all 15 datasets were trained, while 5 were tested. However, only 2 cancer types were predicted correctly. To cross validate RF model, Matthew’s correlation co-efficient (MCC) test was performed. For method 1, the MCC test and MCC cross validation was found to be 0.7796 and 0.9356 respectively. Likewise, for second approach, MCC was observed to be 0.9365, corroborating the accuracy of the designed model. The model was successfully deployed using Streamlit as a web application for easy use. This study presents insights for allowing easy cancer classifications.
As a dangerous etiological agent for dengue, chikungunya, zika and yellow fever, it is essential to combat the incidences of Aedes aegypti, by using repellents. However, chronic overuse of synthetic repellents has led to possibilities of adverse side effects in humans. As a consequence, scientists and researchers are now shifting the focus of research on developing natural alternatives to these repellents. In such a case, the present study aimed to devise a standard protocol that can screen the whole proteome of A. aegypti and identify the major proteins that can be targeted by natural bioactives to produce repellents. To study the binding of the natural actives and the targets, a whole proteome analysis was carried out by finding the reference proteome of the organism, performing a literature survey to identify the potential targets, understanding the circadian rhythm of A. aegypti to identify the proteins expressed in the dark and light cycles, and shortlisting the targets by analyzing the common conserved domains of query sequences. Twenty protein target categories were identified, out of which 309 protein sequences were modelled using standalone tool- RaptorX. These structures were validated using Ramachandran plots from SAVES v6.0. Molecular docking studies using POAP, between the selected representative of the twenty protein targets and the natural bioactives revealed negative binding energies. Those that had the least negative energies were taken forward for 100ns molecular dynamic simulation studies, from which the docked complex stabilities were noted and the conformational changes induced during simulations were revealed. This protocol allows whole proteome analysis that will enable identification of major protein targets that the naturals can act upon, and further reveals the effectiveness of the use of naturals against these proteins, thereby, implying the use of this methodology for whole proteome analysis of other organisms as well. Keywords: Whole proteome, Aedes aegypti, conserved domains, homology modelling, docking, molecular dynamic simulations
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.