Background MicroRNAs (miRNAs) are noncoding RNA molecules heavily involved in human tumors, in which few of them circulating the human body. Finding a tumor-associated signature of miRNA, that is, the minimum miRNA entities to be measured for discriminating both different types of cancer and normal tissues, is of utmost importance. Feature selection techniques applied in machine learning can help however they often provide naive or biased results. Results An ensemble feature selection strategy for miRNA signatures is proposed. miRNAs are chosen based on consensus on feature relevance from high-accuracy classifiers of different typologies. This methodology aims to identify signatures that are considerably more robust and reliable when used in clinically relevant prediction tasks. Using the proposed method, a 100-miRNA signature is identified in a dataset of 8023 samples, extracted from TCGA. When running eight-state-of-the-art classifiers along with the 100-miRNA signature against the original 1046 features, it could be detected that global accuracy differs only by 1.4%. Importantly, this 100-miRNA signature is sufficient to distinguish between tumor and normal tissues. The approach is then compared against other feature selection methods, such as UFS, RFE, EN, LASSO, Genetic Algorithms, and EFS-CLA. The proposed approach provides better accuracy when tested on a 10-fold cross-validation with different classifiers and it is applied to several GEO datasets across different platforms with some classifiers showing more than 90% classification accuracy, which proves its cross-platform applicability. Conclusions The 100-miRNA signature is sufficiently stable to provide almost the same classification accuracy as the complete TCGA dataset, and it is further validated on several GEO datasets, across different types of cancer and platforms. Furthermore, a bibliographic analysis confirms that 77 out of the 100 miRNAs in the signature appear in lists of circulating miRNAs used in cancer studies, in stem-loop or mature-sequence form. The remaining 23 miRNAs offer potentially promising avenues for future research.
The G-protein coupled estrogen receptor 1 GPER/GPR30 is a transmembrane seven-helix (7TM) receptor involved in the growth and proliferation of breast cancer. Due to the absence of a crystal structure of GPER/GPR30, in this work, molecular modeling studies have been carried out to build a three-dimensional structure, which was subsequently refined by molecular dynamics (MD) simulations (up to 120 ns). Furthermore, we explored GPER/GPR30's molecular recognition properties by using reported agonist ligands (G1, estradiol (E2), tamoxifen, and fulvestrant) and the antagonist ligands (G15 and G36) in subsequent docking studies. Our results identified the E2 binding site on GPER/GPR30, as well as other receptor cavities for accepting large volume ligands, through GPER/GPR30 π-π, hydrophobic, and hydrogen bond interactions. Snapshots of the MD trajectory at 14 and 70 ns showed almost identical binding motifs for G1 and G15. It was also observed that C107 interacts with the acetyl oxygen of G1 (at 14 ns) and that at 70 ns the residue E275 interacts with the acetyl group and with the oxygen from the other agonist whereas the isopropyl group of G36 is oriented toward Met141, suggesting that both C107 and E275 could be involved in the protein activation. This contribution suggest that GPER1 has great structural changes which explain its great capacity to accept diverse ligands, and also, the same ligand could be recognized in different binding pose according to GPER structural conformations.
Docking studies include many conformations to predict binding free energies (scoring functions) and to search (scoring sampling) for the most representative binding conformations. Therefore, several biological properties, from side chain residues to complete protein motions, have been included in docking studies to improve theoretical predictions.
Molecular Dynamics (MD) simulations is a computational method that employs Newton's laws to evaluate the motions of water, ions, small molecules, and macromolecules or more complex systems, for example, whole viruses, to reproduce the behavior of the biological environment, including water molecules and lipid membranes. Specifically, structural motions, such as those that are dependent of the temperature and solute/ solvent are very important to study the recognition pattern of ligandprotein or protein-protein complexes, in that sense, MD simulations are very useful because these motions can be modeled using this methodology. Furthermore, MD simulations for drug design provide insights into the structural cavities required to design novel structures with higher affinity to the target. Also, the employment of MD simulations to drug design can help to refine the three-dimensional (3D) structure of targets in order to obtain a better sampling of the binding poses and more reliable affinity values with better structural advantages, because they incorporate some biological conditions that include structural motions compared to traditional docking procedures. This work analyzes the concepts and applicability of MD simulations for drug design because molecular structural motions are considered, and these help to identify hot spots, decipher structural details in the reported protein sites, as well as to eliminate sites that could be structural artifacts which could be originated from the structural characterization conditions from MD. Moreover, better free energy values for protein ligand recognition can also be obtained, and these can be validated under experimental procedures due to the robustness of the MD simulation methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.