With the increasing research and development (R&D) difficulty of new molecular entities (NMEs), novel drug delivery systems (DDSs) are attracting widespread attention. This review investigated the current distribution of Food and Drug Administration (FDA)-approved pharmaceutical products and evaluated the technical barrier for the entry of generic drugs and highlighted the success and failure of advanced drug delivery systems. According to the ratio of generic to new drugs and the four-quadrant classification scheme for evaluating the commercialization potential of DDSs, the results showed that the traditional dosage forms (e.g., conventional tablets, capsules and injections) with a lower technology barrier were easier to reproduce, while advanced drug delivery systems (e.g., inhalations and nanomedicines) with highly technical barriers had less competition and greater market potential. Our study provides a comprehensive insight into FDA-approved products and deep analysis of the technical barriers for advanced drug delivery systems. In the future, the R&D of new molecular entities may combine advanced delivery technologies to make drug candidates into more therapeutically effective formulations.
Machine learning-based scoring functions (MLSFs) have attracted extensive attention recently and are expected to be potential rescoring tools for structure-based virtual screening (SBVS). However, a major concern nowadays is whether MLSFs trained for generic uses rather than a given target can consistently be applicable for VS. In this study, a systematic assessment was carried out to re-evaluate the effectiveness of 14 reported MLSFs in VS. Overall, most of these MLSFs could hardly achieve satisfactory results for any dataset, and they could even not outperform the baseline of classical SFs such as Glide SP. An exception was observed for RFscore-VS trained on the Directory of Useful Decoys-Enhanced dataset, which showed its superiority for most targets. However, in most cases, it clearly illustrated rather limited performance on the targets that were dissimilar to the proteins in the corresponding training sets. We also used the top three docking poses rather than the top one for rescoring and retrained the models with the updated versions of the training set, but only minor improvements were observed. Taken together, generic MLSFs may have poor generalization capabilities to be applicable for the real VS campaigns. Therefore, it should be quite cautious to use this type of methods for VS.
How to accurately estimate protein–ligand binding affinity remains a key challenge in computer-aided drug design (CADD). In many cases, it has been shown that the binding affinities predicted by classical scoring functions (SFs) cannot correlate well with experimentally measured biological activities. In the past few years, machine learning (ML)-based SFs have gradually emerged as potential alternatives and outperformed classical SFs in a series of studies. In this study, to better recognize the potential of classical SFs, we have conducted a comparative assessment of 25 commonly used SFs. Accordingly, the scoring power was systematically estimated by using the state-of-the-art ML methods that replaced the original multiple linear regression method to refit individual energy terms. The results show that the newly-developed ML-based SFs consistently performed better than classical ones. In particular, gradient boosting decision tree (GBDT) and random forest (RF) achieved the best predictions in most cases. The newly-developed ML-based SFs were also tested on another benchmark modified from PDBbind v2007, and the impacts of structural and sequence similarities were evaluated. The results indicated that the superiority of the ML-based SFs could be fully guaranteed when sufficient similar targets were contained in the training set. Moreover, the effect of the combinations of features from multiple SFs was explored, and the results indicated that combining NNscore2.0 with one to four other classical SFs could yield the best scoring power. However, it was not applicable to derive a generic target-specific SF or SF combination.
The microtubule-associated protein tau is critical for the development and maintenance of the nervous system. Tau dysfunction is associated with a variety of neurodegenerative diseases called tauopathies, which are characterized by neurofibrillary tangles formed by abnormally aggregated tau protein. Studying the aggregation mechanism of tau protein is of great significance for elucidating the etiology of tauopathies. The hexapeptide 306VQIVYK311 (PHF6) of R3 has been shown to play a vital role in promoting tau aggregation. In this study, long-term all-atom molecular dynamics simulations in explicit solvent were performed to investigate the mechanisms of spontaneous aggregation and template-induced misfolding of PHF6, and the dimerization at the early stage of nucleation was further specifically analyzed by the Markov state model (MSM). Our results show that PHF6 can spontaneously aggregate to form multimers enriched with β-sheet structure and the β-sheets in multimers prefer to exist in a parallel way. It is observed that PHF6 monomer can be induced to form a β-sheet structure on either side of the template but in a different way. In detail, the β-sheet structure is easier to form on the left side but does not extend well, but on the right side, the monomer can form the extended β-sheet structure. Furthermore, MSM analysis shows that the formation of dimer mainly occurs in three steps. First, the separated monomers collide with each other at random orientations, and then a dimer with short β-sheet structure at the N-terminal forms; finally, β-sheets elongate to form an extended parallel β-sheet dimer. During these processes, multiple intermediate states are identified and multiple paths can form a parallel β-sheet dimer from the disordered coil structure. Moreover, the residues I308, V309, and Y310 play an essential role in the dimerization. In a word, our results uncover the aggregation and misfolding mechanism of PHF6 from the atomic level, which can provide useful theoretical guidance for rational design of effective therapeutic drugs against tauopathies.
Structure-based drug design depends on the detailed knowledge of the three-dimensional (3D) structures of protein–ligand binding complexes, but accurate prediction of ligand-binding poses is still a major challenge for molecular docking due to deficiency of scoring functions (SFs) and ignorance of protein flexibility upon ligand binding. In this study, based on a cross-docking dataset dedicatedly constructed from the PDBbind database, we developed several XGBoost-trained classifiers to discriminate the near-native binding poses from decoys, and systematically assessed their performance with/without the involvement of the cross-docked poses in the training/test sets. The calculation results illustrate that using Extended Connectivity Interaction Features (ECIF), Vina energy terms and docking pose ranks as the features can achieve the best performance, according to the validation through the random splitting or refined-core splitting and the testing on the re-docked or cross-docked poses. Besides, it is found that, despite the significant decrease of the performance for the threefold clustered cross-validation, the inclusion of the Vina energy terms can effectively ensure the lower limit of the performance of the models and thus improve their generalization capability. Furthermore, our calculation results also highlight the importance of the incorporation of the cross-docked poses into the training of the SFs with wide application domain and high robustness for binding pose prediction. The source code and the newly-developed cross-docking datasets can be freely available at https://github.com/sc8668/ml_pose_prediction and https://zenodo.org/record/5525936, respectively, under an open-source license. We believe that our study may provide valuable guidance for the development and assessment of new machine learning-based SFs (MLSFs) for the predictions of protein–ligand binding poses.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.