Virtual
Screening (VS) based on molecular docking is an efficient
method used for retrieving novel hit compounds in drug discovery.
However, the accuracy of the current docking scoring function (SF)
is usually insufficient. In this study, in order to improve the screening
power of SF, a novel approach named EAT-Score was proposed by directly
utilizing the energy auxiliary terms (EAT) provided by molecular docking
scoring through eXtreme Gradient Boosting (XGBoost). Here, EAT specifically
refers to the output of the Molecular Operating Environment (MOE)
scoring, including the energy scores of five different classical SFs
and the Protein–Ligand Interaction Fingerprint (PLIF) terms.
The performance of EAT-Score to discriminate actives from decoys was
strictly validated on the DUD-E diverse subset by using different
performance metrics. The results showed that EAT-Score performed much
better than classical SFs in VS, with its AUC values exhibiting an
improvement of around 0.3. Meanwhile, EAT-Score could achieve comparable
even better prediction performance compared with other state-of-the-art
VS methods, such as some machine learning (ML)-based SFs and classical
SFs implemented in docking programs, in terms of AUC, LogAUC, or BEDROC.
Furthermore, the EAT-Score model can capture important binding pattern
information from protein–ligand complexes by Shapley additive
explanations (SHAP) analysis, which may be very helpful in interpreting
the ligand binding mechanism for a certain target and thereby guiding
drug design.