“…As for the molecular property prediction, the SPOC models gave comparable results with literature models in these datasets. [38] Taking the Freesolv dataset as an example (Fig-ure 6b), SPOC performs significantly better (test RMSE 1.03 kcal/ mol) than the Extended-Connectivity fingerprints (ECFP) based models (Kernel Ridge Regression (KRR), RF, GP, [39] XGBoost [40] ), and better than most of the graph-based models (Directed acyclic graph model (DAG), [41] Graph Convolutional model (GC), Information Maximizing Graph Neural Networks (EIGNN), [42] Weave, Message Passing Neural Network (MPNN) [43] and baseline Chemception models [44] ), and is only inferior to specifically designed graph networks such as EAGNN [42] and Attentive FP, [45] and SMILES-X [46] by using SMILES as model input directly. It should be noted that the FreeSolv datasets comprise both experimental and calculation values by molecular dynamics simulation, and the RMSE error of the simulation-based method is ~1.5 kcal/mol.…”