The degree of π
orbital overlap (DPO) model has been demonstrated
to be an excellent quantitative structure–property relationship
(QSPR) that can map two-dimensional structural information of polycyclic
aromatic hydrocarbons (PAHs) and thienoacenes to their electronic
properties, namely, band gaps, electron affinities, and ionization
potentials. However, the model suffers from significant limitations
that narrow its applications due to inefficient manual procedures
in parameter optimization and descriptor formulation. In this work,
we developed a machine learning (ML)-based method for efficiently
optimizing DPO parameters and proposed a truncated DPO descriptor,
which is simple enough that can be automatically extracted from simplified
molecular-input line-entry system strings of PAHs and thienoacenes.
Compared with the result from our previous studies, the ML-based methodology
can optimize DPO parameters with four times fewer data, while it can
achieve the same level of accuracy in predictions of the mentioned
electronic properties to within 0.1 eV. The truncated DPO model also
has similar accuracy to the full DPO model. Consequently, the ML-based
DPO approach coupled with the truncated DPO model enables new possibilities
for developing automatic pipelines for high-throughput screening and
investigating new QSPR for new chemical classes.