The retention time (log t
R) of pesticidal
compounds in a reverse-phase high-performance liquid chromatography
(HPLC) analysis has a direct relationship with lipophilicity, which
could be related to the ecotoxicity potential of the compounds. The
novel quantitative read-across structure–property relationship
(q-RASPR) modeling approach uses similarity-based descriptors for
predictive model generation. These models have been shown to enhance
external predictivity in previous studies for several end points.
The current study describes the development of a q-RASPR model using
experimental retention time data (log t
R) in the HPLC experiments of 823 environmentally significant pesticide
residues collected from a large compound database. To model the retention
time (log t
R) end point, 0D–2D
descriptors have been used along with the read-across-derived similarity
descriptors. The developed partial least squares (PLS) model was rigorously
validated by various internal and external validation metrics as recommended
by the Organization for Economic Co-operation and Development (OECD).
The final q-RASPR model is proven to be a good fit, robust, and externally
predictive (n
train = 618, R
2 = 0.82, Q
2
LOO = 0.81, n
test = 205, and Q
2
F1 = 0.84) that literally outperforms the
external predictivity of the previously reported quantitative structure–property
relationship (QSPR) model. From the insights of modeled descriptors,
lipophilicity is found to be the most important chemical property,
which positively correlates with the retention time (log t
R). Various other characteristics, such as the number
of multiple bonds (nBM), graph density (GD), etc., have a substantial
and inversely proportionate relationship with the retention time end
point. The software tools utilized in this study are user-friendly,
and most of them are free, which makes our methodology quite cost-effective
when compared to experimentation. In a nutshell, to obtain better
external predictivity, interpretability, and transferability, q-RASPR
is an efficient technique that has the potential to be employed as
a good alternative approach for retention time prediction and ecotoxicity
potential identification.