J. Brown scite author profile

Molecular modeling frequently constructs classification models for the prediction of two‐class entities, such as compound bio(in)activity, chemical property (non)existence, protein (non)interaction, and so forth. The models are evaluated using well known metrics such as accuracy or true positive rates. However, these frequently used metrics applied to retrospective and/or artificially generated prediction datasets can potentially overestimate true performance in actual prospective experiments. Here, we systematically consider metric value surface generation as a consequence of data balance, and propose the computation of an inverse cumulative distribution function taken over a metric surface. The proposed distribution analysis can aid in the selection of metrics when formulating study design. In addition to theoretical analyses, a practical example in chemogenomic virtual screening highlights the care required in metric selection and interpretation.

show abstract

The prediction of aerial radiation patterns from near-field measurements

Brown

Jull

1961

Proc. IEE, B Electron. Commun. Eng. UK

View full text Add to dashboard Cite

Advancing drug discovery via GPU-based deep learning

Gawehn

Hiss

Brown

et al. 2018

Expert Opinion on Drug Discovery

View full text Add to dashboard Cite

ADMET Predictability at Boehringer Ingelheim: State‐of‐the‐Art, and Do Bigger Datasets or Algorithms Make a Difference?

Aleksić

Seeliger

Brown

2021

Molecular Informatics

View full text Add to dashboard Cite

Computational methods assisting drug discovery and development are routine in the pharmaceutical industry. Digital recording of ADMET assays has provided a rich source of data for development of predictive models. Despite the accumulation of data and the public availability of advanced modeling algorithms, the utility of prediction in ADMET research is not clear. Here, we present a critical evaluation of the relationships between data volume, modeling algorithm, chemical representation and grouping, and temporal aspect (time sequence of assays) using an inhouse ADMET database. We find no large difference in prediction algorithms nor any systemic and substantial gain from increasingly large datasets. Temporal-based data enlargement led to performance improvement in only in a limited number of assays, and with fractional improvement at best. Assays that are well-, intermediately-, or poorlysuited for ADMET predictions and reasons for such behavior are systematically identified, generating realistic expectations for areas in which computational models can be used to guide decision making in molecular design and development.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

J. Brown

Artificial dielectrics having refractive indices less than unity

Classifiers and their Metrics Quantified

The prediction of aerial radiation patterns from near-field measurements

Advancing drug discovery via GPU-based deep learning

ADMET Predictability at Boehringer Ingelheim: State‐of‐the‐Art, and Do Bigger Datasets or Algorithms Make a Difference?

Contact Info

Product

Resources

About