Distinct roles for IT and IH in controlling the frequency and timing of rebound spike responses

DNA-encoded library (DEL) is an efficient high-throughput screening technology platform in drug discovery and is also gaining momentum in academic research. Today, the majority of DELs are assembled and encoded with double-stranded DNA tags (dsDELs) and has been selected against numerous biological targets; however, dsDELs are not amendable to some of the recently developed selection methods, such as the cross-linking-based selection against immobilized targets and live-cell-based selections, which require DELs encoded with single-stranded DNAs (ssDELs). Herein, we present a simple method to convert dsDELs to ssDELs using exonuclease digestion without library redesign and resynthesis. We show that dsDELs could be efficiently converted to ssDELs and used for affinity-based selections either with purified proteins or on live cells.

show abstract

Machine-Learning-Based Data Analysis Method for Cell-Based Selection of DNA-Encoded Libraries

Hou

Xie

Gui

et al. 2023

ACS Omega

View full text Add to dashboard Cite

DNA-encoded library (DEL) is a powerful ligand discovery technology that has been widely adopted in the pharmaceutical industry. DEL selections are typically performed with a purified protein target immobilized on a matrix or in solution phase. Recently, DELs have also been used to interrogate the targets in the complex biological environment, such as membrane proteins on live cells. However, due to the complex landscape of the cell surface, the selection inevitably involves significant nonspecific interactions, and the selection data are much noisier than the ones with purified proteins, making reliable hit identification highly challenging. Researchers have developed several approaches to denoise DEL datasets, but it remains unclear whether they are suitable for cell-based DEL selections. Here, we report the proof-of-principle of a new machine-learning (ML)-based approach to process cell-based DEL selection datasets by using a Maximum A Posteriori (MAP) estimation loss function, a probabilistic framework that can account for and quantify uncertainties of noisy data. We applied the approach to a DEL selection dataset, where a library of 7,721,415 compounds was selected against a purified carbonic anhydrase 2 (CA-2) and a cell line expressing the membrane protein carbonic anhydrase 12 (CA-12). The extended-connectivity fingerprint (ECFP)-based regression model using the MAP loss function was able to identify true binders and also reliable structure−activity relationship (SAR) from the noisy cell-based selection datasets. In addition, the regularized enrichment metric (known as MAP enrichment) could also be calculated directly without involving the specific machine-learning model, effectively suppressing low-confidence outliers and enhancing the signal-to-noise ratio. Future applications of this method will focus on de novo ligand discovery from cell-based DEL selections.

show abstract

A Machine-learning-based Data Analysis Method for Cell-based Selection of DNA-encoded libraries (DELs)

Hou

Xie

Gui

et al. 2022

Preprint

View full text Add to dashboard Cite

DNA-encoded library (DEL) is a powerful ligand discovery technology that has been widely adopted in the pharmaceutical industry. DEL selections are typically performed with a purified protein target immobilized on a matrix or in solution phase. Recently, DELs have also been used to interrogate the targets in complex biological environment, such as membrane proteins on live cells. However, due to the complex landscape of the cell surface, the selection inevitably involves significant non-specific interactions, and the selection data is much noisier than the ones with purified proteins, making reliable hit identification highly challenging. Researchers have developed several approaches to denoise DEL datasets, but it remains unclear whether they are suitable for cell-based DEL selections. Here, we propose a new machine-learning (ML)-based approach to process cell-based DEL selection datasets by using a Maximum A Posteriori (MAP) estimation loss function, a probabilistic framework that can account for and quantify uncertainties of noisy data. We applied the approach to a DEL selection dataset, where a library of 7,721,415 compounds was selected against a purified carbonic anhydrase 2 (CA-2) and a cell line expressing the membrane protein carbonic anhydrase 12 (CA-12). The Extended-Connectivity Fingerprint (ECFP)-based regression model using the MAP loss function was able to identify the true binders and also reliable structure-activity relationship (SAR) from the noisy cell-based selection datasets. In addition, the regularized enrichment metric (known as MAP enrichment) could also be calculated directly without involving the specific machine learning model, effectively suppressing low-confidence outliers and enhancing the signal-to-noise ratio.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yuhan Gui

Converting Double-Stranded DNA-Encoded Libraries (DELs) to Single-Stranded Libraries for More Versatile Selections

Machine-Learning-Based Data Analysis Method for Cell-Based Selection of DNA-Encoded Libraries

A Machine-learning-based Data Analysis Method for Cell-based Selection of DNA-encoded libraries (DELs)

Contact Info

Product

Resources

About