2020
DOI: 10.1093/bioinformatics/btaa544
|View full text |Cite
|
Sign up to set email alerts
|

DeepCDA: deep cross-domain compound–protein affinity prediction through LSTM and convolutional neural networks

Abstract: Motivation An essential part of drug discovery is the accurate prediction of the binding affinity of new compound-protein pairs. Most of the standard computational methods assume that compounds or proteins of the test data are observed during the training phase. However, in real-world situations, the test and training data are sampled from different domains with different distributions. To cope with this challenge, we propose a deep learning-based approach that consists of three steps. In the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

1
89
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1
1

Relationship

1
8

Authors

Journals

citations
Cited by 145 publications
(106 citation statements)
references
References 34 publications
1
89
0
Order By: Relevance
“…(i) Convolutional layer: The convolutional layer is a major building block of CNN, which contains a set of learnable filters where each filter is convolved with the input of the layer to encode the local knowledge of the small receptive field. This process helps conserve the dimensional relationship between numeric values in the vectors [ 33 ]. Thus, a 1D convolutional layer was used to construct a convolution kernel and then derive features encoded in the embedding layer [ 34 ].…”
Section: Methodsmentioning
confidence: 99%
“…(i) Convolutional layer: The convolutional layer is a major building block of CNN, which contains a set of learnable filters where each filter is convolved with the input of the layer to encode the local knowledge of the small receptive field. This process helps conserve the dimensional relationship between numeric values in the vectors [ 33 ]. Thus, a 1D convolutional layer was used to construct a convolution kernel and then derive features encoded in the embedding layer [ 34 ].…”
Section: Methodsmentioning
confidence: 99%
“…The protein is commonly encoded through sequence-based models. DeepDTA [11], DeepConv-DTI [9], GraphDTA [10], Tsubaki et al [5], MT-DTI [12] and TransformerCPI [3] apply September 29, 2021 2/15 1D-CNN layers to encode protein sequences, while DeepAffinity [1] and DeepCDA [7] combine 1D-CNN layers with recurrent neural network (RNN) or long short-term memory (LSTM) layers, respectively. The compound is encoded with sequence-based or graph-based models, depending on the input information.…”
Section: Some Of the Most Readily Available Data Representations In Cpi Datasets Arementioning
confidence: 99%
“…Given a compound-protein pair, CPI prediction methods aim to predict a binary value indicating whether the compound and the protein interact [3][4][5][6], a numeric value indicating their binding affinity [1,[7][8][9][10][11][12], or identify binding sites for a specific compound within the protein [13][14][15][16]. Existing CPI prediction methods are diverse in September 29, 2021 1/15 terms of feature engineering and machine learning models.…”
Section: Introductionmentioning
confidence: 99%
“…Overall, until recently, three types of machine learning methods, including supervised, semi-supervised, and unsupervised have been applied to the scope of drug discovery processes [15]. It has been also shown that some modified and improved versions of the present approaches such as deep neural networks could yield better predictive models [16], and overfitting and insufficient amounts of data have been the main challenges in front of the researchers who have been engaged in generating an appropriate predictive model [17][18][19]. ii) The theory-based researches: normally, researchers formulate the relationship among the biological entities based on the numerical and experimental experiences [20,21].…”
Section: Introductionmentioning
confidence: 99%