2013
DOI: 10.1021/ci300348u
|View full text |Cite
|
Sign up to set email alerts
|

Coping with Unbalanced Class Data Sets in Oral Absorption Models

Abstract: Class imbalance occurs frequently in drug discovery datasets. In oral absorption datasets, in the literature, there are considerably more of highly-absorbed compounds compared with poorly-absorbed compounds. This produces models that are biased towards highly-absorbed compounds which lack generalization to industry settings where more early stage drug candidates are poorly-absorbed. This paper presents two strategies to cope with unbalanced class datasets: Under-sampling the majority high absorption class and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
35
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
7

Relationship

2
5

Authors

Journals

citations
Cited by 28 publications
(37 citation statements)
references
References 69 publications
(147 reference statements)
2
35
0
Order By: Relevance
“…It is a well-established fact that compounds with higher logP have poor aqueous solubility and are more likely to pass through lipid bilayer of biological membranes (34). The general trend in the literature with regards to the role of lipophilicity in pharmacokinetic processes indicates that more lipophilic compounds have higher oral absorption, plasma protein binding, and volume of distribution (35)(36)(37) and are more prone to P450 metabolism (30,35). This may lead to the reduced chance of excretion through bile as the intact drug.…”
Section: Structural Features Of Compounds For Biliary Excretionmentioning
confidence: 99%
“…It is a well-established fact that compounds with higher logP have poor aqueous solubility and are more likely to pass through lipid bilayer of biological membranes (34). The general trend in the literature with regards to the role of lipophilicity in pharmacokinetic processes indicates that more lipophilic compounds have higher oral absorption, plasma protein binding, and volume of distribution (35)(36)(37) and are more prone to P450 metabolism (30,35). This may lead to the reduced chance of excretion through bile as the intact drug.…”
Section: Structural Features Of Compounds For Biliary Excretionmentioning
confidence: 99%
“…For instance, Chen et al 16 used the under-sampling method for toxicity modeling of Tetrahymena pyriformi s. Sun et al applied the same method to the prediction of cytochrome P450 profiles of environmental chemicals. Newby et al 17 modeled imbalanced oral absorption data sets. Chen et al compared the over-sampling approach with under-sampling and showed that the under-sampling method performed more consistently.…”
Section: Introductionmentioning
confidence: 99%
“…The median Π values and the corresponding mean FA values have a similar trend as one moves from the negative region to the positive region. Interestingly, it is evident from Figure 5 and Table 1 (Newby, Freitas, & Ghafourian, 2013); it may be due to a reluctance of scientists, companies and journals to publish negative results.…”
Section: Discussionmentioning
confidence: 99%
“…There is limited information available on the pharmaceutical formulation factors like hydration and different polymorphs of the drug employed in the clinical trials. In the literature, there is a relatively low number of reported poorly absorbed compounds (Newby, Freitas, & Ghafourian, ); it may be due to a reluctance of scientists, companies and journals to publish negative results.…”
Section: Discussionmentioning
confidence: 99%