2018
DOI: 10.1039/c8en00061a
|View full text |Cite
|
Sign up to set email alerts
|

Curation of datasets, assessment of their quality and completeness, and nanoSAR classification model development for metallic nanoparticles

Abstract: Workflow for curation of datasets, assessment of their quality and completeness, and nanoSAR model development.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
44
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 33 publications
(47 citation statements)
references
References 42 publications
1
44
0
Order By: Relevance
“…In this study, the experimental protocols were not taken under consideration. Future studies may integrate weighting scoring rules for each published article based on the measurement methods to evaluate the quality of the p-chem data as displayed in [21].…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…In this study, the experimental protocols were not taken under consideration. Future studies may integrate weighting scoring rules for each published article based on the measurement methods to evaluate the quality of the p-chem data as displayed in [21].…”
Section: Discussionmentioning
confidence: 99%
“…Surface area appeared as either important or not, probably largely affected by low data completeness. NP type is found to be both highly and lowly influencing; Trinh, Ha, Choi, Byun and Yoon [21] had only one or two types of metal NP in their dataset, and this could explain why this variable provided little information in building the models. Particle size appeared as significant across all studies, highlighting the importance of size in the manifestation of toxicological effects.…”
Section: Refmentioning
confidence: 99%
See 1 more Smart Citation
“…Furthermore, 8% of the studies mention that their dataset had equal outcome classes, while, on the other hand, 4% of studies tackled the imbalance issue by resampling the training dataset. Resampling can be done by applying the Synthetic Minority Oversampling Technique (SMOTE), which is a supervised instance algorithm that oversamples the minority instances using the k-nearest-neighbor (kNN) [60,67,77]. This method balances the dataset by generating more data points.…”
Section: Class Balancingmentioning
confidence: 99%
“…Class imbalance reflects an unequal distribution of class values within a dataset and poses a challenging problem because classifiers exhibit biases of the results. This has been rarely accounted for properly during training [60,67,72,74]. The most common technique used was SMOTE.…”
Section: The Frameworkmentioning
confidence: 99%