2022
DOI: 10.1021/acs.jcim.2c00744
|View full text |Cite
|
Sign up to set email alerts
|

HyFactor: A Novel Open-Source, Graph-Based Architecture for Chemical Structure Generation

Abstract: Graph-based architectures are becoming increasingly popular as a tool for structure generation. Here, we introduce novel open-source architecture HyFactor in which, similar to the InChI linear notation, the number of hydrogens attached to the heavy atoms was considered instead of the bond types. HyFactor was benchmarked on the ZINC 250K, MOSES, and ChEMBL data sets against conventional graph-based architecture ReFactor, representing our implementation of the reported DEFactor architecture in the literature. On… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 31 publications
0
2
0
Order By: Relevance
“…The number of heavy atoms in molecules was no more than 50. The canonical order of atoms in molecules was obtained from a breadth-first search (BFS) algorithm, similar as it was done by Mercado et al 14 The curated ChEMBL data set was then split into a training set (80% of data or 1.3M molecules) and a test set (20% of data or 327K molecules) as it was done for the training of HyFactor architecture 8 .…”
Section: Datamentioning
confidence: 99%
See 1 more Smart Citation
“…The number of heavy atoms in molecules was no more than 50. The canonical order of atoms in molecules was obtained from a breadth-first search (BFS) algorithm, similar as it was done by Mercado et al 14 The curated ChEMBL data set was then split into a training set (80% of data or 1.3M molecules) and a test set (20% of data or 327K molecules) as it was done for the training of HyFactor architecture 8 .…”
Section: Datamentioning
confidence: 99%
“…The input graph is a hydrogen-count labelled graph (HLG) 8 where each atom is encoded in a vector with the following properties: element number, period, group, number of electrons on the last subshell + atom's charge, number of last shells, the total number of hydrogens, whether or not the atom is in a ring, number of neighbours and counts of single, double, triple and aromatic bonds near current atom. Each property is normalised with a natural logarithm for numerical stability.…”
Section: Encodingmentioning
confidence: 99%