2020
DOI: 10.1186/s13321-020-00453-4
|View full text |Cite
|
Sign up to set email alerts
|

Atomic ring invariant and Modified CANON extended connectivity algorithm for symmetry perception in molecular graphs and rigorous canonicalization of SMILES

Abstract: We propose new invariant (the product of the corresponding primes for the ring size of each bond of an atom) as a simple unambiguous ring invariant of an atom that allows distinguishing symmetry classes in the highly symmetrical molecular graphs using traditional local and distance atom invariants. Also, we propose modifications of Weininger's CANON algorithm to avoid its ambiguities (swapping and leveling ranks, incorrect determination of symmetry classes in non-aromatic annulenes, arbitrary selection of atom… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(5 citation statements)
references
References 26 publications
0
5
0
Order By: Relevance
“…Many popular canonicalization procedures in chemistry are variants of the classical Morgan algorithm [ 21 ], which however uses the bond type of edges as an initial atom invariant and therefore is not domain-independent. Furthermore, non-equivalent atoms can still be assigned identical extended connectivity values, in particular in highly symmetrical molecules [ 23 , 36 ], a problem that is particularly relevant to inorganic cluster chemistry. In contrast, TUCAN only relies on the atomic number as a chemistry-specific invariant.…”
Section: Resultsmentioning
confidence: 99%
“…Many popular canonicalization procedures in chemistry are variants of the classical Morgan algorithm [ 21 ], which however uses the bond type of edges as an initial atom invariant and therefore is not domain-independent. Furthermore, non-equivalent atoms can still be assigned identical extended connectivity values, in particular in highly symmetrical molecules [ 23 , 36 ], a problem that is particularly relevant to inorganic cluster chemistry. In contrast, TUCAN only relies on the atomic number as a chemistry-specific invariant.…”
Section: Resultsmentioning
confidence: 99%
“…The generation of a unique SMILES representation requires the specification of a canonical atom ordering [ 20 , 21 ]. Over the past decades, various algorithms have been developed to achieve such a canonicalization [ 20 , 21 , 29 32 ]. While the matrix canonicity criterion of enu already leads to a canonical ordering of the atoms, this order is not necessarily suitable to create elegant (i.e., easily readable) SMILES strings.…”
Section: Methodsmentioning
confidence: 99%
“…The conversion of the 3D molecular structure of Arglecin into its 2D representation and subsequent SMILES code can be described mathematically using the concept of graph theory [32]. The 3D molecular structure of Arglecin can be represented as a graph, wherein atoms are represented by nodes and chemical bonds between atoms are represented by edges.…”
Section: Sample Smiles String Representationmentioning
confidence: 99%