An enhanced version of an algorithm is discussed which encodes a description of the chemical environment of carbon atoms in a manner that correlates to carbon-13 nuclear magnetic resonance (13C NMR) chemical shifts. The encoding algorithm uses a vector-based approach in which the first dimension of the vector represents the chemical shift of the carbon atom, the second dimension represents the collective influence of atoms one bond away from the carbon on its chemical shift, and each successive dimension represents the influence of the atoms one bond further away. This encoding algorithm is a key component of a 13C NMR spectrum simulation procedure in which each of the carbons in a large database of known structures and spectra is represented as a vector. Database search methods based on vector comparisons are used to find the closest matching chemical environments and associated chemical shifts for each of the carbons in a structure input by a user. Enhancements to the original algorithm include an expansion of the number of atom classes treated, the addition of a scheme to treat aromatic systems as a special case, and the use of an expanded vector format to regain some of the information lost by collapsing the molecular structure to a vector representation. To test this algorithm, a database of structures and spectra is split into training and test sets consisting of 16 959 and 4240 structures, respectively. Experiments performed to optimize several parameters associated with the encoding algorithm are followed by comparing the retrieved (i.e., predicted) and actual chemical shifts for the structures in the test set. For the optimal parameter settings found, the median of the mean absolute deviations in chemical shifts for the structures in the test set was 1.30 ppm and was obtained with an expanded vector representation based on 15 dimensions.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.