Computerised chemical substructure searching began at Shell Research Sitting‐bourne during the 1960s with a file of chemical structures coded in string form using the IUPAC (Dyson) chemical notation. Work on the coding of structures in this notation had begun about 1961 when the definitive rules were first published, and whilst the file was being created it was also being searched for compounds containing similar chemical fragments by using once‐off programs. In order to evolve into a fully fledged computer system, certain drawbacks in the notation had to be overcome and, based on a modified notation, a series of programs had been written by 1966 to enable registration, substructure searching, molecular formula checking and other activities to be carried out. After many years of use, it was decided that a move towards an interactive graphics system should be made, and several ideas were followed up. When, during these deliberations, Molecular Design Limited (MDL) announced their product MOLEX this was examined and assessed to be a suitable replacement for the old system provided that the structures and data already stored could be satisfactorily transferred. Molex, which is now known as MACCS (Molecular ACCess System), was subsequently purchased and has now been in use at Sittingbourne Research Centre for a year and a half.
This paper draws upon experience in the development of computer software which generates and analyzes systematic organic chemical nomenclatures and highlights some general points where these nomenclatures cause difficulties in computer processing. It is suggested that the tackling of these points would, through consequent simplification, tightening, and improved consistency of the rules, help scientists and students in their use of nomenclature both by manual means and by computer. This might gradually lead to the development of a new systematic nomenclature, formally defined, which could co-exist for some time with current nomenclatures, supported by software to provide interconversion between it and them. INTRODUCTIONOnly quite recently has progress been made, other than for indexing purposes, in computer recognition and translation of general systematic organic chemical nomenclature^.'-^ That it has taken so long to be able to recognize and generate systematic nomenclatures and to interconvert them with structural representations results from the many problems encountered in analyzing and generating nomenclatures by computer. This has led us to consider whether it is time for nomenclature to respond to the needs of computer processing through simplification, consistency, and tightening of the rules. This would have the added benefit that scientists could more readily form, understand, and use systematic nomenclature both with and without computer systems.The principal use of chemical nomenclature is to give a compound a label that can be spoken, written, and used in printed indexes and from which the structure can be perceived by scientists. While trivial nomenclature has the benefit of conciseness, only systematic nomenclature, which to a certain extent gives pronunciation and semantics to a structure, is of use for unambiguously labeling a structure with a name that can safely be communicated worldwide. The compilation by Lees and Smitha of the papers presented at a symposium held in 198 1 to discuss the use and the problems of nomenclature, and to help overcome the confusion and misunderstandings that existed, remains a valuable record of many of the issues. Typical of these is that IUPAC systematic organic nomenclature' has some similarities to a natural language8 in that it has evolved. So it includes some commonly used trivial names, contains redundant information, allows a structure to have more than one name, and requires expertise in applying the rules to determine the name for a given structure.As mechanization was applied to chemical information, stricter linear representations, or notations, of structures were developed. Such notations represent structures in coded form using symbols which consist to a greater or lesser extent of the ordinary characters for the chemical elements accompanied by other alphanumeric and special characters. Examples include the Wiswesser line notation (WLN)9 and the Dyson-IUPAC notational0 These chemical notations were readily adapted to computer use: registration and s...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.