ChemEngine: harvesting 3D chemical structures of supplementary data from PDF files

Karthikeyan, Muthukumarasamy; Vyas, Renu

doi:10.1186/s13321-016-0175-x

Cited by 3 publications

(2 citation statements)

References 25 publications

(23 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Tools like ChemEngine have been implemented to automatically extract 3D molecular XYZ coordinates and atom information from articles with the aim to directly generate computable molecular structures. 434 This system used pattern recognition and regular expressions to detect molecular coordinates and distinguish it from surround-ing nonmolecular free text. After generating the atom coordinate matrix data from the previously detected molecular coordinates, tools like ChemEngine build molecules using the bond matrix and the atom connectivity.…”

Section: Linking Documents To Structuresmentioning

confidence: 99%

“…Authors can also present chemical structural information in documents, especially in case of supporting/Supporting Information of scientific articles, in the form of plain text 3D X, Y, Z atom coordinate values. Tools like ChemEngine have been implemented to automatically extract 3D molecular XYZ coordinates and atom information from articles with the aim to directly generate computable molecular structures . This system used pattern recognition and regular expressions to detect molecular coordinates and distinguish it from surrounding nonmolecular free text.…”

Section: Linking Documents To Structuresmentioning

confidence: 99%

See 1 more Smart Citation

Information Retrieval and Text Mining Technologies for Chemistry

et al. 2017

View full text Add to dashboard Cite

Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.

show abstract

Section: Linking Documents To Structuresmentioning

confidence: 99%

Section: Linking Documents To Structuresmentioning

confidence: 99%