2022
DOI: 10.1021/acs.jcim.2c00733
|View full text |Cite
|
Sign up to set email alerts
|

MolMiner: You Only Look Once for Chemical Structure Recognition

Abstract: Molecular structures are commonly depicted in 2D printed forms in scientific documents such as journal papers and patents. However, these 2D depictions are not machine readable. Due to a backlog of decades and an increasing amount of printed literatures, there is a high demand for translating printed depictions into machinereadable formats, which is known as Optical Chemical Structure Recognition (OCSR). Most OCSR systems developed over the last three decades use a rule-based approach, which vectorizes the dep… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 14 publications
(10 citation statements)
references
References 22 publications
0
7
0
Order By: Relevance
“…In recent years, deep-learning-based OCSR tools have been developed 16 18 in conjunction with remarkable advancements in computer vision and natural language processing 19 , 20 . While several publications have claimed to have developed tools that are capable of recognising chemical depictions with high accuracy, most of these tools are either proprietary or entirely unavailable 16 , 21 23 . Among the few open-source OCSR software solutions 15 , 24 , there is no system that combines chemical structure image segmentation, classification, and translation within a comprehensive workflow.…”
Section: Introductionmentioning
confidence: 99%
“…In recent years, deep-learning-based OCSR tools have been developed 16 18 in conjunction with remarkable advancements in computer vision and natural language processing 19 , 20 . While several publications have claimed to have developed tools that are capable of recognising chemical depictions with high accuracy, most of these tools are either proprietary or entirely unavailable 16 , 21 23 . Among the few open-source OCSR software solutions 15 , 24 , there is no system that combines chemical structure image segmentation, classification, and translation within a comprehensive workflow.…”
Section: Introductionmentioning
confidence: 99%
“…In evaluating the segmentation capabilities of MolMiner-ImgDet [ 20 ], DECIMER [ 18 ], and ChemSAM, we reference several considerations due to the varied availability of model implementations and the specific challenges presented by chemical structure segmentation. These considerations include the completeness of segmentation, the proportion of structures accurately segmented from the document layout, the recognition rate of colored structures, and the proportion of non-structural elements.…”
Section: Resultsmentioning
confidence: 99%
“…There have been closed-source projects like CLiDE 39 or the recently published MolMiner 23 that combine a segmentation step with an OCSR step in their workflow. CLiDE is a fully commercial tool, MolMiner permits limited access to registered users and offers unlimited access to users who wish to obtain an enterprise licence.…”
Section: Discussionmentioning
confidence: 99%
“…In recent years, deep-learning-based OCSR tools have been developed 16,17,18 in conjunction with remarkable advancements in computer vision and natural language processing 19,20 . While several publications have claimed to have developed tools that are capable of recognizing chemical depictions with high accuracy, most of these tools are either proprietary or entirely unavailable 16,[21][22][23] . Among the few open-source OCSR software solutions 15,24 , there is no system that combines chemical structure image segmentation, classification, and translation within a comprehensive workflow.…”
Section: Introductionmentioning
confidence: 99%