2016
DOI: 10.1093/database/baw061
|View full text |Cite
|
Sign up to set email alerts
|

Chemical entity recognition in patents by combining dictionary-based and statistical approaches

Abstract: We describe the development of a chemical entity recognition system and its application in the CHEMDNER-patent track of BioCreative 2015. This community challenge includes a Chemical Entity Mention in Patents (CEMP) recognition task and a Chemical Passage Detection (CPD) classification task. We addressed both tasks by an ensemble system that combines a dictionary-based approach with a statistical one. For this purpose the performance of several lexical resources was assessed using Peregrine, our open-source in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
15
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 21 publications
(15 citation statements)
references
References 38 publications
(61 reference statements)
0
15
0
Order By: Relevance
“…Extraction of chemical compounds from chemical-related patents has recently been studied, focusing on patent titles and abstracts (28, 31, 52) or full texts (3, 20, 21, 27, 51). The majority of these studies concentrated on identifying chemical compounds in text while disregarding the structures of the extracted compounds (31, 52).…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Extraction of chemical compounds from chemical-related patents has recently been studied, focusing on patent titles and abstracts (28, 31, 52) or full texts (3, 20, 21, 27, 51). The majority of these studies concentrated on identifying chemical compounds in text while disregarding the structures of the extracted compounds (31, 52).…”
Section: Discussionmentioning
confidence: 99%
“…Obtaining high precision and recall values in the first step is essential for the success of the second step. Based on the findings of our previous studies (27, 28), we used an ensemble approach combining dictionary-based and morphology-based approaches to obtain high precision and recall. These approaches require a small annotated corpus (26, 33) and can provide a structural representation of the extracted compounds.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…There is growing interest in hybrid machine learning and dictionary systems such as the one described in [ 10 ], which obtains interesting performance on chemical entity recognition in patent texts. The authors of [ 55 ] use different approaches for different entity types (machine learning for chemical names, dictionary-based for organism and assay entities); given the complementary application, this is not a hybrid approach in the strict sense.…”
Section: Discussionmentioning
confidence: 99%
“…Two or more of the previously mentioned approaches are used together to combine their strengths and, hopefully, overcome their weaknesses. For example, [ 9 ] and [ 10 ] successfully use a hybrid dictionary-machine learning approach.…”
Section: Introductionmentioning
confidence: 99%