2011
DOI: 10.1186/1758-2946-3-41
|View full text |Cite
|
Sign up to set email alerts
|

OSCAR4: a flexible architecture for chemical text-mining

Abstract: The Open-Source Chemistry Analysis Routines (OSCAR) software, a toolkit for the recognition of named entities and data in chemistry publications, has been developed since 2002. Recent work has resulted in the separation of the core OSCAR functionality and its release as the OSCAR4 library. This library features a modular API (based on reduction of surface coupling) that permits client programmers to easily incorporate it into external applications. OSCAR4 offers a domain-independent architecture upon which che… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
150
0
1

Year Published

2012
2012
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 174 publications
(160 citation statements)
references
References 26 publications
(22 reference statements)
0
150
0
1
Order By: Relevance
“…State-of-the-art recognizers can be used to tag organisms (Gerner et al 2010), chemical entities (Jessop et al 2011) and genes, proteins and other biological entities (Settles 2005). The vocabulary on AMPs, eg derived from the UniProt knowledge base (Magrane and Consortium 2011) and antibiotics lexicon derived from the ARDB database (Liu and Pop 2009) and the antibiotics list in Wikipedia (Wikipedia 2012), can support the pattern matching of AMP mentions and further development of suitable recognizers.…”
Section: Data Screening and Processingmentioning
confidence: 99%
“…State-of-the-art recognizers can be used to tag organisms (Gerner et al 2010), chemical entities (Jessop et al 2011) and genes, proteins and other biological entities (Settles 2005). The vocabulary on AMPs, eg derived from the UniProt knowledge base (Magrane and Consortium 2011) and antibiotics lexicon derived from the ARDB database (Liu and Pop 2009) and the antibiotics list in Wikipedia (Wikipedia 2012), can support the pattern matching of AMP mentions and further development of suitable recognizers.…”
Section: Data Screening and Processingmentioning
confidence: 99%
“…We also compare our approach with the existing techniques proposed in [9,2,16]. Note that in [9] a CRF based machine learning system is developed for the chemical name identification.…”
Section: Comparison With the Existing Systemsmentioning
confidence: 97%
“…PathTexts [81] consisting of a pathway visualizer, text-mining algorithms, and annotation tools is available for systems biologists. Other important text-mining tools specifically built for medical informatics are Biocontrast [82] and BioText Quest [83]. AbNER [84], an open-source software tool for biomedical text mining, provides a GUI for tagging genes, proteins, and other entity names in the given text.…”
Section: Biomedical Text Miningmentioning
confidence: 99%
“…This can save users valuable time of redrawing structures from printed material, as it directly transforms the "images" into "real structures" that can then be saved into chemical databases. Programs such as CLiDE [82], OSRA [83], and ChemOCR [84] are the known relevant softwares that recognize structures, reactions, and text from scanned images of printed chemistry literature. OSRA is a utility designed to convert graphical representations of chemical structures, as they appear in journal articles, patent documents, textbooks, trade magazines, etc.…”
Section: Image To Structure Toolsmentioning
confidence: 99%