2014
DOI: 10.1101/011403
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ENVIRONMENTS and EOL: identification of Environment Ontology terms in text and the annotation of the Encyclopedia of Life

Abstract: Summary: The association of organisms to their environments is a key issue in exploring biodiversity patterns. This knowledge has traditionally been scattered, but textual descriptions of taxa and their habitats are now being consolidated in centralized resources. However, structured annotations are needed to facilitate large-scale analyses. Therefore, we developed ENVIRONMENTS, a fast dictionary-based tagger capable of identifying Environment Ontology (ENVO) terms in text. We evaluate the accuracy of the tagg… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
5
0

Year Published

2016
2016
2016
2016

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 7 publications
(5 reference statements)
0
5
0
Order By: Relevance
“…In particular, the frequency of occurrence of each word is noted. Concretely, this is done offline by using a named entity recognition (NER) system [8] and placing results into an SQLite3 database that is automatically downloaded on the first run of seqenv.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…In particular, the frequency of occurrence of each word is noted. Concretely, this is done offline by using a named entity recognition (NER) system [8] and placing results into an SQLite3 database that is automatically downloaded on the first run of seqenv.…”
Section: Methodsmentioning
confidence: 99%
“…In general, we have found the isolation source metadata to be the most dependable source of environmental information and the results presented here are restricted to that field. A custom named entity recognition (NER) system based on [8] is then used to label the resulting text with terms from the EnvO ontology [3]. An ontology is a formal specifications of the terms in a particular knowledge domain and the relations among them.…”
Section: Introductionmentioning
confidence: 99%
“…Concretely, this is done offline by using a named entity recognition (NER) system. The NER algorithm is an optimized dictionary-based tagger, it searches for keywords associated with each ENVO term but also using a stop-list of problematic words [8].…”
Section: Methodsmentioning
confidence: 99%
“…The ability of the NER engine to tag text with ENVO terms was evaluated in [8] through comparison to a manually curated corpus this resulted in 87.8% precision and 77.0% recall, corresponding to an F1 score of 82.0%. The results were placed into an SQLite3 database that is automatically downloaded on the first run of seqenv.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation