2003
DOI: 10.1093/bioinformatics/btg1023
|View full text |Cite
|
Sign up to set email alerts
|

GENIA corpus—a semantically annotated corpus for bio-textmining

Abstract: GENIA corpus version 3.0 consisting of 2000 MEDLINE abstracts has been released with more than 400,000 words and almost 100,000 annotations for biological terms.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
604
0
4

Year Published

2004
2004
2017
2017

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 873 publications
(609 citation statements)
references
References 1 publication
0
604
0
4
Order By: Relevance
“…The in-domain monolingual corpora include (cf. Table 2): the Cochrane database of reviews of primary research in human health care and health policy [63], DrugBank -a bioinformatics and cheminformatics resource describing drugs [64], Gene Regulation Event Corpus (GREC) -a semantically annotated English corpus of abstracts of biomedical texts [65], the GENIA corpus of biomedical literature compiled and annotated within the GENIA project [66], the Foundational Model of Anatomy Ontology (FMA) -a knowledge source for biomedical informatics concerned with symbolic representation of the phenotypic structure of the human body [67], English texts extracted from the UMLS Metathesaurus [40], the Patient Information Leaflet Corpus (PIL) -a collection of documents giving instructions to patients about their medication [69], and finally, a large set of texts extracted from HONcode-certified sites (HON) that have been identified by language-detection libraries [70,71] to be English-language [72].…”
Section: In-domain Monolingual Corporamentioning
confidence: 99%
“…The in-domain monolingual corpora include (cf. Table 2): the Cochrane database of reviews of primary research in human health care and health policy [63], DrugBank -a bioinformatics and cheminformatics resource describing drugs [64], Gene Regulation Event Corpus (GREC) -a semantically annotated English corpus of abstracts of biomedical texts [65], the GENIA corpus of biomedical literature compiled and annotated within the GENIA project [66], the Foundational Model of Anatomy Ontology (FMA) -a knowledge source for biomedical informatics concerned with symbolic representation of the phenotypic structure of the human body [67], English texts extracted from the UMLS Metathesaurus [40], the Patient Information Leaflet Corpus (PIL) -a collection of documents giving instructions to patients about their medication [69], and finally, a large set of texts extracted from HONcode-certified sites (HON) that have been identified by language-detection libraries [70,71] to be English-language [72].…”
Section: In-domain Monolingual Corporamentioning
confidence: 99%
“…[11] is consistent with [14] only if either (a) every instance of non-Saccharomyces Fungus polarized growth is co-localized with an instance of Saccharomyces polarized group or (b) there is Fungus polarized growth only in Saccharomyces. (a) we take to be biologically false; but (b) implies that 'site of polarized growth (sensu Saccharomyces)' and 'site of polarized growth (sensu Fungi)' in fact refer, confusingly, to the same class, and thus that the latter should be removed from GO's cellular component ontology.…”
Section: Problems With 'Sensu'mentioning
confidence: 95%
“…from which we can infer that: [13] bud tip is a site of polarized growth (sensu Fungi) and from there to: [14] every instance of bud tip has an instance of Fungus polarized growth located therein.…”
Section: Problems With 'Sensu'mentioning
confidence: 99%
See 1 more Smart Citation
“…Finally, the event extractor finds the binary relation using the syntactic information of a given sentence, co-occurrence statistics between two named entities, and pattern information of an event verb. General medical term was trained with UMLS meta-thesaurus [12] and the biological entity and its interaction was trained with GENIA [13] corpus. The underlying NLP approaches for named entity recognition are based on the system of Hwang et al [14] and Lee et al [15] with collaborations.…”
Section: Interaction Extractionmentioning
confidence: 99%