2006
DOI: 10.1075/ubli.5.15mor
|View full text |Cite
|
Sign up to set email alerts
|

Morpho-syntactic Tagging of the Spanish C-ORAL-ROM Corpus

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
3
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
3
3
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 0 publications
0
3
0
Order By: Relevance
“…For this reason, state-of-the-art lexicons [ 14 , 36 , 49 ] gather verb terms, and we proceed similarly in MedLexSp. From a list of medical verbs, we generated conjugated variants by using a python script and the lexicon of a Spanish part-of-speech tagger [ 50 ]: e.g. sangrar (‘to bleed’) sangra (‘he/she/it bleeds’), sangrando (‘bleeding’), sangrado (‘bled’)... Then, the CUI of each noun term was assigned to the corresponding verb term.…”
Section: Construction and Contentmentioning
confidence: 99%
See 2 more Smart Citations
“…For this reason, state-of-the-art lexicons [ 14 , 36 , 49 ] gather verb terms, and we proceed similarly in MedLexSp. From a list of medical verbs, we generated conjugated variants by using a python script and the lexicon of a Spanish part-of-speech tagger [ 50 ]: e.g. sangrar (‘to bleed’) sangra (‘he/she/it bleeds’), sangrando (‘bleeding’), sangrado (‘bled’)... Then, the CUI of each noun term was assigned to the corresponding verb term.…”
Section: Construction and Contentmentioning
confidence: 99%
“… The subset of medical terms encoded in the lexicon of the SPACCC PoS tagger [ 72 ] was processed and their linguistic information was added to MedLexSp. The GRAMPAL tagger [ 50 ] was applied to predict the part-of-speech of mono-word terms for which no morphological data were obtained using the previous methods. Lastly, with regard to multi-words, the procedure was to leverage the information of the head word, given that the head determines the analysis of the constituent.…”
Section: Construction and Contentmentioning
confidence: 99%
See 1 more Smart Citation
“…The Spanish dataset is taken from the FinT-esp corpus (Moreno-Sandoval et al, 2020) and consists of 262 documents with a distribution utterly similar to the Greek dataset (see Table 3). The originals were carefully edited by hand, and fragments not containing the narrative (tables, footnotes, headers, etc.)…”
Section: Spanish Datasetmentioning
confidence: 99%