Proceedings of the 2007 Workshop on Computational Approaches to Semitic Languages Common Issues and Resources - Semitic '07 2007
DOI: 10.3115/1654576.1654587
|View full text |Cite
|
Sign up to set email alerts
|

Can you tag the modal? You should

Abstract: Computational linguistics methods are typically first developed and tested in English. When applied to other languages, assumptions from English data are often applied to the target language. One of the most common such assumptions is that a "standard" part-of-speech (POS) tagset can be used across languages with only slight variations. We discuss in this paper a specific issue related to the definition of a POS tagset for Modern Hebrew, as an example to clarify the method through which such variations can be … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
6
0

Year Published

2009
2009
2018
2018

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 5 publications
0
6
0
Order By: Relevance
“…These are not mere technical differences, but derive from different perspectives on the data. The Hebrew Treebank (TB) tag set is syntactic in nature ("if the word in this particular position functions as an adverb, tag it as an adverb, even though it is listed in the dictionary only as a noun"), whereas the KC tag set (Adler 2007;Netzer et al 2007;Adler et al 2008b) takes a lexical approach to POS tagging ("a word can assume only POS tags that would be assigned to it in a dictionary"). The lexical approach does not accommodate generic modification POS tags such as MOD, nor does it allow listing of demonstrative pronouns as adjectives.…”
Section: A Resource Incompatibility Issuementioning
confidence: 99%
See 2 more Smart Citations
“…These are not mere technical differences, but derive from different perspectives on the data. The Hebrew Treebank (TB) tag set is syntactic in nature ("if the word in this particular position functions as an adverb, tag it as an adverb, even though it is listed in the dictionary only as a noun"), whereas the KC tag set (Adler 2007;Netzer et al 2007;Adler et al 2008b) takes a lexical approach to POS tagging ("a word can assume only POS tags that would be assigned to it in a dictionary"). The lexical approach does not accommodate generic modification POS tags such as MOD, nor does it allow listing of demonstrative pronouns as adjectives.…”
Section: A Resource Incompatibility Issuementioning
confidence: 99%
“…This kind of disagreement naturally appears also between the KC and TB. See Adler et al (2008b) and Netzer et al (2007) for further discussion on these two interesting cases.…”
Section: A Resource Incompatibility Issuementioning
confidence: 99%
See 1 more Smart Citation
“…pers., pron. suff., punt., sost., vb. One could also envisage the refining of the tagset by adding: interrogative, modal, negation, and quantifier (Adler, 2007) (Netzer and Elhadad, 1998) (Netzer et al, 2007).…”
Section: Building the Resourcementioning
confidence: 99%
“…pers., pron. suf., punt., sost., vb. One could also envisage the reining of the tagset by adding: interrogative, modal, negation, and quantiier (Adler, 2007) (Netzer and Elhadad, 1998) (Netzer et al, 2007).…”
Section: Building the Resourcementioning
confidence: 99%