Proceedings of the Workshop on Multiword Expressions Identification, Interpretation, Disambiguation and Applications - MWE '09 2009
DOI: 10.3115/1698239.1698247
|View full text |Cite
|
Sign up to set email alerts
|

Mining complex predicates in Hindi using a parallel Hindi-English corpus

Abstract: Complex predicate is a noun, a verb, an adjective or an adverb followed by a light verb that behaves as a single unit of verb. Complex predicates (CPs) are abundantly used in Hindi and other languages of Indo Aryan family. Detecting and interpreting CPs constitute an important and somewhat a difficult task. The linguistic and statistical methods have yielded limited success in mining this data. In this paper, we present a simple method for detecting CPs of all kinds using a Hindi-English parallel corpus. A CP … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2010
2010
2019
2019

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 17 publications
(10 citation statements)
references
References 6 publications
0
10
0
Order By: Relevance
“…The major confusion with automatic tagging is to differentiate the VBL (light verb) and the VB (verb) tags. Though considered complex predicates [Sinha 2009] VBL words are syntactically very similar to VB words. Several of the cardinals were tagged as nouns owing to syntactic similarity.…”
Section: Urdu-specific Pos Taggermentioning
confidence: 99%
See 1 more Smart Citation
“…The major confusion with automatic tagging is to differentiate the VBL (light verb) and the VB (verb) tags. Though considered complex predicates [Sinha 2009] VBL words are syntactically very similar to VB words. Several of the cardinals were tagged as nouns owing to syntactic similarity.…”
Section: Urdu-specific Pos Taggermentioning
confidence: 99%
“…For details of the CRULP tagset, please refer to the CRULP Web site. 15 We use this tagset in our work for two reasons: (1) the availability of annotated corpus that uses the CRULP tagset, and (2) the existence of the light verb tags VBL and VBLI which is helpful for semantic role labeling [Sinha 2009]. …”
Section: Part Of Speech Taggermentioning
confidence: 99%
“…The automatic detection benefits especially from parallel corpora representing valuable sources of data in which CPs can be automatically recognized via word alignment, see e.g. (Chen et al, 2015), (de Medeiros Caseli et al, 2010), (Sinha, 2009), (Zarrießand Kuhn, 2009).…”
Section: Related Workmentioning
confidence: 99%
“…Third, there are languages that abound in verb + noun constructions or multiword verbs (Hindi (Sinha, 2009;Sinha, 2011), Bengali , Estonian (Kaalep and Muischnek, 2006;Kaalep and Muischnek, 2008;Muischnek and Kaalep, 2010), Persian (Mansoory and Bijankhan, 2008)): verbal concepts are mostly expressed by combining a noun with a light verb (Mansoory and Bijankhan, 2008).…”
Section: Semi-compositional Constructions and Their Verbal Counterpartsmentioning
confidence: 99%
“…For instance, one-to-many alignment can be exploited: if a word corresponds to several words in the other language, it is highly probable that the other language equivalent can be considered as a multiword expression (see e.g. Caseli et al (2009), Caseli et al (2010, Zarrieß and Kuhn (2009), Sinha (2009), Attia et al (2010 or Haugereid and Bond (2011)). However, this method cannot identify multiword expressions that are aligned to another multiword expression in the other language.…”
Section: Related Work On the Automatic Identification Of Multiword Exmentioning
confidence: 99%