2019
DOI: 10.1162/tacl_a_00253
|View full text |Cite
|
Sign up to set email alerts
|

Joint Transition-Based Models for Morpho-Syntactic Parsing: Parsing Strategies for MRLs and a Case Study from Modern Hebrew

Abstract: In standard NLP pipelines, morphological analysis and disambiguation (MA&D) precedes syntactic and semantic downstream tasks. However, for languages with complex and ambiguous word-internal structure, known as morphologically rich languages (MRLs), it has been hypothesized that syntactic context may be crucial for accurate MA&D, and vice versa. In this work we empirically confirm this hypothesis for Modern Hebrew, an MRL with complex morphology and severe word-level ambiguity, in a novel transition-bas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

7
61
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 29 publications
(72 citation statements)
references
References 21 publications
7
61
0
Order By: Relevance
“…Utilizing term frequency-inverse document frequency, we represented each of the police reports with its own sparse vector. In order to improve the accuracy of the TF-IDF representation with Hebrew, we performed the following steps: (1) we removed Hebrew stop words, and (2) we used Hebrew tokenization and lemmatization utilizing the 'YAP' parser tool [25].…”
Section: Baselines -Representation Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Utilizing term frequency-inverse document frequency, we represented each of the police reports with its own sparse vector. In order to improve the accuracy of the TF-IDF representation with Hebrew, we performed the following steps: (1) we removed Hebrew stop words, and (2) we used Hebrew tokenization and lemmatization utilizing the 'YAP' parser tool [25].…”
Section: Baselines -Representation Methodsmentioning
confidence: 99%
“…In our experiments, the police reports used are written in Hebrew, which makes the extraction task very difficult [17,25,37]. Hebrew is a morphologically rich language, and its text tends to be very ambiguous; each token can have a large set of possible inflections, according to quantity, gender, and tense [19].…”
Section: Introductionmentioning
confidence: 99%
“…Assuming the inflection to be in nominative-case, this information enables the nominal to form one of the two possible syntactic relations with the main verb in the sentence, namely, kartā (subject) or karma (object) 2 , in the syntactic analysis of the sentence. The cyclic dependency between morphological and syntax-level tasks is well known (Tsarfaty 2006), and these tasks are often solved jointly (More et al 2019). Similarly, the potential error propagation from word segmentation to its downstream tasks in pipeline models is also well established for multiple languages (Hatori et al 2012;Zhang and Yang 2018).…”
Section: Figurementioning
confidence: 99%
“…The performance of a system depends highly on the choice of feature function used for the task. In MRLs, hand-crafted features still form a crucial component in contributing to the performance of the state-of-the-art systems for tasks such as morphological parsing and dependency parsing More et al, 2019;Seeker and Ç etinoglu 2015). But Krishna et al (2018) learn a feature function using the Path Ranking Algorithm (PRA) (Lao and Cohen 2010) for the joint task of word segmentation and morphological parsing.…”
Section: Figurementioning
confidence: 99%
See 1 more Smart Citation