Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing From Raw Text to Universal Dependencies 2017
DOI: 10.18653/v1/k17-3004
|View full text |Cite
|
Sign up to set email alerts
|

IMS at the CoNLL 2017 UD Shared Task: CRFs and Perceptrons Meet Neural Networks

Abstract: This paper presents the IMS contribution to the CoNLL 2017 Shared Task. In the preprocessing step we employed a CRF POS/morphological tagger and a neural tagger predicting supertags. On some languages, we also applied word segmentation with the CRF tagger and sentence segmentation with a perceptron-based parser. For parsing we took an ensemble approach by blending multiple instances of three parsers with very different architectures. Our system achieved the third place overall and the second place for the surp… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
22
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 24 publications
(23 citation statements)
references
References 16 publications
1
22
0
Order By: Relevance
“…These representations are used to score arcs, which are greedily added to the tree. Björkelund et al (2017) perform best on Arabic, using an ensemble of many different types of bottom-up discriminative parsers. They have each of twelve parsers score potential arcs, learn a weighting function to combine them, and use the Chu-Liu-Edmonds algorithm (Chu, 1965;Edmonds, 1967) to output final parses.…”
Section: Related Workmentioning
confidence: 99%
“…These representations are used to score arcs, which are greedily added to the tree. Björkelund et al (2017) perform best on Arabic, using an ensemble of many different types of bottom-up discriminative parsers. They have each of twelve parsers score potential arcs, learn a weighting function to combine them, and use the Chu-Liu-Edmonds algorithm (Chu, 1965;Edmonds, 1967) to output final parses.…”
Section: Related Workmentioning
confidence: 99%
“…Baselines for UD v2.0. We compare to the top performing models for EN, JA, VI, ZH from the CoNLL 2017 shared task: UDPipe 1.2 (Straka and Straková, 2017), Stanford (Dozat et al, 2017), FBAML (Qian and Liu, 2017), TRL (Kanayama et al, 2017), and IMS (Björkelund et al, 2017).…”
Section: Multilingual Experiments On Clean Datamentioning
confidence: 99%
“…Outside of the East Asian context, word segmentation-related research is focused mainly on languages with complex morphology and/or extensive compounding-such as Finnish, Turkish, German, Arabic and Hebrew-where splitting coarse-grained surface forms into smaller units leads to a significant reduction in the vocabulary size and thus lower proportion of out-of-vocabulary words [31][32][33][34][35]. Apart from that, even in languages normally using explicit word delimiters, there exist special types of text specific to the web domain, such as Uniform Resource Locators (URL) and hashtags, whose analysis requires the application of a word segmentation procedure [35,36].…”
Section: Related Workmentioning
confidence: 99%