Niko Schenk scite author profile

2017

We introduce an attention-based Bi-LSTM for Chinese implicit discourse relations and demonstrate that modeling argument pairs as a joint sequence can outperform word order-agnostic approaches. Our model benefits from a partial sampling scheme and is conceptually simple, yet achieves state-of-the-art performance on the Chinese Discourse Treebank. We also visualize its attention activity to illustrate the model's ability to selectively focus on the relevant parts of an input sequence.

Unsupervised Learning of Prototypical Fillers for Implicit Semantic Role Labeling

2016

Gold annotations for supervised implicit semantic role labeling are extremely sparse and costly. As a lightweight alternative, this paper describes an approach based on unsupervised parsing which can do without iSRL-specific training data: We induce prototypical roles from large amounts of explicit SRL annotations paired with their distributed word representations. An evaluation shows competitive performance with supervised methods on the SemEval 2010 data, and our method can easily be applied to predicates (or languages) for which no training annotations are available.

Do We Really Need All Those Rich Linguistic Features? A Neural Network-Based Approach to Implicit Sense Labeling

Donandt³

et al. 2016

We describe our contribution to the CoNLL 2016 Shared Task on shallow discourse parsing. 1 Our system extends the two best parsers from previous year's competition by integration of a novel implicit sense labeling component. It is grounded on a highly generic, language-independent feedforward neural network architecture incorporating weighted word embeddings for argument spans which obviates the need for (traditional) hand-crafted features. Despite its simplicity, our system overall outperforms all results from 2015 on 5 out of 6 evaluation sets for English and achieves an absolute improvement in F 1 -score of 3.2% on the PDTB test section for non-explicit sense classification.

Annotating a Low-Resource Language with LLOD Technology: Sumerian Morphology and Syntax

et al. 2018

This paper describes work on the morphological and syntactic annotation of Sumerian cuneiform as a model for low resource languages in general. Cuneiform texts are invaluable sources for the study of history, languages, economy, and cultures of Ancient Mesopotamia and its surrounding regions. Assyriology, the discipline dedicated to their study, has vast research potential, but lacks the modern means for computational processing and analysis. Our project, Machine Translation and Automated Analysis of Cuneiform Languages, aims to fill this gap by bringing together corpus data, lexical data, linguistic annotations and object metadata. The project’s main goal is to build a pipeline for machine translation and annotation of Sumerian Ur III administrative texts. The rich and structured data is then to be made accessible in the form of (Linguistic) Linked Open Data (LLOD), which should open them to a larger research community. Our contribution is two-fold: in terms of language technology, our work represents the first attempt to develop an integrative infrastructure for the annotation of morphology and syntax on the basis of RDF technologies and LLOD resources. With respect to Assyriology, we work towards producing the first syntactically annotated corpus of Sumerian.

A Minimalist Approach to Shallow Discourse Parsing and Implicit Relation Recognition

2015

We describe a minimalist approach to shallow discourse parsing in the context of the CoNLL 2015 Shared Task. 1 Our parser integrates a rule-based component for argument identification and datadriven models for the classification of explicit and implicit relations. We place special emphasis on the evaluation of implicit sense labeling, we present different feature sets and show that (i) word embeddings are competitive with traditional word-level features, and (ii) that they can be used to considerably reduce the total number of features. Despite its simplicity, our parser is competitive with other systems in terms of sense recognition and thus provides a solid ground for further refinement.