Initial Experiments in Data-Driven Morphological Analysis for Finnish

Silfverberg, Miikka; Hulden, Mans

doi:10.18653/v1/w18-0209

Cited by 9 publications

(26 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Classically, rule-based analyzers have been augmented with statistical guessers which provide analyses for out-of-lexicon word forms (Lindén, 2009). Recently, purely data-driven morphological analysis has received increasing attention (Nicolai and Kondrak, 2017;Silfverberg and Hulden, 2018;Moeller et al, 2018;Silfverberg and Tyers, 2019). Purely data-driven systems learn an analysis model from a data set of morphologically analyzed word forms and can then be applied to unseen word forms.…”

Section: Discussionmentioning

confidence: 99%

A Report on the Third

Zampieri

Malmasi²,

Scherrer³

et al. 2019

Proceedings of the Sixth Workshop On

Self Cite

View full text Add to dashboard Cite

In this paper, we present the findings of the Third VarDial Evaluation Campaign organized as part of the sixth edition of the workshop on Natural Language Processing (NLP) for Similar Languages, Varieties and Dialects (VarDial), co-located with NAACL 2019. This year, the campaign included five shared tasks, including one task rerun -German Dialect Identification (GDI)-and four new tasks-Cross-lingual Morphological Analysis (CMA), Discriminating between Mainland and Taiwan variation of Mandarin Chinese (DMT), Moldavian vs. Romanian Cross-dialect Topic identification (MRC), and Cuneiform Language Identification (CLI). A total of 22 teams submitted runs across the five shared tasks. After the end of the competition, we received 14 system description papers, which are published in the VarDial workshop proceedings and referred to in this report.

show abstract

Section: Discussionmentioning

confidence: 99%

A Report on the Third

Zampieri

Malmasi²,

Scherrer³

et al. 2019

Proceedings of the Sixth Workshop On

Self Cite

View full text Add to dashboard Cite

show abstract

“…Conversion systems for transforming sound signals of arbitrary sound events into their corresponding onomatopoeic representations have been developed using deep learning techniques . Moreover, a totally data‐driven approach called representation learning has also been investigated for training networks using embedding vectors corresponding to sound signals .…”

Section: Recent Research Trends In Environmental Sound Processingmentioning

confidence: 99%

Environmental sound processing and its applications

Miyazaki

Toda

Hayashi

et al. 2019

IEEJ Transactions Elec Engng

View full text Add to dashboard Cite

As part of the effort to develop techniques for understanding environments using sound, many studies in the field of computational auditory scene analysis have focused on using computers to perform functions carried out naturally by the human auditory system. Thanks to recent progress in machine‐learning techniques, these environmental sound‐processing techniques have significantly improved and a widening variety of applications has resulted in considerable interest in this field. In this review, we introduce the fundamental techniques of environmental sound processing, as well as recent advances in front‐end and back‐end processing and potential applications for these techniques. Prospects for further progress in the field of environmental sound processing and the challenges still to be overcome are also discussed. © 2019 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.

show abstract

“…The second one is created using data from the Turku Dependency Treebank. This dataset was originally presented by Silfverberg and Hulden (2018). We explicitly do not use any data from the Unimorph project.…”

Section: Datamentioning

confidence: 99%

“…Our second dataset was presented by Silfverberg and Hulden (2018). It is the Finnish part of the Universal Dependencies treebank v1 (Pyysalo et al, 2015) which has been analyzed using the OMorFi morphological analyzer (Pirinen et al, 2017).…”

Section: Finnish Treebank Datamentioning

confidence: 99%

“…It is the Finnish part of the Universal Dependencies treebank v1 (Pyysalo et al, 2015) which has been analyzed using the OMorFi morphological analyzer (Pirinen et al, 2017). We used the splits into training, development and test sets provided by Silfverberg and Hulden (2018). In contrast to the Uralic Wikipedia datasets, which is a type-level resource consisting of analyses for unique word forms, the Finnish treebank data is a token-level resource consisting of morphologically analyzed running text.…”

Section: Finnish Treebank Datamentioning

confidence: 99%

See 1 more Smart Citation

Proceedings of the Fifth International Workshop on Computational Linguistics for Uralic Languages

2019

View full text Add to dashboard Cite

This paper describes an initial set of experiments in data-driven morphological analysis of Uralic languages. The paper differs from previous work in that our work covers both lemmatization and generating ambiguous analyses. While hand-crafted finite-state transducers represent the state of the art in morphological analysis for most Uralic languages, we believe that there is a place for datadriven approaches, especially with respect to making up for lack of completeness in the шlexicon. We present results for nine Uralic languages that show that, at least for basic nominal morphology for six out of the nine languages, data-driven methods can achieve an F-score of over 90%, providing results that approach those of finite-state techniques. We also compare our system to an earlier approach to Finnish data-driven morphological analysis (Silfverberg and Hulden, 2018) and show that our system outperforms this baseline.

show abstract

Initial Experiments in Data-Driven Morphological Analysis for Finnish

Cited by 9 publications

References 8 publications

A Report on the Third

A Report on the Third

Environmental sound processing and its applications

Proceedings of the Fifth International Workshop on Computational Linguistics for Uralic Languages

Contact Info

Product

Resources

About