Developing a Stemmer for German Based on a Comparative Analysis of Publicly Available Stemmers

Leonie, Weissweiler,; Fraser, Alexander

doi:10.1007/978-3-319-73706-5_8

Cited by 12 publications

(9 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The distribution between Android and iOS in the reviews (77% Android) roughly matches that of the distribution of Android and iOS market shares in Germany (64% Android) 5 . Looking at all ratings, including those without a review, for Android, 36% of the ratings were 5-star ratings, and 35% were 1-star ratings.…”

Section: A App Reviewssupporting

confidence: 62%

Public Perception of the German COVID-19 Contact-Tracing App Corona-Warn-App

Beierle¹,

Dhakal²,

Cohrdes³

et al. 2021

Preprint

View full text Add to dashboard Cite

Several governments introduced or promoted the use of contact-tracing apps during the ongoing COVID-19 pandemic. In Germany, the related app is called Corona-Warn-App, and by end of 2020, it had 22.8 million downloads. Contact tracing is a promising approach for containing the spread of the novel coronavirus. It is only effective if there is a large user base, which brings new challenges like app users unfamiliar with using smartphones or apps. As Corona-Warn-App is voluntary to use, reaching many users and gaining a positive public perception is crucial for its effectiveness. Based on app reviews and tweets, we are analyzing the public perception of Corona-Warn-App. We collected and analyzed all 78,963 app reviews for the Android and iOS versions from release (June 2020) to beginning of February 2021, as well as all original tweets until February 2021 containing #CoronaWarnApp (43,082). For the reviews, the most common words and n-grams point towards technical issues, but it remains unclear, to what extent this is due to the app itself, the used Exposure Notification Framework, system settings on the user's phone, or the user's misinterpretations of app content. For Twitter data, overall, based on tweet content, frequent hashtags, and interactions with tweets, we conclude that the German Twitter-sphere widely reports adopting the app and promotes its use.

show abstract

Section: A App Reviewssupporting

confidence: 62%

Public Perception of the German COVID-19 Contact-Tracing App Corona-Warn-App

Beierle¹,

Dhakal²,

Cohrdes³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Then, we tokenized tweets using the regular expression, re, Python package [17]. When tokenizing, we changed tweets to lowercase and stemmed tweets with Porter stemmer for English [18] and with Cistem for German [19]. For English tweets, we employed a basic tokenizer to tokenize tweets without stemming for POS tagging.…”

Section: Pre-processingmentioning

confidence: 99%

Multi-Class Detection of Abusive Language Using Automated Machine Learning

Jorgensen¹,

Choi²,

Niemann³

et al. 2020

WI2020 Zentrale Tracks

View full text Add to dashboard Cite

Abusive language detection online is a daunting task for moderators. We propose Automated Machine Learning (Auto-ML) to semi-automate abusive language detection and to assist moderators. In this paper, we show that multi-class classification powered by Auto-ML is successful in detecting abusive language in English and German as well as and better than the state-ofthe-art machine learning models. We also highlight how we combatted the imbalanced data problem in our data-sets through feature selection and undersampling methods. We propose Auto-ML as a promising approach to the field of abusive language detection, especially for small companies who may have little machine learning knowledge and computing resources.

show abstract

“…Dedicated methods that tackle rich target-side morphology have also shown good results in phrase-based translation systems previously (Huck et al, 2017c). Future work on neural machine translation could for instance follow a two-step prediction paradigm (Conforti et al, 2018), or improve over our current version of linguistically informed word segmentation by means of a better linguistic analysis (Weissweiler and Fraser, 2017).…”

Section: Preprocessingmentioning

confidence: 99%

LMU Munich’s Neural Machine Translation Systems at WMT 2018

Huck¹,

Stojanovski²,

Hangya³

et al. 2018

Proceedings of the Third Conference on Machine Translation: Shared Task Papers

Self Cite

View full text Add to dashboard Cite

We present the LMU Munich machine translation systems for the English-German language pair. We have built neural machine translation systems for both translation directions (English→German and German→English) and for two different domains (the biomedical domain and the news domain). The systems were used for our participation in the WMT18 biomedical translation task and in the shared task on machine translation of news. 1,2 The main focus of our recent system development efforts has been on achieving improvements in the biomedical domain over last year's strong biomedical translation engine for English→German (Huck et al., 2017a). Considerable progress has been made in the latter task, which we report on in this paper.

show abstract

Developing a Stemmer for German Based on a Comparative Analysis of Publicly Available Stemmers

Cited by 12 publications

References 6 publications

Public Perception of the German COVID-19 Contact-Tracing App Corona-Warn-App

Public Perception of the German COVID-19 Contact-Tracing App Corona-Warn-App

Multi-Class Detection of Abusive Language Using Automated Machine Learning

LMU Munich’s Neural Machine Translation Systems at WMT 2018

Contact Info

Product

Resources

About