Multi-task learning for historical text normalization: Size matters

Bollmann, Marcel; Søgaard, Anders; Bingel, Joachim

doi:10.18653/v1/w18-3403

Cited by 13 publications

(8 citation statements)

References 6 publications

(10 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Data. We experiment on the ten datasets from Bollmann et al (2018), which represent eight different languages: German (two datasets; Bollmann et al, 2017;Odebrecht et al, 2017); English, Hungarian, Icelandic, and Swedish (Pettersson, 2016); Slovene (two datasets; Ljubešic et al, 2016); and Spanish and Portuguese (Vaamonde, 2015). We treat the two datasets for German and Slovene as different languages.…”

Section: Historical Text Normalization (Norm)mentioning

confidence: 99%

“…Both encoder and decoder have a single hidden layer. We use the default model in Open-NMT (Klein et al, 2017) 3 as our implementation and employ the hyperparameters from Bollmann et al (2018). In the original paper, early stopping is done by training for 50 epochs, and the best model regarding development accuracy is applied to the test set.…”

Section: Historical Text Normalization (Norm)mentioning

confidence: 99%

“…However, since T is finite, overfitting the training set might lead to poor generalization performance. One way to avoid fitting Equation 1 too # train # dev ES Bollmann et al (2018) 5k 12k-46k Yes 400-700 100-200 Yes Makarov and Clematide (2018) 100 1k Yes Sharma et al (2018) 100 100 Yes Schulz et al (2018) 1k-21k 9k N/A Upadhyay et al (2018) 500 1k Yes closely is early stopping: a separate development or validation set is used to end training as soon as the loss on the development set L D (θ) starts increasing or model performance on the development set D starts decreasing. The best set of parameters θ is used in the final model.…”

Section: Introductionmentioning

confidence: 99%

“…This leads to settings where validation examples may outnumber training examples. Table 1 shows such cases for the tasks of historical text normalization (Bollmann et al, 2018), morphological segmentation , morphological inflection (Makarov and Clematide, 2018;Sharma et al, 2018), argument component identification (Schulz et al, 2018), and transliteration (Upadhyay et al, 2018).…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Towards Realistic Practices In Low-Resource Natural Language Processing: The Development Set

Kann¹,

Cho²,

Bowman³

2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

Development sets are impractical to obtain for real low-resource languages, since using all available data for training is often more effective. However, development sets are widely used in research papers that purport to deal with low-resource natural language processing (NLP). Here, we aim to answer the following questions: Does using a development set for early stopping in the low-resource setting influence results as compared to a more realistic alternative, where the number of training epochs is tuned on development languages? And does it lead to overestimation or underestimation of performance? We repeat multiple experiments from recent work on neural models for low-resource NLP and compare results for models obtained by training with and without development sets. On average over languages, absolute accuracy differs by up to 1.4%. However, for some languages and tasks, differences are as big as 18.0% accuracy. Our results highlight the importance of realistic experimental setups in the publication of lowresource NLP research results.

show abstract

Section: Historical Text Normalization (Norm)mentioning

confidence: 99%

Section: Historical Text Normalization (Norm)mentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Towards Realistic Practices In Low-Resource Natural Language Processing: The Development Set

Kann¹,

Cho²,

Bowman³

2019

Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conferen

View full text Add to dashboard Cite

show abstract

“…One active research question however is what information in specific tasks should be shared, as well was what indicators can be used to predetermine the cost-benefit trade-offs of MTL for a given application. Findings have shown that label distributions (Martínez Alonso and Plank, 2017), data sizes (Bollmann et al, 2018) and single task loss curves (Bingel and Søgaard, 2017) have all been respective indicators for MTL performance. Different tasks, data sizes, and settings can all show different relative performance gains (Adouane and Bernardy, 2020).…”

Section: Introductionmentioning

confidence: 99%

Annotations Matter: Leveraging Multi-task Learning to Parse UD and SUD

Sayyed¹,

Dakota²

2021

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

View full text Add to dashboard Cite

Using multiple treebanks to improve parsing performance has shown positive results. However, to what extent similar, yet competing annotation decisions play in parser behavior is unclear. We investigate this within a multi-task learning (MTL) dependency parser setup on two parallel treebanks, UD and SUD, which, while possessing similar annotation schemes, differ in specific linguistic annotation preferences. We perform a set of experiments with different MTL architectural choices, comparing performance across various input embeddings. We find languages tend to pattern in loose typological associations, but generally the performance within an MTL setting is lower than single model baseline parsers for each annotation scheme. The main contributing factor seems to be the competing syntactic annotation information shared between treebanks in an MTL setting, which is shown in experiments against differently annotated treebanks. This suggests that the impact of how the signal is encoded for annotations and its influence on possible negative transfer is more important than that of the input embeddings in an MTL setting.

show abstract

Shared-Private LSTM for Multi-domain Text Classification

Zhang

Jin

et al. 2019

Natural Language Processing and Chinese Computing

View full text Add to dashboard Cite

Multi-task learning for historical text normalization: Size matters

Cited by 13 publications

References 6 publications

Towards Realistic Practices In Low-Resource Natural Language Processing: The Development Set

Towards Realistic Practices In Low-Resource Natural Language Processing: The Development Set

Annotations Matter: Leveraging Multi-task Learning to Parse UD and SUD

Shared-Private LSTM for Multi-domain Text Classification

Contact Info

Product

Resources

About