2022
DOI: 10.48550/arxiv.2204.06251
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Experimental Standards for Deep Learning in Natural Language Processing Research

Abstract: The field of Deep Learning (DL) has undergone explosive growth during the last decade, with a substantial impact on Natural Language Processing (NLP) as well. Yet, as with other fields employing DL techniques, there has been a lack of common experimental standards compared to more established disciplines. Starting from fundamental scientific principles, we distill ongoing discussions on experimental standards in DL into a single, widely-applicable methodology. Following these best practices is crucial to stren… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 55 publications
0
3
0
Order By: Relevance
“…In our opinion, there are some fundamental steps that need to be made by researchers in order to improve experimental analysis and, ultimately, make progress in science. In this sense, we share the ideas discussed by [36], and we believe that there are some necessary best practices regarding choice of data, source code, models, experimental setting, and analysis that should be documented in any research paper that presents experimental results.…”
Section: How Does This Study Reflect On Current Knowledge About Issue...mentioning
confidence: 80%
See 1 more Smart Citation
“…In our opinion, there are some fundamental steps that need to be made by researchers in order to improve experimental analysis and, ultimately, make progress in science. In this sense, we share the ideas discussed by [36], and we believe that there are some necessary best practices regarding choice of data, source code, models, experimental setting, and analysis that should be documented in any research paper that presents experimental results.…”
Section: How Does This Study Reflect On Current Knowledge About Issue...mentioning
confidence: 80%
“…We believe that only a better documentation, both in the research paper and in the source code, could overcome most (if not all) the issues we encountered in this reproducibility study. The best practices suggested by [36] are one of the best starting points for a checklist of all the things a researcher should take into account "before" any experimental analysis. In order to mitigate issues like the ones related to the stopping criterion (an approach that is described in the paper but that is missing in the code), we believe that only a more accurate check on all the steps is the solution.…”
Section: What Are the Problems And Challenges Encountered?mentioning
confidence: 99%
“…Weights & Biases (Biewald, 2020) was used to track and manage hyperparameter searches and experiments. In general, we follow many of the experimental guidelines and suggestions laid out by Ulmer et al (2022a).…”
Section: B Calibration Metricsmentioning
confidence: 99%