2022
DOI: 10.48550/arxiv.2207.09152
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Benchmarking Transformers-based models on French Spoken Language Understanding tasks

Oralie Cattan,
Sahar Ghannay,
Christophe Servan
et al.

Abstract: In the last five years, the rise of the self-attentional Transformerbased architectures led to state-of-the-art performances over many natural language tasks. Although these approaches are increasingly popular, they require large amounts of data and computational resources. There is still a substantial need for benchmarking methodologies ever upwards on under-resourced languages in data-scarce application conditions. Most pre-trained language models were massively studied using the English language and only a … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
0
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 16 publications
0
0
0
Order By: Relevance
“…Pre-training Compute and CO2 Impact Our model was trained for 8 days on 6 A40 GPUs, compared to CamemBERT which was trained on 256 V100 GPUs for one day, which is roughly equivalent to 28 days of training on 6 A40 GPUs, since an NVIDIA A40 GPU is about 1.5x faster than a V100 GPU on language modeling tasks according to recent benchmarks. 8 Following the reports by Luccioni et al (2022) and Cattan et al (2022) on the environmental impact of language model training, we use Lannelongue et al's (2021) online carbon footprint calculator to provide the following estimates: CAMEM-BERTA's pre-training used 700kWh and emitted 36kg CO 2 compared to 3.32MWh and 170kg for CamemBERT. 9…”
Section: Pre-training Dataset Choicementioning
confidence: 99%
“…Pre-training Compute and CO2 Impact Our model was trained for 8 days on 6 A40 GPUs, compared to CamemBERT which was trained on 256 V100 GPUs for one day, which is roughly equivalent to 28 days of training on 6 A40 GPUs, since an NVIDIA A40 GPU is about 1.5x faster than a V100 GPU on language modeling tasks according to recent benchmarks. 8 Following the reports by Luccioni et al (2022) and Cattan et al (2022) on the environmental impact of language model training, we use Lannelongue et al's (2021) online carbon footprint calculator to provide the following estimates: CAMEM-BERTA's pre-training used 700kWh and emitted 36kg CO 2 compared to 3.32MWh and 170kg for CamemBERT. 9…”
Section: Pre-training Dataset Choicementioning
confidence: 99%