Hierarchical Pre-training for Sequence Labelling in Spoken Dialog

Chapuis, Emile; Colombo, Pierre; Manica, Matteo; Labeau, Matthieu; Clavel, Chloé

doi:10.18653/v1/2020.findings-emnlp.239

Cited by 20 publications

(13 citation statements)

References 49 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This work paves the way for using and developing new alternative methods to improve the learning (e.g new estimator of mu-tual information (Colombo et al, 2021a), Wasserstein Barycenters (Colombo et al, 2021b), Data Depths (Staerman et al, 2021), Extreme Value Theory (Jalalzai et al, 2020)). A future line of research involves using this methods for emotion (Colombo et al, 2019;Witon et al, 2018) and dialog act (Chapuis et al, 2021(Chapuis et al, , 2020a classification with pretrained model tailored for spoken language (Dinkar et al, 2020).…”

Section: Discussionmentioning

confidence: 99%

Improving Multimodal fusion via Mutual Dependency Maximisation

Colombo¹,

Chapuis²,

Labeau³

et al. 2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Self Cite

View full text Add to dashboard Cite

Multimodal sentiment analysis is a trending area of research, and the multimodal fusion is one of its most active topic. Acknowledging humans communicate through a variety of channels (i.e visual, acoustic, linguistic), multimodal systems aim at integrating different unimodal representations into a synthetic one. So far, a consequent effort has been made on developing complex architectures allowing the fusion of these modalities. However, such systems are mainly trained by minimising simple losses such as L 1 or cross-entropy. In this work, we investigate unexplored penalties and propose a set of new objectives that measure the dependency between modalities. We demonstrate that our new penalties lead to a consistent improvement (up to 4.3 on accuracy) across a large variety of state-of-the-art models on two well-known sentiment analysis datasets: CMU-MOSI and CMU-MOSEI. Our method not only achieves a new SOTA on both datasets but also produces representations that are more robust to modality drops. Finally, a by-product of our methods includes a statistical network which can be used to interpret the high dimensional representations learnt by the model.

show abstract

Section: Discussionmentioning

confidence: 99%

Improving Multimodal fusion via Mutual Dependency Maximisation

Colombo¹,

Chapuis²,

Labeau³

et al. 2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Self Cite

View full text Add to dashboard Cite

show abstract

“…In addition, we use the TweetEval Dataset Xiong et al [2019] for tweet classification on sentiment (59,899), hate detection (12,970) and emotion recognition (5,052). Furthermore, the SILICONE Dataset Chapuis et al [2020] is also used for the tasks of emotion detection (Semaine 13,708) and utterance sentiment analysis (Meld-S 5,627).…”

Section: Datasetsmentioning

confidence: 99%

XAI for Transformers: Better Explanations through Conservative Propagation

Ameen¹,

Schnake²,

Eberle³

et al. 2022

Preprint

View full text Add to dashboard Cite

Transformers have become an important workhorse of machine learning, with numerous applications. This necessitates the development of reliable methods for increasing their transparency. Multiple interpretability methods, often based on gradient information, have been proposed. We show that the gradient in a Transformer reflects the function only locally, and thus fails to reliably identify the contribution of input features to the prediction. We identify Attention Heads and LayerNorm as main reasons for such unreliable explanations and propose a more stable way for propagation through these layers. Our proposal, which can be seen as a proper extension of the well-established LRP method to Transformers, is shown both theoretically and empirically to overcome the deficiency of a simple gradient-based approach, and achieves state-of-the-art explanation performance on a broad range of Transformer models and datasets.

show abstract

“…Among available contrast measures, the Fisher-Rao distance is parameter-free and thus, it is easy to use in practice while the AB-Divergence achieves better results but requires to select α and β. Future work includes extending our metrics to new tasks such as SLU (Chapuis et al 2020(Chapuis et al , 2021Dinkar et al 2020;Colombo, Clavel, and Piantanida 2021), controlled sentence generation (Colombo et al 2019(Colombo et al , 2021b and multi-modal learning (Colombo et al 2021a;Garcia et al 2019).…”

Section: Summary and Concluding Remarksmentioning

confidence: 99%

InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation

Colombo

Clavel

Piantanida

2022

AAAI

Self Cite

View full text Add to dashboard Cite

Assessing the quality of natural language generation (NLG) systems through human annotation is very expensive. Additionally, human annotation campaigns are time-consuming and include non-reusable human labour. In practice, researchers rely on automatic metrics as a proxy of quality. In the last decade, many string-based metrics (e.g., BLEU or ROUGE) have been introduced. However, such metrics usually rely on exact matches and thus, do not robustly handle synonyms. In this paper, we introduce InfoLM a family of untrained metrics that can be viewed as a string-based metric that addresses the aforementioned flaws thanks to a pre-trained masked language model. This family of metrics also makes use of information measures allowing the possibility to adapt InfoLM to different evaluation criteria. Using direct assessment, we demonstrate that InfoLM achieves statistically significant improvement and two figure correlation gains in many configurations compared to existing metrics on both summarization and data2text generation tasks.

show abstract

Hierarchical Pre-training for Sequence Labelling in Spoken Dialog

Cited by 20 publications

References 49 publications

Improving Multimodal fusion via Mutual Dependency Maximisation

Improving Multimodal fusion via Mutual Dependency Maximisation

XAI for Transformers: Better Explanations through Conservative Propagation

InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation

Contact Info

Product

Resources

About