Multi-lingual and Multi-cultural Figurative Language Understanding

Kabra, Anubha; Liu, Emmy; Khanuja, Simran; Aji, Alham Fikri; Winata, Genta Indra; Cahyawijaya, Samuel; Aremu, Anuoluwapo; Perez, Ogayo,; Neubig, Graham

doi:10.18653/v1/2023.findings-acl.525

Cited by 2 publications

(2 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In cross-cultural communication, cultural differences cause misunderstandings of speakers' intentions (Thomas, 1983;Tannen, 1985;Wierzbicka, 1991). Recent work in NLP has studied differences in time expressions (Vilares and Gómez-Rodríguez, 2018;Shwartz, 2022), perspectives over news topics (Gutiérrez et al, 2016), pragmatic reference of nouns (Shaikh et al, 2023), culture-specific entities (Peskov et al, 2021;Yao et al, 2023), figurative language (Kabra et al, 2023;Liu et al, 2023b). Our work connects the two lines of research by investigating how cultural knowledge affects language understanding.…”

Section: Cultural Factors and Normsmentioning

confidence: 99%

Culturally Aware Natural Language Inference

Huang,

Yang

2023

Findings of the Association for Computational Linguistics: EMNLP 2023

View full text Add to dashboard Cite

Humans produce and consume language in a particular cultural context, which includes knowledge about specific norms and practices. A listener's awareness of the cultural context is critical for interpreting the speaker's meaning. A simple expression like "I didn't leave a tip" implies a strong sense of dissatisfaction when tipping is assumed to be the norm. As NLP systems reach users from different cultures, achieving culturally aware language understanding becomes increasingly important. However, current research has focused on building cultural knowledge bases without studying how such knowledge leads to contextualized interpretations of texts. In this work, we operationalize cultural variations in language understanding through a natural language inference (NLI) task that surfaces cultural variations as label disagreement between annotators from different cultural groups. We introduce the first Culturally Aware Natural Language Inference (CALI) dataset with 2.7K premise-hypothesis pairs annotated by two cultural groups located in the U.S. and India. With CALI, we categorize how cultural norms affect language understanding and present an evaluation framework to assess at which levels large language models are culturally aware. Our dataset is available at https://github.com/ SALT-NLP/CulturallyAwareNLI.

show abstract

Section: Cultural Factors and Normsmentioning

confidence: 99%

Culturally Aware Natural Language Inference

Huang,

Yang

2023

Findings of the Association for Computational Linguistics: EMNLP 2023

View full text Add to dashboard Cite

show abstract

“…This generalization capability is further improved with various tuning methods, such as instruction tuning (Sanh et al, 2022;Wei et al, 2022a;Chung et al, 2022;Muennighoff et al, 2022). However, LLMs and their instruction-tuned variants face difficulties in generalizing across various languages, leading to a disparity in performances (Xue et al, 2021;Gehrmann et al, 2022;Scao et al, 2022;Chowdhery et al, 2022;Yong et al, 2023;Zhang et al, 2023;Asai et al, 2023;Kabra et al, 2023). Moreover, these models have limited language coverage, mostly in the Indo-European language family as indicated in Figure 1.…”

Section: Introductionmentioning

confidence: 99%

InstructAlign: High-and-Low Resource Language Alignment via Continual Crosslingual Instruction Tuning

Cahyawijaya,

Lovenia,

et al. 2023

Proceedings of the First Workshop in South East Asian Language Processing

View full text Add to dashboard Cite

Large language models (LLMs) that are tuned with instructions have demonstrated remarkable capabilities in various tasks and languages. However, their ability to generalize to underrepresented languages is limited due to the scarcity of available data. Additionally, directly adapting new languages to instruction-tuned LLMs can result in catastrophic forgetting, which leads to the loss of multitasking ability. To address this issue, we propose InstructAlign which uses continual crosslingual instruction tuning to enable LLMs to align new unseen languages with previously learned high-resource languages. Our results demonstrate the effectiveness of InstructAlign in enabling the model to understand low-resource languages with limited parallel data while preventing catastrophic forgetting. Our work contributes to the advancement of language adaptation methods, particularly for adapting instruction-tuned LLMs to underrepresented languages. We will release our code publicly upon acceptance.

show abstract

Multi-lingual and Multi-cultural Figurative Language Understanding

Cited by 2 publications

References 0 publications

Culturally Aware Natural Language Inference

Culturally Aware Natural Language Inference

InstructAlign: High-and-Low Resource Language Alignment via Continual Crosslingual Instruction Tuning

Contact Info

Product

Resources

About