Arabizi Language Models for Sentiment Analysis

Baert, Gaétan; Gahbiche, Souhir; Gadek, Guillaume; Pauchet, Alexandre

doi:10.18653/v1/2020.coling-main.51

Cited by 4 publications

(2 citation statements)

References 30 publications

(24 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Regarding the specific case of Arabic dialects written in Arabizi, a recent BERT-based model have been pretrained on 7 millions Egyptian tweets and displayed effective results on a sentiment analysis task (Baert et al, 2020). Another very recent model, at the date of writing, was pre-trained on 4 millions Algerian tweets and also demonstrated interesting results on sentiment analysis (Abdaoui et al, 2021).…”

Section: Discussionmentioning

confidence: 99%

Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?

Riabi¹,

Sagot²,

Seddah³

2021

Preprint

View full text Add to dashboard Cite

Recent impressive improvements in NLP, largely based on the success of contextual neural language models, have been mostly demonstrated on at most a couple dozen highresource languages. Building language models and, more generally, NLP systems for nonstandardized and low-resource languages remains a challenging task. In this work, we focus on North-African colloquial dialectal Arabic written using an extension of the Latin script, called NArabizi, found mostly on social media and messaging communication. In this low-resource scenario with data displaying a high level of variability, we compare the downstream performance of a character-based language model on part-of-speech tagging and dependency parsing to that of monolingual and multilingual models. We show that a characterbased model trained on only 99k sentences of NArabizi and fined-tuned on a small treebank of this language leads to performance close to those obtained with the same architecture pretrained on large multilingual and monolingual models. Confirming these results a on much larger data set of noisy French user-generated content, we argue that such character-based language models can be an asset for NLP in low-resource and high language variability settings.

show abstract

Section: Discussionmentioning

confidence: 99%

Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?

Riabi¹,

Sagot²,

Seddah³

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…Pre-trained language models (PLMs) have significantly advanced state-of-the-art on various natural language processing tasks, such as sentiment analysis (Bataa and Wu, 2019;Baert et al, 2020), text classification (Sun et al, 2019a;Arslan et al, 2021), and question answering (Yang et al, 2019). Despite the remarkable results, PLMs have a large number of parameters which make them expensive for deployment (Yang et al, 2019).…”

Section: Introductionmentioning

confidence: 99%

Computer Simulation of Freeze-thaw Cycle in Long'en Hall of Zhaoling Mausoleum in Shenyang City, China

Wang

2022

2022 International Conference on Computer Engineering and Artificial Intelligence (ICCEAI)

View full text Add to dashboard Cite

Knowledge distillation addresses the problem of transferring knowledge from a teacher model to a student model. In this process, we typically have multiple types of knowledge extracted from the teacher model. The problem is to make full use of them to train the student model. Our preliminary study shows that: (1) not all of the knowledge is necessary for learning a good student model, and (2) knowledge distillation can benefit from certain knowledge at different training steps. In response to these, we propose an actor-critic approach to selecting appropriate knowledge to transfer during the process of knowledge distillation. In addition, we offer a refinement of the training algorithm to ease the computational burden. Experimental results on the GLUE datasets show that our method outperforms several strong knowledge distillation baselines significantly.

show abstract

Over a decade of social opinion mining: a systematic review

Cortis

Davis

2021

Artif Intell Rev

View full text Add to dashboard Cite

Social media popularity and importance is on the increase due to people using it for various types of social interaction across multiple channels. This systematic review focuses on the evolving research area of Social Opinion Mining, tasked with the identification of multiple opinion dimensions, such as subjectivity, sentiment polarity, emotion, affect, sarcasm and irony, from user-generated content represented across multiple social media platforms and in various media formats, like text, image, video and audio. Through Social Opinion Mining, natural language can be understood in terms of the different opinion dimensions, as expressed by humans. This contributes towards the evolution of Artificial Intelligence which in turn helps the advancement of several real-world use cases, such as customer service and decision making. A thorough systematic review was carried out on Social Opinion Mining research which totals 485 published studies and spans a period of twelve years between 2007 and 2018. The in-depth analysis focuses on the social media platforms, techniques, social datasets, language, modality, tools and technologies, and other aspects derived. Social Opinion Mining can be utilised in many application areas, ranging from marketing, advertising and sales for product/service management, and in multiple domains and industries, such as politics, technology, finance, healthcare, sports and government. The latest developments in Social Opinion Mining beyond 2018 are also presented together with future research directions, with the aim of leaving a wider academic and societal impact in several real-world applications.

show abstract

Arabizi Language Models for Sentiment Analysis

Cited by 4 publications

References 30 publications

Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?

Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?

Computer Simulation of Freeze-thaw Cycle in Long'en Hall of Zhaoling Mausoleum in Shenyang City, China

Over a decade of social opinion mining: a systematic review

Contact Info

Product

Resources

About