Mohamed Amine Menacer scite author profile

Mohamed Amine Menacer

5Publications

52Citation Statements Received

47Citation Statements Given

How they've been cited

How they cite others

Affiliations

Lorraine Research Laboratory in Computer Science and its Applications, Université de Lorraine

Publications

Order By: Most citations

CALYOU: A Comparable Spoken Algerian Corpus Harvested from YouTube

Abidi¹,

Menacer²,

Smaïli³

2017

View full text Add to dashboard Cite

This paper addresses the issue of comparability of comments extracted from Youtube. The comments concern spoken Algerian that could be either local Arabic, Modern Standard Arabic or French. This diversity of expression gives rise to a huge number of problems concerning the data processing. In this article, several methods of alignment will be proposed and tested. The method which permits to best align is Word2Vecbased approach that will be used iteratively. This recurrent call of Word2Vec allows us improve significantly the results of comparability. In fact, a dictionary-based approach leads to a Recall of 4, while our approach allows one to get a Recall of 33 at rank 1. Thanks to this approach, we built from Youtube CALYOU, a Comparable Corpus of the spoken Algerian.

show abstract

Machine Translation on a Parallel Code-Switched Corpus

Menacer

Langlois

Jouvet

et al. 2019

View full text Add to dashboard Cite

Code-switching (CS) is the phenomenon that occurs when a speaker alternates between two or more languages within an utterance or discourse. In this work, we investigate the existence of code-switching in formal text, namely proceedings of multilingual institutions. Our study is carried out on the Arabic-English code-mixing in a parallel corpus extracted from official documents of United Nations. We build a parallel code-switched corpus with two reference translations one in pure Arabic and the other in pure English. We also carry out a human evaluation of this resource in the aim to use it to evaluate the translation of code-switched documents. To the best of our knowledge, this kind of corpora does not exist. The one we propose is unique. This paper examines several methods to translate codeswitched corpus: conventional statistical machine translation, the end-to-end neural machine translation and multitask-learning.

show abstract

Development of the Arabic Loria Automatic Speech Recognition system (ALASR) and its evaluation for Algerian dialect

Menacer

Mella

Fohr

et al. 2017

Procedia Computer Science

View full text Add to dashboard Cite

An enhanced automatic speech recognition system for Arabic

Menacer¹,

Mella²,

Fohr³

et al. 2017

View full text Add to dashboard Cite

Automatic speech recognition for Arabic is a very challenging task. Despite all the classical techniques for Automatic Speech Recognition (ASR), which can be efficiently applied to Arabic speech recognition, it is essential to take into consideration the language specificities to improve the system performance. In this article, we focus on Modern Standard Arabic (MSA) speech recognition. We introduce the challenges related to Arabic language, namely the complex morphology nature of the language and the absence of the short vowels in written text, which leads to several potential vowelization for each graphemes, which is often conflicting. We develop an ASR system for MSA by using Kaldi toolkit. Several acoustic and language models are trained. We obtain a Word Error Rate (WER) of 14.42 for the baseline system and 12.2 relative improvement by rescoring the lattice and by rewriting the output with the right hamoza above or below Alif.

show abstract

An Integrated AMIS Prototype for Automated Summarization and Translation of Newscasts and Reports

Grega

Smaïli

Leszczuk

et al. 2018

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.