Improving Neural Machine Translation for Low Resource Algerian Dialect by Transductive Transfer Learning Strategy

Slim, Amel; Melouah, Ahlem; Faghihi, Usef; Khouloud, Sahib

doi:10.1007/s13369-022-06588-w

Cited by 8 publications

(3 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Overall, the multi-perspective feature integration strategy in machine translation tasks aims to enable the model to capture and flexibly utilize various linguistic features more acutely by deeply analyzing the multi-level and multi-dimensional features of the source language and seamlessly connecting them to the encoding and decoding processes of the Transformer architecture, especially when dealing with complex translation situations or facing low-resource linguistic data, this strategy can significantly enhance the model's translation accuracy, robustness and generalization ability [28].…”

Section: B Application Of Multi-view Information In the Encoding And ...mentioning

confidence: 99%

Migration Learning and Multi-View Training for Low-Resource Machine Translation

Yan,

Lin,

Zhao

2024

ijacsa

View full text Add to dashboard Cite

This paper discusses the main challenges and solution strategies of low-resource machine translation, and proposes a novel translation method combining migration learning and multi-view training. In a low-resource environment, neural machine translation models are prone to problems such as insufficient generalization performance, inaccurate translation of long sentences, difficulty in processing unregistered words, and inaccurate translation of domain-specific terms due to their heavy reliance on massively parallel corpora. Migration learning gradually adapts to the translation tasks of low-resource languages in the process of fine-tuning by borrowing the general translation knowledge of high-resource languages and utilizing pre-training models such as BERT, XLM-R, and so on. Multiperspective training, on the other hand, emphasizes the integration of source and target language features from multiple levels, such as word level, syntax and semantics, in order to enhance the model's comprehension and translation ability under limited data conditions. In the experiments, the study designed an experimental scheme containing pre-training model selection, multi-perspective feature construction, and migration learning and multi-perspective fusion, and compared the performance with randomly initialized Transformer model, pre-training-only model, and traditional statistical machine translation model. The experiments demonstrate that the model with multi-view training strategy significantly outperforms the baseline model in evaluation metrics such as BLEU, TER, and ChrF, and exhibits stronger robustness and accuracy in processing complex language structures and domain-specific terminology.

show abstract

Section: B Application Of Multi-view Information In the Encoding And ...mentioning

confidence: 99%

Migration Learning and Multi-View Training for Low-Resource Machine Translation

Yan,

Lin,

Zhao

2024

ijacsa

View full text Add to dashboard Cite

show abstract

“…It typically consists of two temporally concatenated LSTM networks, one for encoding the input past sequence to a state vector and another for decoding the output future sequence from this vector; thus, the temporal dependencies in both the input and the output sequences are considered . The seq2seq structure has been widely used in language translation, speech recognition, and text generation fields. − Unfortunately, no previous study was reported on the application of seq2seq to HAB forecasting.…”

Section: Introductionmentioning

confidence: 99%

Mining Spatiotemporal Information for Harmful Algal Bloom Forecasting and Mechanism Interpreting

Jia,

Xu,

Jia

et al. 2024

ACS EST Water

View full text Add to dashboard Cite

A multistep spatiotemporal forecasting (MSTF) network is developed through incorporating the graph convolutional network (GCN) and the long short-term memory (LSTM) network within a sequence-to-sequence (seq2seq) framework. The MSTF method can not only extract spatial and temporal information from the input data but also make multistep-ahead and continuous predictions. An MSTF-based harmful algal bloom (HAB) forecasting model is then formulated to predict the chlorophyll-a (Chl-a) concentration of the Dianchi Lake (China). The integrated gradients (IG) method is employed to interpret the trained MSTF model and quantify the attribution of each input dimension to the Chl-a prediction. Results indicate that (i) the coefficient of determination (R 2) of the MSTF model in 24-h-ahead Chl-a prediction reaches 0.926, 28.4% higher than that of the traditional LSTM model; (ii) the ammonia nitrogen (12.3%), the total phosphorus (10.2%), the total nitrogen (9.9%), and the temperature (8.6%) are significant variables for Chl-a prediction; (iii) the spatial information from neighbor lake and river stations plays an important role in the HAB forecasting, with an average contribution of 35.0%; (iv) the proposed MSTF model is also skillful in the 72-h-ahead Chl-a prediction. Results presented highlight the importance of considering both spatial and temporal dependency of monitoring data in HAB forecasting and mechanism interpreting.

show abstract

“…The existing multilingual models are highly limited in scope, as they do not concentrate on Arabic, let alone dialects. The majority of these models are primarily trained on MSA, which exhibits substantial disparities in morphological, syntactic, and other linguistic aspects compared to the Moroccan dialect [8]. Meanwhile, multidialectal models lack the specificity to accurately represent the Moroccan dialect and often result in the loss of dialect-specific features.…”

Section: Introductionmentioning

confidence: 99%

DarijaBERT: a step forward in NLP for the written Moroccan dialect

Gaanoun,

Naira,

Allak

et al. 2024

Int J Data Sci Anal

View full text Add to dashboard Cite

The performance of existing transformer-based language models in providing state-of-the-art results on many downstream tasks is well established. However, these models tend to be limited to highresource languages or are multilingual in nature. The availability of models dedicated to Arabic dialects is limited, and even those that exist primarily support dialects written in Arabic script. This study presents the first BERT models for Moroccan Arabic dialect, also known as Darija, called Dari-jaBERT, DarijaBERT-arabizi, and DarijaBERT-mix. These models are trained on the largest Arabic monodialectal corpus, supporting both Arabic and Latin character representations of the Moroccan dialect. The models' performance is evaluated and compared to existing multidialectal and multilingual models on four distinct downstream tasks, demonstrating state-of-the-art results. The data collection methodology and pre-training process are described, and the Moroccan Topic Classification Dataset (MTCD) is introduced as the first dataset for topic classification in the Moroccan Arabic dialect. The pre-trained models and MTCD dataset are available to the scientific community.

show abstract

Improving Neural Machine Translation for Low Resource Algerian Dialect by Transductive Transfer Learning Strategy

Cited by 8 publications

References 19 publications

Migration Learning and Multi-View Training for Low-Resource Machine Translation

Migration Learning and Multi-View Training for Low-Resource Machine Translation

Mining Spatiotemporal Information for Harmful Algal Bloom Forecasting and Mechanism Interpreting

DarijaBERT: a step forward in NLP for the written Moroccan dialect

Contact Info

Product

Resources

About