Improved Unsupervised Statistical Machine Translation
            <i>via</i>
            Unsupervised Word Sense Disambiguation for a Low-Resource and Indic Languages

Saxena, Shefali; Chaurasia, Uttkarsh; Bansal, Nitin; Daniel, Philemon

doi:10.1080/03772063.2022.2098189

Cited by 3 publications

(3 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…. ., wn), WSD is the task of assigning appropriate sense of a polysemous word in T, that is, to identify a mapping A from words to senses, such that A(i) ⊆ SensesD(wi ), where SensesD(wi) is the set of senses encoded in a knowledge source K for word wi 1 and A(i) is the subset of the senses of wi which are appropriate in the context T. The mapping A can assign more than one sense to each word wi ∈ T. However, only the most appropriate sense is selected, that is, | A(i) |= 1." A knowledge source can be in various lexical formats.…”

Section: A Task Definitionmentioning

confidence: 99%

“…This linguistic characteristic is referred to as polysemy. Polysemy poses challenges in natural language processing (NLP ) applications such as machine translation [1], information retrieval [2], and text summarization [3], where accurately determining the intended meaning of a polysemous word based on context is important. Resolving polysemy is crucial for improving the accuracy and precision of these applications.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Integrating Lesk Algorithm with Cosine Semantic Similarity to Resolve Polysemy for Setswana Language

Moape,

Olugbara,

Ojo

2024

IJACSA

View full text Add to dashboard Cite

Word Sense Disambiguation (WSD) serves as an intermediate task for enhancing text understanding in Natural Language Processing (NLP) applications, including machine translation, information retrieval, and text summarization. Its role is to enhance the effectiveness and efficiency of these applications by ensuring the accurate selection of the appropriate sense for polysemous words in diverse contexts. This task is recognized as an AI-complete problem, indicating its longstanding complexity since the 1950s. One of the earliest proposed solutions to address polysemy in NLP is the Lesk algorithm, which has seen various adaptations by researchers for different languages over the years. This study proposes a simplified, Lesk-based algorithm to resolve polysemy for Setswana. Instead of combinatorial comparisons among candidate senses that Lesk is based on that cause computational complexity, this study models word sense glosses using Bidirectional Encoder Representations from Transformers (BERT) and Cosine similarity measure, which have been proven to achieve optimal performance in WSD. The proposed algorithm was evaluated on Setswana and obtained an accuracy of 86.66 and an error rate of 14.34, surpassing the accuracy of other Lesk-based algorithms for other languages.

show abstract

Section: A Task Definitionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Integrating Lesk Algorithm with Cosine Semantic Similarity to Resolve Polysemy for Setswana Language

Moape,

Olugbara,

Ojo

2024

IJACSA

View full text Add to dashboard Cite

show abstract

“…They depended on rigid structures and lacked fluency (Oliver, 2020, p. 125) and were found incapable of handling complex rhetorical devices, ambiguity, metaphors and other creative language (Hasselberger, 2021). However, statistical machine translation (SMT) improved outcomes by training on vast volumes of parallel text corpora Saxena et al, 2022). SMT modeled probabilistic mappings between source and target languages (Sharma & Singh, 2021).…”

Section: Literature Reviewmentioning

confidence: 99%

Evaluating machine translation of literature through rhetorical analysis

Karabayeva,

Kalizhanova

2024

JTLS

View full text Add to dashboard Cite

This paper looks at how well ChatGPT and DeepL, two AI tools, translate literary works. Not only can ChatGPT translate text, but it can also carry out other jobs. DeepL is a service that performs computer translation and uses neural networks. The paper looks at how ChatGPT and DeepL translate books, poems, and dialogues compared to translations done by humans. The paper also talks about the pros and cons of using machine translation for literary reasons, including issues of creativity, style, and adapting to different cultures. The paper uses both new and old studies on machine translation technologies and how they work with human translation. The paper comes to the conclusion that ChatGPT and DeepL are useful but imperfect tools for translating literature, and they require human review and improvement. The paper adds to the fields of machine translation and natural language processing by looking at how two cutting-edge AI tools, ChatGPT and DeepL, can be used to translate literary works. The paper also adds to literature studies and digital humanities by looking into what machine translation can and can't do for creative writing and dialog systems. The goal of the paper is to encourage researchers, translators, writers, and users from different fields to work together and talk to each other. ⁤

show abstract