Creating Welsh Language Word Embeddings

Corcoran, Padraig; Palmer, Geraint; Arman, Laura; Knight, Dawn; Spasić, Irena

doi:10.3390/app11156896

Cited by 4 publications

(3 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Existing research has focused primarily on using digital language resources for second language and foreign learning among students and children [1,28,29]. There are very few studies that focused on native and particularly minority languages, their e-learning and analysis, such as, for example, the Welsh language [34]. The characteristics of native language usage compared to second language usage were only examined within the context of Slovene speakers' preferences for communication in online activities during second language learning [28].…”

Section: Literature Reviewmentioning

confidence: 99%

Usefulness of Digital Language Resources in Improving Native Language among Adults

2022

View full text Add to dashboard Cite

Important keys to effective communication are language competences, which can be supported by using digital language resources. These usually assist the acquisition of a second language, despite their potential for improving one’s native language. Our study was, thus, aimed at raising awareness about the possibilities of improving the native language of an adult population by using digital language resources for the Slovenian language. We conducted workshops, a survey and, partly, semi-structured interviews with 124 participants. We examined whether the perceived usefulness and ease of using digital language resources depends on age, education, self-assessed language proficiency, and experience with language training. The analysis revealed that self-initiative use of analogue language resources is related positively to using digital ones for seeking information, improving language use, as well as for study or work. Moreover, self-assessed proficiency in language was found to affect the perceived ease of using digital language resources. These findings may help language professionals support developing language skills by using digital language resources and preserving language in an adult population.

show abstract

Section: Literature Reviewmentioning

confidence: 99%

Usefulness of Digital Language Resources in Improving Native Language among Adults

2022

View full text Add to dashboard Cite

show abstract

“…Each item of the set is modelled as a finite mixture over an underlying set of topic probabilities (Sathi & Ramanujapura, 2016). To derive an accurate categorization of user feedback, the authors used the document-embedding widget before topic modelling to observe the embedding for each n-gram while employing the pre-trained fastText models for English and obtaining one vector per document (Alghamdi & Alfalqi, 2015;Corcoran et al, 2021). The LDAvis and multidimensional scaling (MDS) tools were used to evaluate the topics that emerged from the LDA, the LDAvis, and MDS.…”

Section: Thematic Analysis Of Positive and Negative Reviewsmentioning

confidence: 99%

Unsupervised Machine Learning to Identify Positive and Negative Themes in Jordanian mHealth Apps

Alhur

Alshamari

Oláh

et al. 2022

International Journal of E-Services and Mobile Applications

View full text Add to dashboard Cite

User opinions are crucial in the development of mobile health (mHealth) applications (apps). This study aimed to investigate and qualitatively assess consumer attitudes toward mHealth apps and the main aspects of their design. The methodology was divided into four steps: (1) data collection, (2) preprocessing, (3) sentiment analysis by valence-aware dictionary and sentiment reasoner (VADER), and (4) thematic analysis by the latent Dirichlet allocation (LDA) algorithm. These steps were implemented in 836 reviews of eight mHealth apps on app stores in Jordan. The current study offers healthcare stakeholders insight into the positive and negative aspects of mHealth apps by identifying user-preferred features and recommending improvements. The findings indicate several aspects of design that mHealth app developers may use to improve overall efficacy, including user experience, client services, usability, and adherence.

show abstract

“…In tasks like predicting lexical complexity, leveraging Transformer models in conjunction with various traditional linguistic features has proven effective in enhancing the performance of deep learning systems [10]. Similarly, accommodating syntactic and morphological peculiarities is crucial, especially for languages like Welsh, a minority language, which necessitates adaptations to existing word embedding methods for optimal results [11]. Hence, the imperative lies in crafting specific word embedding methodologies tailored to the nuances of particular texts or tasks, recognizing the intricate interplay between linguistic structure and neural representations.…”

Section: Introductionmentioning

confidence: 99%

Floating-Point Embedding: Enhancing the Mathematical Comprehension of Large Language Models

Jin,

Mao,

Yue

et al. 2024

Symmetry

View full text Add to dashboard Cite

The processing and comprehension of numerical information in natural language represent pivotal focal points of scholarly inquiry. Across diverse applications spanning text analysis to information retrieval, the adept management and understanding of the numerical content within natural language are indispensable in achieving task success. Specialized encoding and embedding techniques tailored to numerical data offer an avenue toward improved performance in tasks involving masked prediction and numerical reasoning, inherently characterized by numerical values. Consequently, treating numbers in text merely as words is inadequate; their numerical semantics must be underscored. Recent years have witnessed the emergence of a range of specific encoding methodologies designed explicitly for numerical content, demonstrating promising outcomes. We observe similarities between the Transformer architecture and CPU architecture, with symmetry playing a crucial role. In light of this observation and drawing inspiration from computer system theory, we introduce a floating-point representation and devise a corresponding embedding module. The numerical representations correspond one-to-one with their semantic vector values, rendering both symmetric regarding intermediate transformation methods. Our proposed methodology facilitates the more comprehensive encoding and embedding of numerical information within a predefined precision range, thereby ensuring a distinctive encoding representation for each numerical entity. Rigorous testing on multiple encoder-only models and datasets yielded results that stand out in terms of competitiveness. In comparison to the default embedding methods employed by models, our approach achieved an improvement of approximately 3.8% in Top-1 accuracy and a reduction in perplexity of approximately 0.43. These outcomes affirm the efficacy of our proposed method. Furthermore, the enrichment of numerical semantics through a more comprehensive embedding contributes to the augmentation of the model’s capacity for semantic understanding.

show abstract

Creating Welsh Language Word Embeddings

Cited by 4 publications

References 21 publications

Usefulness of Digital Language Resources in Improving Native Language among Adults

Usefulness of Digital Language Resources in Improving Native Language among Adults

Unsupervised Machine Learning to Identify Positive and Negative Themes in Jordanian mHealth Apps

Floating-Point Embedding: Enhancing the Mathematical Comprehension of Large Language Models

Contact Info

Product

Resources

About