Combining Embeddings of Input Data for Text Classification

Parcheta, Zuzanna; Sanchis-Trilles, Germán; Casacuberta, Francisco; Rendahl, Robin

doi:10.1007/s11063-020-10312-w

Cited by 7 publications

(7 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, semantic features used in the models rely on pre-trained word embeddings, which limits the effect of the model. Parcheta et al [25] studied the influence of embeddings extracted by combining different methods on text classification models.…”

Section: Deep Learning-based Methodsmentioning

confidence: 99%

See 1 more Smart Citation

Bert-Enhanced Text Graph Neural Network for Classification

Yang

Cui

2021

Entropy

View full text Add to dashboard Cite

Text classification is a fundamental research direction, aims to assign tags to text units. Recently, graph neural networks (GNN) have exhibited some excellent properties in textual information processing. Furthermore, the pre-trained language model also realized promising effects in many tasks. However, many text processing methods cannot model a single text unit’s structure or ignore the semantic features. To solve these problems and comprehensively utilize the text’s structure information and semantic information, we propose a Bert-Enhanced text Graph Neural Network model (BEGNN). For each text, we construct a text graph separately according to the co-occurrence relationship of words and use GNN to extract text features. Moreover, we employ Bert to extract semantic features. The former part can take into account the structural information, and the latter can focus on modeling the semantic information. Finally, we interact and aggregate these two features of different granularity to get a more effective representation. Experiments on standard datasets demonstrate the effectiveness of BEGNN.

show abstract

Section: Deep Learning-based Methodsmentioning

confidence: 99%

“…[18] Dynamically Gated Convolutional Neural Network. [25] Research on effect of different embedding technologies when they are used together.…”

Section: Related Researchesmentioning

confidence: 99%

Bert-Enhanced Text Graph Neural Network for Classification

Yang

Cui

2021

Entropy

View full text Add to dashboard Cite

show abstract

“…This approach achieves better results than all reported results for this subtask. In regard to the first subtask, Parcheta et al [38] experimented using multiple text encoding techniques, such as byte pair encoding (BPE) [39], GloVe and BERT. To generate the BERT embeddings, they used a small multilingual model that was trained using 104 different languages.…”

Section: Germeval 2017 Resultsmentioning

confidence: 99%

Domain Adaptation of Transformer-Based Models Using Unlabeled Data for Relevance and Polarity Classification of German Customer Feedback

et al. 2023

View full text Add to dashboard Cite

Understanding customer feedback is becoming a necessity for companies to identify problems and improve their products and services. Text classification and sentiment analysis can play a major role in analyzing this data by using a variety of machine and deep learning approaches. In this work, different transformer-based models are utilized to explore how efficient these models are when working with a German customer feedback dataset. In addition, these pre-trained models are further analyzed to determine if adapting them to a specific domain using unlabeled data can yield better results than off-the-shelf pre-trained models. To evaluate the models, two downstream tasks from the GermEval 2017 are considered. The experimental results show that transformer-based models can reach significant improvements compared to a fastText baseline and outperform the published scores and previous models. For the subtask Relevance Classification, the best models achieve a micro-averaged F1-Score of 96.1 % on the first test set and 95.9 % on the second one, and a score of 85.1 % and 85.3 % for the subtask Polarity Classification.

show abstract

“…RAKE (Rapid Automatic Keyword Extraction) is a common algorithm used across most applications in natural language processing; this algorithm uses a list of stop words and delimiters to extract relevant phrases and words from a target text [44]. It extracts keywords based on a scoring system which it implements using stop-lists.…”

Section: Data Preprocessingmentioning

confidence: 99%

Machine Learning Driven Mental Stress Detection on Reddit Posts Using Natural Language Processing

Inamdar

Chapekar

Gite

et al. 2023

Hum-Cent Intell Syst

View full text Add to dashboard Cite

People’s mental conditions are often reflected in their social media activity due to the internet's anonymity. Psychiatric issues are often detected through such activities and can be addressed in their early stages, potentially preventing the consequences of unattended mental disorders like depression and anxiety. In this paper, the authors have implemented machine learning models and used various embedding techniques to classify posts from the famous social media blog site Reddit as stressful and non-stressful. The dataset used contains user posts that can be analyzed to detect patterns in the social media activity of those diagnosed with mental disorders. This paper uses different NLP (Natural Language Processing) tools such as ELMo (Embeddings from Language Models) word embeddings, BERT (Bidirectional Encoder Representations from Transformers) tokenizers, and BoW (Bag of Words) approach to create word/sentence data that can be fed to machine learning models. The results of each method have been discussed. The results achieved a top F1 score of 0.76, a Precision score of 0.71, and a Recall of 0.74 using only the preprocessed texts and machine learning algorithms to classify the posts. The results achieved by this paper are significant and have the potential to be applied in real-world scenarios to analyze mental stress among social media users. Although this paper focuses on data from Reddit, the techniques used can be transferred to similar social media platforms and could help solve the growing mental health crisis.

show abstract

Combining Embeddings of Input Data for Text Classification

Cited by 7 publications

References 19 publications

Bert-Enhanced Text Graph Neural Network for Classification

Bert-Enhanced Text Graph Neural Network for Classification

Domain Adaptation of Transformer-Based Models Using Unlabeled Data for Relevance and Polarity Classification of German Customer Feedback

Machine Learning Driven Mental Stress Detection on Reddit Posts Using Natural Language Processing

Contact Info

Product

Resources

About