2021
DOI: 10.14569/ijacsa.2021.0120124
|View full text |Cite
|
Sign up to set email alerts
|

Text Coherence Analysis based on Misspelling Oblivious Word Embeddings and Deep Neural Network

Abstract: Text coherence analysis is the most challenging task in Natural Language Processing (NLP) than other subfields of NLP, such as text generation, translation, or text summarization. There are many text coherence methods in NLP, most of them are graph-based or entity-based text coherence methods for short text documents. However, for long text documents, the existing methods perform low accuracy results which is the biggest challenge in text coherence analysis in both English and Bengali. This is because existing… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 12 publications
(3 citation statements)
references
References 24 publications
0
3
0
Order By: Relevance
“…Feature engineering of textual data is also known as vectorization, where words within at text document are encoded as binary numbers of numeric or floating-point vectors. In this study, Word2Vec [35], TF-IDF [36], and BERT [37] feature extraction methods were used on the textual datasets.…”
Section: Feature Engineeringmentioning
confidence: 99%
“…Feature engineering of textual data is also known as vectorization, where words within at text document are encoded as binary numbers of numeric or floating-point vectors. In this study, Word2Vec [35], TF-IDF [36], and BERT [37] feature extraction methods were used on the textual datasets.…”
Section: Feature Engineeringmentioning
confidence: 99%
“…While CNNs and RNNs perform well with context-free features, textual content features with contextual information provide a better representation of words and yield better classification results. Recently, different language models have gained popularity in different NLP tasks [22][23][24]. A few studies have used embeddings from language model (ELMO), such as Bidirectional Encoder Representations from Transformers (BERT), that have outperformed several baseline methods in fake news detection [25,26].…”
Section: Introductionmentioning
confidence: 99%
“…This method is based on the study of latent semantic analysis (LSA), a method that compares units of textual information and determines their semantic relationship. In the following years, several coherence analysis methods were proposed by various researchers; however, no method has proved to be perfect [8].…”
Section: Introductionmentioning
confidence: 99%