2022
DOI: 10.3390/app12105119
|View full text |Cite
|
Sign up to set email alerts
|

Systematic Comparison of Vectorization Methods in Classification Context

Abstract: Natural language processing has been the subject of numerous studies in the last decade. These have focused on the various stages of text processing, from text preparation to vectorization to final text comprehension. The goal of vector space modeling is to project words in a language corpus into a vector space in such a way that words that are similar in meaning are close to each other. Currently, there are two commonly used approaches to the topic of vectorization. The first focuses on creating word vectors … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0
1

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 16 publications
(8 citation statements)
references
References 14 publications
0
7
0
1
Order By: Relevance
“…The count vectorizer is built unconditionally on the number of word events in the text. The count vectorization technique performs both the counting of how many times a token occurs and the process of tokenization [17]. The count vectorizer is based on various parameters that are used to purify the type of features.…”
Section: Feature Extraction 331 Count Vectorizermentioning
confidence: 99%
“…The count vectorizer is built unconditionally on the number of word events in the text. The count vectorization technique performs both the counting of how many times a token occurs and the process of tokenization [17]. The count vectorizer is based on various parameters that are used to purify the type of features.…”
Section: Feature Extraction 331 Count Vectorizermentioning
confidence: 99%
“…Machine learning these days is one of the many technologies employed in medicine and particularly medical data analysis. To this day, there are countless approaches of machine learning that are applied in various medical problems, whether it is hospital readmission, diagnosis or treatment plans, and even newer methods and applications are being developed as of now [22][23][24][25].…”
Section: Related Work On Machine Learning For Medical Systemsmentioning
confidence: 99%
“…Feature extraction through NLP requires the application of multiple steps (Figure 3 [10,[16][17][18][19]22]):…”
Section: Figure 2 Transformation Of Text With Patient Notes To Table ...mentioning
confidence: 99%
“…To do this, the key characteristics are assigned as elements of the vector. For example, if degree distribution, clustering coefficient, and betweenness centrality are chosen as key characteristics, then each network can be represented as a vector with these elements (Krzeszewska et al, 2022). The vectorization process is an important step because it enables the networks to be compared using mathematical techniques such as CS.…”
Section: Vectorizationmentioning
confidence: 99%