2018
DOI: 10.1007/978-3-030-03493-1_32
|View full text |Cite
|
Sign up to set email alerts
|

Machine Learning Methods Based Preprocessing to Improve Categorical Data Classification

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 6 publications
0
5
0
Order By: Relevance
“…Table 15 and Table 16 , and Figure 10 show the compression of a real world census data taken from Base de Datos-Censo de Población y Vivienda 2010 ( ). The dataset is a subset of the original census used for imputation of missing data [ 30 ].…”
Section: Resultsmentioning
confidence: 99%
“…Table 15 and Table 16 , and Figure 10 show the compression of a real world census data taken from Base de Datos-Censo de Población y Vivienda 2010 ( ). The dataset is a subset of the original census used for imputation of missing data [ 30 ].…”
Section: Resultsmentioning
confidence: 99%
“…Yet, researchers can prefer clustering data using SOM for the following features: the network learns without a teacher, output neurons are ordered, the learning process is performed on the basis of competition and emergence of the winning neuron, a topological map is created and the neighborhood of clusters identified as a learning result. Traditionally, the method can handle numerical data, but owing to the dummy coding, SOM can also operate on categorical data [5,26]. Despite some weaknesses, the method is very attractive for data exploratory analysis, widely used to solve a large range of problems, and can be considered as an appropriate clustering algorithm for high dimensional dataset [3,5,10,26].…”
Section: Methodsmentioning
confidence: 99%
“…Traditionally, the method can handle numerical data, but owing to the dummy coding, SOM can also operate on categorical data [5,26]. Despite some weaknesses, the method is very attractive for data exploratory analysis, widely used to solve a large range of problems, and can be considered as an appropriate clustering algorithm for high dimensional dataset [3,5,10,26].…”
Section: Methodsmentioning
confidence: 99%
“…Each neuron effectively calculates a "weighted sum" of its input, adds a bias and decides whether or not to fire. This can be expressed as a function f applied to a linear classifier 𝐰 T 𝒙 + 𝑏 (8). The decision function f could be a non-linear threshold function, a non-linear distance function or a probability-like sigmoid function.…”
Section: ) Artificial Neural Network (Ann)mentioning
confidence: 99%
“…In contrast, categorical variables tend to hide (even mask) a great deal of the interesting information in a dataset [4,5,6]. It is not so easy to see trends and make predictions or forecasts when categorical variables dominate the dataset [7,8]. This makes it crucial to develop systematic methods and heuristics for dealing with such variables.…”
Section: Introductionmentioning
confidence: 99%