2022
DOI: 10.1016/j.engappai.2022.104807
|View full text |Cite
|
Sign up to set email alerts
|

Influence of statistical feature normalisation methods on K-Nearest Neighbours and K-Means in the context of industry 4.0

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 14 publications
(5 citation statements)
references
References 36 publications
0
5
0
Order By: Relevance
“…(where w, x, b are the weight vector, feature vector and bias term) is used to determine the class for each instance. The k-nearest neighbour classifier [20] uses the Euclidean distance to measure the similarity between instances. The Euclidean distance is defined as…”
Section: Machine Learning Algorithmsmentioning
confidence: 99%
“…(where w, x, b are the weight vector, feature vector and bias term) is used to determine the class for each instance. The k-nearest neighbour classifier [20] uses the Euclidean distance to measure the similarity between instances. The Euclidean distance is defined as…”
Section: Machine Learning Algorithmsmentioning
confidence: 99%
“…The higher the number of clusters 𝑘, the more accurate the data partitioning will be and the compactness among the members of the classes will increase, and the corresponding SSE values will decrease. When the number of clusters does not reach the optimum, the compactness of the intra-class members increases substantially with the increase of the value of 𝑘, and the decrease of SSE increases substantially with the increase of the value of 𝑘 [19]. When the number of clusters reaches an optimal value, if the 𝑘 value continues to be increased, the increase in compactness among the members of the class decreases rapidly and the decrease in the SSE gradient slows down, using the elbow method to determine the 𝑘 value, which also facilitates the use of the K-means algorithm in the next step.…”
Section: K-means Clustering Algorithmmentioning
confidence: 99%
“…Max-Min normalization, however, does not handle newly introduced outlier data well, while Logistic normalization assumes that the dataset is distributed around zero, which is not consistent with our research dataset. Therefore, we have chosen the Z-Score standardization method, which effectively eliminates the inconvenience caused by data with different magnitudes for data analysis and ensures comparability between the data points [21].…”
Section: Data Preprocessingmentioning
confidence: 99%