Prediction of Euroleague Games based on Supervised Classification Algorithm k-Nearest Neighbours

Horvat, Tomislav; Job, Josip; Medved, Vladimir

doi:10.5220/0006893502030207

Cited by 6 publications

(8 citation statements)

References 8 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The achieved prediction was higher compared to other researcher's prediction result. In paper by Horvat et al (2018), the authors proposed a model based on k-NN for predicting Euroleague games. They applied several models using different values of k and a number of seasons and two data preparation variants based on different feature sets.…”

Section: Evaluation Of the Resultsmentioning

confidence: 99%

“…Without feature selection, the authors achieved the accuracy of 67%. Horvat et al (2018) proposed a model for feature selection based on the feature information gain. The authors presented two feature selection variants and two data preparation algorithms.…”

Section: Feature Selection Based On Feature Selection Methodsmentioning

confidence: 99%

“…Most papers do not show a comparison table with prediction results using the initial feature set and results obtained by using some type of feature selection. The papers with the presented improvements of the prediction results using additional feature selection are Loeffelholz et al (2009), Trawinski (2010), Buursma (2011), Ping-Feng et al (2017, Ganguly and Frank (2018), and Horvat et al (2018). Other researchers only note that additional feature selection contributes to better prediction results.…”

Section: Feature Selection Based On Feature Selection Methodsmentioning

confidence: 99%

See 2 more Smart Citations

The use of machine learning in sport outcome prediction: A review

Horvat

Job

2020

WIREs Data Min & Knowl

Self Cite

View full text Add to dashboard Cite

The increase in the volume of structured and unstructured data related to more than just sport events leads to the development and increased use of techniques that extract information and employ machine-learning algorithms in predicting process outcomes based on input but not necessarily output data. Taking sports into consideration, predicting outcomes, and extracting valuable information has become appealing not only to sports workers but also to the wider audience, particularly in the areas of team management and sports betting. The aim of this article is to review the existing machine learning (ML) algorithms in predicting sport outcomes. Over 100 papers were analyzed and only some of these papers were taken into consideration. Almost all of the analyzed papers use some sort of feature selection and feature extraction, most often prior to using the machine-learning algorithm. As an evaluation method of ML algorithms, researchers, in most cases, use data segmentation with data being chronologically distributed. In addition to data segmentation, researchers also use the k-cross-evaluation method. Sport predictions are usually treated as a classification problem with one class being predicted and rare cases being predicted as numerical values. Mostly used ML models are neural networks using data segmentation.

show abstract

Section: Evaluation Of the Resultsmentioning

confidence: 99%

Section: Feature Selection Based On Feature Selection Methodsmentioning

confidence: 99%

Section: Feature Selection Based On Feature Selection Methodsmentioning

confidence: 99%

See 1 more Smart Citation

The use of machine learning in sport outcome prediction: A review

Horvat

Job

2020

WIREs Data Min & Knowl

Self Cite

View full text Add to dashboard Cite

show abstract

“…The method fundamentally relies on a metric distance value. The most common metric is Euclidean distance (7), although other metrics that can be used as well [11].…”

Section: K-nn Algorithmmentioning

confidence: 99%

“…Without feature selection, authors achieved accuracy of 67%. In [11], the authors proposed a model based on k-nearest neighbours (K-NN) for predicting Euroleague games. The authors used several models using different k and number of seasons.…”

mentioning

confidence: 99%

The Impact of Selecting a Validation Method in Machine Learning on Predicting Basketball Game Outcomes

2020

Self Cite

View full text Add to dashboard Cite

Interest in sports predictions as well as the public availability of large amounts of structured and unstructured data are increasing every day. As sporting events are not completely independent events, but characterized by the influence of the human factor, the adequate selection of the analysis process is very important. In this paper, seven different classification machine learning algorithms are used and validated with two validation methods: Train&Test and cross-validation. Validation methods were analyzed and critically reviewed. The obtained results are analyzed and compared. Analyzing the results of the used machine learning algorithms, the best average prediction results were obtained by using the nearest neighbors algorithm and the worst prediction results were obtained by using decision trees. The cross-validation method obtained better results than the Train&Test validation method. The prediction results of the Train&Test validation method by using disjoint datasets and up-to-date data were also compared. Better results were obtained by using up-to-date data. In addition, directions for future research are also explained. Symmetry 2020, 12, 431 2 of 15The aim of this paper is, through the comparison of the classification machine learning algorithms in predicting basketball game outcomes, to define which algorithm, validation method, and data preparation method produces better prediction results.This paper demonstrates what impact the different validation methods have on the prediction accuracy when using different ML algorithms. Moreover, the impact of selecting a validation method on the prediction results when applying ML to the disjoint datasets or the up-to-date data is revealed, thereby enabling the formation of recommendations for the most appropriate combination of the ML algorithm and validation method, depending on the available datasets.After this introduction and the overview of sport outcome-related researches, the second chapter provides the basic information about classification machine learning algorithms and the validation methods applied in this research. The third chapter describes the data acquisition and data preparation procedures. The research results are presented and discussed in the fourth chapter, and the conclusions are given at the end of paper. Related Literature ReviewThe most common algorithm in predicting outcomes in sports are neural networks coupled with the Train&Test validation method. The authors of [1] used a variety of neural networks and Train&Test validation for predicting game outcomes in the National Basketball Association (NBA) league, with the best results of more than 70%. In [2], the authors used 37 algorithms in the Waikato Environment for Knowledge Analysis (WEKA) and Train&Test validation method. The result with the best yield was 72.8%, showing that the best classifiers have 5% better precision than the referent classifier, which favors the team with the better rating. The authors of [3] used logistic regression, Naïve Bayes, Support Vector Machine (S...

show abstract

A performance evaluation of neural network features and functions settings on the model accuracy

Bozděch

2022

Preprint

View full text Add to dashboard Cite

Not only in sports is a neural network the most used type of artificial intelligence. With software development, anyone can create a neural network model, but little is known about how to prepare the data and how to set up the model algorithms to their maximum performance. For these reasons, this study aims to determine whether features or function settings have a greater effect on model accuracy. An initial feature dataset (n = 18882) was obtained from publicly available sources. Each of the six different feature settings consisted of 96 models. A total of 384 models were created, in which their testing accuracy and the percentage difference between the training and testing phases were further analyzed. No statistically significant differences were found between the accuracy of the function's settings, but statistically significant differences were confirmed between the feature settings. The study found that feature settings, especially the reduction of the number of outputs, are a more important factor in increasing the model accuracy, than function settings. Although the literature focuses more on the function setting and sets feature setting is taken rather as a type of how to improve the model.

show abstract

Prediction of Euroleague Games based on Supervised Classification Algorithm k-Nearest Neighbours

Cited by 6 publications

References 8 publications

The use of machine learning in sport outcome prediction: A review

The use of machine learning in sport outcome prediction: A review

The Impact of Selecting a Validation Method in Machine Learning on Predicting Basketball Game Outcomes

A performance evaluation of neural network features and functions settings on the model accuracy

Contact Info

Product

Resources

About