The large amount of features recorded from GPS and inertial sensors (external load) and well-being questionnaires (internal load) can be used together in a multi-dimensional non-linear machine learning based model for a better prediction of non-contact injuries. In this study we put forward the main hypothesis that the use of such models would be able to inform better about injury risks by considering the evolution of both internal and external loads over two horizons (one week and one month). Predictive models were trained with data collected by both GPS and subjective questionnaires and injury data from 40 elite male soccer players over one season. Various classification machine-learning algorithms that performed best on external and internal loads features were compared using standard performance metrics such as accuracy, precision, recall and the area under the receiver operator characteristic curve. In particular, tree-based algorithms based on non-linear models with an important interpretation aspect were privileged as they can help to understand internal and external load features impact on injury risk. For 1-week injury prediction, internal load features data were more accurate than external load features while for 1-month injury prediction, the best performances of classifiers were reached by combining internal and external load features.
Abstract-In real world applications, data are often uncertain or imperfect. In most classification approaches, they are transformed into precise data. However, this uncertainty is an information in itself which should be part of the learning process. Data uncertainty can take several forms: probabilities, (fuzzy) sets of possible values, expert assessments, etc. We therefore need a flexible and generic enough model to represent and treat this uncertainty, such as belief functions. Decision trees are well known classifiers which are usually learned from precise datasets. In this paper we propose a methodology to learn decision trees from uncertain data in the belief function framework. In the proposed method, the tree parameters are estimated through the maximization of an evidential likelihood function computed from belief functions, using the recently proposed E 2 M algorithm that extends the classical EM. Some promising experiments compare the obtained trees with classical CART decision trees.
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
In machine learning, Ensemble Learning methodologies are known to improve predictive accuracy and robustness. They consist in the learning of many classifiers that produce outputs which are finally combined according to different techniques. Bagging, or Bootstrap Aggregating, is one of the most famous Ensemble methodologies and is usually applied to the same classification base algorithm, i.e. the same type of classifier is learnt multiple times on bootstrapped versions of the initial learning dataset. In this paper, we propose a bagging methodology that involves different types of classifier. Classifiers' probabilist outputs are used to build mass functions which are further combined within the belief functions framework. Three different ways of building mass functions are proposed; preliminary experiments on benchmark datasets showing the relevancy of the approach are presented.
The emergence of the first Fitness-Fatigue impulse responses models (FFMs) have allowed the sport science community to investigate relationships between the effects of training and performance. In the models, athletic performance is described by first order transfer functions which represent Fitness and Fatigue antagonistic responses to training. On this basis, the mathematical structure allows for a precise determination of optimal sequence of training doses that would enhance the greatest athletic performance, at a given time point. Despite several improvement of FFMs and still being widely used nowadays, their efficiency for describing as well as for predicting a sport performance remains mitigated. The main causes may be attributed to a simplification of physiological processes involved by exercise which the model relies on, as well as a univariate consideration of factors responsible for an athletic performance. In this context, machine-learning perspectives appear to be valuable for sport performance modelling purposes. Weaknesses of FFMs may be surpassed by embedding physiological representation of training effects into non-linear and multivariate learning algorithms. Thus, ensemble learning methods may benefit from a combination of individual responses based on physiological knowledge within supervised machine-learning algorithms for a better prediction of athletic performance.In conclusion, the machine-learning approach is not an alternative to FFMs, but rather a way to take advantage of models based on physiological assumptions within powerful machine-learning models.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.