Leonardo Augusto Coelho Ribeiro scite author profile

Leonardo Augusto Coelho Ribeiro

2Publications

3Citation Statements Received

0Citation Statements Given

How they've been cited

How they cite others

Affiliations

Federal University of Lavras

Publications

Order By: Most citations

Disentangling data dependency using cross-validation strategies to evaluate prediction quality of cattle grazing activities using machine learning algorithms and wearable sensor data

Ribeiro

Bresolin

Rosa

et al. 2021

View full text Add to dashboard Cite

Wearable sensors have been explored as an alternative for real-time monitoring of cattle feeding behavior in grazing systems. To evaluate the performance of predictive models such as machine learning (ML) techniques, data cross-validation (CV) approaches are often employed. However, due to data dependencies and confounding effects, poorly performed validation strategies may significantly inflate the prediction quality. In this context, our objective was to evaluate the effect of different CV strategies on the prediction of grazing activities in cattle using wearable sensor (accelerometer) data and ML algorithms. Six Nellore bulls (average live weight of 345 ± 21 kg) had their behavior visually classified as grazing or not-grazing for a period of 15 days. Elastic Net Generalized Linear Model (GLM), Random Forest (RF), and Artificial Neural Network (ANN) were employed to predict grazing activity (grazing or not-grazing) using 3-axis accelerometer data. For each analytical method, three CV strategies were evaluated: holdout, leave-one-animal-out (LOAO), and leave-one-day-out (LODO). Algorithms were trained using similar dataset sizes (holdout: n = 57,862; LOAO: n = 56,786; LODO: n = 56,672). Overall, GLM delivered the worst prediction accuracy (53%) compared to the ML techniques (65% for both RF and ANN), and ANN performed slightly better than RF for LOAO (73%) and LODO (64%) across CV strategies. The holdout yielded the highest nominal accuracy values for all three ML approaches (GLM: 59%, RF: 76%, and ANN: 74%), followed by LODO (GLM: 49%, RF: 61%, and ANN: 63%) and LOAO (GLM: 52%, RF: 57%, and ANN: 57%). With a larger dataset (i.e., more animals and grazing management scenarios), it is expected that accuracy could be increased. Most importantly, the greater prediction accuracy observed for holdout CV may simply indicate a lack of data independence and the presence of carry-over effects from animals and grazing management. Our results suggest that generalizing predictive models to unknown (not used for training) animals or grazing management may incur poor prediction quality. The results highlight the need for using management knowledge to define the validation strategy that is closer to the real-life situation, i.e., the intended application of the predictive model.

show abstract

PSXI-22 Prediction quality of cattle behavior traits evaluated through different cross-validation strategies using wearable sensor data and machine learning algorithms

Ribeiro¹,

Bresolin

Rosa

et al. 2020

View full text Add to dashboard Cite

Wearable sensors have been adopted as an alternative for real-time monitoring of cattle feeding behavior in grazing systems. However, even using machine learning (ML) techniques confounding effects such as cross-validation strategy may inflate the prediction quality. Our objective was to evaluate the effect of different cross-validation strategies on the prediction of grazing activities in cattle using wearable sensor data and ML algorithms. Six Nellore bulls (345 ± 21 kg) had their behavior visually classified as grazing or not-grazing for a period of 15 days. Generalized Linear Model (GLM), Random Forest (RF), and Artificial Neural Network (ANN) were employed to predict behavior (grazing or not-grazing) using 3-axis accelerometer data. For each analytical method, three cross-validation strategies were evaluated: holdout, leave-one-animal-out (LOAO), and leave-one-day-out (LODO). Algorithms were trained using similar dataset sizes (holdout: n = 57,862; LOAO: n = 56,786; LODO: n = 56,672). Regardless of the cross-validation strategy, GLM achieved the worst prediction accuracy (53%) compared to the ML techniques (65% for both RF and ANN). ANN performed slightly better than RF for LOAO (73%) and LODO (64%) cross-validation strategies. The holdout yielded the highest accuracy values for all three ML approaches (GLM: 59%, RF: 76%, and ANN: 74%), followed by LODO (58%) and LOAO (55%). In conclusion, the GLM approach was not adequate to predict grazing behavior, regardless of the cross-validation strategy. The greater prediction accuracy observed for holdout cross-validation may simply indicate a lack of data independence and the presence of carry-over effects from animals and grazing management. Our results suggest that generalizing predictive models to unknown (not used for training) animals or grazing management may incur in poor prediction quality. The results highlight the need for using biological knowledge to define the validation strategy that is closer to the real-life situation.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.