PMID- 34223900 OWN - NLM STAT- MEDLINE DCOM- 20210910 LR - 20220716 IS - 1525-3163 (Electronic) IS - 0021-8812 (Print) IS - 0021-8812 (Linking) VI - 99 IP - 9 DP - 2021 Sep 1 TI - Disentangling data dependency using cross-validation strategies to evaluate prediction quality of cattle grazing activities using machine learning algorithms and wearable sensor data. LID - 10.1093/jas/skab206 [doi] LID - skab206 AB - Wearable sensors have been explored as an alternative for real-time monitoring of cattle feeding behavior in grazing systems. To evaluate the performance of predictive models such as machine learning (ML) techniques, data cross-validation (CV) approaches are often employed. However, due to data dependencies and confounding effects, poorly performed validation strategies may significantly inflate the prediction quality. In this context, our objective was to evaluate the effect of different CV strategies on the prediction of grazing activities in cattle using wearable sensor (accelerometer) data and ML algorithms. Six Nellore bulls (average live weight of 345 +/- 21 kg) had their behavior visually classified as grazing or not-grazing for a period of 15 d. Elastic Net Generalized Linear Model (GLM), Random Forest (RF), and Artificial Neural Network (ANN) were employed to predict grazing activity (grazing or not-grazing) using 3-axis accelerometer data. For each analytical method, three CV strategies were evaluated: holdout, leave-one-animal-out (LOAO), and leave-one-day-out (LODO). Algorithms were trained using similar dataset sizes (holdout: n = 57,862; LOAO: n = 56,786; LODO: n = 56,672). Overall, GLM delivered the worst prediction accuracy (53%) compared with the ML techniques (65% for both RF and ANN), and ANN performed slightly better than RF for LOAO (73%) and LODO (64%) across CV strategies. The holdout yielded the highest nominal accuracy values for all three ML approaches (GLM: 59%, RF: 76%, and ANN: 74%), followed by LODO (GLM: 49%, RF: 61%, and ANN: 63%) and LOAO (GLM: 52%, RF: 57%, and ANN: 57%). With a larger dataset (i.e., more animals and grazing management scenarios), it is expected that accuracy could be increased. Most importantly, the greater prediction accuracy observed for holdout CV may simply indicate a lack of data independence and the presence of carry-over effects from animals and grazing management. Our results suggest that generalizing predictive models to unknown (not used for training) animals or grazing management may incur poor prediction quality. The results highlight the need for using management knowledge to define the validation strategy that is closer to the real-life situation, i.e., the intended application of the predictive model. CI - (c) The Author(s) 2021. Published by Oxford University Press on behalf of the American Society of Animal Science. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com. FAU - Coelho Ribeiro, Leonardo Augusto AU - Coelho Ribeiro LA AD - Department of Animal Science, University of Lavras, Lavras, MG 37200-900, Brazil. FAU - Bresolin, Tiago AU - Bresolin T AUID- ORCID: 0000-0002-3196-5150 AD - Department of Animal and Dairy Sciences, University of Wisconsin, Madison, WI 53706, USA. FAU - Rosa, Guilherme Jordao de Magalhaes AU - Rosa GJM AUID- ORCID: 0000-0001-9172-6461 AD - Department of Animal and Dairy Sciences, University of Wisconsin, Madison, WI 53706, USA. FAU - Rume Casagrande, Daniel AU - Rume Casagrande D AD - Department of Animal Science, University of Lavras, Lavras, MG 37200-900, Brazil. FAU - Danes, Marina de Arruda Camargo AU - Danes MAC AUID- ORCID: 0000-0003-4196-8328 AD - Department of Animal Science, University of Lavras, Lavras, MG 37200-900, Brazil. FAU - Dorea, Joao Ricardo Reboucas AU - Dorea JRR AUID- ORCID: 0000-0001-9849-7358 AD - Department of Animal and Dairy Sciences, University of Wisconsin, Madison, WI 53706, USA. LA - eng GR - University of Wisconsin-Madison/ GR - Wisconsin Alumni Research Foundation/ GR - Wisconsin Institutes for Discovery/ GR - National Science Foundation/ GR - U.S. Department of Energy's Office of Science/ GR - Coordenacao de Aperfeicoamento de Pessoal de Nivel Superior/ GR - APQ-01436-18/Fundacao de Amparo a Pesquisa do Estado de Minas Gerais/ PT - Journal Article PL - United States TA - J Anim Sci JT - Journal of animal science JID - 8003002 SB - IM MH - Algorithms MH - Animals MH - Cattle MH - Linear Models MH - *Machine Learning MH - Male MH - Neural Networks, Computer MH - *Wearable Electronic Devices PMC - PMC8418637 OTO - NOTNLM OT - accelerometer OT - grazing OT - machine learning OT - validation EDAT- 2021/07/06 06:00 MHDA- 2021/09/11 06:00 PMCR- 2022/07/05 CRDT- 2021/07/05 12:18 PHST- 2021/03/13 00:00 [received] PHST- 2021/07/02 00:00 [accepted] PHST- 2021/07/06 06:00 [pubmed] PHST- 2021/09/11 06:00 [medline] PHST- 2021/07/05 12:18 [entrez] PHST- 2022/07/05 00:00 [pmc-release] AID - 6314786 [pii] AID - skab206 [pii] AID - 10.1093/jas/skab206 [doi] PST - ppublish SO - J Anim Sci. 2021 Sep 1;99(9):skab206. doi: 10.1093/jas/skab206.