Forecasting the severity of occupational injuries shall be all industries’ top priority. The use of machine learning is theoretically valuable to assist the predictive analysis, thus, this study attempts to propose a feature-optimized predictive model for anticipating occupational injury severity. A public database of 66,405 occupational injury records from OSHA is analyzed using five sets of machine learning models: Support Vector Machine, K-Nearest Neighbors, Naïve Bayes, Decision Tree, and Random Forest. For model comparison, Random Forest outperformed other models with higher accuracy and F1-score. Therefore, it highlighted the potential of ensemble learning as a more accurate prediction model in the field of occupational injury. In constructing the model, this study also proposed the feature optimization technique that revealed the three most important features; ‘nature of injury’, ‘type of event’, and ‘affected body part’ in developing model. The accuracy of the Random Forest model was improved by 0.5% or 0.895 and 0.954 for the prediction of hospitalization and amputation, respectively by redeveloping and optimizing the model with hyperparameter tuning. The feature optimization is essential in providing insight knowledge to the Safety and Health Practitioners for future injury corrective and preventive strategies. This study has shown promising potential for smart workplace surveillance.
Workplace accidents can cause a catastrophic loss to the company including human injuries and fatalities. Occupational injury reports may provide a detailed description of how the incidents occurred. Thus, the narrative is a useful information to extract, classify and analyze occupational injury. This study provides a systematic review of text mining and Natural Language Processing (NLP) applications to extract text narratives from occupational injury reports. A systematic search was conducted through multiple databases including Scopus, PubMed, and Science Direct. Only original studies that examined the application of machine and deep learningbased Natural Language Processing models for occupational injury analysis were incorporated in this study. A total of , out of articles were reviewed in this study by adopting the Preferred Reporting Items for Systematic Review (PRISMA). This review highlighted that various machine and deep learning-based NLP models such as K-means, Naïve Bayes, Support Vector Machine, Decision Tree, and K-Nearest Neighbors were applied to predict occupational injury. On top of these models, deep neural networks are also included in classifying the type of accidents and identifying the causal factors. However, there is a paucity in using the deep learning models in extracting the occupational injury reports. This is due to these techniques are pretty much very recent and making inroads into decision-making in occupational safety and health as a whole. Despite that, this paper believed that there is a huge and promising potential to explore the application of NLP and text-based analytics in this occupational injury research field. Therefore, the improvement of data balancing techniques and the development of an automated decision-making
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.