Construction is one of the most injury prone industries worldwide. Concerns of health and safety of the employees in construction sites have been a vastly discussed topic for decades. In many countries, companies are required to report safety incidents by using catastrophe investigating report in their workplaces to relevant authorities whereby such data is made publicly available under the open data policy. These open datasets may be well structured or may require further preparation in order to be usable. Some datasets are in the form of reports, which require qualitative, textual analysis to extract insightful information. The purpose of this study is to extract safety hazard factors from an open dataset obtained from the US Occupational Health and Safety Administration, and to further analyse such factors using statistical analysis techniques. For each reported case, text analysis was carried out with the narrative data field describing the circumstances leading to safety incidents to extract safety hazard factors. These hazard factors were categorized into human factors, technical factors, external environmental factors, organizational factors and other factors. The results showed that hazards related to human factors are most common. Descriptive statistics also showed that the most frequent nature of accident was fractures and most frequently occurring accident event was falls to the lower levels. Such information can help to provide insights into the accidents occurred and how relevant authorities may devise strategies to improve construction site safety.
Construction is an industry well known for its very high rate of injuries and accidents around the world. Even though many researchers are engaged in analysing the risks of this industry using various techniques, construction accidents still require much attention in safety science. According to existing literature, it has been found that hazards related to workers, technology, natural factors, surrounding activities and organisational factors are primary causes of accidents. Yet, there has been limited research aimed to ascertain the extent of these hazards based on the actual reported accidents. Therefore, the study presented in this paper was conducted with the purpose of devising an approach to extract sources of hazards from publicly available injury reports by using Text Mining (TM) and Natural Language Processing (NLP) techniques. This paper presents a methodology to develop a rule-based extraction tool by providing full details of lexicon building, devising extraction rules and the iterative process of testing and validation. In addition, the developed rule-based classifier was compared with, and found to outperform, the existing statistical classifiers such as Support Vector Machine (SVM), Kernel SVM, K-nearest neighbours, Naïve Bayesian classifier and Random Forest classifier. The finding using the developed tool identified the worker factor as the highest contributor to construction site accidents followed by technological factor, surrounding activities, organisational factor, and natural factor (1%). The developed tool could be used to quickly extract the sources of hazards by converting largely available unstructured digital accident data to structured attributes allowing better data-driven safety management.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.