The unobserved heterogeneity in traffic crash data hides certain relationships between the contributory factors and injury severity. The literature has been limited in exploring different types of clustering methods for the analysis of the injury severity in crashes involving large trucks. Additionally, the variability of data type in traffic crash data has rarely been addressed. This study explored the application of the k-prototypes clustering method to countermeasure the unobserved heterogeneity in large truck-involved crashes that had occurred in the United States between the period of 2016 to 2019. The study segmented the entire dataset (EDS) into three homogeneous clusters. Four gradient boosted decision trees (GBDT) models were developed on the EDS and individual clusters to predict the injury severity in crashes involving large trucks. The list of input features included crash characteristics, truck characteristics, roadway attributes, time and location of the crash, and environmental factors. Each cluster-based GBDT model was compared with the EDS-based model. Two of the three cluster-based models showed significant improvement in their predicting performances. Additionally, feature analysis using the SHAP (Shapley additive explanations) method identified few new important features in each cluster and showed that some features have a different degree of effects on severe injuries in the individual clusters. The current study concluded that the k-prototypes clustering-based GBDT model is a promising approach to reveal hidden insights, which can be used to improve safety measures, roadway conditions and policies for the prevention of severe injuries in crashes involving large trucks.
The significance of large trucks for the expansion and well-being of the economy is a well-established fact. However, crashes involving large trucks significantly threaten the overall safety on the roads. Moreover, a significant proportion of fatal crashes involving large trucks occurs on interstate roadways in the United States. However, not many studies have focused on the heterogeneous effects of the contributory factors on injury outcomes of interstate crashes involving large trucks. The current study explores the application of a k-prototypes clustering-based mixed logit model to identify and analyze the heterogeneous effects of contributory factors on injury outcomes in different scenarios of interstate crashes involving large trucks. Data from six years of crashes involving large trucks that occurred on interstate roadways in the state of Pennsylvania, US, were used in this study. The list of contributory factors included the following: drivers’ demographics and behaviors; crash characteristics; vehicle-related factors; location and roadway attributes; and environmental factors. The results indicated that some of the contributory factors were significant for all scenarios of interstate crashes involving large trucks. However, the magnitude of those factors' effects varied across scenarios. Moreover, some of the contributory factors were exclusive to certain scenarios of interstate crashes involving large trucks. Lastly, the identification of random parameters in the cluster-based models indicated that a cluster-based mixed logit model is a more effective approach for accurately estimating the effects of contributory factors on injury outcomes in large-truck interstate crashes. The empirical findings of this study can be used to develop more robust traffic laws and safety measures to reduce the frequency and severity of injury in different scenarios of interstate crashes involving large trucks.
In recent years, the number of studies on crashes involving large-trucks has increased due to its importance to the economy and the higher chance of fatalities. However, none of the previous studies has given attention to the spatial concentrations of large-truck crashes. Moreover, the literature lacks exploration of granular level land use and urban design factors. The current study used the DBSCAN (Density-Based Spatial Clustering of Application with Noise) method to identify the spatial concentrations of crashes involving large-trucks. Additionally, the study explored housing, population, employment, and road network density attributes along with the crash characteristics, roadway attributes, location type, traffic conditions, driver’s action and behavior, and environmental factors. The association rule analysis was employed to discover the contributory factors that lead to no injury, non-severe and severe injuries at the spatial concentrations of crashes involving large-trucks. The findings indicated that the rear-end collisions involving drunk drivers often lead to severe injuries in large-truck crashes. Non-interstate roads, speed limit from 40 to 80 kilometers per hour, high road network density, medium and high population density are frequent conditions of non-severe injuries. Lastly, collisions between large-trucks and fixed objects, sideswipe same direction collisions, snowy roads, clear weather, medium road network and employment density are likely to facilitate no injury crashes involving large-trucks. Road traffic authorities can use these insights to reduce the frequency and severity of crashes involving large-trucks at their spatial concentrations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.