The classification of vehicular crashes based on their severity is crucial since not all of them have the same financial and injury values. In addition, avoiding crashes by identifying their influential factors is possible via accurate prediction modeling. In crash severity analysis, accurate and time-saving prediction models are necessary for classifying crashes based on their severity. Moreover, statistical models are incapable of identifying the potential severity of crashes regarding influencing factors incorporated in models. Unlike previous research efforts, which focused on the limited class of crash severity, including property damage only (PDO), fatality, and injury by applying data mining models, the present study sought to predict crash frequency according to five severity levels of PDO, fatality, severe injury, other visible injuries, and complaint of pain. The multinomial logistic regression (MLR) model and data mining approaches, including artificial neural network-multilayer perceptron (ANN-MLP) and two decision tree techniques, (i.e., Chi-square automatic interaction detector (CHAID) and C5.0) are utilized based on traffic crash records for State Highways in California, USA. The comparison of the findings of the relative importance of ten qualitative and ten quantitative independent variables incorporated in CHAID and C5.0 indicated that the cause of the crash (X1) and the number of vehicles (X5) were known as the most influential variables involved in the crash. However, the cause of the crash (X1) and weather (X2) were identified as the most contributing variables by the ANN-MLP model. In addition, the MLR model showed that the driver’s age (X11) accounts for a larger proportion of traffic crash severity. Therefore, the sensitivity analysis demonstrated that C5.0 had the best performance for predicting road crash severity. Not only did C5.0 take a shorter time (0.05 s) compared to CHAID, MLP, and MLR, it also represented the highest accuracy rate for the training set. The overall prediction accuracy based on the training data was approximately 88.09% compared to 77.21 and 70.21% for CHAID and MLP models. In general, the findings of this study revealed that C5.0 can be a promising tool for predicting road crash severity.
Negative binomial-based safety performance functions (SPFs) have been extensively used by United States Department of Transportation professionals for predictive crash analysis. Recently, the Florida Department of Transportation (FDOT) has developed a context classification approach and incorporated it into crash prediction models, which has the potential to significantly enhance their accuracy and reliability. The additional modeling contexts and parameters make it more challenging to diagnose and remedy modeling problems, however. Particularly for roadway segments with low annual average daily traffic (AADT), short lengths, or low counts of severe crashes, the SPF models significantly underestimate the actual number of crashes. This uncertainty in SPF predictions can lead FDOT practitioners to reach misleading conclusions, such as failing to detect sites with genuinely high crash rates. This project intends to establish thresholds for certain SPF parameters to ensure reliable crash predictions are obtained across various context classes. For this purpose, we (a) developed a functional statistical model that quantifies economic loss relative prediction errors as a function of AADT volume and (b) calculated the minimum context-specific AADT threshold for each segment length group, roadway category, context classification, and crash severity combination. Employing the developed AADT thresholds confirmed up to 89% reduction in SPF prediction errors for the most represented context class. In light of the results obtained, we are able to conclude that context-specific AADT thresholds perform well in significantly reducing prediction errors for the thresholded segments and contexts on Florida roadways.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.