This study aims to explore the potential of using big data in advancing the pedestrian risk analysis including the investigation of contributing factors and the hotspot identification. Massive amounts of data of Manhattan from a variety of sources were collected, integrated, and processed, including taxi trips, subway turnstile counts, traffic volumes, road network, land use, sociodemographic, and social media data. The whole study area was uniformly split into grid cells as the basic geographical units of analysis. The cell-structured framework makes it easy to incorporate rich and diversified data into risk analysis. The cost of each crash, weighted by injury severity, was assigned to the cells based on the relative distance to the crash site using a kernel density function. A tobit model was developed to relate grid-cell-specific contributing factors to crash costs that are left-censored at zero. The potential for safety improvement (PSI) that could be obtained by using the actual crash cost minus the cost of "similar" sites estimated by the tobit model was used as a measure to identify and rank pedestrian crash hotspots. The proposed hotspot identification method takes into account two important factors that are generally ignored, i.e., injury severity and effects of exposure indicators. Big data, on the one hand, enable more precise estimation of the effects of risk factors by providing richer data for modeling, and on the other hand, enable large-scale hotspot identification with higher resolution than conventional methods based on census tracts or traffic analysis zones.
Monitoring nonmotorized traffic is gaining more attention in the context of transportation studies. Most of the traditional pedestrian monitoring technologies focus on counting pedestrians passing through a fixed location in the network. It is thus not possible to anonymously track the movement of individuals or groups as they move outside each particular sensor’s range. Moreover, most agencies do not have continuous pedestrian counts mainly because of technological limitations. Wireless data collection technologies, however, can capture crowd dynamics by scanning mobile devices. Data collection that takes advantage of mobile devices has gained much interest in the transportation literature as a result of its low cost, ease of implementation, and richness of the captured data. In this paper, algorithms to filter and aggregate data collected by wireless sensors are investigated, as well as how to fuse additional data sources to improve the estimation of various pedestrian-based performance measures. Procedures to accurately filter the noise in the collected data and to find pedestrian flows, wait times, and counts with wireless sensors are presented. The developed methods are applied to a 2-month-long collection of public transportation terminal data carried out with the use of six sensors. Results point out that if the penetration rate of discoverable devices is known, then it is possible to accurately estimate the number of pedestrians, pedestrian flows, and average wait times in the detection zone of the developed sensors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.