Single-loop detectors provide the most abundant source of traffic data in California, but loop data samples are often missing or invalid. A method is described that detects bad data samples and imputes missing or bad samples to form a complete grid of clean data, in real time. The diagnostics algorithm and the imputation algorithm that implement this method are operational on 14,871 loops in six districts of the California Department of Transportation. The diagnostics algorithm detects bad (malfunctioning) single-loop detectors from their volume and occupancy measurements. Its novelty is its use of time series of many samples, instead of basing decisions on single samples, as in previous approaches. The imputation algorithm models the relationship between neighboring loops as linear and uses linear regression to estimate the value of missing or bad samples. This gives a better estimate than previous methods because it uses historical data to learn how pairs of neighboring loops behave. Detection of bad loops and imputation of loop data are important because they allow algorithms that use loop data to perform analysis without requiring them to compensate for missing or incorrect data samples.
An approach is presented for estimating future travel times on a freeway using flow and occupancy data from single-loop detectors and historical travel-time information. Linear regression, with the stepwise-variable-selection method and more advanced tree-based methods, is used. The analysis considers forecasts ranging from a few minutes into the future up to an hour ahead. Leave-a-day-out cross-validation was used to evaluate the prediction errors without underestimation. The current traffic state proved to be a good predictor for the near future, up to 20 min, whereas historical data are more informative for longer-range predictions. Tree-based methods and linear regression both performed satisfactorily, showing slightly different qualitative behaviors for each condition examined in this analysis. Unlike preceding works that rely on simulation, real traffic data were used. Although the current implementation uses measured travel times from probe vehicles, the ultimate goal is an autonomous system that relies strictly on detector data. In the course of presenting the prediction system, the manner in which travel times change from day to day was examined, and several metrics to quantify these changes were developed. The metrics can be used as input for travel-time prediction, but they also should be beneficial for other applications, such as calibrating traffic models and planning models.
Measures of structural similarity between known protein structures provide an objective basis for classifying protein folds and for revealing a global view of the protein structure universe. Here, we describe a rapid method to measure structural similarity based on the profiles of representative local features of C ␣ distance matrices of compared protein structures. We first extract a finite number of representative local feature (LF) patterns from the distance matrices of all protein fold families by medoid analysis. Then, each C ␣ distance matrix of a protein structure is encoded by labeling all its submatrices by the index of the nearest representative LF patterns. Finally, the structure is represented by the frequency distribution of these indices, which we call the LF frequency (LFF) profile of the protein. The LFF profile allows one to calculate structural similarity scores among a large number of protein structures quickly, and also to construct and update the ''map'' of the protein structure universe easily. The LFF profile method efficiently maps complex protein structures into a common Euclidean space without prior assignment of secondary structure information or structural alignment.protein structural similarity ͉ protein distance matrix ͉ local protein structural features profile ͉ protein fold ͉ protein fold space
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.