To achieve the intelligent perception of traffic video and overcome the dispersion of computing modules and the separation processing of multiple tasks, efficient chained centre network (ECCNet) is proposed as a unified framework, accomplishing detection, vehicle classification, tracking, and vehicle speed estimation simultaneously. First, for the speedaccuracy trade-off, CA-CenterNet is presented, which can detect vehicles and classify vehicle types to serve cross-frame tasks more accurately by embedding coordinate attention. Second, 3D convolution is employed to construct a self-adaptive branch for data association and speed estimation, respectively. This self-adaptive approach leverages the power of deep learning to enhance tracking performance and avoid camera calibration via capturing motion information across frames. Moreover, a chained structure is adopted to reuse the backbone feature map. The spatio-temporal information of adjacent frames can be extracted at almost no additional cost. Finally, the above single-frame and cross-frame tasks are integrated into a unified multi-task collaborative optimization model. The effectiveness of ECCNet is verified with experiments on the UA-DETRAC dataset. ECCNet achieves 55.5% MOTA, 0.76 F1 value, and 3.10 MAE with an inference speed of 32.5 Hz in the tracking, classification, and speed estimation tasks, respectively.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.