To achieve the intelligent perception of traffic video and overcome the dispersion of computing modules and the separation processing of multiple tasks, efficient chained centre network (ECCNet) is proposed as a unified framework, accomplishing detection, vehicle classification, tracking, and vehicle speed estimation simultaneously. First, for the speedaccuracy trade-off, CA-CenterNet is presented, which can detect vehicles and classify vehicle types to serve cross-frame tasks more accurately by embedding coordinate attention. Second, 3D convolution is employed to construct a self-adaptive branch for data association and speed estimation, respectively. This self-adaptive approach leverages the power of deep learning to enhance tracking performance and avoid camera calibration via capturing motion information across frames. Moreover, a chained structure is adopted to reuse the backbone feature map. The spatio-temporal information of adjacent frames can be extracted at almost no additional cost. Finally, the above single-frame and cross-frame tasks are integrated into a unified multi-task collaborative optimization model. The effectiveness of ECCNet is verified with experiments on the UA-DETRAC dataset. ECCNet achieves 55.5% MOTA, 0.76 F1 value, and 3.10 MAE with an inference speed of 32.5 Hz in the tracking, classification, and speed estimation tasks, respectively.