A tropical cyclone (TC) is a typical extreme tropical weather system, which could cause serious disasters in transit areas. Accurate TC track forecasting is the key to reducing casualties and damages, however, long-term forecasting of TCs is a challenging problem due to their extremely high dynamics and uncertainty. Existing TC track forecasting methods mainly focus on utilizing a single modality of source data, meanwhile, suffer from limited long-term forecasting capability and high computational complexity. In this paper, we propose to address the above challenges from a new perspective -by utilizing largescale spatio-temporal multimodal historical data and advanced deep learning techniques. A novel multi-horizon tropical cyclone track forecasting model named Dual-Branched spatio-temporal Fusion Network (DBF-Net) is proposed and evaluated. DBF-Net contains a TC features branch that extracts temporal features from 2D state vectors and a pressure field branch that extracts spatio-temporal features from reanalysis 3D pressure field. We show that with the above design, DBF-Net can fully exploit the implicit associations of multimodal data, achieving advantages that unimodal data-based method does not have. Extensive experiments on 39 years of historical TCs track data in the Northwest Pacific show that our DBF-Net achieves significant accuracy improvement compared with previous TCs track forecast methods.