Thanks to their mobility, flexibility, and adaptive altitude, unmanned aerial vehicles (UAVs) have been admitted for several applications in wireless communication systems. In particular, UAVs can function as aerial base stations (BSs) to increase cellular network capacity and provide persistent coverage. Due to the line-of-sight channels between UAVs and user equipments (UEs), strong co-channel interference may be generated in the network. In this article, a joint transmission coordinated multi-point technique is considered in the downlink cellular network to effectively manage the interference brought by UAVs. A two-timescale optimization approach is proposed to prevent exchanging absurd messages and channel information between the transmission points. The objective of the proposed method is to maximize the sum rate of the network subject to the constraints on UEs' quality of service as well as per-ground BS and per-UAV total transmitted power. The proposed approach is decomposed into long-term and short-term problems. The long-term problem aims to jointly optimize the UEs' clusters, the UAVs' locations, and the initial transmit beamforming vectors based on large-scale channel gain. This problem is a mixed-integer nonlinear programming optimization problem, which is nondeterministic polynomial hard. Meanwhile, the short-term problem requires obtaining the transmit beamforming vectors according to the overall channel vector. This problem is also a non-convex optimization problem. To address these challenges, two algorithms are proposed based on difference of convex functions programming and successive convex approximation technique to deal with these two problems. Extensive simulations show that the performance of the proposed approach outperforms the comparable algorithms.