The multiple-unmanned-aerial-vehicle (multi-UAV) mobile edge network is a promising networking paradigm that uses multiple resource-limited and trajectory-planned unmanned aerial vehicles (UAVs) as edge servers, upon which on-demand virtual network functions (VNFs) are deployed to provide low-delay virtualized network services for the requests of ground users (GUs), who often move randomly and have difficulty accessing the Internet. However, VNF deployment and UAV trajectory planning are both typical NP-complete problems, and the two operations have a strong coupling effect: they affect each other. Achieving optimal virtualized service provision (i.e., maximizing the number of accepted GU requests under a given period T while minimizing the energy consumption and the cost of accepting the requests in all UAVs) is a challenging issue. In this paper, we propose an improved online deep reinforcement learning (DRL) scheme to tackle this issue. First, we formulate the joint optimization of the two operations as a nonconvex mixed-integer nonlinear programming problem, which can be viewed as a sequence of one-frame joint VNF deployment and UAV-trajectory-planning optimization subproblems. Second, we propose an online DRL based on jointly optimizing discrete (VNF deployment) and continuous (UAV trajectory planning) actions to solve each subproblem, whose key idea is establishing and achieving the coupled influence of discrete and continuous actions. Finally, we evaluate the proposed scheme through extensive simulations, and the results demonstrate its effectiveness.