We develop the probabilistic implementation of a nonlocal gate exp [iξσn A σn B ] and ξ ∈ [0, π 4 ], by using a single non-maximally entangled state. We prove that, nonlocal gates can be implemented with a fidelity greater than 79.3% and a consumption of less than 0.969 ebits and 2 classical bits, when ξ ≤ 0.353. This provides a higher bound for the feasible operation compared to the former techniques [9,14,16]. Besides, gates with ξ ≥ 0.353 can be implemented with the probability 79.3% and a consumption of 0.969 ebits, which is the same efficiency as the distillation-based protocol [14,16], while our method saves extra classical resource. Gates with ξ → 0 can be implemented with nearly unit probability and a small entanglement. We also generalize some application to the multiple system, where we find it is possible to implement certain nonlocal gates between many non-entangled partners using a non-maximally multiple entangled state.Entanglement has been examined as an essential resource in most applications of quantum information such as enhanced classical communication, dense coding and quantum cryptography [1]. Of all tasks above, the implementation of nonlocal quantum operation on spatially distributed systems is a considerable aspect, especially in quantum computation. This is because, all in all, the only performance by a quantum computer is proved to be the collective unitary operation, which is also able to create entanglement between distributed groups. In particular, the latter effect implies that one can realize the above tasks, such as quantum teleportation [2] starting from one entangling operation. Besides the nonlocal Hamiltonians directly results in the dynamical evolution of distributed quantum systems, thus it is also significative to explore other properties such as the structure and interconvertability of nonlocal gates.In fact, many results have been reported in this direction [3,4,5,6,7,8,9,10,11,12,13,14,15]. Here, a prime problem is the efficiency for implementing a given nonlocal gate. It is accepted that at least the consumption of 1 ebit and 2 cbits is necessarily required to faithfully implement a general nonlocal operation [8]. On the other hand, a remarkable progress has been made by [9], which says gates with small ξ can be implemented by using a lower entanglement and a more classical consumption. However the scheme therein is inefficient for the gates with larger ξ, and it usually requires an excessive amount of classical resource because of plenty of quantum channels required. This deficiency has been made up by the recent work in [14], who employ a probabilistic protocol to implement the nonlocal gates, with a high fidelity when ξ is small. The required classical communication therein is 2 bits, while the entanglement can be made arbitrarily small since it relates to the success fidelity. However, the best attainable fidelity will lower * Electronic address: deteriorate@zju.edu.cn † Electronic address: yxchen@zimp.edu.cn down rapidly with the increasing ξ, thus it is irresponsi...