we discuss an implementation and optimization of GPU-accelerated Molecular Dynamics (MD) simulation of high-speed collision molecular model in NVIDIA CUDA language. A series of optimization methods are presented: spatial decomposition, use of shared memory and use of blockcell-link structure. These optimization methods effectively improve the performance by reducing data transfer time between CPU and GPU and reducing memory access time on GPU. We test our GPU-accelerated MD algorithm on a modern GPU, NVIDIA Tesla C870. The performance of our code implemented on the GPU with AMD Athlon64 4400+ is compared to the CPU-only version. GPU-accelerated MD simulation can achieve speedup of 38.37 times compared to the sequential version running on single core of the CPU. The peak performance of C870 reaches 15 GFLOPS.