The chemical-potential multiphase lattice Boltzmann method (CP-LBM) has the advantages of satisfying the thermodynamic consistency and Galilean invariance, and it realizes a very large density ratio and easily expresses the surface wettability. Compared with the traditional central difference scheme, the CP-LBM uses the Thomas algorithm to calculate the differences in the multiphase simulations, which significantly improves the calculation accuracy but increases the calculation complexity. In this study, we designed and implemented a parallel algorithm for the chemical-potential model on a graphic processing unit (GPU). Several strategies were used to optimize the GPU algorithm, such as coalesced access, instruction throughput, thread organization, memory access, and loop unrolling. Compared with dual-Xeon 5117 CPU server, our methods achieved 95 times speedup on an NVIDIA RTX 2080Ti GPU and 106 times speedup on an NVIDIA Tesla P100 GPU. When the algorithm was extended to the environment with dual NVIDIA Tesla P100 GPUs, 189 times speedup was achieved and the workload of each GPU reached 96%.