Achieving crowd formation transformations has wide‐ranging applications in fields such as unmanned aerial vehicle formation control, crowd simulation, and large‐scale performances. However, planning trajectories for hundreds of agents is a challenging and tedious task. When modifying crowd formation change schemes, adjustments are typically required based on the style of formation change. Existing methods often involve manual adjustments at each crucial step, leading to a substantial amount of physical labor. Motivated by these challenges, this study introduces a novel generative adversarial network (GAN) for generating crowd formation transformations. The proposed GAN learns specific styles from a series of crowd formation transformation trajectories and can transform a new crowd with an arbitrary number of individuals into the same styles with minimal manual intervention. The model incorporates a space–time transformer module to aggregate spatiotemporal information for learning distinct styles of formation transformation. Furthermore, this article investigates the relationship between the distribution of training data and the length of trajectory sequences, providing insights into the preprocessing of training data.