The fast Fourier transform (FFT) has applications in almost every frequency related studies, e.g. in image and signal processing, and radio astronomy. It is also used to solve partial differential equations used in fluid flows, density functional theory, many-body theory, and others. Three-dimensional 3 FFT has large time complexity ( 3 log 2 ). Hence, parallel algorithms are made to compute such FFTs. Popular libraries perform slab division or pencil decomposition of 3 data. None of the existing libraries have achieved perfect inverse scaling of time with ( −1 ≈ ) cores because FFT requires all-to-all communication and clusters hitherto do not have physical all-to-all connections. Dragonfly, one of the popular topologies for the interconnect, supports hierarchical connections among the components. Thus, we show that if we align the all-to-all communication of FFT with the physical connections of Dragonfly topology we will achieve a better scaling and reduce communication time.