Montgomery algorithm is the most common mechanism for implementing modular multiplication. This work proposes a new systolic architecture to perform high radix Montgomery algorithm on modern FPGA, which is rich in dedicated hardcore multiplier resources, and the new architecture is suitable to be used in public key coprocessors. In the modern FPGA application design, using dedicated hardcore in FPGA is the recommended designing ideas. In this work, by following this new design concept, the new multiplier architecture can reach to a high throughput. Compared with the same architecture work, the improved architecture saves nearly half of the dedicated multiplier in FPGA.