Energy-efficient serial and parallel multiplier structures are explored to see their suitability in the low and ultra low power design regimes. 16×16-bit serial and state-of-art parallel multipliers are compared in 45nm CMOS. A multiplier structure is proposed by optimizing the architecture, gate sizes and the voltage supply. The proposed structure provides 15% more throughput as compared to two-cycle parallel multiplier with the same energy consumption for high speed applications. In the low speed design region, it provides 3.7X energy reduction compared to the serial multiplier.