Fully Homomorphic Encryption is currently a sound theoretical approach for cloud security; it is currently not practically used due to the tremendous computation requirements of multiplying very large, million-bit, operands. In this paper, we explore the design space of software/hardware (SW/HW) co-designed accelerator relying on integrating fast software multiplication algorithms with a configurable hardware multiplier. The multiplier is based on a modified serialparallel multiplier design, in which School-Book is a special case. The paper conducts an analytical performance study, exploring key design space parameters as well as comparing with other design approaches in the literature. Based on an actual FPGA implementation, we estimate a power consumption of 10 Watt, and area-time-power of 20.20 billion transistor-secWatt, potentially allowing for promising scalability.