Based on the radix-4 Booth algorithm, a scheme that integrates tap folding and coefficient folding is proposed to design a programmable Finite Impulse Response (FIR) architecture with low power dissipation. In addition, without increasing hardware complexity and degrading computational performance, the effective selection on input data is realized to lower the operating frequencies of the latches and multiplexers involved with the input data. With the reduction on the frequency of the input data being selected to the Booth decoders, the power consumed in the Booth decoders can be also minimized. The proposed and conventional FIR architectures are implemented using the TSMC 0.1 8,tm CMOS technology. The areas and power consumption of these architectures are analyzed and compared. Under the same specifications and throughput rate, the results revealed that in comparison to the conventional architectures, the proposed FIR architecture not only saves about 18.18% to 39.19% of area occupied, it also reduces 14.23% to 25.56% in power consumption.