MotivationDuring the past few years the nature of integrated circuit design has slowly changed; the continued scaling of the underlying technology has moved designs from being limited by the amount of functionality on a chip, to being powerconstrained. The nature of the power constraints may be different (i.e., the chips in cell phones vs. desktop processors), but in many cases today, and in most cases in the future, the performance one can achieve will depend on the how efficiently that computation can be done per unit of energy. While historically for CMOS circuits there has always been a strong relationship between power and performance, the power of the chip remained within the allowable power envelope; in this scenario, designers focused primarily on achieving the needed performance. Power, if considered, was only checked to ensure that it was not too high. In order to achieve the highest performance in the power-limited scaling regime, one must use the most energy efficient method available, otherwise one will overrun the specified power/energy budget.This new relationship between peak achievable performance and energy efficiency changes the way one tends to think about design. Traditionally, architects try to create a machine organization that has the "best" performance. This design is then passed to the block designers, who again try to build the blocks in order to achieve the peak performance. If energy efficiency is the key in achieving high performance, optimizing each layer individually will not lead to an optimal design, rather, it will lead to a design that dissipates too much power. Instead, one needs to optimize the design by using techniques that are the most power efficient first, until the desired performance or power is reached.Merely optimizing for the most energy efficient design is misleading, since this approach rarely achieves needed performance. Thus, the correct optimization typically either minimizes the energy consumption, subject to a throughput constraint, or maximizes the amount of computation for a given amount of energy. Both these design optimizations can be achieved if the tradeoffs between the energy and delay are known.The dramatic increase in leakage currents in today's (and future) technologies adds another factor to the optimization problem. Since some of the leakage power can be traded off for the dynamic power of the design, the optimization needs to select the correct balance here, as well. Furthermore, as the ratio of leakage-to-active power increases, the optimal architecture and circuits also change. From a power budget perspective, leaky gates are expensive since they cost watts when they are inactive. Thus, for leaky technologies, one wants to keep the gates as active as possible, leading to deeply pipelined, rather than parallel, architectures.Design methods that explore "true power minimization" need to work in a large dimension search space, where power and performance of different solutions are compared. This includes system architecture optimization ...