We propose a sub-mW H.264 baseline-profile motion estimation processor for portable video applications. It features a VLSIoriented block partitioning strategy and low-power SIMD/systolic-array datapath architecture, where the datapath can be switched between an SIMD and systolic array depending on processing flow. The processor supports all the seven kinds of block modes, and can handle three reference frames for a CIF (352 × 288) 30-fps to QCIF (176 × 144) 15-fps sequences with a quarter-pixel accuracy. It integrates 3.3 million transistors, and occupies 2.8×3.1 mm 2 in a 130-nm CMOS technology. The proposed processor achieves a power of 800 µW in a QCIF 15-fps sequence with one reference picture.