Always-on image processing is crucial for many applications, such as face and attention detection, and it is usually offloaded to dedicated, energy-efficient image processors. These processors need to be flexible and scalable to follow the rapid evolution of image sensors and always-on image processing workloads. A flexible architecture is the shared memory cluster, where multiple cores are tightly coupled with L1 memory. However, current clusters are not latency tolerant and follow a uniform memory access approach, which limits their frequency and scalability. The MemPool architecture [1] lifts those constraints by combining latency-tolerant cores, pipelined functional processing units, and a non-uniform memory access interconnect. This paper presents MinPool, a low-power image processor for always-on functions implemented in TSMC's 65 nm technology and based on a tailored MemPool architecture. Thanks to an instruction set architecture extension tuned for image processing and the low-leakage process, it achieves excellent utilization results with IPCs of up to 0.98 and an energy efficiency of 65 GOPS/W for key image processing kernels.