Object recognition in huge image data sets or in live camera images at interactive frame rates is a very demanding task, especially within embedded systems. The recognition task includes the localization of a reference object and its rotation and scaling in a search image. The Generalized Hough Transform (GHT) is known as a powerful and robust technique to support this task by transforming the search image into a 4D parameter space. However, the GHT itself is very complex and demanding towards computational power and memory consumption. This paper presents a novel hardware architecture to perform a complete 4D GHT at interactive frame rates in an FPGA. The architecture is configurable in order to allow a trade-off between performance, accuracy and hardware usage. The proposed architecture has been implemented in a low-cost Zynq-7000 FPGA and successfully evaluated in two practical applications, namely groyne detection in aerial images and traffic sign detection