Abstract-In highly interactive applications, low latency (the time between a user's action, and the response to this action) is critical for a good user experience. Traditional GPU architectures can make very low latencies difficult to achieve. This is because they are designed first and foremost to implement the painter's algorithm -a rendering algorithm that trades-off visual realism for moderate computational speed and high scene dynamism. The dataflow programming paradigm, along with dedicated toolchains such as Maxeler's MaxCompiler, enable the design of application-specific graphics accelerators. Such accelerators, however, have the advantage that their architecture can be completely customised. In this paper we present a custom renderer that composites 2D sprites and maps to emulate a graphical user interface. It was designed to facilitate user interaction tests described in our previous work. Our design is ultra low latency, updating what is being driven to the display within 1 ms of receiving user input. This is far lower than traditional GPUs. We describe the operation of our renderer, and our novel DVI display driver output stage. We measure a latency of under 1 ms for our renderer, with an end-to-end delay of 6 ms for our whole apparatus. We compare this with the end-to-end latency of the same apparatus built with a modern GPU, which we measure at 20 ms.