Abstract-To accelerate many computational software algorithms, designers are implementing them as computational circuits. These algorithms are diverse and include molecular dynamics, weather simulation, video encoding, and financial modelling. Circuit designers repeatedly synthesize and simulate circuits for debugging and incremental design, but due to the size of computational circuits these steps are slow and waste designer productivity. In this paper we present an architecture and tool flow for rapidly compiling and simulating/executing computational circuits. We use a motion estimation circuit to demonstrate the performance vs. capacity scalability of our architecture, and show that the performance is comparable to an FPGA-based design.