FPGAs have been successfully used to accelerate many computationally bound applications, such as highperformance Monte-Carlo simulations, but the amount of programmer effort required in development, testing, and tuning is also very high, requiring a new custom design for each application. This paper presents Contessa, a pure-functional continuation-based language for describing path-based Monte-Carlo simulations, and a completely automated method for turning platform-independent Contessa programs into high-performance hardware implementations. Our approach exploits the large degree of thread-based parallelism available in Monte-Carlo simulations, allowing data-dependent control-flow and loopcarried dependencies to be expressed, while retaining highperformance. The Contessa toolchain is evaluated using five different simulation kernels, in comparison to both software and manually described hardware. When compared to an existing FPGA implementation, Contessa requires a quarter of the Handel-C source-code length, and doubles the clock rate to over 300MHz while requiring a similar number of resources, and also provides a 35 times speedup over a C++ implementation using an Opteron 2.2GHz.