A conflict probe is an air traffic management decision support tool that predicts aircraft-to-aircraft and aircraft-to-airspace conflicts. In order to achieve the confidence of the air traffic controllers who are provided this tool, a conflict probe must accurately predict these conflicts. This paper discusses how a conflict probe's quantitative accuracy requirements can be tested using hypothesis testing techniques. The paper also asserts that air traffic scenarios based on recorded field data are essential to the evaluation of a conflict probe and states that time shifting these scenarios can create data samples necessary to perform the hypothesis testing. This paper then compares three time shifting techniques: time compression, random time adjustment, and an implementation of a genetic algorithm.