Thread-level speculation and profile-guided parallelization techniques exploit the fact that many statically detected data and control flow dependences do not manifest themselves in every possible program execution. Instead, many of these maydependences only occur infrequently, e.g. for some corner cases, or not at all for any legal program input. While the effectiveness of dynamic parallelization techniques critically depends on the absence of such dependences, not much is known about their nature. In this paper, we present an empirical analysis and characterization of the variability of both data dependences and control flow across program runs. We run the CBENCH benchmark suite with 100 randomly chosen input data sets and record complete control and data flow traces. Based on these traces, we build a whole-program control and data flow graph (CDFG) for each run and compare the resulting graphs to obtain a measure of the variance in the observed control and data flow. We show that, on average, the cumulative profile information gathered with at least 55, and up to 100, different input data sets is needed to achieve full coverage of the data flow observed across all runs. For control flow, the figure stands at 46 and 100 data sets, respectively. This suggests that profile-guided parallelization needs to be applied with utmost care, as misclassification of sequential loops as parallel was observed even when up to 94 input data sets are used.