Detecting faults in evolving systems is important. Change impact analysis has been shown to be effective for finding faults during software evolution. For example, Chianti represents program edits as atomic changes, selects affected tests, and determines a subset of affecting changes that might have caused test failures. However, the number of affecting changes related to each test failure in practice may still be overwhelming for manual inspection. In this paper, we present a novel approach, FAULTTRACER, which ranks program edits according to their suspiciousness to reduce developer effort in manually inspecting affecting changes. FAULTTRACER adapts spectrum-based fault localization techniques, which assume the statements that are primarily executed by failed tests are more suspicious, and applies them in tandem with an enhanced change impact analysis to identify failure-inducing edits more precisely. We conducted an experimental study using 23 real versions of four real-world Java programs from the Software Infrastructure Repository. The experimental results show that FAULTTRACER localizes a real regression fault within top three atomic changes for 14 out of 22 studied real failures. When ranking only method-level changes, compared to the existing ranking heuristic, FAULTTRACER reduces the number of changes to be manually inspected by more than 50% on the data set of real regression faults, and by more than 60% on the data set of seeded faults. The fault localization component of FAULTTRACER is 80% more effective than traditional spectrum-based fault localization, and enables similar benefits when using either our enhanced change impact analysis or Chianti. The runtime overhead for FAULTTRACER to collect extended call graphs is 49.83 s for each subject on average and is only 8.26% more than that for Chianti to collect traditional call graph information.While Chianti [6] can identify edits related to failed tests, the number of such affecting changes for each failed test may still be too large for manual inspection. For example, even when a mediumsized Java program ant 6.0 [7] evolves to ant 7.0, the number of suspicious edits related to each failed test can be substantial, ranging from 22 to 182 changes [8].The last decade has seen much progress in automated debugging [9] and fault localization [10][11][12][13][14]. Many techniques for identifying faulty statements are based on test spectra, which include code coverage for each passed or failed test, and produce a ranked list of potential faulty statements based on suspiciousness scores [10,11,13,14]. The main intuition is that statements primarily executed by failed tests are more suspicious than statements primarily executed by passed tests. For large programs, however, the number of suspicious statements could be overwhelming for manual inspection as well. According to a recent study using xml-security 1.0, Tarantula localizes faults to 14.77% of code on average, which amount to thousands of statements [15]. Moreover, in the context of evolution, a vast majo...