Modern gear fault detection analysis began with algorithms based on the time synchronous average. Over the course of decades, many gear analyses have been proposed, but with no evidence that the analysis was significantly more powerful in terms of fault detection than existing algorithms. This study focuses on a comprehensive comparison of gear fault detection algorithms to evaluate their performance. Using a large, statistically significant set of data from three nominal machines and a damaged machine, the CI responses of 88 different analysis are compared in terms of their statistical significance to detect a cracked tooth. The comparison includes residual, energy operator (and its variants), the narrowband analysis (with a comparison of bandwidth requirements), the amplitude and frequency modulation analysis, an analysis of variance of the "factor” analysis: crest, shape, impulse, and margin, and other standard gear fault CIs. Further, the effect of CI selection in the establishment of gear component health is evaluated, where given a set of CIs, a gear health indicator is built, showing that CIs with high statistically separability and low correlation have improved fault detection power. This is validated on a third, dissimilar gear fault propagation test.