Whole-exome sequencing (WES) plays a crucial role in diagnosing genetic diseases by identifying germline variants. However, reproducibility issues limit its clinical utility. We conducted a large-scale proficiency test across 89 clinical and commercial labs in China, employing the well-characterized Quartet DNA reference materials, to evaluate the impact of experimental and bioinformatic factors on the performance of small variant detection. We observed significant variability in sequencing data quality and variant calling performance, with higher raw read quality and lower contamination levels improved variant detection. Our findings emphasized the collective influence of multiple factors on variant detection, with capture efficiency metrics, such as fold-80 penalty, on-target rate, and target region coverage, instead of base-by-base quality metrics on raw sequences, emerging as the most critical. Our study not only revealed the nationwide performance of WES in China, but also provided actionable best practices for optimizing the entire WES process, from data generation to analysis, thereby enhancing variant detection quality and reliability.