Background
While higher genetic risk score (GRS) has been statistically associated with increased disease risk (broad‐sense validity), the concept and tools for assessing the validity of reported GRS values from tests (narrow‐sense validity) are underdeveloped.
Methods
We propose two benchmarks for assessing the narrow‐sense validity of GRS. The baseline benchmark requires that the mean GRS value in a general population approximates 1.0. The calibration benchmark assesses the agreement between observed risks and estimated risks (GRS values). We assessed benchmark performance for three prostate cancer (PCa) GRS tests, derived from three SNP panels with increasing stringency of selection criteria, in a PCa chemoprevention trial where 714 of 3225 men were diagnosed with PCa during the 4‐year follow‐up.
Results
GRS from Panels 1, 2, and 3 were all statistically associated with PCa risk; P = 5.58 × 10−3, P = 1 × 10−3, and P = 1.5 × 10−13, respectively (broad‐sense validity). For narrow‐sense validity, the mean GRS value among men without PCa was 1.33, 1.09, and 0.98 for Panels 1, 2, and 3, respectively (baseline benchmark). For assessing the calibration benchmark, observed risks were calculated for seven groups of men with GRS values <0.3, 0.3–0.79, 0.8–1.19, 1.2‐1.49, 1.5‐1.99, 2‐2.99, and ≥3. The calibration slope (higher is better) was 0.15, 0.12, and 0.60, and the bias score (lower is better) between the observed risks and GRS values was 0.08, 0.08, and 0.02 for Panels 1, 2, and 3, respectively.
Conclusion
Performance differed considerably among GRS tests. We recommend that all GRS tests be evaluated using the two benchmarks before clinical implementation for individual risk assessment.