9Standard methods for case-control association studies of rare variation often treat disease outcome as a 10 dichotomous phenotype. However, both theoretical and experimental studies have demonstrated that subjects 11 with a family history of disease can be enriched for risk variation relative to subjects without such history. 12 Assuming family history information is available, this observation motivates the idea of replacing the standard 13 dichotomous outcome variable used in case-control studies with a more informative ordinal outcome variable 14 that distinguishes controls (0), sporadic cases (1), and cases with a family history (2), with the expectation 15 that we should observe increasing number of risk variants with increasing category of the ordinal variable. To 16 leverage this expectation, we propose a novel rare-variant association test that incorporates family history 17 information based on our previous GAMuT framework (Broadaway et al., 2016) for rare-variant association 18 testing of multivariate phenotypes. We use simulated data to show that, when family history information is 19 available, our new method outperforms standard rare-variant association methods like burden and SKAT 20 tests that ignore family history. We further illustrate our method using a rare-variant study of cleft lip and 21 palate. 22 3 23 Sequencing and exome-chip technologies facilitate the discovery of rare genetic variation influencing complex 24 diseases. Many rare-variant association studies of complex diseases now exist with most studies employing 25 traditional case-control sampling designs for analysis (De Rubeis et al., 2014; Sanders et al., 2017). Under 26 such a design, studies typically test whether patterns of rare variation within a gene or region of interest 27 differ between affected and unaffected subjects using either burden (Li and Leal, 2008) or variance-component 28 (Wu et al., 2011) approaches based on an underlying logistic-regression framework that treats disease status 29as a simple dichotomous outcome variable. While such an analysis strategy is commonplace, there may exist 30 helpful secondary information collected by the study that can facilitate the creation of a modified outcome 31 variable that is more refined than the coarse dichotomous outcome typically considered. Use of this refined 32 outcome variable within the study can reduce heterogeneity and potentially lead to more powerful analyses.
33One valuable source of secondary information often collected in a case-control study (but rarely utilized) is 34 whether a sample participant reports a family history of the disease under study. Subjects with a family 35 history of disease demonstrate different patterns of genetic variation than their sporadic counterparts. In 36 particular, several papers have noted that a sample of cases reporting affected relatives are more enriched for 37 a causal variant than cases without such family history (Teng and Risch, 1999; Zöllner, 2012; Epstein et al., 38 2015) since more risk variants tend to se...