Using genome-wide genotypes, we characterized the genetic structure of 103,006 participants in the Kaiser Permanente Northern California multi-ethnic Genetic Epidemiology Research on Adult Health and Aging Cohort and analyzed the relationship to selfreported race/ethnicity. Participants endorsed any of 23 race/ethnicity/nationality categories, which were collapsed into seven major race/ ethnicity groups. By self-report the cohort is 80.8% white and 19.2% minority; 93.8% endorsed a single race/ethnicity group, while 6.2% endorsed two or more. Principal component (PC) and admixture analyses were generally consistent with prior studies. Approximately 17% of subjects had genetic ancestry from more than one continent, and 12% were genetically admixed, considering only nonadjacent geographical origins. Self-reported whites were spread on a continuum along the first two PCs, indicating extensive mixing among European nationalities. Self-identified East Asian nationalities correlated with genetic clustering, consistent with extensive endogamy. Individuals of mixed East AsianEuropean genetic ancestry were easily identified; we also observed a modest amount of European genetic ancestry in individuals selfidentified as Filipinos. Self-reported African Americans and Latinos showed extensive European and African genetic ancestry, and Native American genetic ancestry for the latter. Among 3741 genetically identified parent-child pairs, 93% were concordant for self-reported race/ ethnicity; among 2018 genetically identified full-sib pairs, 96% were concordant; the lower rate for parent-child pairs was largely due to intermarriage. The parent-child pairs revealed a trend toward increasing exogamy over time; the presence in the cohort of individuals endorsing multiple race/ethnicity categories creates interesting challenges and future opportunities for genetic epidemiologic studies.
The success of genome-wide association studies has paralleled the development of efficient genotyping technologies. We describe the development of a next-generation microarray based on the new highly-efficient Affymetrix Axiom genotyping technology that we are using to genotype individuals of European ancestry from the Kaiser Permanente Research Program on Genes, Environment and Health (RPGEH). The array contains 674,517 SNPs, and provides excellent genome-wide as well as gene-based and candidate-SNP coverage. Coverage was calculated using an approach based on imputation and cross validation. Preliminary results for the first 80,301 saliva-derived DNA samples from the RPGEH demonstrate very high quality genotypes, with sample success rates above 94% and over 98% of successful samples having SNP call rates exceeding 98%. At steady state, we have produced 462 million genotypes per week for each Axiom system. The new array provides a valuable addition to the repertoire of tools for large scale genome-wide association studies.
The Kaiser Permanente (KP) Research Program on Genes, Environment and Health (RPGEH), in collaboration with the University of California-San Francisco, undertook genome-wide genotyping of .100,000 subjects that constitute the Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort. The project, which generated .70 billion genotypes, represents the first large-scale use of the Affymetrix Axiom Genotyping Solution. Because genotyping took place over a short 14-month period, creating a near-real-time analysis pipeline for experimental assay quality control and final optimized analyses was critical. Because of the multi-ethnic nature of the cohort, four different ethnic-specific arrays were employed to enhance genome-wide coverage. All assays were performed on DNA extracted from saliva samples. To improve sample call rates and significantly increase genotype concordance, we partitioned the cohort into disjoint packages of plates with similar assay contexts. Using strict QC criteria, the overall genotyping success rate was 103,067 of 109,837 samples assayed (93.8%), with a range of 92.1-95.4% for the four different arrays. Similarly, the SNP genotyping success rate ranged from 98.1 to 99.4% across the four arrays, the variation depending mostly on how many SNPs were included as single copy vs. double copy on a particular array. The high quality and large scale of genotype data created on this cohort, in conjunction with comprehensive longitudinal data from the KP electronic health records of participants, will enable a broad range of highly powered genome-wide association studies on a diversity of traits and conditions. KEYWORDS genome-wide genotyping; GERA cohort; Affymetrix Axiom; saliva DNA; quality control T HE Genetic Epidemiology Research on Adult Health and Aging (GERA) resource is a cohort of .100,000 subjects who are participants in the Kaiser Permanente Medical Care Plan, Northern California Region (KPNC), Research Program on Genes, Environment and Health (RPGEH) (detailed description of the cohort and study design can be found in dbGaP, Study Accession: phs000674.v1.p1). Genome-wide genotyping was targeted for this cohort to enable large-scale genome-wide association studies by linkage to comprehensive longitudinal clinical data derived from extensive KPNC electronic health record databases. The cohort is multi-ethnic, with 20% minority representation (African American, East Asian, and Latino or mixed), and the remaining 80% nonHispanic white. For this project, four ethnic-specific arrays were designed based on the Affymetrix Axiom Genotyping System (Hoffmann et al. 2011a,b). The genotyping assay experiment took place over a 14-month period and to our knowledge, is the single largest genotyping experiment to date, producing .70 billion genotypes. The magnitude of the experiment, in conjunction with the long duration and simultaneous high throughput, required new protocols for assuring quality control (QC) during the assays and new genotyping strategies in postassay data analysis.Samp...
Growing evidence supports the hypothesis that narcolepsy with cataplexy is an autoimmune disease. Using genome-wide association (GWA) in narcolepsy patients versus controls, with replication and fine mapping across three ethnic groups (3406 individuals of European ancestry, 2414 Asians, and 302 African Americans), we found a novel association between SNP rs2305795 in the 3′UTR of the purinergic receptor subtype 2Y11 (P2RY11) gene and narcolepsy (p(Mantel Haenszel)=6.1×10-10; odds ratio 1.28; n=5689). The disease-associated allele is correlated with a 3-fold lower expression of P2RY11 in CD8+ T lymphocytes (p=0.003) and natural killer (NK) cells (p=0.031) but not in other peripheral blood mononuclear cell (PBMC) types. The low expression variant is also associated with decreased P2RY11 mediated resistance to adenosine triphosphate (ATP) induced cell death in T lymphocytes (p=0.0007) and NK cells (p=0.001). These results identify P2RY11 as an important regulator of immune cell survival, with possible implications in narcolepsy and other autoimmune diseases.
Four custom Axiom genotyping arrays were designed for a genome-wide association (GWA) study of 100,000 participants from the Kaiser Permanente Research Program on Genes, Environment and Health. The array optimized for individuals of European race/ethnicity was previously described. Here we detail the development of three additional microarrays optimized for individuals of East Asian, African American, and Latino race/ethnicity. For these arrays, we decreased redundancy of high-performing SNPs to increase SNP capacity. The East Asian array was designed using greedy pairwise SNP selection. However, removing SNPs from the target set based on imputation coverage is more efficient than pairwise tagging. Therefore, we developed a novel hybrid SNP selection method for the African American and Latino arrays utilizing rounds of greedy pairwise SNP selection, followed by removal from the target set of SNPs covered by imputation. The arrays provide excellent genome-wide coverage and are valuable additions for large-scale GWA studies.
The Kaiser Permanente Research Program on Genes, Environment, and Health (RPGEH) Genetic Epidemiology Research on Adult Health and Aging (GERA) cohort includes DNA specimens extracted from saliva samples of 110,266 individuals. Because of its relationship to aging, telomere length measurement was considered an important biomarker to develop on these subjects. To assay relative telomere length (TL) on this large cohort over a short time period, we created a novel high throughput robotic system for TL analysis and informatics. Samples were run in triplicate, along with control samples, in a randomized design. As part of quality control, we determined the within-sample variability and employed thresholds for the elimination of outlying measurements. Of 106,902 samples assayed, 105,539 (98.7%) passed all quality control (QC) measures. As expected, TL in general showed a decline with age and a sex difference. While telomeres showed a negative correlation with age up to 75 years, in those older than 75 years, age positively correlated with longer telomeres, indicative of an association of longer telomeres with more years of survival in those older than 75. Furthermore, while females in general had longer telomeres than males, this difference was significant only for those older than age 50. An additional novel finding was that the variance of TL between individuals increased with age. This study establishes reliable assay and analysis methodologies for measurement of TL in large, population-based human studies. The GERA cohort represents the largest currently available such resource, linked to comprehensive electronic health and genotype data for analysis.KEYWORDS relative telomere length; GERA cohort; saliva DNA; robotic assay; quantitative PCR T ELOMERES are the protective DNA-protein complexes that cap the ends of eukaryotic chromosomes and are required for genome stability. The essential telomeric DNA consists of a tract of a tandemly repeated short sequence specified and maintained by the highly regulated reverse transcriptase action of the cellular enzyme telomerase. Telomeric DNA is susceptible to natural terminal erosion through a variety of processes including the end replication problem of linear chromosomal DNA, which causes telomeres to get shorter each time a somatic cell divides (Olovnikov 1973;
Breast cancer risk is a polygenic trait. To identify breast cancer modifier alleles that have a high population frequency and low penetrance we used a comparative genomics approach. Quantitative trait loci (QTL) were initially identified by linkage analysis in a rat mammary carcinogenesis model followed by verification in congenic rats carrying the specific QTL allele under study. The breast cancer genetics ͉ cancer epidemiology ͉ comparative genomics ͉ noncoding elements ͉ rat models D espite immense efforts, the search for modifier genes underlying complex diseases has not been highly productive. Alleles of modifier genes that influence common disease risk have a moderate-to-high population frequency with a low penetrance. It has been suggested that alleles acting in this manner comprise the majority of genetic risk for many common diseases such as breast cancer (1, 2). It is estimated that if most risk alleles were identified, it would become possible to assign Ϸ90% of breast cancer risk to 50% of women (3). In most studies, candidate modifier genes are selected based on function, such as DNA repair or estrogen metabolism for breast cancer. Over 100 such candidate modifier genes have been tested in breast cancer case-control association studies (Ͼ400 SNPs); few show a consistent and significant association with risk in large sample populations (4). These results suggest the need for an alternative strategy to identify breast cancer modifier genes. Our laboratory has pursued the identification of candidate loci by using wholegenome linkage studies in inbred rat mammary cancer models, followed by fine-mapping in congenic rats. The rat was chosen because, similar to humans, it develops mammary carcinomas that are hormone-responsive and of ductal origin (5, 6). Using a backcross of [Wistar-Kyoto (WKy) ϫ Wistar-Furth (WF)]F 1 ϫ WF rats, we identified four mammary carcinoma susceptibility QTL, Mcs5, Mcs6, Mcs7, and Mcs8, on rat chromosomes 5, 7, 10, and 14, respectively (7). The WKy allele of Mcs5 acts to suppress mammary tumor multiplicity in a susceptible WF genetic background and has been shown to consist of at least three clustered loci; among these, Mcs5a confers a phenotype of resistance to mammary cancer (8). Here we show that Mcs5a is a compound QTL located in a noncoding genomic region. We identified polymorphisms within the human genomic region orthologous to rat Mcs5a that significantly associate with breast cancer risk in women. ResultsTo further analyze the Mcs5a locus, WF.WKy congenic rat lines with different segments of the WKy allele were established and phenotyped for resistance to 7,12-dimethylbenzanthracene (DMBA)-induced carcinogenesis (Fig. 1). Mammary carcinoma susceptibility in DMBA-treated rats was reduced Ϸ50% for each congenic line O, WW, and XX (Fig. 1). The boundaries of the Mcs5a locus are given by the overlapping WKy sequences of congenic lines WW and XX, which define a genomic interval of Ϸ116 kb containing Mcs5a. By incorporating phenotypic data from additional congenic lines within ...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.