Candidate gene and genome-wide association studies (GWAS) represent two complementary approaches to uncovering genetic contributions to common diseases. We systematically reviewed the contributions of these approaches to our knowledge of genetic associations with cancer risk by analyzing the data in the Cancer Genome-wide Association and Meta Analyses database (Cancer GAMAdb). The database catalogs studies published since January 1, 2000, by study and cancer type. In all, we found that meta-analyses and pooled analyses of candidate genes reported 349 statistically significant associations and GWAS reported 269, for a total of 577 unique associations. Only 41 (7.1%) associations were reported in both candidate gene meta-analyses and GWAS, usually with similar effect sizes. When considering only noteworthy associations (defined as those with false-positive report probabilities r0.2) and accounting for indirect overlap, we found 202 associations, with 27 of those appearing in both meta-analyses and GWAS. Our findings suggest that meta-analyses of well-conducted candidate gene studies may continue to add to our understanding of the genetic associations in the post-GWAS era.
Differences in genetic ancestry and socioeconomic status (SES) among Latin American populations have been linked to health disparities for a number of complex diseases, such as diabetes. We used a population genomic approach to investigate the role that genetic ancestry and socioeconomic status (SES) play in the epidemiology of type 2 diabetes (T2D) for two Colombian populations: Chocó (Afro-Latino) and Antioquia (Mestizo). Chocó has significantly higher predicted genetic risk for T2D compared to Antioquia, and the elevated predicted risk for T2D in Chocó is correlated with higher African ancestry. Despite its elevated predicted genetic risk, the population of Chocó has a three-times lower observed T2D prevalence than Antioquia, indicating that environmental factors better explain differences in T2D outcomes for Colombia. Chocó has substantially lower SES than Antioquia, suggesting that low SES in Chocó serves as a protective factor against T2D. The combination of lower prevalence of T2D and lower SES in Chocó may seem surprising given the protective nature of elevated SES in many populations in developed countries. However, low SES has also been documented to be a protective factor in rural populations in less developed countries, and this appears to be the case when comparing Chocó to Antioquia.
Background Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the cause of coronavirus disease 2019 (COVID-19), has spread globally and is being surveilled with an international genome sequencing effort. Surveillance consists of sample acquisition, library preparation, and whole genome sequencing. This has necessitated a classification scheme detailing Variants of Concern (VOC) and Variants of Interest (VOI), and the rapid expansion of bioinformatics tools for sequence analysis. These bioinformatic tools are means for major actionable results: maintaining quality assurance and checks, defining population structure, performing genomic epidemiology, and inferring lineage to allow reliable and actionable identification and classification. Additionally, the pandemic has required public health laboratories to reach high throughput proficiency in sequencing library preparation and downstream data analysis rapidly. However, both processes can be limited by a lack of a standardized sequence dataset. Methods We identified six SARS-CoV-2 sequence datasets from recent publications, public databases and internal resources. In addition, we created a method to mine public databases to identify representative genomes for these datasets. Using this novel method, we identified several genomes as either VOI/VOC representatives or non-VOI/VOC representatives. To describe each dataset, we utilized a previously published datasets format, which describes accession information and whole dataset information. Additionally, a script from the same publication has been enhanced to download and verify all data from this study. Results The benchmark datasets focus on the two most widely used sequencing platforms: long read sequencing data from the Oxford Nanopore Technologies platform and short read sequencing data from the Illumina platform. There are six datasets: three were derived from recent publications; two were derived from data mining public databases to answer common questions not covered by published datasets; one unique dataset representing common sequence failures was obtained by rigorously scrutinizing data that did not pass quality checks. The dataset summary table, data mining script and quality control (QC) values for all sequence data are publicly available on GitHub: https://github.com/CDCgov/datasets-sars-cov-2. Discussion The datasets presented here were generated to help public health laboratories build sequencing and bioinformatics capacity, benchmark different workflows and pipelines, and calibrate QC thresholds to ensure sequencing quality. Together, improvements in these areas support accurate and timely outbreak investigation and surveillance, providing actionable data for pandemic management. Furthermore, these publicly available and standardized benchmark data will facilitate the development and adjudication of new pipelines.
Pathogen genetics is already a mainstay of public health investigation and control efforts; now advances in technology make it possible to investigate the role of human genetic variation in the epidemiology of infectious diseases. To describe trends in this field, we analyzed articles that were published from 2001 through 2010 and indexed by the HuGE Navigator, a curated online database of PubMed abstracts in human genome epidemiology. We extracted the principal findings from all meta-analyses and genome-wide association studies (GWAS) with an infectious disease-related outcome. Finally, we compared the representation of diseases in HuGE Navigator with their contributions to morbidity worldwide. We identified 3,730 articles on infectious diseases, including 27 meta-analyses and 23 GWAS. The number published each year increased from 148 in 2001 to 543 in 2010 but remained a small fraction (about 7%) of all studies in human genome epidemiology. Most articles were by authors from developed countries, but the percentage by authors from resource-limited countries increased from 9% to 25% during the period studied. The most commonly studied diseases were HIV/AIDS, tuberculosis, hepatitis B infection, hepatitis C infection, sepsis, and malaria. As genomic research methods become more affordable and accessible, population-based research on infectious diseases will be able to examine the role of variation in human as well as pathogen genomes. This approach offers new opportunities for understanding infectious disease susceptibility, severity, treatment, control, and prevention.
Background: The apolipoprotein E gene (apoE) has three major isoforms encoded by the ε2, ε3, and ε4 alleles, with the ε4 allele associated with hypercholesterolemia and the ε2 allele with the opposite effect. An inverse relationship between cholesterolemia and head and neck cancer (HNC) has been previously reported, although the relationship between apoE genotypes and HNC has not been explored to date.Methods: Four hundred and seventeen HNC cases and 436 hospital controls were genotyped for apoE polymorphisms. Adjusted odds ratios (ORs) and 95% confidence intervals (CI) from logistic regression were used to explore the relationship between HNC and putative risk factors. A gene-environment interaction analysis was done.Results: A borderline significant 40% decreased HNC risk (OR, 0.58; 95% CI, 0.31-1.05) was observed for individuals carrying at least one ε2 allele. Females carrying at least one ε2 allele showed a 60% risk reduction (OR, 0.43; 95% CI, 0.21-0.90) for HNC compared with ε3 homozygotes. A statistically significant interaction was found between alcohol use and the ε4 allele (P for interaction = 0.04), with a 2-fold increased risk (OR, 2.06; 95% CI, 0.95-4.48) among ever drinkers with an ε4 allele, with respect to ε3 homozygote nondrinkers.Conclusions: Our study provides novel evidence of a possible protective effect of the ε2 allele against HNC, probably due to its increased antioxidant properties.Impact: According to our results, apolipoprotein E may play a different role in carcinogenesis other than its well-known role in regulating blood serum cholesterol levels. Cancer Epidemiol Biomarkers Prev; 19(11); 2839-46.
Genome-wide association studies have uncovered thousands of genetic variants that are associated with a wide variety of human traits. Knowledge of how trait-associated variants are distributed within and between populations can provide insight into the genetic basis of group-specific phenotypic differences, particularly for health-related traits. We analyzed the genetic divergence levels for (i) individual trait-associated variants and (ii) collections of variants that function together to encode polygenic traits, between two neighboring populations in Colombia that have distinct demographic profiles: Antioquia (Mestizo) and Chocó (Afro-Colombian). Genetic ancestry analysis showed 62% European, 32% Native American, and 6% African ancestry for Antioquia compared to 76% African, 10% European, and 14% Native American ancestry for Chocó, consistent with demography and previous results. Ancestry differences can confound cross-population comparison of polygenic risk scores (PRS); however, we did not find any systematic bias in PRS distributions for the two populations studied here, and population-specific differences in PRS were, for the most part, small and symmetrically distributed around zero. Both genetic differentiation at individual trait-associated SNPs and population-specific PRS differences between Antioquia and Chocó largely reflected anthropometric phenotypic differences that can be readily observed between the populations along with reported disease prevalence differences. Cases where population-specific differences in genetic risk did not align with observed trait (disease) prevalence point to the importance of environmental contributions to phenotypic variance, for both infectious and complex, common disease. The results reported here are distributed via a web-based platform for searching trait-associated variants and PRS divergence levels at http://map.chocogen.com.
5067 Background: African-American (AA) men have the highest rates of prostate cancer (PCa) incidence and mortality in the U.S. Screening for PCa with prostate specific antigen (PSA) has allowed detection of early stage disease, but side effects of radical prostatectomy and radiation raise concerns about unfavorable risk:benefit ratios of PSA screening and subsequent therapy. Active surveillance (AS) is an option for early-stage PCa (ESPC), but only 10% of men eligible for AS choose this approach. The 2011 NIH State-of-the-Science Conference promoted the need to enhance decision-making (DM) about AS. In 2012, the U.S. Preventive Services Task Force recommended against PSA screening, while encouraging patient DM. Our study examined DM needs by men (N=204; 68% AA; screening PSA within normal limits) and their significant others (SO) (N=181; 65% AA) regarding AS and other ESPC options. Methods: This multi-center, mixed methods study (N=402; 51% rural) included 5 sites nationwide. Subjects completed quantitative questionnaires prior to focus groups (FG); 54 FG were held, with separate groups for men and SO. Results: After adjusting for education, comorbidities, insurance, age, health literacy, distance to treatment center, willingness to travel, income and numeracy score, AA men were significantly more likely to be influenced by convenience (OR: 2.84, 95% CI: 1.42-5.65) compared to Caucasians. Rural residence, however, did not affect DM. In qualitative analysis, numerous themes were identified relevant to choice of AS: physician treatment discussions being limited to their own specialty; confusion due to conflicting sources of information; convenience; worry about untreated cancer remaining and treatment toxicities; and lack of awareness of AS as an option. SO tended to value cure over avoiding side effects. Conclusions: While the impact of new PCa screening guidelines is uncertain, for AS to become a viable treatment option, providers will need to discuss along with other therapeutic alternatives. SO are influential in DM and may be less enthusiastic about AS than men. For AA men, AS may be a particularly attractive option given the relative influence of convenience in DM.
Background: Resistance genes encoding β-lactamases (BLs) confer resistance to the widely prescribed antibiotic class, β-lactams. Therefore, the prevalence of BL genes in clinical or environmental samples is important for assessing the public health risk and the spreading of these genes into pathogens. However, identification of genes encoding BLs from short read metagenomes remains challenging due to the high frequency of shared amino acid functional domains and motifs in proteins encoded by BL genes and related, non-BL gene sequences. Accordingly, divergent BL homologs can be frequently missed during similarity searches, which has important practical consequences for monitoring antibiotic resistance.Results: To address this limitation, we built ROCker models that targeted either broad classes (e.g., class A, B, C and D) or individual families (e.g., TEM) of BLs and challenged them with mock 150 bp and 250 bp read data sets of known composition. ROCker identifies most-discriminant bit score thresholds in sliding windows along the sequence of the target protein sequence and hence, can account for non-discriminative domains shared by unrelated proteins. BL ROCker models showed a 0% false positive rate (FPR), a 0% to 4% false negative rate (FNR), and a up to 50-fold higher F1 score [2*precision*recall/(precision+recall)] compared to alternative methods, such as similarity searches using BLASTx with various e-value thresholds and BL Hidden Markov Models, or specialized tools for this purpose like DeepARG, ShortBRED and AMRFinder. The ROCker models and the underlying protein sequence reference data sets and phylogenetic trees for read placement are freely available through http://enve-omics.ce.gatech.edu/data/rocker-bla.Conclusions: Our results showcased the reliable detection and typing of short read sequences carrying BLs by ROCker models. Application of these BL ROCker models on metagenomics, metatranscriptomics as well as high-throughput PCR gene amplicon data should facilitate the reliable detection and quantification of BL variants encoded by environmental or clinical isolates and microbiomes, and more accurate assessment of the associated public health risk compared to the current practice.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.