Significance Theory predicts that chronic pathogens with vertical or familial transmission should become less virulent over time because of coevolution. Although transmitted in this way, Helicobacter pylori is the major causative agent of gastric cancer. In two distinct Colombian populations with similar levels of H. pylori infection but different incidences of gastric cancer, we examined human and pathogen ancestry in matched samples to assess whether their genomic variation affects the severity of premalignant lesions. Interaction between human Amerindian ancestry and H. pylori African ancestry accounted for the geographic disparity in clinical presentation. We conclude that coevolutionary relationships are important determinants of gastric disease risk and that the historical colonization of the Americas continues to influence health in modern American populations.
A major goal in infectious disease research is to identify the human and pathogenic genetic variants that explain differences in microbial pathogenesis. However, neither pathogenic strain nor human genetic variation in isolation has proven adequate to explain the heterogeneity of disease pathology. We suggest that disrupted co-evolution between a pathogen and its human host can explain variation in disease outcomes, and that genome-by-genome interactions should therefore be incorporated into genetic models of disease caused by infectious agents. Genetic epidemiological studies that fail to take both the pathogen and host into account can lead to false and misleading conclusions about disease etiology. We discuss our model in the context of three pathogens, Helicobacter pylori, Mycobacterium tuberculosis and human papillomavirus, and generalize the conditions under which it may be applicable.
Immunosuppression resulting from HIV infection increases the risk of progression to active tuberculosis (TB) both in individuals newly exposed to Mycobacterium tuberculosis (MTB) and in those with latent infections. We hypothesized that HIV-positive individuals who do not develop TB, despite living in areas where it is hyperendemic, provide a model of natural resistance. We performed a genome-wide association study of TB resistance by using 581 HIV-positive Ugandans and Tanzanians enrolled in prospective cohort studies of TB; 267 of these individuals developed active TB, and 314 did not. A common variant, rs4921437 at 5q33.3, was significantly associated with TB (odds ratio = 0.37, p = 2.11 × 10(-8)). This variant lies within a genomic region that includes IL12B and is embedded in an H3K27Ac histone mark. The locus also displays consistent patterns of linkage disequilibrium across African populations and has signals of strong selection in populations from equatorial Africa. Along with prior studies demonstrating that therapy with IL-12 (the cytokine encoded in part by IL12B, associated with longer survival following MTB infection in mice deficient in CD4 T cells), our results suggest that this pathway might be an excellent target for the development of new modalities for treating TB, especially for HIV-positive individuals. Our results also indicate that studying extreme disease resistance in the face of extensive exposure can increase the power to detect associations in complex infectious disease.
SUMMARY The number of effectively independent tests performed in genome-wide association studies and the corresponding genome-wide significance level varies by population. Therefore, a common p-value threshold may be inappropriate. To assess this, we estimated the number of independent SNPs for all Phase 3 HapMap samples using the LD pruning function in PLINK. We also used an autocorrelation-based approach to verify the HapMap findings, and tested it on 1000 Genomes data to estimate the number of independent tests in whole genome sequences. The number of effectively independent tests performed in genome-wide association studies (GWAS) varies by population, making a universal p-value threshold inappropriate. We estimated the number of independent SNPs in Phase 3 HapMap samples by: (1) the LD pruning function in PLINK, and (2) an autocorrelation-based approach. Autocorrelation was also used to estimate the number of independent SNPs in whole genome sequences from 1000 Genomes. Both approaches yielded consistent estimates of numbers of independent SNPs, which were used to calculate new population-specific thresholds for genome-wide significance. African populations had the most stringent thresholds (1.49×10−7 for YRI at r2=0.3), East Asian populations the least (3.75×10−7 for JPT at r2=0.3). We also assessed how using population-specific significance thresholds compared to using a single multiple testing threshold at the conventional 5×10−8 cutoff. Applied to a previously published GWAS of melanoma in Caucasians, our approach identified two additional genes, both previously associated with the phenotype. In a Chinese breast cancer GWAS, our approach identified 48 additional genes, 19 of which were in or near genes previously associated with the phenotype. We conclude that the conventional genome-wide significance threshold generates an excess of Type 2 errors, particularly in GWAS performed on more recently founded populations.
Populations in sub-Saharan Africa are shifting from rural to increasingly urban. Although the burden of cardiovascular disease is expected to increase with this changing landscape, few large studies have assessed a wide range of risk factors in urban and rural populations, particularly in West Africa. We conducted a cross-sectional, population-based survey of 3317 participants from Ghana (≥18 years old), of whom 2265 (57% female) were from a mid-sized city (Sunyani, population ~250,000) and 1052 (55% female) were from surrounding villages (populations <5000). We measured canonical cardiovascular disease risk factors (BMI, blood pressure, fasting glucose, lipids) and fibrinolytic markers (PAI-1 and t-PA), and assessed how their distributions and related clinical outcomes (including obesity, hypertension and diabetes) varied with urban residence and sex. Urban residence was strongly associated with obesity (OR: 7.8, 95% CI: 5.3–11.3), diabetes (OR 3.6, 95% CI: 2.3–5.7), and hypertension (OR 3.2, 95% CI: 2.6–4.0). Among the quantitative measures, most affected were total cholesterol (+0.81 standard deviations, 95% CI 0.73–0.88), LDL cholesterol (+0.89, 95% CI: 0.79–0.99), and t-PA (+0.56, 95% CI: 0.48–0.63). Triglycerides and HDL cholesterol profiles were similarly poor in both urban and rural environments, but significantly worse among rural participants after BMI-adjustment. For most of the risk factors, the strength of the association with urban residence did not vary with sex. Obesity was a major exception, with urban women at particularly high risk (26% age-standardized prevalence) compared to urban men (7%). Overall, urban residents had substantially worse cardiovascular risk profiles, with some risk factors at levels typically seen in the developed world.
One in three people has been infected with Mycobacterium tuberculosis (MTB), and the risk for MTB infection in HIV-infected individuals is even higher. We hypothesized that HIV-positive individuals living in tuberculosis-endemic regions who do not get infected by Mycobacterium tuberculosis are genetically resistant. Using an “experiment of nature” design that proved successful in our previous work, we performed a genome-wide association study of tuberculin skin test positivity using 469 HIV-positive patients from prospective study cohorts of tuberculosis from Tanzania and Uganda to identify genetic loci associated with MTB infection in the context of HIV-infection. Among these individuals, 244 tested were tuberculin skin test (TST) positive either at enrollment or during the >8 year follow up, while 225 were not. We identified a genome-wide significant association between a dominant model of rs877356 and binary TST status in the combined cohort (Odds ratio = 0.2671, p = 1.22x10-8). Association was replicated with similar significance when examining TST induration as a continuous trait. The variant lies in the 5q31.1 region, 57kb downstream from IL9. Two-locus analyses of association of variants near rs877356 showed a haplotype comprised of rs877356 and an IL9 missense variant, rs2069885, had the most significant association (p = 1.59x10-12). We also replicated previously linked loci on chromosomes 2, 5, and 11. IL9 is a cytokine produced by mast cells and TH2 cells during inflammatory responses, providing a possible link between airway inflammation and protection from MTB infection. Our results indicate that studying uninfected, HIV-positive participants with extensive exposure increases the power to detect associations in complex infectious disease.
In omic research, such as genome wide association studies, researchers seek to repeat their results in other datasets to reduce false positive findings and thus provide evidence for the existence of true associations. Unfortunately this standard validation approach cannot completely eliminate false positive conclusions, and it can also mask many true associations that might otherwise advance our understanding of pathology. These issues beg the question: How can we increase the amount of knowledge gained from high throughput genetic data? To address this challenge, we present an approach that complements standard statistical validation methods by drawing attention to both potential false negative and false positive conclusions, as well as providing broad information for directing future research. The Diverse Convergent Evidence approach (DiCE) we propose integrates information from multiple sources (omics, informatics, and laboratory experiments) to estimate the strength of the available corroborating evidence supporting a given association. This process is designed to yield an evidence metric that has utility when etiologic heterogeneity, variable risk factor frequencies, and a variety of observational data imperfections might lead to false conclusions. We provide proof of principle examples in which DiCE identified strong evidence for associations that have established biological importance, when standard validation methods alone did not provide support. If used as an adjunct to standard validation methods this approach can leverage multiple distinct data types to improve genetic risk factor discovery/validation, promote effective science communication, and guide future research directions.
BackgroundMetabolic syndrome (MetS) is diagnosed by the presence of at least 3 of the following: obesity, hypertension, hyperglycemia, hypertriglyceridemia, and low high‐density lipoprotein. Individuals with MetS also typically have elevated plasma levels of the antifibrinolytic factor, plasminogen activator inhibitor‐1 (PAI‐1), but the relationships between PAI‐1 and MetS diagnostic criteria are not clear. Understanding these relationships can elucidate the relevance of MetS to cardiovascular disease risk, because PAI‐1 is associated with ischemic events and directly involved in thrombosis.Methods and ResultsIn a cross‐sectional analysis of 2220 Ghanaian men and women from urban and rural locales, we found the age‐standardized prevalence of MetS to be as high as 21.4% (urban women). PAI‐1 level increased exponentially as the number of diagnostic criteria increased linearly (P<10−13), supporting the conclusion that MetS components have a joint effect that is stronger than their additive contributions. Body mass index, triglycerides, and fasting glucose were more strongly correlated with PAI‐1 than with canonical MetS criteria, and this pattern did not change when pair‐wise correlations were conditioned on all other risk factors, supporting an independent role for PAI‐1 in MetS. Finally, whereas the correlations between conventional risk factors did not vary significantly by sex or across urban and rural environments, correlations with PAI‐1 were generally stronger among urban participants.ConclusionsMetS prevalence in the West African population we studied was comparable to that of the industrialized West. PAI‐1 may serve as a key link between MetS, as currently defined, and the endpoints with which it is associated. Whether this association is generalizable will require follow‐up.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.