The combined impact of common and rare exonic variants in COVID-19 host genetics is currently insufficiently understood. Here, common and rare variants from whole-exome sequencing data of about 4000 SARS-CoV-2-positive individuals were used to define an interpretable machine-learning model for predicting COVID-19 severity. First, variants were converted into separate sets of Boolean features, depending on the absence or the presence of variants in each gene. An ensemble of LASSO logistic regression models was used to identify the most informative Boolean features with respect to the genetic bases of severity. The Boolean features selected by these logistic models were combined into an Integrated PolyGenic Score that offers a synthetic and interpretable index for describing the contribution of host genetics in COVID-19 severity, as demonstrated through testing in several independent cohorts. Selected features belong to ultra-rare, rare, low-frequency, and common variants, including those in linkage disequilibrium with known GWAS loci. Noteworthily, around one quarter of the selected genes are sex-specific. Pathway analysis of the selected genes associated with COVID-19 severity reflected the multi-organ nature of the disease. The proposed model might provide useful information for developing diagnostics and therapeutics, while also being able to guide bedside disease management.
SARS-CoV-2 has caused a worldwide epidemic of enormous proportions, which resulted in different mortality rates in different countries for unknown reasons. We analyzed factors associated with mortality using data from the Italian national database of more than 4 million SARS-CoV-2-positive cases diagnosed between January 2020 and July 2021, including > 415 thousand hospitalized for coronavirus disease-19 (COVID-19) and > 127 thousand deceased. For patients for whom age, sex and date of infection detection were available, we determined the impact of these variables on mortality 30 days after the date of diagnosis or hospitalization. Multivariable weighted Cox analysis showed that each of the analyzed variables independently affected COVID-19 mortality. Specifically, in the overall series, age was the main risk factor for mortality, with HR > 100 in the age groups older than 65 years compared with a reference group of 15–44 years. Male sex presented a two-fold higher risk of death than female sex. Patients infected after the first pandemic wave (i.e. after 30 June 2020) had an approximately threefold lower risk of death than those infected during the first wave. Thus, in a series of all confirmed SARS-CoV-2-infected cases in an entire European nation, elderly age was by far the most significant risk factor for COVID-19 mortality, confirming that protecting the elderly should be a priority in pandemic management. Male sex and being infected during the first wave were additional risk factors associated with COVID-19 mortality.
Red cell polymorphisms can provide evidence of human migration and adaptation patterns. In Eurasia, the distribution of Diego blood group system polymorphisms remains unaddressed. To shed light on the dispersal of the Di antigen, we performed analyses of correlations between the frequencies of DI*01 allele, C2-M217 and C2-M401 Y-chromosome haplotypes ascribed as being of Mongolian-origin and language affiliations, in 75 Eurasian populations including DI*01 frequency data from the HGDP-CEPH panel. We revealed that DI*01 reaches its highest frequency in Mongolia, Turkmenistan and Kyrgyzstan, expanding southward and westward across Asia with Altaic-speaking nomadic carriers of C2-M217, and even more precisely C2-M401, from their homeland presumably in Mongolia, between the third century BCE and the thirteenth century CE. The present study has highlighted the gene-culture co-migration with the demographic movements that occurred during the past two millennia in Central and East Asia. Additionally, this work contributes to a better understanding of the distribution of immunogenic erythrocyte polymorphisms with a view to improve transfusion safety.
The risk of colorectal cancer (CRC) depends on environmental and genetic factors. Among environmental factors, an imbalance in the gut microbiota can increase CRC risk. Also, microbiota is influenced by host genetics. However, it is not known if germline variants influence CRC development by modulating microbiota composition. We investigated germline variants associated with the abundance of bacterial populations in the normal (non-involved) colorectal mucosa of 93 CRC patients and evaluated their possible role in disease. Using a multivariable linear regression, we assessed the association between germline variants identified by genome wide genotyping and bacteria abundances determined by 16S rRNA gene sequencing. We identified 37 germline variants associated with the abundance of the genera Bacteroides, Ruminococcus, Akkermansia, Faecalibacterium and Gemmiger and with alpha diversity. These variants are correlated with the expression of 58 genes involved in inflammatory responses, cell adhesion, apoptosis and barrier integrity. Genes and bacteria appear to be involved in the same processes. In fact, expression of the pro-inflammatory genes GAL, GSDMD and LY6H was correlated with the abundance of Bacteroides, which has pro-inflammatory properties; abundance of the anti-inflammatory genus Faecalibacterium correlated with expression of KAZN, with barrier-enhancing functions. Both the microbiota composition and local inflammation are regulated, at least partially, by the same germline variants. These variants may regulate the microenvironment in which bacteria grow and predispose to the development of cancer. Identification of these variants is the first step to identifying higher-risk individuals and proposing tailored preventive treatments that increase beneficial bacterial populations.
Emerging evidence suggests that the prognosis of patients with lung adenocarcinoma can be determined from germline variants and transcript levels in nontumoral lung tissue. Gene expression data from noninvolved lung tissue of 483 lung adenocarcinoma patients were tested for correlation with overall survival using multivariable Cox proportional hazard and multivariate machine learning models. For genes whose transcript levels are associated with survival, we used genotype data from 414 patients to identify germline variants acting as cis ‐expression quantitative trait loci (eQTLs). Associations of eQTL variant genotypes with gene expression and survival were tested. Levels of four transcripts were inversely associated with survival by Cox analysis ( CLCF1 , hazard ratio [HR] = 1.53; CNTNAP1 , HR = 2.17; DUSP14 , HR = 1.78; and MT1F : HR = 1.40). Machine learning analysis identified a signature of transcripts associated with lung adenocarcinoma outcome that was largely overlapping with the transcripts identified by Cox analysis, including the three most significant genes ( CLCF1 , CNTNAP1 , and DUSP14 ). Pathway analysis indicated that the signature is enriched for ECM components. We identified 32 cis ‐eQTLs for CNTNAP1 , including 6 with an inverse correlation and 26 with a direct correlation between the number of minor alleles and transcript levels. Of these, all but one were prognostic: the six with an inverse correlation were associated with better prognosis (HR < 1) while the others were associated with worse prognosis. Our findings provide supportive evidence that genetic predisposition to lung adenocarcinoma outcome is a feature already present in patients' noninvolved lung tissue.
Background SARS-CoV-2 has caused a worldwide epidemic of enormous proportions, which resulted in different mortality rates in different countries for unknown reasons. Aim We aimed to evaluate which independent parameters are associated with risk of mortality from COVID-19 in a series that includes all Italian cases, ie, more than 4 million individuals infected with the SARS-CoV-2 coronavirus. Methods We analyzed factors associated with mortality using data from the Italian national database of SARS-CoV-2-positive cases, including more than 4 million cases, >415 thousand hospitalized for coronavirus disease-19 (COVID-19) and >127 thousand deceased. For patients for whom age, sex and date of infection detection were available, we determined the impact of these variables on mortality 30 days after the date of diagnosis or hospitalization. Results Multivariable Cox analysis showed that each of the analyzed variables independently affected COVID-19 mortality. Specifically, in the overall series, age was the main risk factor for mortality, with HR >100 in the age groups older than 65 years compared with a reference group of 15-44 years. Male sex presented an excess risk of death (HR = 2.1; 95% CI, 2.0-2.1). Patients infected in the first pandemic wave (before 30 June 2020) had a greater risk of death than those infected later (HR = 2.7; 95% CI, 2.7-2.8). Conclusions In a series of all confirmed SARS-CoV-2-infected cases in an entire European nation, elderly age was by far the most significant risk factor for COVID-19 mortality, confirming that protecting the elderly should be a priority in pandemic management. Male sex and being infected during the first wave were additional risk factors associated with COVID-19 mortality.
SARS-CoV-2 has caused a worldwide epidemic of enormous proportions, which resulted in different mortality rates in different countries for unknown reasons. We analyzed factors associated with mortality using data from the Italian national database of SARS-CoV-2-positive cases, including more than 4 million cases, >415 thousand hospitalized for coronavirus disease-19 (COVID-19) and >127 thousand deceased. For patients for whom age, sex and date of infection detection were available, we determined the impact of these variables on mortality 30 days after the date of diagnosis or hospitalization. Multivariable Cox analysis showed that each of the analyzed variables independently affected COVID-19 mortality. Specifically, in the overall series, age was the main risk factor for mortality, with HR >100 in the age groups older than 65 years compared with a reference group of 15-44 years. Male sex presented a two-fold higher risk of death than females. Patients infected after the first pandemic wave, defined up to 30 June 2020, had about 3-fold lower risk of death than those infected during the first wave. Thus, in a series of all confirmed SARS-CoV-2-infected cases in an entire European nation, elderly age was by far the most significant risk factor for COVID-19 mortality, confirming that protecting the elderly should be a priority in pandemic management. Male sex and being infected during the first wave were additional risk factors associated with COVID-19 mortality.
The risk of colorectal cancer (CRC) depends on environmental and genetic factors. Among environmental factors, an imbalance in the gut microbiota can increase CRC risk. Also, microbiota is influenced by host genetics. However, it is not known if germline variants influence CRC development by modulating microbiota composition. We investigated germline variants associated with the abundance of bacterial populations in the normal (non-involved) colorectal mucosa of 93 CRC patients and evaluated their possible role in disease. Using a multivariable linear regression, we assessed the association between germline variants identified by genome wide genotyping and bacteria abundances determined by 16S rRNA gene sequencing. We identified 37 germline variants associated with the abundance of the genera Bacteroides, Ruminococcus, Akkermansia, Faecalibacterium and Gemmiger and with alpha diversity. These variants are correlated with the expression of 58 genes involved in inflammatory responses, cell adhesion, apoptosis and barrier integrity. Genes and bacteria appear to be involved in the same processes. In fact, expression of the pro-inflammatory genes GAL, GSDMD and LY6H was correlated with the abundance of Bacteroides, which has pro-inflammatory properties; abundance of the anti-inflammatory genus Faecalibacterium correlated with expression of KAZN, with barrier-enhancing functions. Both the microbiota composition and local inflammation are regulated, at least partially, by the same germline variants. These variants may regulate the microenvironment in which bacteria grow and predispose to the development of cancer. Identification of these variants is the first step to identifying higher-risk individuals and proposing tailored preventive treatments that increase beneficial bacterial populations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.