BACKGROUNDThere is considerable variation in disease behavior among patients infected with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the virus that causes coronavirus disease 2019 . Genomewide association analysis may allow for the identification of potential genetic factors involved in the development of Covid-19. METHODSWe conducted a genomewide association study involving 1980 patients with Covid-19 and severe disease (defined as respiratory failure) at seven hospitals in the Italian and Spanish epicenters of the SARS-CoV-2 pandemic in Europe. After quality control and the exclusion of population outliers, 835 patients and 1255 control participants from Italy and 775 patients and 950 control participants from Spain were included in the final analysis. In total, we analyzed 8,582,968 single-nucleotide polymorphisms and conducted a meta-analysis of the two case-control panels. RESULTSWe detected cross-replicating associations with rs11385942 at locus 3p21.31 and with rs657152 at locus 9q34.2, which were significant at the genomewide level (P<5×10 −8 ) in the meta-analysis of the two case-control panels (odds ratio, 1.77; 95% confidence interval [CI], 1.48 to 2.11; P = 1.15×10 −10 ; and odds ratio, 1.32; 95% CI, 1.20 to 1.47; P = 4.95×10 −8 , respectively). At locus 3p21.31, the association signal spanned the genes SLC6A20, LZTFL1, CCR9, FYCO1, CXCR6 and XCR1. The association signal at locus 9q34.2 coincided with the ABO blood group locus; in this cohort, a blood-group-specific analysis showed a higher risk in blood group A than in other blood groups (odds ratio, 1.45; 95% CI, 1.20 to 1.75; P = 1.48×10 −4 ) and a protective effect in blood group O as compared with other blood groups (odds ratio, 0.65; 95% CI, 0.53 to 0.79; P = 1.06×10 −5 ). CONCLUSIONSWe identified a 3p21.31 gene cluster as a genetic susceptibility locus in patients with Covid-19 with respiratory failure and confirmed a potential involvement of the ABO blood-group system. (Funded by Stein Erik Hagen and others.
Background. Respiratory failure is a key feature of severe Covid-19 and a critical driver of mortality, but for reasons poorly defined affects less than 10% of SARS-CoV-2 infected patients. Methods. We included 1,980 patients with Covid-19 respiratory failure at seven centers in the Italian and Spanish epicenters of the SARS-CoV-2 pandemic in Europe (Milan, Monza, Madrid, San Sebastian and Barcelona) for a genome-wide association analysis. After quality control and exclusion of population outliers, 835 patients and 1,255 population-derived controls from Italy, and 775 patients and 950 controls from Spain were included in the final analysis. In total we analyzed 8,582,968 single-nucleotide polymorphisms (SNPs) and conducted a meta-analysis of both case-control panels. Results. We detected cross-replicating associations with rs11385942 at chromosome 3p21.31 and rs657152 at 9q34, which were genome-wide significant (P<5x10-8) in the meta-analysis of both study panels, odds ratio [OR], 1.77; 95% confidence interval [CI], 1.48 to 2.11; P=1.14x10-10 and OR 1.32 (95% CI, 1.20 to 1.47; P=4.95x10-8), respectively. Among six genes at 3p21.31, SLC6A20 encodes a known interaction partner with angiotensin converting enzyme 2 (ACE2). The association signal at 9q34 was located at the ABO blood group locus and a blood-group-specific analysis showed higher risk for A-positive individuals (OR=1.45, 95% CI, 1.20 to 1.75, P=1.48x10-4) and a protective effect for blood group O (OR=0.65, 95% CI, 0.53 to 0.79, P=1.06x10-5). Conclusions. We herein report the first robust genetic susceptibility loci for the development of respiratory failure in Covid-19. Identified variants may help guide targeted exploration of severe Covid-19 pathophysiology.
Background The number of applications of deep learning algorithms in bioinformatics is increasing as they usually achieve superior performance over classical approaches, especially, when bigger training datasets are available. In deep learning applications, discrete data, e.g. words or n-grams in language, or amino acids or nucleotides in bioinformatics, are generally represented as a continuous vector through an embedding matrix. Recently, learning this embedding matrix directly from the data as part of the continuous iteration of the model to optimize the target prediction – a process called ‘end-to-end learning’ – has led to state-of-the-art results in many fields. Although usage of embeddings is well described in the bioinformatics literature, the potential of end-to-end learning for single amino acids, as compared to more classical manually-curated encoding strategies, has not been systematically addressed. To this end, we compared classical encoding matrices, namely one-hot, VHSE8 and BLOSUM62, to end-to-end learning of amino acid embeddings for two different prediction tasks using three widely used architectures, namely recurrent neural networks (RNN), convolutional neural networks (CNN), and the hybrid CNN-RNN. Results By using different deep learning architectures, we show that end-to-end learning is on par with classical encodings for embeddings of the same dimension even when limited training data is available, and might allow for a reduction in the embedding dimension without performance loss, which is critical when deploying the models to devices with limited computational capacities. We found that the embedding dimension is a major factor in controlling the model performance. Surprisingly, we observed that deep learning models are capable of learning from random vectors of appropriate dimension. Conclusion Our study shows that end-to-end learning is a flexible and powerful method for amino acid encoding. Further, due to the flexibility of deep learning systems, amino acid encoding schemes should be benchmarked against random vectors of the same dimension to disentangle the information content provided by the encoding scheme from the distinguishability effect provided by the scheme.
Inflammatory bowel disease (IBD) is a chronic inflammatory disease of the gut. Genetic association studies have identified the highly variable human leukocyte antigen (HLA) region as the strongest susceptibility locus for IBD, and specifically DRB1*01:03 as a determining factor for ulcerative colitis (UC). However, for most of the association signal such a delineation could not be made due to tight structures of linkage disequilibrium within the HLA. The aim of this study was therefore to further characterize the HLA signal using a trans-ethnic approach. We performed a comprehensive fine mapping of single HLA alleles in UC in a cohort of 9272 individuals with African American, East Asian, Puerto Rican, Indian and Iranian descent and 40 691 previously analyzed Caucasians, additionally analyzing whole HLA haplotypes. We computationally characterized the binding of associated HLA alleles to human self-peptides and analysed the physico-chemical properties of the HLA proteins and predicted self-peptidomes. Highlighting alleles of the HLA-DRB1*15 group and their correlated HLA-DQ-DR haplotypes, we identified consistent associations (regarding effects directions/magnitudes) across different ethnicities but also identified population-specific signals (regarding differences in allele frequencies). We observed that DRB1*01:03 is mostly present in individuals of Western European descent and hardly present in non-Caucasian individuals. We found peptides predicted to bind to risk HLA alleles to be rich in positively charged amino acids such. We conclude that the HLA plays an important role for UC susceptibility across different ethnicities. This research further implicates specific features of peptides that are predicted to bind risk and protective HLA proteins.
Given the highly variable clinical phenotype of Coronavirus disease 2019 (COVID-19), a deeper analysis of the host genetic contribution to severe COVID-19 is important to improve our understanding of underlying disease mechanisms. Here, we describe an extended GWAS meta-analysis of a well-characterized cohort of 3255 COVID-19 patients with respiratory failure and 12 488 population controls from Italy, Spain, Norway and Germany/Austria, including stratified analyses based on age, sex and disease severity, as well as targeted analyses of chromosome Y haplotypes, the human leukocyte antigen (HLA) region and the SARS-CoV-2 peptidome. By inversion imputation, we traced a reported association at 17q21.31 to a ~ 0.9-Mb inversion polymorphism that creates two highly differentiated haplotypes and characterized the potential effects of the inversion in detail. Our data, together with the 5th release of summary statistics from the COVID-19 Host Genetics Initiative including non-Caucasian individuals, also identified a new locus at 19q13.33, including NAPSA, a gene which is expressed primarily in alveolar cells responsible for gas exchange in the lung.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.