Chronic kidney disease (CKD), impairment of kidney function, is a serious public health problem, and the assessment of genetic factors influencing kidney function has substantial clinical relevance. Here, we report a meta-analysis of genome-wide association studies for kidney function–related traits, including 71,149 east Asian individuals from 18 studies in 11 population-, hospital- or family-based cohorts, conducted as part of the Asian Genetic Epidemiology Network (AGEN). Our meta-analysis identified 17 loci newly associated with kidney function–related traits, including the concentrations of blood urea nitrogen, uric acid and serum creatinine and estimated glomerular filtration rate based on serum creatinine levels (eGFRcrea) (P < 5.0 × 10−8). We further examined these loci with in silico replication in individuals of European ancestry from the KidneyGen, CKDGen and GUGC consortia, including a combined total of ~110,347 individuals. We identify pleiotropic associations among these loci with kidney function–related traits and risk of CKD. These findings provide new insights into the genetics of kidney function.
Empirical residue-residue pair potentials are used to screen possible complexes for protein-protein dockings. A correct docking is defined as a complex with not more than 2.5 A root-mean-square distance from the known experimental structure. The complexes were generated by "ftdock" (Gabb et al. J Mol Biol 1997;272:106-120) that ranks using shape complementarity. The complexes studied were 5 enzyme-inhibitors and 2 antibody-antigens, starting from the unbound crystallographic coordinates, with a further 2 antibody-antigens where the antibody was from the bound crystallographic complex. The pair potential functions tested were derived both from observed intramolecular pairings in a database of nonhomologous protein domains, and from observed intermolecular pairings across the interfaces in sets of nonhomologous heterodimers and homodimers. Out of various alternate strategies, we found the optimal method used a mole-fraction calculated random model from the intramolecular pairings. For all the systems, a correct docking was placed within the top 12% of the pair potential score ranked complexes. A combined strategy was developed that incorporated "multidock," a side-chain refinement algorithm (Jackson et al. J Mol Biol 1998;276:265-285). This placed a correct docking within the top 5 complexes for enzyme-inhibitor systems, and within the top 40 complexes for antibody-antigen systems.
The results of the first Critical Assessment of Fully Automated Structure Prediction (CAFASP-1) are presented. The objective was to evaluate the success rates of fully automatic web servers for fold recognition which are available to the community. This study was based on the targets used in the third meeting on the Critical Assessment of Techniques for Protein Structure Prediction (CASP-3). However, unlike CASP-3, the study was not a blind trial, as it was held after the structures of the targets were known. The aim was to assess the performance of methods without the user intervention that several groups used in their CASP-3 submissions. Although it is clear that "human plus machine" predictions are superior to automated ones, this CAFASP-1 experiment is extremely valuable for users of our methods; it provides an indication of the performance of the methods alone, and not of the "human plus machine" performance assessed in CASP. This information may aid users in choosing which programs they wish to use and in evaluating the reliability of the programs when applied to their specific prediction targets. In addition, evaluation of fully automated methods is particularly important to assess their applicability at genomic scales. For each target, groups submitted the top-ranking folds generated from their servers. In CAFASP-1 we concentrated on fold-recognition web servers only and evaluated only recognition of the correct fold, and not, as in CASP-3, alignment accuracy. Although some performance differences appeared within each of the four target categories used here, overall, no single server has proved markedly superior to the others. The results showed that current fully automated fold recognition servers can often identify remote similarities when pairwise sequence search methods fail. Nevertheless, in only a few cases outside the family-level targets has the score of the top-ranking fold been significant enough to allow for a confident fully automated prediction. Because the goals, rules, and procedures of CAFASP-1 were different from those used at CASP-3, the results reported here are not comparable with those reported in CASP-3. Nevertheless, it is clear that current automated fold recognition methods can not yet compete with "human-expert plus machine" predictions. Finally, CAFASP-1 has been useful in identifying the requirements for a future blind trial of automated served-based protein structure prediction.
An 'intrinsically disordered protein' (IDP) is assumed to be unfolded in the cell and perform its biological function in that state. We contend that most intrinsically disordered proteins are in fact proteins waiting for a partner (PWPs), parts of a multi-component complex that do not fold correctly in the absence of other components. Flexibility, not disorder, is an intrinsic property of proteins, exemplified by X-ray structures of many enzymes and protein-protein complexes. Disorder is often observed with purified proteins in vitro and sometimes also in crystals, where it is difficult to distinguish from flexibility. In the crowded environment of the cell, disorder is not compatible with the known mechanisms of proteinprotein recognition, and, foremost, with its specificity. The self-assembly of multi-component complexes may, nevertheless, involve the specific recognition of nascent polypeptide chains that are incompletely folded, but then disorder is transient, and it must remain under the control of molecular chaperones and of the quality control apparatus that obviates the toxic effects it can have on the cell.
There is a pressing need for accurate in silico methods to predict the toxicity of molecules that are being introduced into the environment or are being developed into new pharmaceuticals. Predictive toxicology is in the realm of structure activity relationships (SAR), and many approaches have been used to derive such SAR. Previous work has shown that inductive logic programming (ILP) is a powerful approach that circumvents several major difficulties, such as molecular superposition, faced by some other SAR methods. The ILP approach reasons with chemical substructures within a relational framework and yields chemically understandable rules. Here, we report a general new approach, support vector inductive logic programming (SVILP), which extends the essentially qualitatiVe ILP-based SAR to quantitatiVe modeling. First, ILP is used to learn rules, the predictions of which are then used within a novel kernel to derive a support-vector generalization model. For a highly heterogeneous dataset of 576 molecules with known fathead minnow fish toxicity, the cross-validated correlation coefficients (R 2 CV ) from a chemical descriptor method (CHEM) and SVILP are 0.52 and 0.66, respectively. The ILP, CHEM, and SVILP approaches correctly predict 55, 58, and 73%, respectively, of toxic molecules. In a set of 165 unseen molecules, the R 2 values from the commercial software TOPKAT and SVILP are 0.26 and 0.57, respectively. In all calculations, SVILP showed significant improvements in comparison with the other methods. The SVILP approach has a major advantage in that it uses ILP automatically and consistently to derive rules, mostly novel, describing fragments that are toxicity alerts. The SVILP is a general machine-learning approach and has the potential of tackling many problems relevant to chemoinformatics including in silico drug design.
SARS-CoV-2 is a novel virus causing mainly respiratory, but also gastrointestinal symptoms. Elucidating the molecular processes underlying SARS-CoV-2 infection, and how the genetic background of an individual is responsible for the variability in clinical presentation and severity of COVID-19 is essential in understanding this disease.Cell infection by the SARS-CoV-2 virus requires binding of its Spike (S) protein to the ACE2 cell surface protein and priming of the S by the serine protease TMPRSS2. One may expect that genetic variants leading to a defective TMPRSS2 protein can affect SARS-CoV-2 ability to infect cells. We used a range of bioinformatics methods to estimate the prevalence and pathogenicity of TMPRSS2 genetic variants in the human population, and assess whether TMPRSS2 and ACE2 are co-expressed in the intestine, similarly to what is observed in lungs.We generated a 3D structural model of the TMPRSS2 extracellular domain using the prediction server Phyre and studied 378 naturally-occurring TMPRSS2 variants reported in the GnomAD database. One common variant, p.V160M (rs12329760), is predicted damaging by both SIFT and PolyPhen2 and has a MAF of 0.25. Valine 160 is a highly conserved residue within the SRCS domain. The SRCS is found in proteins involved in host defence, such as CD5 and CD6, but its role in TMPRSS2 remains unknown. 84 rare variants (53 missense and 31 leading to a prematurely truncated protein, cumulative minor allele frequency (MAF) 7.34×10−4) cause structural destabilization and possibly protein misfolding, and are also predicted damaging by SIFT and PolyPhen2 prediction tools. Moreover, we extracted gene expression data from the human protein atlas and showed that both ACE2 and TMPRSS2 are expressed in the small intestine, duodenum and colon, as well as the kidneys and gallbladder.The implications of our study are that: i. TMPRSS2 variants, in particular p.V160M with a MAF of 0.25, should be investigated as a possible marker of disease severity and prognosis in COVID-19 and ii. in vitro validation of the co-expression of TMPRSS2 and ACE2 in gastro-intestinal is warranted.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.