Building a population-specific catalogue of single nucleotide variants (SNVs), indels and structural variants (SVs) with frequencies, termed a national pan-genome, is critical for further advancing clinical and public health genetics in large cohorts. Here we report a Danish pan-genome obtained from sequencing 10 trios to high depth (50 × ). We report 536k novel SNVs and 283k novel short indels from mapping approaches and develop a population-wide de novo assembly approach to identify 132k novel indels larger than 10 nucleotides with low false discovery rates. We identify a higher proportion of indels and SVs than previous efforts showing the merits of high coverage and de novo assembly approaches. In addition, we use trio information to identify de novo mutations and use a probabilistic method to provide direct estimates of 1.27e−8 and 1.5e−9 per nucleotide per generation for SNVs and indels, respectively.
Predictions of intramolecular residue-residue contacts were assessed as part of the seventh community-wide Critical Assessment of Structure Prediction experiment (CASP7). As in past assessments, we focused on contacts that lie far apart in sequence as these are likely to be more informative in predicting protein structure. One lab did somewhat better than others according to our assessment, and there is some reason to think that this lab's results represent progress over CASP6. In general, contacts inferred from 3D structural predictions are similar in accuracy to those predicted by contact prediction methods. However, contact prediction methods were more accurate for some targets.
This article details the assessment process and evaluation results for two categories in the 8th Critical Assessment of Protein Structure Prediction experiment (CASP8). The domain prediction category was evaluated with a range of scores including the Normalized Domain Overlap score and a domain boundary distance measure. Residue‐residue contact predictions were evaluated with standard CASP measures, prediction accuracy, and Xd. In the domain boundary prediction category, prediction methods still make reliable predictions for targets that have structural templates, but continue to struggle to make good predictions for the few ab initio targets in CASP. There was little indication of improvement in the domain prediction category. The contact prediction category demonstrated that there was renewed interest among predictors and despite the small sample size the results suggested that there had been an increase in prediction accuracy. In contrast to CASP7 contact specialists predicted contacts more accurately than the majority of tertiary structure predictors. Despite this small success, the lack of free modeling targets makes it unlikely that either category will be included in their present form in CASP9. Proteins 2009. © 2009 Wiley‐Liss, Inc.
Purpose of review: Systemic lupus erythematosus (SLE) is caused by a combination of genetic and acquired immuno-deficiencies and environmental factors including infections. An association to Epstein-Barr virus (EBV) has been established by numerous studies over the past decades. Here, we review recent experimental studies on this, and present our integrated theory of SLE development. Recent findings: SLE patients have dysfunctional control of EBV infection resulting in frequent reactivationsand disease progression. These comprise impaired functions of EBV-specific T-cells with an inverse correlation to disease activity and elevated serum levels of antibodies against lytic cycle EBV antigens. The presence of EBV proteins in renal tissue from SLE patients with nephritis indicates a direct involvement of EBV in SLE development. As expected for patients with immuno-deficiencies, studies reveal that SLE patients show dysfunctional responses to other viruses as well. An association to EBV infection has also been demonstrated for other autoimmune diseases including Sjögren's syndrome, rheumatoid arthritis, and multiple sclerosis.Summary: Collectively, the interplay between an impaired immune system and the cumulative effects of EBV and other viruses results in frequent reactivations of EBV and enhanced cell death, causing development of SLE and concomitant autoreactivities.
Background: It has repeatedly been shown that interacting protein families tend to have similar phylogenetic trees. These similarities can be used to predicting the mapping between two families of interacting proteins (i.e. which proteins from one family interact with which members of the other). The correct mapping will be that which maximizes the similarity between the trees. The two families may eventually comprise orthologs and paralogs, if members of the two families are present in more than one organism. This fact can be exploited to restrict the possible mappings, simply by impeding links between proteins of different organisms. We present here an algorithm to predict the mapping between families of interacting proteins which is able to incorporate information regarding orthologues, or any other assignment of proteins to "classes" that may restrict possible mappings.
Cancer genome projects are now being expanded in an attempt to provide complete landscapes of the mutations that exist in tumours. Although the importance of cataloguing genome variations is well recognized, there are obvious difficulties in bridging the gaps between high-throughput resequencing information and the molecular mechanisms of cancer evolution. Here, we describe the current status of the high-throughput genomic technologies, and the current limitations of the associated computational analysis and experimental validation of cancer genetic variants. We emphasize how the current cancer-evolution models will be influenced by the high-throughput approaches, in particular through efforts devoted to monitoring tumour progression, and how, in turn, the integration of data and models will be translated into mechanistic knowledge and clinical applications.
Rheumatoid arthritis (RA) is a chronic systemic autoimmune disorder of unknown etiology, which is characterized by inflammation in the synovium and joint damage. Although the pathogenesis of RA remains to be determined, a combination of environmental (e.g., viral infections) and genetic factors influence disease onset. Especially genetic factors play a vital role in the onset of disease, as the heritability of RA is 50–60%, with the human leukocyte antigen (HLA) alleles accounting for at least 30% of the overall genetic risk. Some HLA-DR alleles encode a conserved sequence of amino acids, referred to as the shared epitope (SE) structure. By analyzing the structure of a HLA-DR molecule in complex with Epstein-Barr virus (EBV), the SE motif is suggested to play a vital role in the interaction of MHC II with the viral glycoprotein (gp) 42, an essential entry factor for EBV. EBV has been repeatedly linked to RA by several lines of evidence and, based on several findings, we suggest that EBV is able to induce the onset of RA in predisposed SE-positive individuals, by promoting entry of B-cells through direct contact between SE and gp42 in the entry complex.
An entire family of methodologies for predicting protein interactions is based on the observed fact that families of interacting proteins tend to have similar phylogenetic trees due to co-evolution. One application of this concept is the prediction of the mapping between the members of two interacting protein families (which protein within one family interacts with which protein within the other). The idea is that the real mapping would be the one maximizing the similarity between the trees. Since the exhaustive exploration of all possible mappings is not feasible for large families, current approaches use heuristic techniques which do not ensure the best solution to be found. This is why it is important to check the results proposed by heuristic techniques and to manually explore other solutions. Here we present TSEMA, the server for efficient mapping assessment. This system calculates an initial mapping between two families of proteins based on a Monte Carlo approach and allows the user to interactively modify it based on performance figures and/or specific biological knowledge. All the explored mappings are graphically shown over a representation of the phylogenetic trees. The system is freely available at . Standalone versions of the software behind the interface are available upon request from the authors.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.