Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of protein products can only be annotated computationally. If computational predictions are to be relied upon, it is crucial that the accuracy of these methods be high. Here we report the results from the first large-scale community-based Critical Assessment of protein Function Annotation (CAFA) experiment. Fifty-four methods representing the state-of-the-art for protein function prediction were evaluated on a target set of 866 proteins from eleven organisms. Two findings stand out: (i) today’s best protein function prediction algorithms significantly outperformed widely-used first-generation methods, with large gains on all types of targets; and (ii) although the top methods perform well enough to guide experiments, there is significant need for improvement of currently available tools.
Genetic variation can modulate gene expression, and thereby phenotypic variation and susceptibility to complex diseases such as type 2 diabetes (T2D). Here we harnessed the potential of DNA and RNA sequencing in human pancreatic islets from 89 deceased donors to identify genes of potential importance in the pathogenesis of T2D. We present a catalog of genetic variants regulating gene expression (eQTL) and exon use (sQTL), including many long noncoding RNAs, which are enriched in known T2D-associated loci. Of 35 eQTL genes, whose expression differed between normoglycemic and hyperglycemic individuals, siRNA of tetraspanin 33 (TSPAN33), 5′-nucleotidase, ecto (NT5E), transmembrane emp24 protein transport domain containing 6 (TMED6), and p21 protein activated kinase 7 (PAK7) in INS1 cells resulted in reduced glucose-stimulated insulin secretion. In addition, we provide a genome-wide catalog of allelic expression imbalance, which is also enriched in known T2D-associated loci. Notably, allelic imbalance in paternally expressed gene 3 (PEG3) was associated with its promoter methylation and T2D status. Finally, RNA editing events were less common in islets than previously suggested in other tissues. Taken together, this study provides new insights into the complexity of gene regulation in human pancreatic islets and better understanding of how genetic variation can influence glucose metabolism.T ype 2 diabetes (T2D) is an increasing global health problem (1). Although genome-wide association studies (GWAS) have yielded more than 70 loci associated with T2D or related traits (2, 3), they have not provided the expected breakthrough in our understanding of the pathogenesis of the disease. They have nonetheless pointed at a central role of the pancreatic islets and β-cell dysfunction in the development of the disease (4, 5). It therefore seems pertinent to focus on human pancreatic islets to obtain insights into the molecular mechanisms causing the disease (6, 7). Given that most SNPs associated with T2D lie in noncoding regions, the majority of causal variants are likely to regulate gene expression rather than protein function per se. Therefore, combination of DNA and RNA sequencing in the same individuals may help to disentangle the role these SNPs play in the pathogenesis of the disease (8). Although the human pancreatic islet transcriptome has been previously described (6, 9-18), using microarrays or RNA sequencing of a limited number of nondiabetic individuals, this has not allowed a more global analysis of the complexity of the islet transcriptome in T2D. Here we combined genotypic imputation, expression microarrays, and exome and RNA sequencing (ExomeSeq and RNA-Seq) in a large number of human pancreatic islets from deceased donors with and without T2D. This study identified a number of novel genes, including long intergenic noncoding RNAs (lincRNAs), whose expression and/or splicing influences insulin secretion and is associated with glycemia. In addition, we provide a catalog of RNA editing and allele-specific expr...
Alignment is the first step in most RNA-seq analysis pipelines, and the accuracy of downstream analyses depends heavily on it. Unlike most steps in the pipeline, alignment is particularly amenable to benchmarking with simulated data. We performed a comprehensive benchmarking of 14 common splice-aware aligners for base, read, and exon junction-level accuracy and compared default with optimized parameters. We found that performance varied by genome complexity, and accuracy and popularity were poorly correlated. The most widely cited tool underperforms for most metrics, particularly when using default settings.
. Use of a novel triple-tracer approach to assess postprandial glucose metabolism. Am J Physiol Endocrinol Metab 284: E55-E69, 2003; 10.1152/ajpendo. 00190.2001.-Numerous studies have used the dual-tracer method to assess postprandial glucose metabolism. The present experiments were undertaken to determine whether the marked tracer nonsteady state that occurs with the dual-tracer approach after food ingestion introduces error when it is used to simultaneously measure both meal glucose appearance (Ra meal) and endogenous glucose production (EGP). To do so, a novel triple-tracer approach was designed: 12 subjects ingested a mixed meal containing [1-13 C]glucose while [6-3 H]glucose and [6,6-2 H2]glucose were infused intravenously in patterns that minimized the change in the plasma ratios of [6-3 H]glucose to [1-13 C]glucose and of [6,6-2 H2]glucose to endogenous glucose, respectively. Ra meal and EGP measured with this approach were essentially model independent, since non-steady-state error was minimized by the protocol. Initial splanchnic glucose extraction (ISE) was 12.9% Ϯ 3.4%, and suppression of EGP (EGPS) was 40.3% Ϯ 4.1%. In contrast, when calculated with the dual-tracer onecompartment model, ISE was higher (P Ͻ 0.05) and EGPS was lower (P Ͻ 0.005) than observed with the triple-tracer approach. These errors could only be prevented by using time-varying volumes different for Ra meal and EGP. Analysis of the dual-tracer data with a two-compartment model reduced but did not totally avoid the problems associated with marked postprandial changes in the tracer-to-tracee ratios. We conclude that results from previous studies that have used the dual-tracer one-compartment model to measure postprandial carbohydrate metabolism need to be reevaluated and that the triple-tracer technique may provide a useful approach for doing so. glucose kinetics; initial splanchnic glucose uptake; nonsteady state AFTER AN OVERNIGHT FAST, the amount of glucose entering the systemic circulation [i.e., endogenous glucose production (EGP)] approximates the amount of glucose leaving the circulation [i.e., glucose disappearance (R d )]. Under these circumstances, EGP is primarily derived from the liver, with a small contribution coming from the kidney (6, 9, 10, 27). The situation becomes more complex after food ingestion when glucose entering the systemic circulation can originate from both the gut and EGP (9). An alteration in either of these processes can substantially influence glucose tolerance.In a pioneering series of experiments, Steele et al. (25) introduced a dual-isotope method that enabled simultaneous in vivo measurement of both the systemic rate of appearance of the ingested glucose (R a meal ) and postprandial EGP. This approach utilizes two glucose tracers: one ingested and one infused intravenously. The intravenously infused tracer measures the rates of appearance (R a ) of the ingested tracer and of total glucose (i.e., labeled and unlabeled). Appearance of the ingested glucose is calculated by multiplying the R a of th...
BackgroundPredicting protein function has become increasingly demanding in the era of next generation sequencing technology. The task to assign a curator-reviewed function to every single sequence is impracticable. Bioinformatics tools, easy to use and able to provide automatic and reliable annotations at a genomic scale, are necessary and urgent. In this scenario, the Gene Ontology has provided the means to standardize the annotation classification with a structured vocabulary which can be easily exploited by computational methods.ResultsArgot2 is a web-based function prediction tool able to annotate nucleic or protein sequences from small datasets up to entire genomes. It accepts as input a list of sequences in FASTA format, which are processed using BLAST and HMMER searches vs UniProKB and Pfam databases respectively; these sequences are then annotated with GO terms retrieved from the UniProtKB-GOA database and the terms are weighted using the e-values from BLAST and HMMER. The weighted GO terms are processed according to both their semantic similarity relations described by the Gene Ontology and their associated score. The algorithm is based on the original idea developed in a previous tool called Argot. The entire engine has been completely rewritten to improve both accuracy and computational efficiency, thus allowing for the annotation of complete genomes.ConclusionsThe revised algorithm has been already employed and successfully tested during in-house genome projects of grape and apple, and has proven to have a high precision and recall in all our benchmark conditions. It has also been successfully compared with Blast2GO, one of the methods most commonly employed for sequence annotation. The server is freely accessible at http://www.medcomp.medicina.unipd.it/Argot2.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.