Single-cell genomics analysis requires normalization of feature counts that stabilizes variance while accounting for variable cell sequencing depth. We discuss some of the trade-offs present with current widely used methods, and analyze their performance on 526 single-cell RNA-seq datasets. The results lead us to recommend proportional fitting prior to log transformation followed by an additional proportional fitting.
We present a command-line tool, called ffq, for querying metadata from genomic databases.
Motivation Several genomic databases host data and metadata for an ever-growing collection of sequence datasets. While these databases have a shared hierarchical structure, there are no tools specifically designed to leverage it for metadata extraction. Results We present a command-line tool, called ffq, for querying user-generated data and metadata from sequence databases. Given an accession or a paper’s DOI, ffq efficiently fetches metadata and links to raw data in JSON format. ffq’s modularity and simplicity makes it extensible to any genomic database exposing its data for programmatic access. Availability and implementation ffq is free and open source, and the code can be found here: https://github.com/pachterlab/ffq. Supplementary information Supplementary data are available at Bioinformatics online.
Translation of mRNAs containing premature termination codons (PTCs) results in truncated protein products with deleterious effects. Nonsense-mediated decay (NMD) is a surveillance pathway responsible for detecting PTC containing transcripts. Although the molecular mechanisms governing mRNA degradation have been extensively studied, the fate of the nascent protein product remains largely uncharacterized. Here, we use a fluorescent reporter system in mammalian cells to reveal a selective degradation pathway specifically targeting the protein product of an NMD mRNA. We show that this process is post-translational and dependent on the ubiquitin proteasome system. To systematically uncover factors involved in NMD-linked protein quality control, we conducted genome-wide flow cytometry-based screens. Our screens recovered known NMD factors but suggested that protein degradation did not depend on the canonical ribosome-quality control (RQC) pathway. A subsequent arrayed screen demonstrated that protein and mRNA branches of NMD rely on a shared recognition event. Our results establish the existence of a targeted pathway for nascent protein degradation from PTC containing mRNAs, and provide a reference for the field to identify and characterize required factors.
Many single cell RNA-sequencing (scRNA-seq) data analysis workflows rely on methods that embed and visualize the properties of a k-nearest neighbor (kNN) graph in two-dimensions. These visualizations are typically combined with categorical labels assigned to individual data points and can support a range of analysis tasks, despite the fact that these embeddings are known to distort the local and global properties of the graph. Rather than relying on a two-dimensional visualization, we introduce a method for quantitatively assessing the concordance between a set of labels and the k-nearest neighbor graph. Our method, called CONCORDEX, computes for each node the fraction of neighbors with the same or different label, and compares the result to the mean obtained with random labeling of the graph. CONCORDEX can be used for any categorical label and can be interpreted via an intuitive heatmap visualization. We demonstrate its utility for assessment of clustering results and "integration". Since CONCORDEX can be used to directly visualize properties of a kNN graph, we also use CONCORDEX to evaluate how well two-dimensional embeddings capture the local and global structure of the underlying graph. We have made CONCORDEX available as a Python-based command line tool (https://github.com/pachterlab/concordex) and as a software package in Bioconductor (https://bioconductor.org/packages/concordexR).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.