conan.wang@griffith.edu.au; a.hofmann@griffith.edu.au.
Stemformatics is an established gene expression data portal containing over 420 public gene expression datasets derived from microarray, RNA sequencing and single cell profiling technologies. Developed for the stem cell community, it has a major focus on pluripotency, tissue stem cells, and staged differentiation. Stemformatics includes curated ‘collections’ of data relevant to cell reprogramming, as well as hematopoiesis and leukaemia. Rather than simply rehosting datasets as they appear in public repositories, Stemformatics uses a stringent set of quality control metrics and its own pipelines to process handpicked datasets from raw files. This means that about 30% of datasets processed by Stemformatics fail the quality control metrics and never make it to the portal, ensuring that Stemformatics data are of high quality and have been processed in a consistent manner. Stemformatics provides easy-to-use and intuitive tools for biologists to visually explore the data, including interactive gene expression profiles, principal component analysis plots and hierarchical clusters, among others. The addition of tools that facilitate cross-dataset comparisons provides users with snapshots of gene expression in multiple cell and tissues, assisting the identification of cell-type restricted genes, or potential housekeeping genes. Stemformatics is freely available at stemformatics.org.
The Stemformatics myeloid atlas is an integrated transcriptome atlas of human macrophages and dendritic cells that systematically compares freshly isolated tissue-resident, cultured, and pluripotent stem cell-derived myeloid cells. Three classes of tissue-resident macrophage were identified: Kupffer cells and microglia; monocyte-associated; and tumor-associated macrophages. Culture had a major impact on all primary cell phenotypes. Pluripotent stem cell-derived macrophages were characterized by atypical expression of collagen and a highly efferocytotic phenotype. Myeloid subsets, and phenotypes associated with derivation, were reproducible across experimental series including data projected from single-cell studies, demonstrating that the atlas provides a robust reference for myeloid phenotypes. Implementation in Stemformatics.org allows users to visualize patterns of sample grouping or gene expression for user-selected conditions and supports temporary upload of your own microarray or RNA sequencing samples, including single-cell data, to benchmark against the atlas.
Gene expression atlases have transformed our understanding of the development, composition and function of human tissues. New technologies promise improved cellular or molecular resolution, and have led to the identification of new cell types, or better defined cell states. But as new technologies emerge, information derived on old platforms becomes obsolete. We demonstrate that it is possible to combine a large number of different profiling experiments summarised from dozens of laboratories and representing hundreds of donors, to create an integrated molecular map of human tissue. As an example, we combine 850 samples from 38 platforms to build an integrated atlas of human blood cells. We achieve robust and unbiased cell type clustering using a variance partitioning method, selecting genes with low platform bias relative to biological variation. Other than an initial rescaling, no other transformation to the primary data is applied through batch correction or renormalisation. Additional data, including single-cell datasets, can be projected for comparison, classification and annotation. The resulting atlas provides a multi-scaled approach to visualise and analyse the relationships between sets of genes and blood cell lineages, including the maturation and activation of leukocytes in vivo and in vitro.In allowing for data integration across hundreds of studies, we address a key reproduciblity challenge which is faced by any new technology. This allows us to draw on the deep phenotypes and functional annotations that accompany traditional profiling methods, and provide important context to the high cellular resolution of single cell profiling. Here, we have implemented the blood atlas in the open access Stemformatics.org platform, drawing on its extensive collection of curated transcriptome data. The method is simple, scalable and amenable for rapid deployment in other biological systems or computational workflows. MicroarrayRNA-Seq pluripotent stem cell mesenchymal stromal cells blood cells myeloid lymphoid progenitors Blood Atlas (stemformatics.org/atlas) data projections Curated datasets from stemformatics.org Interactive web page hosted at stemformatics.org show gene expression show sample information data / plot downloads find datasets and samples Bulk RNA-Seq scRNA-Seq (aggregated) Graphical abstract: Recursive approach to generating a multi-scaled atlas. Top panel: The method integrates data from all cell types in the Stemformatics database, and shows clear division of samples into global categories of stromal, pluripotent or blood (inset) cell types. Bottom panel: Integration of only the blood cell subsets provides a blood atlas. Projection of external samples (green) onto the blood atlas. Samples are coloured by curated annotations derived from the original studies, and can be viewed at Stemformatics.orgRNA profiling has been a mainstay descriptor of cellular systems for over two decades, but methods for measuring transcript abundance have changed dramatically over this period. The field was revolutionise...
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.