Kevin R. Moon scite author profile

Single-cell RNA-sequencing technologies suffer from many sources of technical noise, including under-sampling of mRNA molecules, often termed ‘dropout’, which can severely obscure important gene-gene relationships. To address this, we developed MAGIC (Markov Affinity-based Graph Imputation of Cells), a method that shares information across similar cells, via data diffusion, to denoise the cell count matrix and fill in missing transcripts. We validate MAGIC on several biological systems and find it effective at recovering gene-gene relationships and additional structures. MAGIC reveals a phenotypic continuum, with the majority of cells residing in intermediate states that display stem-like signatures and uncovers known and previously uncharacterized regulatory interactions, demonstrating that our approach can successfully uncover regulatory relations without perturbations.

show abstract

Visualizing structure and transitions in high-dimensional biological data

Moon

et al. 2019

View full text Add to dashboard Cite

show abstract

Recovering Gene Interactions from Single-Cell Data Using Data Diffusion

et al. 2018

View full text Add to dashboard Cite

Exploring single-cell data with deep multitasking neural networks

et al. 2019

View full text Add to dashboard Cite

Biomedical researchers are generating high-throughput, high-dimensional single-cell 5 data at a staggering rate. As costs of data generation decrease, experimental design is mov-6 ing towards measurement of many different single-cell samples in the same dataset. These 7 samples can correspond to different patients, conditions, or treatments. While scalability of 8 methods to datasets of these sizes is a challenge on its own, dealing with large-scale exper-9 imental design presents a whole new set of problems, including batch effects and sample 10 1 .

show abstract

Manifold learning-based methods for analyzing single-cell RNA-sequencing data

Moon

Stanley

Burkhardt

et al. 2018

Current Opinion in Systems Biology

116

122

View full text Add to dashboard Cite

Structural and developmental principles of neuropil assembly in C. elegans

Moyle

Barnes

Kuchroo

et al. 2021

Nature

View full text Add to dashboard Cite

Neuropil is a fundamental form of tissue organization within brains 1 . In neuropils, densely packed neurons synaptically interconnect into precise circuit architecture 2 , 3 , yet the structural and developmental principles governing this nanoscale precision remain largely unknown 4 , 5 . Here, we use diffusion condensation, an iterative data coarse-graining algorithm 6 , to identify nested circuit structures within the C. elegans neuropil (called the nerve ring). We show that the nerve ring neuropil is largely organized into four strata composed of related behavioral circuits. The stratified architecture of the neuropil is a geometrical representation of the functional segregation of sensory information and motor outputs, with specific sensory organs and muscle quadrants mapping onto particular neuropil strata. We identify groups of neurons with unique morphologies that integrate information across strata and that create neural structures that cage the strata within the nerve ring. We use high resolution light-sheet microscopy 7 , 8 , coupled with lineage-tracing and cell-tracking algorithms 9 , 10 , to resolve the developmental sequence and reveal principles of cell position, migration and outgrowth that guide stratified neuropil organization. Our results uncover conserved structural design principles underlying nerve ring neuropil architecture and function, and a pioneer-neuron-based, temporal progression of outgrowth that guides the hierarchical development of the layered neuropil. Our findings provide a systematic blueprint for using structural and developmental approaches to understand neuropil organization within brains.

show abstract

Exploring Single-Cell Data with Deep Multitasking Neural Networks

Amodio

Dijk

Srinivasan

et al. 2017

Preprint

View full text Add to dashboard Cite

Handling the vast amounts of single-cell RNA-sequencing and CyTOF data, which are now being generated in patient cohorts, presents a computational challenge due to the noise, complexity, sparsity and batch effects present. Here, we propose a unified deep neural network-based approach to automatically process and extract structure from these massive datasets. Our unsupervised architecture, called SAUCIE (Sparse Autoencoder for Unsupervised Clustering, Imputation, and Embedding), simultaneously performs several key tasks for single-cell data analysis including 1) clustering, 2) batch correction, 3) visualization, and 4) denoising/imputation. SAUCIE is trained to recreate its own input after reducing its dimensionality in a 2-D embedding layer which can be used to visualize the data. Additionally, it uses two novel regularizations: (1) an information dimension regularization to penalize entropy as computed on normalized activation values of the layer, and thereby encourage binary-like encodings that are amenable to clustering and (2) a Maximal Mean Discrepancy penalty to correct batch effects. Thus SAUCIE has a single architecture that denoises, batch-corrects, visualizes and clusters data using a unified 1 . CC-BY 4.0 International license peer-reviewed) is the author/funder. It is made available under a

show abstract

Ensemble estimation of multivariate f-divergence

Moon

Hero

2014

View full text Add to dashboard Cite

f -divergence estimation is an important problem in the fields of information theory, machine learning, and statistics. While several divergence estimators exist, relatively few of their convergence rates are known. We derive the MSE convergence rate for a density plug-in estimator of f -divergence. Then by applying the theory of optimally weighted ensemble estimation, we derive a divergence estimator with a convergence rate of O 1 T that is simple to implement and performs well in high dimensions. We validate our theoretical results with experiments.I. INTRODUCTION f -divergence is a measure of the difference between distributions and is important to the fields of information theory, machine learning, and statistics [1]. Many different kinds of f -divergences have been defined including the Kullback-Leibler (KL) [2] and . A special case of the KL divergence is mutual information which gives the capacities in data compression and channel coding [4]. Mutual information estimation has also been used in applications such as feature selection [5], fMRI data processing [6], and clustering [7]. Entropy is also a special case of divergence where one of the distributions is the uniform distribution. Entropy estimation is useful for intrinsic dimension estimation [8], texture classification and image registration [9], and many other applications. Additionally, divergence estimation is useful for empirically estimating the decay rates of error probabilities of hypothesis testing [4] and extending machine learning algorithms to distributional features [10], [11]. For other applications of divergence estimation, see [12].We consider the problem of estimating the f -divergence when only two finite populations of independent and identically distributed (i.i.d.) samples are available from some unknown, nonparametric, smooth, d-dimensional distributions. While several estimators of divergence have been previously defined, the convergence rates are known for only a few of them. Our first contribution is to derive convergence rates for kernel density plug-in f -divergence estimators with an adaptive k-nearest neighbor (k-nn) kernel. Our second contribution is to extend the theory of optimally weighted ensemble entropy estimation developed in [13] to obtain a divergence estimator with a convergence rate of O

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Kevin R. Moon

Recovering Gene Interactions from Single-Cell Data Using Data Diffusion

Visualizing structure and transitions in high-dimensional biological data

Recovering Gene Interactions from Single-Cell Data Using Data Diffusion

Exploring single-cell data with deep multitasking neural networks

Manifold learning-based methods for analyzing single-cell RNA-sequencing data

Structural and developmental principles of neuropil assembly in C. elegans

Exploring Single-Cell Data with Deep Multitasking Neural Networks

Ensemble estimation of multivariate f-divergence

Contact Info

Product

Resources

About