Spike sorting is a critical first step in extracting neural signals from large-scale multi-electrode array (MEA) data. This manuscript presents several new techniques that make MEA spike sorting more robust and accurate. Our pipeline is based on an efficient multi-stage "triage-then-cluster-then-pursuit" approach that initially extracts only clean, high-quality waveforms from the electrophysiological time series by temporarily skipping noisy or "collided" events (representing two neurons firing synchronously). This is accomplished by developing a neural network detection and denoising method followed by efficient outlier triaging. The denoised spike waveforms are then used to infer the set of spike templates through nonparametric Bayesian clustering. We use a divide-andconquer strategy to parallelize this clustering step. Finally, we recover collided waveforms with matching-pursuit deconvolution techniques, and perform further split-and-merge steps to estimate additional templates from the pool of recovered waveforms. We apply the new pipeline to data recorded in the primate retina, where high firing rates and highly-overlapping axonal units provide a challenging testbed for the deconvolution approach; in addition, the well-defined mosaic structure of receptive fields in this preparation provides a useful quality check on any spike sorting pipeline. We show that our pipeline improves on the state-of-the-art in spike sorting (and outperforms manual sorting) on both real and semi-simulated MEA data with > 500 electrodes; open source code can be found at https://github.com/paninski-lab/yass. * Equal contribution authors ‡ DARPA Neural Engineering System Design program BAA-16-09 1 datastream as efficiently as possible. Finally, scalability must be a key consideration. To feasibly process the oncoming data deluge, we use parallel, scalable algorithms based on efficient data summarizations wherever possible and focus computational power on the "hard cases," using cheap fast methods to handle easy cases.To evaluate the resulting pipeline, we focus here on MEA data collected from the primate retina. This preparation is a useful spike sorting testbed for several important reasons. First, the two-dimensional MEA used here matches the approximately two-dimensional substrate of the retinal ganglion layer. Second, receptive fields of well-characterized retinal ganglion cell (RGC) types (e.g., ON parasols, OFF midgets, etc.) are known to approximately tile the visual field, providing useful side information for scoring different spike sorting pipelines. Third, many RGCs have moderately high firing rates and often have significant axonal projections that overlap with each other spatially on the MEA, making it challenging to demix spikes that overlap spatially and temporally from different RGCs * .We will first outline the methodology that forms the core of our pipeline in Section 2.1, then provide details of each module in the following subsections, and finally demonstrate the improvements in performance on 512-electrode primat...