sumrep: a summary statistic framework for immune receptor repertoire comparison and model validation

Olson, Branden J; Moghimi, Pejvak; Schramm, Chaim A.; Obraztsova, Anna; Ralph, D.; Heiden, Jason A. Vander; Shugay, Mikhail; Shepherd, Adrian J.; Lees, William D.; Matsen, F. A.

doi:10.1101/727784

Cited by 10 publications

(11 citation statements)

References 37 publications

(33 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…We next sought to quantify the similarity of model-generated sequences to real sequences, for each of the three models in consideration. To accomplish this task, we used the sumrep package (Olson et al, 2019) (https://github.com/matsengrp/sumrep/), a collaborative effort of the AIRR (Breden et al, 2017; Rubelt et al, 2017) software working group. This package calculates many summary statistics on immune receptor sequence repertoires and provides functions for comparing these summaries.…”

Section: Resultsmentioning

confidence: 99%

Deep generative models for T cell receptor protein sequences

Davidsen

Olson

DeWitt

et al. 2019

eLife

Self Cite

View full text Add to dashboard Cite

Probabilistic models of adaptive immune repertoire sequence distributions can be used to infer the expansion of immune cells in response to stimulus, differentiate genetic from environmental factors that determine repertoire sharing, and evaluate the suitability of various target immune sequences for stimulation via vaccination. Classically, these models are defined in terms of a probabilistic V(D)J recombination model which is sometimes combined with a selection model. In this paper we take a different approach, fitting variational autoencoder (VAE) models parameterized by deep neural networks to T cell receptor (TCR) repertoires. We show that simple VAE models can perform accurate cohort frequency estimation, learn the rules of VDJ recombination, and generalize well to unseen sequences. Further, we demonstrate that VAE-like models can distinguish between real sequences and sequences generated according to a recombination-selection model, and that many characteristics of VAE-generated sequences are similar to those of real sequences.

show abstract

Section: Resultsmentioning

confidence: 99%

Deep generative models for T cell receptor protein sequences

Davidsen

Olson

DeWitt

et al. 2019

eLife

Self Cite

View full text Add to dashboard Cite

show abstract

“…[1]Two such approaches have been proposed for specific clone detection in Minimal Residual Diseases 45,46 as well as for the BCR, but not TCR, repertoire 47 , still at a very low diversity level. The construction of such gold standard repertoires is currently very costly and remains a major challenge that the Adaptive Immune Receptor Repertoire Community (AIRR-C) 48 , engaged in AIRR-seq standardization [49][50][51] , may tackle in the future. Finally, in this study some data were pre-processed using proprietary (mPCR-1, mPCR-3) or published 30,52 (RACE-1_U and RACE-2_U) tools and then aligned and error-corrected using MiXCR (v2.1.10) 37 .…”

Section: Detection Sensitivity Of Rare Tcrs Depends On the Methodsmentioning

confidence: 99%

“…For RACE-1 and RACE-2, UMI pre-processing was performed following protocols published elsewhere 29,30 . FASTQ and FASTA files were then processed for TRB and TRA sequence annotation using the MiXCR software (v2.1.10) with RNA-Seq parameters (-p rna-seq -s hsa) 50 .…”

Section: Tcr Deep Sequencing Data Processingmentioning

confidence: 99%

Benchmarking of T cell receptor repertoire profiling methods reveals large systematic biases

et al. 2020

Self Cite

View full text Add to dashboard Cite

Accurate profiling of T-cell receptor (TCR) repertoires is key to monitoring adaptive immunity.We systematically compared TCR sequences obtained with 9 methods applied to aliquots of the same T-cell sample. We observed marked differences in accuracy and intra-and intermethod reproducibility for alpha (TRA) and beta (TRB) TCR chains. Most methods showed lower ability to capture TRA than TRB diversity. Low RNA input generated non-representative repertoires. Results from 5'RACE-PCR methods were consistent among themselves, while differing from the RNA-based multiplex-PCR results. gDNA-based multiplex-PCR methods also differed from each other. Using an in silico meta-repertoire generated from 108 replicates, we found that one gDNA-based method and two non-UMI RNA-based methods were more sensitive than UMI methods in detecting rare clonotypes, despite the better clonotype quantification accuracy of the latter. This study delineates the advantages and limitations of different TCR sequencing methods, which should help the study, diagnosis and treatment of human diseases.

show abstract

“…While these efforts raise concerns over the validity of DE cells as a biomarker or mechanism of T1D pathogenesis, our data provide an opportunity for further discovery. The data can be used to determine whether potential immune cell or immune repertoire motifs are associated with T1D, to compare against libraries of antibodies with known auto-specificities (Seay et al, 2016) as well as for sample normalization, motif discovery, and disease-specific immune subset analysis (Olson et al, 2019;Miho et al, 2019). Another interesting and under-explored aspect of our data is the analysis of the Aab + control subjects, who have broken tolerance to a subset of T1D-associated autoantigens (Battaglia et al, 2020).…”

Section: Discussionmentioning

confidence: 99%