2019
DOI: 10.1101/759795
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

immuneSIM: tunable multi-feature simulation of B- and T-cell receptor repertoires for immunoinformatics benchmarking

Abstract: Summary B- and T-cell receptor repertoires of the adaptive immune system have become a key target for diagnostics and therapeutics research. Consequently, there is a rapidly growing number of bioinformatics tools for immune repertoire analysis. Benchmarking of such tools is crucial for ensuring reproducible and generalizable computational analyses. Currently, however, it remains challenging to create standardized ground truth immune receptor repertoires for immunoinformatics tool benchmarking. Therefore, we de… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
20
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
3

Relationship

7
1

Authors

Journals

citations
Cited by 14 publications
(20 citation statements)
references
References 43 publications
0
20
0
Order By: Relevance
“…While these published datasets allowed us to benchmark basic run results, they do not contain complete variable domains, and were thus unable to confirm that Stitchr and Thimble were generating the correct nucleotide sequences. To generate gold-standard TCR sequences spanning the entirety of the variable domain necessary to rigorously assess this, we used the immuneSIM tool (16) to simulate V(D)J recombination, creating known TCR sequences from the IMGT germline reference database and predetermined generation probabilities (Figure 3B). This produces a repertoire with a normal distribution of variable domain lengths (Supplementary Figure 5A).…”
Section: Resultsmentioning
confidence: 99%
“…While these published datasets allowed us to benchmark basic run results, they do not contain complete variable domains, and were thus unable to confirm that Stitchr and Thimble were generating the correct nucleotide sequences. To generate gold-standard TCR sequences spanning the entirety of the variable domain necessary to rigorously assess this, we used the immuneSIM tool (16) to simulate V(D)J recombination, creating known TCR sequences from the IMGT germline reference database and predetermined generation probabilities (Figure 3B). This produces a repertoire with a normal distribution of variable domain lengths (Supplementary Figure 5A).…”
Section: Resultsmentioning
confidence: 99%
“…Of note, the concept of diversity measures creating equivalence classes has been noted previously for Hill diversity measures (Greiff et al, 2015b) and is here extended to include additional repertoire features immuneREF unifies single and composite features, frequency-dependent, and sequence-dependent similarity measures into one computational framework. Beyond quantifying the repertoire similarity of experimental immune repertoires, immuneREF also enables the comparison of simulated (Han et al, 2021; Marcou et al, 2017; Safonova et al, 2015; Weber et al, 2020)(Marcou et al, 2017; Safonova et al, 2015; Weber et al, 2019) and in vitro synthetic immune repertoires used for therapeutic antibody discovery (Mason et al, 2018). Furthermore, immuneREF may be used for data curation purposes in immune repertoire databases such as iReceptor (Corrie et al, 2018), VDJserver (Cowell et al, 2015), PIRD (Zhang et al, 2019), and Observed Antibody Space (Kovaltsuk et al, 2018).…”
Section: Discussionmentioning
confidence: 99%
“…Here, we developed a simulation framework and further exemplified how simulated repertoires can be leveraged to benchmark and develop bioinformatics software relating to experimental datasets. Although multiple simulation frameworks have been previously developed (Yermanos et al, 2017;Weber et al, 2020;Safonova et al, 2015;Davidsen and Matsen, 2018), there is a lack of software specifically tailored to generating adaptive immune receptors at the single-cell resolution and with corresponding transcriptome information. Echidna bridges the gap between single-cell transcriptome and immune repertoire sequencing, providing information relevant to both fields at the single-cell resolution.…”
Section: Discussionmentioning
confidence: 99%
“…Interpreting such datasets, however, remains challenging as the accompanying computational pipelines and software are still in their infancy (Yermanos, Agrafiotis, et al, 2021b;Borcherding et al, 2020;Sturm et al, 2020). Although multiple tools have been developed to simulate immune receptors and single-cell transcriptomes (Marcou et al, 2018;Weber et al, 2020;Yermanos et al, 2017;Davidsen et al, 2019;Safonova et al, 2015), these represent separate platforms and thus there remains a lack of software capable of simulating scSeq data of immune repertoires and transcriptomes. We therefore developed Echidna, an R package that simulates immune repertoires and their corresponding transcriptomes.…”
Section: Introductionmentioning
confidence: 99%