Killer immunoglobulin-like receptor (KIR) genes and human leukocyte antigen (HLA) genes are highly polymorphic in a population and play important roles in innate and adaptive immunity. We have developed a novel computational method T1K that can efficiently and accurately infer the KIR or HLA alleles from next-generation sequencing data. T1K is flexible and is compatible with various sequencing platforms including RNA-seq and genomic sequencing data. We applied T1K on CD8+ T cell single-cell RNA-seq data, and identified that KIR2DL4 allele expression levels were enriched in tumor-specific CD8+ T cells.
We applied our computational algorithm TRUST4 to assemble immune receptor (TCR/BCR) repertoires from approximately twelve thousand RNA-seq samples from The Cancer Genome Atlas (TCGA) and seven immunotherapy studies. From over 35 million assembled complete complementary-determining region 3 (CDR3) sequences, we observed that the expression of CCL5 and MZB1 are the most positively correlated genes with T-cell clonal expansion and B-cell clonal expansion, respectively. We analyzed amino acid evolution during B-cell receptor somatic hypermutation and identified tyrosine as the preferred residue. We found that IgG1+IgG3 antibodies together with FcRn were associated with complement-dependent cytotoxicity and antibody-dependent cellular cytotoxicity or phagocytosis. In addition to B-cell infiltration, we discovered that B-cell clonal expansion and IgG1+IgG3 antibodies are also correlated with better patient outcomes. Finally, we created a website, VisualizIRR, for users to interactively explore and visualize the immune repertoires in this study.
Killer immunoglobulin-like receptor (KIR) genes and human leukocyte antigen (HLA) genes play important roles in innate and adaptive immunity. They are highly polymorphic and cannot be genotyped with standard variant calling pipelines. Compared with HLA genes, many KIR genes are similar to each other in sequences and may be absent in the chromosomes. Therefore, while many tools have been developed to genotype HLA genes using common sequencing data, none of them works for KIR genes. Even the specialized KIR genotypers could not resolve all the KIR genes. Here we describe T1K, a novel computational method for the efficient and accurate inference of KIR or HLA alleles from RNA-seq, whole genome sequencing or whole exome sequencing data. T1K jointly considers alleles across all genotyped genes, so it can reliably identify present genes and distinguish homologous genes, including the challengingKIR2DL5A/KIR2DL5Bgenes. This model also benefits HLA genotyping, where T1K achieves the highest accuracy in benchmarks. Moreover, T1K can call novel single nucleotide variants and process single-cell data. Applying T1K to tumor single-cell RNA-seq data, we found thatKIR2DL4expression was enriched in tumor-specific CD8+T cells. T1K may open the opportunity for HLA and KIR genotyping across various sequencing applications.
Motivation: The chromatin profile measured by ATAC-seq, ChIP-seq, or DNase-seq experiments can identify genomic regions critical in regulating gene expression and provide insights on biological processes such as diseases and development. However, quality control and processing chromatin profiling data involves many steps, and different bioinformatics tools are used at each step. It can be challenging to manage the analysis. Results: We developed a Snakemake pipeline called CHIPS (CHromatin enrIchment ProcesSor) to streamline the processing of ChIP-seq, ATAC-seq, and DNase-seq data. The pipeline supports single- and paired-end data and is flexible to start with FASTQ or BAM files. It includes basic steps such as read trimming, mapping, and peak calling. In addition, it calculates quality control metrics such as contamination profiles, polymerase chain reaction bottleneck coefficient, the fraction of reads in peaks, percentage of peaks overlapping with the union of public DNaseI hypersensitivity sites, and conservation profile of the peaks. For downstream analysis, it carries out peak annotations, motif finding, and regulatory potential calculation for all genes. The pipeline ensures that the processing is robust and reproducible. Availability: CHIPS is available at https://github.com/liulab-dfci/CHIPS.
Motivation: The chromatin profile measured by ATAC-seq, ChIP-seq, or DNase-seq experiments can identify genomic regions critical in regulating gene expression and provide insights on biological processes such as diseases and development. However, quality control and processing chromatin profiling data involve many steps, and different bioinformatics tools are used at each step. It can be challenging to manage the analysis. Results: We developed a Snakemake pipeline called CHIPS (CHromatin enrichment Processor) to streamline the processing of ChIP-seq, ATAC-seq, and DNase-seq data. The pipeline supports single- and paired-end data and is flexible to start with FASTQ or BAM files. It includes basic steps such as read trimming, mapping, and peak calling. In addition, it calculates quality control metrics such as contamination profiles, PCR bottleneck coefficient, the fraction of reads in peaks, percentage of peaks overlapping with the union of public DNaseI hypersensitivity sites, and conservation profile of the peaks. For downstream analysis, it carries out peak annotations, motif finding, and regulatory potential calculation for all genes. The pipeline ensures that the processing is robust and reproducible. Availability: CHIPS is available at https://bitbucket.org/plumbers/cidc_chips/src/master/ Contact: mtang@ds.dfci.harvard.edu; henry_long@dfci.harvard.edu
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.