Gali Bai scite author profile

Killer immunoglobulin-like receptor (KIR) genes and human leukocyte antigen (HLA) genes are highly polymorphic in a population and play important roles in innate and adaptive immunity. We have developed a novel computational method T1K that can efficiently and accurately infer the KIR or HLA alleles from next-generation sequencing data. T1K is flexible and is compatible with various sequencing platforms including RNA-seq and genomic sequencing data. We applied T1K on CD8+ T cell single-cell RNA-seq data, and identified that KIR2DL4 allele expression levels were enriched in tumor-specific CD8+ T cells.

show abstract

Comprehensive Characterizations of Immune Receptor Repertoire in Tumors and Cancer Immunotherapy Studies

Song

Ouyang

Cohen

et al. 2022

View full text Add to dashboard Cite

We applied our computational algorithm TRUST4 to assemble immune receptor (TCR/BCR) repertoires from approximately twelve thousand RNA-seq samples from The Cancer Genome Atlas (TCGA) and seven immunotherapy studies. From over 35 million assembled complete complementary-determining region 3 (CDR3) sequences, we observed that the expression of CCL5 and MZB1 are the most positively correlated genes with T-cell clonal expansion and B-cell clonal expansion, respectively. We analyzed amino acid evolution during B-cell receptor somatic hypermutation and identified tyrosine as the preferred residue. We found that IgG1+IgG3 antibodies together with FcRn were associated with complement-dependent cytotoxicity and antibody-dependent cellular cytotoxicity or phagocytosis. In addition to B-cell infiltration, we discovered that B-cell clonal expansion and IgG1+IgG3 antibodies are also correlated with better patient outcomes. Finally, we created a website, VisualizIRR, for users to interactively explore and visualize the immune repertoires in this study.

show abstract

Efficient and accurate KIR and HLA genotyping with massively parallel sequencing data

et al. 2023

View full text Add to dashboard Cite

Killer immunoglobulin-like receptor (KIR) genes and human leukocyte antigen (HLA) genes play important roles in innate and adaptive immunity. They are highly polymorphic and cannot be genotyped with standard variant calling pipelines. Compared with HLA genes, many KIR genes are similar to each other in sequences and may be absent in the chromosomes. Therefore, while many tools have been developed to genotype HLA genes using common sequencing data, none of them works for KIR genes. Even the specialized KIR genotypers could not resolve all the KIR genes. Here we describe T1K, a novel computational method for the efficient and accurate inference of KIR or HLA alleles from RNA-seq, whole genome sequencing or whole exome sequencing data. T1K jointly considers alleles across all genotyped genes, so it can reliably identify present genes and distinguish homologous genes, including the challengingKIR2DL5A/KIR2DL5Bgenes. This model also benefits HLA genotyping, where T1K achieves the highest accuracy in benchmarks. Moreover, T1K can call novel single nucleotide variants and process single-cell data. Applying T1K to tumor single-cell RNA-seq data, we found thatKIR2DL4expression was enriched in tumor-specific CD8+T cells. T1K may open the opportunity for HLA and KIR genotyping across various sequencing applications.

show abstract

CHIPS: A Snakemake pipeline for quality control and reproducible processing of chromatin profiling data

et al. 2021

View full text Add to dashboard Cite

Motivation: The chromatin profile measured by ATAC-seq, ChIP-seq, or DNase-seq experiments can identify genomic regions critical in regulating gene expression and provide insights on biological processes such as diseases and development. However, quality control and processing chromatin profiling data involves many steps, and different bioinformatics tools are used at each step. It can be challenging to manage the analysis. Results: We developed a Snakemake pipeline called CHIPS (CHromatin enrIchment ProcesSor) to streamline the processing of ChIP-seq, ATAC-seq, and DNase-seq data. The pipeline supports single- and paired-end data and is flexible to start with FASTQ or BAM files. It includes basic steps such as read trimming, mapping, and peak calling. In addition, it calculates quality control metrics such as contamination profiles, polymerase chain reaction bottleneck coefficient, the fraction of reads in peaks, percentage of peaks overlapping with the union of public DNaseI hypersensitivity sites, and conservation profile of the peaks. For downstream analysis, it carries out peak annotations, motif finding, and regulatory potential calculation for all genes. The pipeline ensures that the processing is robust and reproducible. Availability: CHIPS is available at https://github.com/liulab-dfci/CHIPS.

show abstract

CHIPS: A Snakemake pipeline for quality control and reproducible processing of chromatin profiling data

Taing

Cousins

Bai

et al. 2021

Preprint

View full text Add to dashboard Cite

Motivation: The chromatin profile measured by ATAC-seq, ChIP-seq, or DNase-seq experiments can identify genomic regions critical in regulating gene expression and provide insights on biological processes such as diseases and development. However, quality control and processing chromatin profiling data involve many steps, and different bioinformatics tools are used at each step. It can be challenging to manage the analysis. Results: We developed a Snakemake pipeline called CHIPS (CHromatin enrichment Processor) to streamline the processing of ChIP-seq, ATAC-seq, and DNase-seq data. The pipeline supports single- and paired-end data and is flexible to start with FASTQ or BAM files. It includes basic steps such as read trimming, mapping, and peak calling. In addition, it calculates quality control metrics such as contamination profiles, PCR bottleneck coefficient, the fraction of reads in peaks, percentage of peaks overlapping with the union of public DNaseI hypersensitivity sites, and conservation profile of the peaks. For downstream analysis, it carries out peak annotations, motif finding, and regulatory potential calculation for all genes. The pipeline ensures that the processing is robust and reproducible. Availability: CHIPS is available at https://bitbucket.org/plumbers/cidc_chips/src/master/ Contact: mtang@ds.dfci.harvard.edu; henry_long@dfci.harvard.edu

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Gali Bai

T1K: efficient and accurate KIR and HLA genotyping with next-generation sequencing data

Comprehensive Characterizations of Immune Receptor Repertoire in Tumors and Cancer Immunotherapy Studies

Efficient and accurate KIR and HLA genotyping with massively parallel sequencing data

CHIPS: A Snakemake pipeline for quality control and reproducible processing of chromatin profiling data

CHIPS: A Snakemake pipeline for quality control and reproducible processing of chromatin profiling data

Contact Info

Product

Resources

About