Tanxiao Huang scite author profile

BackgroundSome applications, especially those clinical applications requiring high accuracy of sequencing data, usually have to face the troubles caused by unavoidable sequencing errors. Several tools have been proposed to profile the sequencing quality, but few of them can quantify or correct the sequencing errors. This unmet requirement motivated us to develop AfterQC, a tool with functions to profile sequencing errors and correct most of them, plus highly automated quality control and data filtering features. Different from most tools, AfterQC analyses the overlapping of paired sequences for pair-end sequencing data. Based on overlapping analysis, AfterQC can detect and cut adapters, and furthermore it gives a novel function to correct wrong bases in the overlapping regions. Another new feature is to detect and visualise sequencing bubbles, which can be commonly found on the flowcell lanes and may raise sequencing errors. Besides normal per cycle quality and base content plotting, AfterQC also provides features like polyX (a long sub-sequence of a same base X) filtering, automatic trimming and K-MER based strand bias profiling.ResultsFor each single or pair of FastQ files, AfterQC filters out bad reads, detects and eliminates sequencer’s bubble effects, trims reads at front and tail, detects the sequencing errors and corrects part of them, and finally outputs clean data and generates HTML reports with interactive figures. AfterQC can run in batch mode with multiprocess support, it can run with a single FastQ file, a single pair of FastQ files (for pair-end sequencing), or a folder for all included FastQ files to be processed automatically. Based on overlapping analysis, AfterQC can estimate the sequencing error rate and profile the error transform distribution. The results of our error profiling tests show that the error distribution is highly platform dependent.ConclusionMuch more than just another new quality control (QC) tool, AfterQC is able to perform quality control, data filtering, error profiling and base correction automatically. Experimental results show that AfterQC can help to eliminate the sequencing errors for pair-end sequencing data to provide much cleaner outputs, and consequently help to reduce the false-positive variants, especially for the low-frequency somatic mutations. While providing rich configurable options, AfterQC can detect and set all the options automatically and require no argument in most cases.

show abstract

Local and abscopal responses in advanced intrahepatic cholangiocarcinoma with low TMB, MSS, pMMR and negative PD-L1 expression following combined therapy of SBRT with PD-1 blockade

Liu¹,

Yao²,

Song³

et al. 2019

j. immunotherapy cancer

View full text Add to dashboard Cite

Background Late-stage or recurrent intrahepatic cholangiocarcinoma (ICC) patients exhibit poor prognosis due to limited sensitivity to chemotherapy or radiotherapy and coexistence of multiple lesions. Programmed cell death protein 1 (PD-1) blockade provides a therapeutic opportunity for patients with high tumor mutation burden (TMB), high microsatellite instability (MSI-H), deficient mismatch repair (dMMR) and/or positive programmed cell death ligand 1 (PD-L1) expression. However, it is currently believed that patients with low TMB, microsatellite stable (MSS), proficient mismatch repair (pMMR) or negative PD-L1 expression are less likely to benefit from PD-1 blockade. Case presentation Here we provide the first report on the therapeutic responses of ICC patients treated with combined PD-1 blockade with stereotactic body radiotherapy (SBRT) (Cyberknife) in the background of low TMB, MSS, pMMR and negative PD-L1 expression. One stage IVA ICC patients and two postsurgical recurrent ICC patients were involved in this study and the responses of both locally irradiated tumor(s) and the abscopal tumors or metastasis to the combined therapy were assessed by magnetic resonance imaging (MRI) and positron emission tomography-computed tomography (PET-CT). The stage IVA ICC patient (patient A) exhibited a TMB of 1.2 muts/Mb with MSS, pMMR and < 1% PD-L1 expression. Both the intrahepatic lesion and the lymph node metastases were well controlled for 7 months, and partial response (PR) was achieved with the sum of lesion diameters decreased by 40.9%. One of the postsurgical recurrent ICC patients (Patient B) exhibited a TMB of 3.8 muts/Mb with MSS, pMMR and < 1% PD-L1 expression. Both the recurrent intrahepatic lesion and the lymph node metastases were well controlled by the combined therapy and the sum of lesion diameter decreased by 86.3% (PR). The other postsurgical recurrent patient (Patient C) exhibited a TMB of 0.98 muts/Mb with MSS, pMMR and < 1% PD-L1 expression, and achieved complete response (CR) and maintained for 11 months. Abscopal effects were observed for all three patients. Conclusions This study provided the first set of evidence for the effectiveness of SBRT and PD-1 blockade combined therapy in late-stage or recurrent ICC patients with low TMB, MSS, pMMR and negative PD-L1 expression, and potentially expanded the indications of the combined therapy to those patients who were previously not suitable for immunotherapy.

show abstract

Gencore: an efficient tool to generate consensus reads for error suppressing and duplicate removing of NGS data

et al. 2019

View full text Add to dashboard Cite

BackgroundRemoving duplicates might be considered as a well-resolved problem in next-generation sequencing (NGS) data processing domain. However, as NGS technology gains more recognition in clinical application, researchers start to pay more attention to its sequencing errors, and prefer to remove these errors while performing deduplication operations. Recently, a new technology called unique molecular identifier (UMI) has been developed to better identify sequencing reads derived from different DNA fragments. Most existing duplicate removing tools cannot handle the UMI-integrated data. Some modern tools can work with UMIs, but are usually slow and use too much memory. Furthermore, existing tools rarely report rich statistical results, which are very important for quality control and downstream analysis. These unmet requirements drove us to develop an ultra-fast, simple, little-weighted but powerful tool for duplicate removing and sequence error suppressing, with features of handling UMIs and reporting informative results.ResultsThis paper presents an efficient tool gencore for duplicate removing and sequence error suppressing of NGS data. This tool clusters the mapped sequencing reads and merges reads in each cluster to generate one single consensus read. While the consensus read is generated, the random errors introduced by library construction and sequencing can be removed. This error-suppressing feature makes gencore very suitable for the application of detecting ultra-low frequency mutations from deep sequencing data. When unique molecular identifier (UMI) technology is applied, gencore can use them to identify the reads derived from same original DNA fragment. Gencore reports statistical results in both HTML and JSON formats. The HTML format report contains many interactive figures plotting statistical coverage and duplication information. The JSON format report contains all the statistical results, and is interpretable for downstream programs.ConclusionsComparing to the conventional tools like Picard and SAMtools, gencore greatly reduces the output data’s mapping mismatches, which are mostly caused by errors. Comparing to some new tools like UMI-Reducer and UMI-tools, gencore runs much faster, uses less memory, generates better consensus reads and provides simpler interfaces. To our best knowledge, gencore is the only duplicate removing tool that generates both informative HTML and JSON reports. This tool is available at: https://github.com/OpenGene/gencore

show abstract

Comprehensive analysis of POLE and POLD1 Gene Variations identifies cancer patients potentially benefit from immunotherapy in Chinese population

Yao¹,

Yang

Zhao

et al. 2019

Sci Rep

View full text Add to dashboard Cite

POLE/POLD1 gene variants have been suggested as potential markers for immunotherapy due to their significant association with the tumor mutational burden (TMB), an effective indicator for response prediction in immunotherapy. However, the correlation of POLE/POLD1 variants with MSI, MMR, TMB, MMR-related and key driver gene mutations needs to be defined to support patient recruitment and therapeutic effect assessment in immunotherapy. 1,392 Chinese cancer patients were recruited, and the correlation of POLE/POLD1 variants with existing immunotherapeutic markers and cancer pathways was investigated. A next-generation sequencing panel including 605 cancer-related genes was used for variant sequencing. It was found that the frequency of POLE variants was not statistically different from that in COSMIC database, while the frequency of POLD1 variants was significantly higher in lung cancer. c.857 C > G and c.2091dupC were potential high frequency variants in Chinese cancer patients. Patients carrying POLE damaging variants were significantly younger than POLE/POLD1 WT patients. Patients carrying POLE/POLD1 damaging variants exhibited significantly higher TMB and frequency of MMR gene variants than POLE/POLD1 WT patients. Patients with POLE damaging variants also exhibited significantly higher frequency of driver gene variants than POLE/POLD1 WT patients. Further analysis showed that POLE damaging variants may affect the cancer development through MMR, TGFβ and RTK/RAS/RAF signaling pathways, and POLD1 through MMR pathways. In conclusion, this study identified key characteristics and regions of POLE/POLD1 genes that correlates with TMB, MMR gene mutations and key driver gene mutations, and provided theoretical and practical basis for patient selection based on POLE/POLD1 gene status in immunotherapy.

show abstract

The contribution of hereditary cancer-related germline mutations to lung cancer susceptibility

Liu¹,

Liu²,

Suo³

et al. 2020

Transl Lung Cancer Res

View full text Add to dashboard Cite

Background: Germline variations may contribute to lung cancer susceptibility besides environmental factors. The influence of germline mutations on lung cancer susceptibility and their correlation with somatic mutations has not been systematically investigated.Methods: In this study, germline mutations from 1,026 non-small cell lung cancer (NSCLC) patients were analyzed with a 58-gene next-generation sequencing (NGS) panel containing known hereditary cancerrelated genes, and were categorized based on American College of Medical Genetics and Genomics (ACMG) guidelines in pathogenicity, and the corresponding somatic mutations were analyzed using a 605-gene NGS panel containing known cancer-related genes.Results: Plausible genetic susceptibility was found in 4.7% of lung cancer patients, in which 14 patients with pathogenic mutations (P group) and 34 patients with likely-pathogenic mutations (LP group) were identified. The ratio of the first degree relatives with lung cancer history of the P groups was significantly higher than the Non-P group (P=0.009). The ratio of lung cancer patients with history of other cancers was higher in P (P=0.0007) or LP (P=0.017) group than the Non-P group. Pathogenic mutations fell most commonly in BRCA2, followed by CHEK2 and ATM. Likely-pathogenic mutations fell most commonly in NTRK1 and EXT2, followed by BRIP1 and PALB2. These genes are involved in DNA repair, cell cycle regulation and tumor suppression. By comparing the germline mutation frequency from this study with that from the whole population or East Asian population (gnomAD database), we found that the overall odds ratio (OR) for P or LP group was 17.93 and 15.86, respectively, when compared with the whole population, and was 2.88 and 3.80, respectively, when compared with the East Asian population, suggesting the germline mutations of the P and LP groups were risk factors for lung cancer. Somatic mutation analysis revealed no significant difference in tumor mutation burden (TMB) among the groups, although a trend of lower TMB in the pathogenic group was found. The SNV/INDEL mutation frequency of TP53 in the P group was significantly lower than the other two groups, and the copy number variation (CNV) mutation frequency of PIK3CA and MET was significantly higher than the Non-P group. Pathway enrichment analysis found no significant difference in aberrant pathways among the three groups.Conclusions: A proportion of 4.7% of patients carrying germline variants may be potentially linked to increased susceptibility to lung cancer. Patients with pathogenic germline mutations exhibited stronger family history and higher lung cancer risk.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Tanxiao Huang

AfterQC: automatic filtering, trimming, error removing and quality control for fastq data

Local and abscopal responses in advanced intrahepatic cholangiocarcinoma with low TMB, MSS, pMMR and negative PD-L1 expression following combined therapy of SBRT with PD-1 blockade

Gencore: an efficient tool to generate consensus reads for error suppressing and duplicate removing of NGS data

Comprehensive analysis of POLE and POLD1 Gene Variations identifies cancer patients potentially benefit from immunotherapy in Chinese population

The contribution of hereditary cancer-related germline mutations to lung cancer susceptibility

Contact Info

Product

Resources

About