Persistent pathogen infection is a known cause of malignancy, although with sparse systematic evaluation across tumor types. We present a comprehensive landscape of 1060 infectious pathogens across 239 whole exomes and 1168 transcriptomes of breast, lung, gallbladder, cervical, colorectal, and head and neck tumors. We identify known cancer-associated pathogens consistent with the literature. In addition, we identify a significant prevalence of Fusobacterium in head and neck tumors, comparable to colorectal tumors. The Fusobacterium-high subgroup of head and neck tumors occurs mutually exclusive to human papillomavirus, and is characterized by overexpression of miRNAs associated with inflammation, elevated innate immune cell fraction and nodal metastases. We validate the association of Fusobacterium with the inflammatory markers IL1B, IL6 and IL8, miRNAs hsa-mir-451a, hsa-mir-675 and hsa-mir-486-1, and MMP10 in the tongue tumor samples. A higher burden of Fusobacterium is also associated with poor survival, nodal metastases and extracapsular spread in tongue tumors defining a distinct subgroup of head and neck cancer.
The analysis of the SARS-CoV-2 genome datasets has significantly advanced our understanding of the biology and genomic adaptability of the virus. However, the plurality of advanced sequencing datasets—such as short and long reads—presents a formidable computational challenge to uniformly perform quantitative, variant or phylogenetic analysis, thus limiting its application in public health laboratories engaged in studying epidemic outbreaks. We present a computational tool, Infectious Pathogen Detector (IPD), to perform integrated analysis of diverse genomic datasets, with a customized analytical module for the SARS-CoV-2 virus. The IPD pipeline quantitates individual occurrences of 1060 pathogens and performs mutation and phylogenetic analysis from heterogeneous sequencing datasets. Using IPD, we demonstrate a varying burden (5.055–999655.7 fragments per million) of SARS-CoV-2 transcripts across 1500 short- and long-read sequencing SARS-CoV-2 datasets and identify 4634 SARS-CoV-2 variants (~3.05 variants per sample), including 449 novel variants, across the genome with distinct hotspot mutations in the ORF1ab and S genes along with their phylogenetic relationships establishing the utility of IPD in tracing the genome isolates from the genomic data (as accessed on 11 June 2020). The IPD predicts the occurrence and dynamics of variability among infectious pathogens—with a potential for direct utility in the COVID-19 pandemic and beyond to help automate the sequencing-based pathogen analysis and in responding to public health threats, efficaciously. A graphical user interface (GUI)-enabled desktop application is freely available for download for the academic users at http://www.actrec.gov.in/pi-webpages/AmitDutt/IPD/IPD.html and for web-based processing at http://ipd.actrec.gov.in/ipdweb/ to generate an automated report without any prior computational know-how.
We present an updated version of our automated computational pipeline, Infection Pathogen Detector IPD 2.0 with a SARS-CoV-2 module, to perform genomic analysis to understand the pathogenesis and virulence of the virus. Analysing the currently available 208911 SARS-CoV2 genome sequences (as accessed on 28 Dec 2020), we generate an extensive database of sample- wise variants and clade annotation, which forms the core of the SARS-CoV-2 analysis module of the analysis pipeline. A comparative account of lineage-specific mutations in the newer SARS-CoV-2 strains emerging in the UK, South Africa and Brazil along with data reported from India identify overlapping and lineages specific acquired mutations suggesting a repetitive convergent and adaptive evolution. Thus, the persistence of pandemic may lead to the emergence of newer regional strains with improved fitness. IPD 2.0 also adopts the recent dynamic clade nomenclature and shows improvement in accuracy of clade assignment, processing time and portability, to its predecessor and thus could be a vital tool to help facilitate genomic surveillance in a population to identify variants involved in breakthrough infections.
Occult lymph-node metastasis is a crucial predictor of tongue cancer mortality, with an unmet need to understand the underlying mechanism. Our immunohistochemical and real-time PCR analysis of 208 tongue tumors show overexpression of Matrix Metalloproteinase, MMP10, in 86% of node-positive tongue tumors (n = 79; p < 0.00001). Additionally, global profiling for non-coding RNAs associated with node-positive tumors reveals that of the 11 significantly de-regulated miRNAs, miR-944 negatively regulates MMP10 by targeting its 3’-UTR. We demonstrate that proliferation, migration, and invasion of tongue cancer cells are suppressed by MMP10 knockdown or miR-944 overexpression. Further, we show that depletion of MMP10 prevents nodal metastases using an orthotopic tongue cancer mice model. In contrast, overexpression of MMP10 leads to opposite effects upregulating epithelial-mesenchymal-transition, mediated by a tyrosine kinase gene, AXL, to promote nodal and distant metastasis in vivo. Strikingly, AXL expression is essential and sufficient to mediate the functional consequence of MMP10 overexpression. Consistent with our findings, TCGA-HNSC data suggests overexpression of MMP10 or AXL positively correlates with poor survival of the patients. In conclusion, our results establish that the miR-944/MMP10/AXL- axis underlies lymph node metastases with potential therapeutic intervention and prediction of nodal metastases in tongue cancer patients.
Background Rapid analysis of SARS-CoV-2 genomic data plays a crucial role in surveillance and adoption of measures in controlling spread of Covid-19. Fast, inclusive and adaptive methods are required for the heterogenous SARS-CoV-2 sequence data generated at an unprecedented rate. Results We present an updated version of the SARS-CoV-2 analysis module of our automated computational pipeline, Infectious Pathogen Detector (IPD) 2.0, to perform genomic analysis to understand the variability and dynamics of the virus. It adopts the recent clade nomenclature and demonstrates the clade prediction accuracy of 92.8%. IPD 2.0 also contains a SARS-CoV-2 updater module, allowing automatic upgrading of the variant database using genome sequences from GISAID. As a proof of principle, analyzing 208,911 SARS-CoV-2 genome sequences, we generate an extensive database of 2.58 million sample-wise variants. A comparative account of lineage-specific mutations in the newer SARS-CoV-2 strains emerging in the UK, South Africa and Brazil and data reported from India identify overlapping and lineages specific acquired mutations suggesting a repetitive convergent and adaptive evolution. Conclusions A novel and dynamic feature of the SARS-CoV-2 module of IPD 2.0 makes it a contemporary tool to analyze the diverse and growing genomic strains of the virus and serve as a vital tool to help facilitate rapid genomic surveillance in a population to identify variants involved in breakthrough infections. IPD 2.0 is freely available from http://www.actrec.gov.in/pi-webpages/AmitDutt/IPD/IPD.html and the web-application is available at http://ipd.actrec.gov.in/ipdweb/.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.