The concept of a pan-genome, which is the collection of all genomes from a population, has shown great potential in genomics study, especially for crop sciences. The rice pan-genome constructed from the second-generation sequencing (SGS) data is about 270 Mb larger than Nipponbare, the rice reference genome (NipRG), but it still suffers from incompleteness and loss of genomic contexts. The third-generation sequencing (TGS) with long reads can help to construct better pan-genomes. In this paper, we reported a high-quality rice pan-genome construction method by introducing a series of new steps to deal with the long-read data including unmapped sequence block filtering, redundancy removing, and sequence block elongating. Compared to NipRG, the long-read sequencing-based pan-genome constructed from 105 rice accessions, which contains 604 Mb novel sequences, is much more comprehensive than the one constructed from ~3000 rice genomes sequenced with short reads. The repetitive sequences are the main components of novel sequences, which partially explained the differences between the pan-genomes based on TGS and SGS. Adding 6 wild rice accessions, there are about 879 Mb novel sequences and 19,000 novel genes in the rice pan-genome in total. In addition, we have created high-quality reference genomes for all representative rice populations, including 5 gapless reference genomes. This study has brought significant progress for our understanding about the rice pan-genome, and this pan-genome construction method for long-read data can be applied to accelerate a broad range of genomics studies.
Pangenomic study might improve the completeness of human reference genome (GRCh38) and promote precision medicine. Here, we use an automated pipeline of human pangenomic analysis to build gastric cancer pan-genome for 185 paired deep sequencing data (370 samples), and characterize the gene presence-absence variations (PAVs) at whole genome level. Genes ACOT1, GSTM1, SIGLEC14 and UGT2B17 are identified as highly absent genes in gastric cancer population. A set of genes from unaligned sequences with GRCh38 are predicted. We successfully locate one of predicted genes GC0643 on chromosome 9q34.2. Overexpression of GC0643 significantly inhibits cell growth, cell migration and invasion, cell cycle progression, and induces cell apoptosis in cancer cells. The tumor suppressor functions can be reversed by shGC0643 knockdown. The GC0643 is approved by NCBI database (GenBank: MW194843.1). Collectively, the robust pan-genome strategy provides a deeper understanding of the gene PAVs in the human cancer genome.
Background In cancer cells, fusion genes can produce linear and chimeric fusion-circular RNAs (f-circRNAs), which are functional in gene expression regulation and implicated in malignant transformation, cancer progression, and therapeutic resistance. For specific cancers, proteins encoded by fusion transcripts have been identified as innovative therapeutic targets (e.g., EML4-ALK). Even though RNA sequencing (RNA-Seq) technologies combined with existing bioinformatics approaches have enabled researchers to systematically identify fusion transcripts, specifically detecting f-circRNAs in cells remains challenging owing to their general sparsity and low abundance in cancer cells but also owing to imperfect computational methods. Results We developed the Python-based workflow “Fcirc” to identify fusion linear and f-circRNAs from RNA-Seq data with high specificity. We applied Fcirc to 3 different types of RNA-Seq data scenarios: (i) actual synthetic spike-in RNA-Seq data, (ii) simulated RNA-Seq data, and (iii) actual cancer cell–derived RNA-Seq data. Fcirc showed significant advantages over existing methods regarding both detection accuracy (i.e., precision, recall, F-measure) and computing performance (i.e., lower runtimes). Conclusion Fcirc is a powerful and comprehensive Python-based pipeline to identify linear and circular RNA transcripts from known fusion events in RNA-Seq datasets with higher accuracy and shorter computing times compared with previously published algorithms. Fcirc empowers the research community to study the biology of fusion RNAs in cancer more effectively.
A high serine content in body fluid was identified in a portion of patients with gastric cancer, but its biological significance was not clear. Here, we investigated the biological effect of serine on gastric cancer cells. Serine was added into the culture medium of MGC803 and HGC27 cancer cells, and its influence on multiple biological functions, such as cell growth, migration and invasion, and drug resistance was analyzed. We examined the global transcriptomic profiles in these cultured cells with high serine content. Both MGC803 and HGC27 cell lines were originated from male patients, however, their basal gene expression patterns were very different. The finding of cell differentiationassociated genes, ALPI, KRT18, TM4SF1, KRT81, A2M, MT1E, MUC16, BASP1, TUSC3, and PRSS21 in MGC803 cells suggested that this cell line was more poorly differentiated, compared to HGC27 cell line. When the serine concentration was increased to 150mg/ml in medium, the response of these two gastric cancer cell lines was different, particularly on cell growth, cell migration, and invasion and 5-FU resistance. In animal experiment, administration of high concentration of serine promoted cancer cell metastasis to local lymph node. Taken together, we characterized the basal gene expressing profiles of MGC803 and HGC27. The HGC27 cells were more differentiated than MGC803 cells. MGC803 cells were more sensitive to the change of serine content. Our results suggested that the responsiveness of cancer cells to microenvironmental change is associated with their genetic background.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.