This article presents an efficient algorithm for DNA sequence compression, which achieves the best compression ratios reported over a test set commonly used for evaluating DNA compression programs. The algorithm introduces many refinements to a compression method that combines: (1) encoding by a simple normalized maximum likelihood (NML) model for discrete regression, through reference to preceding approximate matching blocks, (2) encoding by a first order context coding and (3) representing strings in clear, to make efficient use of the redundancy sources in DNA data, under fast execution times. One of the main algorithmic features is the constraint on the matching blocks to include reasonably long contiguous matches, which not only reduces significantly the search time, but also can be used to modify the NML model to exploit the constraint for getting smaller code lengths. The algorithm handles the changing statistics of DNA data in an adaptive way and by predictively encoding the matching pointers it is successful in compressing long approximate matches. Apart from comparison with previous DNA encoding methods, we present compression results for the recently published human genome data.
Leiomyosarcoma is one of the most common mesenchymal tumors. Proteomics profiling analysis by reversephase protein lysate array surprisingly revealed that expression of the epithelial marker E-cadherin (encoded by CDH1) was significantly elevated in a subset of leiomyosarcomas. In contrast, E-cadherin was rarely expressed in the gastrointestinal stromal tumors, another major mesenchymal tumor type. We further sought to 1) validate this finding, 2) determine whether there is a mesenchymal to epithelial reverting transition (MErT) in leiomyosarcoma, and if so 3) elucidate the regulatory mechanism responsible for this MErT. Our data showed that the epithelial cell markers E-cadherin, epithelial membrane antigen, cytokeratin AE1/AE3, and pan-cytokeratin were often detected immunohistochemically in leiomyosarcoma tumor cells on tissue microarray. Interestingly, the E-cadherin protein expression was correlated with better survival in leiomyosarcoma patients. Whole genome microarray was used for transcriptomics analysis, and the epithelial gene expression signature was also associated with better survival. Bioinformatics analysis of transcriptome data showed an inverse correlation between E-cadherin and E-cadherin repressor Slug (SNAI2) expression in leiomyosarcoma, and this inverse correlation was vali- The adhesion protein E-cadherin, encoded by CDH1, plays a central part in the process of epithelial morphogenesis. The down-regulation of E-cadherin is associated with a process called epithelial to mesenchymal transition (EMT) 1 that accounts for increased invasion and metastasis during tumor progression in multiple carcinomas of epithelial origin (1-5). Altered expression of E-cadherin has been shown to be regulated through several transcriptional factors such as Snail, SIP1, Twist, and Slug (encoded by SNAI2) and non-coding RNA such as miR-200 and let-7 (3-9). EMT of epithelial cancer cells is characterized by acquisition of fibroblast-like properties with reduced intercellular adhesion and increased motility in vitro as well as metastasis (2, 5, 10). Recently, a similar but reverse process called mesenchymal to epithelial reverting transition (MErT) has been observed and reported (11). Ecadherin is also a key indicator of MErT during the metastatic seeding of disseminated carcinomas (11). In synovial sarcoma, the fusion protein SYT-SSX1 has been shown to induce MErT through the regulation of SNAI1 and Slug (12). These investigations suggest that MErT might be an important biological process for tumors of mesenchymal origin, although E-cadherin expression is infrequent in most sarcomas examined thus far (13).Leiomyosarcomas and gastrointestinal stromal tumors (GISTs), two of the most common mesenchymal tumors, share remarkably similar phenotypic features but are molecFrom the Departments of
The progression of gliomas has been extensively studied at the genomic level using cDNA microarrays. However, systematic examinations at the protein translational and post-translational levels are far more limited. We constructed a glioma protein lysate array from 82 different primary glioma tissues, and surveyed the expression and phosphorylation of 46 different proteins involved in signaling pathways of cell proliferation, cell survival, apoptosis, angiogenesis, and cell invasion. An analysis algorithm was employed to robustly estimate the protein expressions in these samples. When ranked by their discriminating power to separate 37 glioblastomas (high-grade gliomas) from 45 lower-grade gliomas, the following 12 proteins were identified as the most powerful discriminators: IBalpha, EGFRpTyr845, AKTpThr308, phosphatidylinositol 3-kinase (PI3K), BadpSer136, insulin-like growth factor binding protein (IGFBP) 2, IGFBP5, matrix metalloproteinase 9 (MMP9), vascular endothelial growth factor (VEGF), phosphorylated retinoblastoma protein (pRB), Bcl-2, and c-Abl. Clustering analysis showed a close link between PI3K and AKTpThr308, IGFBP5 and IGFBP2, and IBalpha and EGFRpTyr845. Another cluster includes MMP9, Bcl-2, VEGF, and pRB. These clustering patterns may suggest functional relationships, which warrant further investigation. The marked association of phosphorylation of AKT at Thr308, but not Ser473, with glioblastoma suggests a specific event of PI3K pathway activation in glioma progression.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.