Cancer genomic, transcriptomic, and proteomic profiling has generated extensive data that necessitate the development of tools for its analysis and dissemination. We developed UALCAN to provide a portal for easy exploring, analyzing, and visualizing these data, allowing users to integrate the data to better understand the gene, proteins, and pathways perturbed in cancer and make discoveries. UALCAN web portal enables analyzing and delivering cancer transcriptome, proteomics, and patient survival data to the cancer research community. With data obtained from The Cancer Genome Atlas (TCGA) project, UALCAN has enabled users to evaluate protein-coding gene expression and its impact on patient survival across 33 types of cancers. The web portal has been used extensively since its release and received immense popularity, underlined by its usage from cancer researchers in more than 100 countries. The present manuscript highlights the task we have undertaken and updates that we have made to UALCAN since its release in 2017. Extensive user feedback motivated us to expand the resource by including data on a) microRNAs (miRNAs), long non-coding RNAs (lncRNAs), and promoter DNA methylation from TCGA and b) mass spectrometry-based proteomics from the Clinical Proteomic Tumor Analysis Consortium (CPTAC). UALCAN provides easy access to pre-computed, tumor subgroup-based gene/protein expression, promoter DNA methylation status, and Kaplan-Meier survival analyses. It also provides new visualization features to comprehend and integrate observations and aids in generating hypotheses for testing. UALCAN is accessible at
http://ualcan.path.uab.edu
Abstract-The IDX data format provides efficient, cache oblivious, and progressive access to large-scale scientific datasets by storing the data in a hierarchical Z (HZ) order. Data stored in IDX format can be visualized in a i interactive environment allowing for meaningful explorations with minimal required resources. This technology enables real-time, interactive visualization and analysis of large datasets on a variety of systems ranging from desktops and laptop computers to portable devices such as iPhones/iPads and over the web. While the existing ViSUS API for writing IDX data is serial, there are obvious advantages to applying the IDX format to the output of large scale scientific simulations. We have therefore developed PIDX -a parallel API for writing data in an IDX format. With PIDX it is now possible to generate IDX datasets directly from large scale scientific simulations with the added advantage of real-time monitoring and visualization of the generated data.In this paper, we provide an overview of the IDX file format and how it is generated using PIDX. We then present a data model description and a novel aggregation strategy to enhance the scalability of the PIDX library. The S3D combustion application is used as an example to demonstrate the efficacy of PIDX for a real-world scientific simulation. S3D is used for fundamental studies of turbulent combustion requiring exceptionally high fidelity simulations. PIDX achieves up to 18 GiB/s I/O throughput at 8,192 processes for S3D to write data out in the IDX format. This allows for interactive analysis and visualization of S3D data, thus, enabling in situ analysis of S3D simulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.