Originally motivated by the need for research reproducibility and data reuse, large-scale, open access information repositories have become key resources for training and testing of advanced machine learning applications in biomedical and clinical research. To be of value, such repositories must provide large, high-quality data sets, where quality is defined as minimising variance due to data collection protocols and data misrepresentations. Curation is the key to quality. We have constructed a large public access image repository, The Cancer Imaging Archive, dedicated to the promotion of open science to advance the global effort to diagnose and treat cancer. Drawing on this experience and our experience in applying machine learning techniques to the analysis of radiology and pathology image data, we will review the requirements placed on such information repositories by state-of-the-art machine learning applications and how these requirements can be met.
PURPOSE Precision medicine requires an understanding of individual variability, which can only be acquired from large data collections such as those supported by the Cancer Imaging Archive (TCIA). We have undertaken a program to extend the types of data TCIA can support. This, in turn, will enable TCIA to play a key role in precision medicine research by collecting and disseminating high-quality, state-of-the-art, quantitative imaging data that meet the evolving needs of the cancer research community METHODS A modular technology platform is presented that would allow existing data resources, such as TCIA, to evolve into a comprehensive data resource that meets the needs of users engaged in translational research for imaging-based precision medicine. This Platform for Imaging in Precision Medicine (PRISM) helps streamline the deployment and improve TCIA’s efficiency and sustainability. More importantly, its inherent modular architecture facilitates a piecemeal adoption by other data repositories. RESULTS PRISM includes services for managing radiology and pathology images and features and associated clinical data. A semantic layer is being built to help users explore diverse collections and pool data sets to create specialized cohorts. PRISM includes tools for image curation and de-identification. It includes image visualization and feature exploration tools. The entire platform is distributed as a series of containerized microservices with representational state transfer interfaces. CONCLUSION PRISM is helping modernize, scale, and sustain the technology stack that powers TCIA. Repositories can take advantage of individual PRISM services such as de-identification and quality control. PRISM is helping scale image informatics for cancer research at a time when the size, complexity, and demands to integrate image data with other precision medicine data-intensive commons are mounting.
Cloud Computing researches involve a tremendous amount of entities such as users, applications, and virtual machines. Due to the limited access and often variable availability of such resources, researchers have their prototypes tested against the simulation environments, opposed to the real cloud environments. Existing cloud simulation environments such as CloudSim and EmuSim are executed sequentially, where a more advanced cloud simulation tool could be created extending them, leveraging the latest technologies as well as the availability of multi-core computers and the clusters in the research laboratories. This research seeks to develop Cloud2Sim, a concurrent and distributed cloud simulator, extending CloudSim while exploiting the features provided by Hazelcast, Infinispan and Hibernate Search to distribute the storage and execution of the simulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.