SUMMARYData mining is being increasingly used in biology. Biologists are adopting prototyping languages, like R and Matlab, to facilitate the application of data mining algorithms to their data. As a result, their scripts are becoming increasingly complex and also require frequent updates. Application to large datasets becomes impractical and the time-to-paper increases. Furthermore, even if there are various systems that can be used to efficiently process large datasets, for example, using Cloud and High Performance Computing, they usually require procedures to be translated into specific languages or to be adapted to a certain computing platform. Such modifications can speed up the processing, but translation is not automatic, especially in complex cases, and can require a large amount of programming effort and accurate validation. In this paper, we propose an approach to parallelize data mining procedures in the form of compiled software or R scripts developed by biology communities of practice. Our approach requires minimal alteration of the original code. In many cases, there is no need for code modification. Furthermore, it allows for fast updating when a new version is ready. We clarify the constraints and the benefits of our method and report a practical use case to demonstrate such benefits compared with a standard execution. Our approach relies on a distributed network of web services and ultimately exposes the algorithms as-a-Service, to be invoked by remote thin clients.
Cloud Computing is a new computing paradigm. Among the incredible number of challenges in this field two of them are considered of great relevance: SLA management and Security management. The level of trust in such context is very hard to define and is strictly related to the problem of management of SLA in cloud applications and providers. In this paper we will try to show how it is possible, using a cloud-oriented API derived from the mOSAIC project, to build up an SLA-oriented cloud application which enables the management of security features related to user authentication and authorization to an Infrastructure as a Service (IaaS) Cloud Provider. As Cloud Provider we will adopt the perf-Cloud solution, which uses GRID-based solutions for security management and service delivery. So the proposed solution can be used in order to build up easily a SLA-based interface for any GRID system.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.