This paper intends to perform de-duplication for enhancing the storage optimization by utilizing the similarity in mutual information. Hence, this paper contributes by proposing a hybrid fingerprint extracting using SH and HC algorithms. Secondly, the data is clustered using the latest technique called as SOMI-GO to extract the metadata. The extracted metadata is stored in metadata server which provides better storage optimization and de-duplication. SOMI-GO is adopted as it provides maximum second-order mutual information based on the similarity index. The proposed SOMI-GO technique is compared with the existing methods such as K-means, K-mode, ED-PSO, ED-GA and ED-GWO in terms of accuracy, TPR, TNR and performance time and the significance of the SOMI-GO method is described.
Scientific workflows perform computations exceeding single workstation's capabilities. When running such data intensive workflows in the cloud distributed across several physical locations, the execution time and the resource utilization efficiency highly depends on the initial placement and distribution of the input datasets across these multiple virtual machines in the Cloud. The ideal data placement scheme optimizes the execution of the data intensive scientific workflows in cloud by assigning the tasks to the execution site in such a way that the file transfers and the cost associated are reduced. Several data placement strategies in cloud based scientific workflows are reviewed. A data placement scheme which uses big data to improve the performance and also the data movement cost is studied. BDAP (Big Data Placement strategy), improves workflow performance by minimizing data movement across multiple virtual machines.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.