The explosion of the data quantities, which reflects the scaling of volumes, numbers, and types, has resulted in the development of new locations techniques and access to data. The final steps in this evolution have emerged new technologies: cloud computing and big data. The new requirements and the difficulties encountered in the management of data classified “big data” have emerged NoSQL and NewSQL systems. This paper develops a comparative study about the performance of six solutions NoSQL, employed by the important companies in the IT sector: MongoDB, Cassandra, HBase, Redis, Couchbase, and OrientDB. To compare the performance of these NoSQL systems, the authors will use a very powerful tool called YCSB: Yahoo! Cloud Serving Benchmark. The contribution is to provide some answers to choose the appropriate NoSQL system for the type of data used and the type of processing performed on that data.
NoSQL databases are new architectures developed to remedy the various weaknesses that have affected relational databases in highly distributed systems such as cloud computing, social networks, electronic commerce. Several companies loyal to traditional relational SQL databases for several decades seek to switch to the new “NoSQL” databases to meet the new requirements related to the change of scale in data volumetry, the load increases, the diversity of types of data handled, and geographic distribution. This paper develops a comparative study in which the authors will evaluate the performance of two databases very widespread in the field: MySQL as a relational database and MongoDB as a NoSQL database. To accomplish this confrontation, this research uses the Yahoo! Cloud Serving Benchmark (YCSB). This contribution is to provide some answers to choose the appropriate database management system for the type of data used and the type of processing performed on that data.
The technological revolution integrating multiple information sources and extension of computer science in different sectors led to the explosion of the data quantities, which reflects the scaling of vo-lumes, numbers and types. These massive increases have resulted in the development of new location techniques and access to data. The final steps in this evolution have emerged new technologies: Cloud and Big Data. The reference implementation of the Clouds and Big Data storage is incontestably the Hadoop Distributed File System (HDFS). This latter is based on the separation of metadata to data that consists in the centralization and isolation of the metadata of storage servers. In this paper, the authors propose an approach to improve the service metadata for Hadoop to maintain consistency without much compromising performance and scalability of metadata by suggesting a mixed solution between centralization and distribution of metadata to enhance the performance and scalability of the model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.