Big Data is widely used in many organizations nowadays. Hive is an open source data warehouse system for managing large data set. It provides a SQL-like interface to Hadoop over Map-Reduce framework. Currently, Big Data solution starts to adopt HiveQL tool to improve execution time of relational information. In this paper, we investigate on an execution time of query processing issues comparing two algorithm of ORC file: ZLIB and SNAPPY. The results show that ZLIB can compress data up to 87% compared to NONE compressing data. It was better than SNAPPY which has space saving 79%. However, the key for reducing execution time is Map-Reduce that were shown by a less query execution time when mapper and data node were equal. For example, all query suites in 6-node(ZLIB/SNAPPY) with 250-million table rows has quite similar execution time comparison to 9-node(ZLIB/SNAPPY) with 350-million table rows.
Docker engine is an extremely powerful tool for PaaS platform of cloud computing. It gives benefits for large-scale of internet services. Web service is basic service for everyone who requires to access internet that web infrastructure must has scalability with load-balance web server called reverse proxy. The key answers for a large-scale web must have multiple web servers working together with high speed bandwidth. Moreover, multiple clusters can find in the same data center there are required to assign priority and quality of each cluster service. We investigate load-balance assign link aggregation with network QoS by using pipework script and traffic control tool in frontend reverse proxy server on each cluster. Our research evaluates scenario of network QoS ratios which include 50/50, 60/40, 70/30 and 80/20. We compare network bandwidth between both web reverse proxy clusters. The results present our designed and implementation tool not only can control network QoS on each web reverse proxy cluster in all load-balance link aggregation modes which include round-robin, XOR and ALB but also those of clusters can access multiple network interface. In experiment, average network bandwidths in all QoS cases are around 200 MB per second for link aggregation of 2 gigabit interface.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.