This paper presents Tebaldi, a distributed key-value store that explores new ways to harness the performance opportunity of combining different specialized concurrency control mechanisms (CCs) within the same database. Tebaldi partitions conflicts at a fine granularity and matches them to specialized CCs within a hierarchical framework that is modular, extensible, and able to support a wide variety of concurrency control techniques, from single-version to multiversion and from lock-based to timestamp-based. When running the TPC-C benchmark, Tebaldi yields more than 20× the throughput of the basic two-phase locking protocol, and over 3.7× the throughput of Callas, a recent system that, like Tebaldi, aims to combine different CCs.
In order to minimize user perceived latency while ensuring high data availability, cloud applications desire to select servers from one of the multiple data centers (i.e., server clusters) in different geographical locations, which are able to provide desired services with low latency and low cost. This paper presents CloudGPS, a new server selection scheme of the cloud computing environment that achieves high scalability and ISP-friendliness. CloudGPS proposes a configurable global performance function that allows Internet service providers (ISPs) and cloud service providers (CSPs) to leverage the cost in terms of inter-domain transit traffic and the quality of service in terms of network latency. CloudGPS bounds the overall burden to be linear with the number of end users. Moreover, compared with traditional approaches, CloudGPS significantly reduces network distance measurement cost (i.e., from O(N ) to O(1) for each end user in an application using N data centers). Furthermore, CloudGPS achieves ISP-friendliness by significantly decreasing inter-domain transit traffic.
We propose a scheme to schedule the transmission of data center traffic to guarantee a transmission rate for long flows without affecting the rapid transmission required by short flows. We call the proposed scheme Deadline-Aware Queue (DAQ). The traffic of a data center can be broadly classified into long and short flows, where the terms long and short refer to the amount of data to be transmitted. In a data center, the long flows require modest transmission rates to keep maintenance, data updates, and functional operation. Short flows require either fast service or be serviced within a tight deadline. Satisfaction of both classes of bandwidth demands is needed. DAQ uses per-class queues at supporting switches, keeps minimum flow state information, and uses a simple but effective flow control. The credit-based flow control, employed between switch and data sources, ensures lossless transmissions. We study the performance of DAQ and compare it to those of other existing schemes. The results show that the proposed scheme improves the achievable throughput for long flows up to 37% and the application throughput for short flows up to 33% when compared to other schemes. DAQ guarantees a minimum throughput for long flows despite the presence of heavy loads of short flows.
To explore the status on the area of educational big data, 476 related articles form the from Web of Science core collection were analyzed by CiteSpace V. The results show that: (1) the area of Educational Big Data stared in 2012 and shows an upward trend year by year form 2014-2018, and the number of publications reached the historical peak in 2018. Williamson and Huda, USA and China, Educational sciences theory practice and Agro food industry hi tech top the list of contributing authors, country and publication respectively. (2) “eHealth research”, “current state”, “corporate use” “data entry”, “analytics discipline” and “psychological language” are the top six largest clusters on the area of educational big data from 2012-2019.
Computing the shortest-path distances between nodes is a key problem in analyzing social graphs. Traditional methods like breadth-first search (BFS) do not scale well with graph size. Recently, a Graph Coordinate System, called Orion, has been proposed to estimate shortest-path distances in a scalable way. Orion uses a landmark-based approach, which does not take account of the shortest-path distances between non-landmark nodes in coordinate calculation. Such biased input for the coordinate system cannot characterize the graph structure well. In this paper, we propose Pomelo, which calculates the graph coordinates in a decentralized manner. Every node in Pomelo computes its shortest-path distances to both nearby neighbors and some random distant neighbors. By introducing the novel partial BFS, the computational overhead of Pomelo is tunable. Our experimental results from different representative social graphs show that Pomelo greatly outperforms Orion in estimation accuracy while maintaining the same computational overhead.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.