Chandra and Toueg introduced the concept of unreliable failure detectors. They showed how, by adding these detectors to an asynchronous system, it is possible to solve the Consensus problem. In this paper, we propose a new implementation of a failure detector. This implementation is a variant of the heartbeat failure detector which is adaptable and can support scalable applications. In this implementation we dissociate two aspects: a basic estimation of the expected arrival date to provide a short detection time, and an adaptation of the quality of service according to application needs. The latter is based on two principles: an adaptation layer and a heuristic to adapt the sending period of "I am alive" messages.
We present a new failure detector implementation. This implementation, a variant of the heartbeat failure detector, is both adaptable and designed for scalability. Its first specificity lies in the fact that it is designed as a shared service among several applications by way of an adaptation layer. This layer adapts the quality of service according to application needs. The second specificity is the hierarchic organization of the detection service: it allows to decrease the number of messages and the processor load. Through an experimentation evaluation, we show that our implementation is adaptable to the environment characteristics and usable with large scale applications.
We study the problem of shortest-path geographic routing in a static sensor network. Existing algorithms often make routing decisions based on node information in local neighborhoods. However, it is shown by Kuhn et al. that such a design constraint results in a highly undesirable lower bound for routing performance: if a best route has length c, then in the worst case a route produced by any localized algorithm has length Ω(c 2 ), which can be arbitrarily worse than the optimal. We present VIGOR, a VIsibility-Graph-based rOuting pRotocol that produces routes of length Θ(c). Our design is based on the construction of a much reduced visibility graph, which guides nodes to find near-optimal paths. The per-node protocol overheads in terms of state information and message transmission depend only on the complexity of the field's large topological features, rather than on the network size. Simulation results show that our protocol dramatically outperforms localized protocols such as GPSR and GOAFR+ in both average and worst cases, with reasonable extra overheads.
This paper presents P3Q, a fully decentralized gossip-based protocol to personalize query processing in social tagging systems. P3Q dynamically associates each user with social acquaintances sharing similar tagging behaviours. Queries are gossiped among such acquaintances, computed on the fly in a collaborative, yet partitioned manner, and results are iteratively refined and returned to the querier. Analytical and experimental evaluations convey the scalability of P3Q for top-k query processing. More specifically, we show that on a 10,000-user delicious trace, with little storage at each user, the queries are accurately computed within reasonable time and bandwidth consumption. We also report on the inherent ability of P3Q to cope with users updating profiles and departing.
Abstract-Virtual coordinate geographic routing is an appealing geographic routing approach for its ability to work without physical location information. We examine two representative such routing protocols, namely NoGeo and BVR, and show through experiments and theoretical analysis their limitation in adapting to complex field topologies, in particular fields with concave holes. Based on the new insights, we propose a distributed convex partition protocol that divides the field to subareas with convex shapes, using only connectivity information. A new geographic routing protocol, called CONVEX, that builds upon the partitioning protocol is then described. Simulations demonstrate significant performance improvement of the new routing protocol over NoGeo and BVR, in terms of transmission stretch and maintenance overheads.
This paper presents DARX, our framework for building applications that provide adaptive fault tolerance. It relies on the fact that multi-agent platforms constitute a very strong basis for decentralized software that is both flexible and scalable, and makes the assumption that the relative importance of each agent varies during the course of the computation. DARX regroups solutions which facilitate the creation of multi-agent applications in a large-scale context.
Its most important feature is adaptive replication: replication strategies are applied on a per-agent basis with respect to transient environment characteristics such as the importance of the agent for the computation, the network load or the mean time between failures.Firstly, the interwoven concerns of multi-agent systems and fault-tolerant solutions are put forward. An overview of the DARX architecture follows, as well as an evaluation of its performances. We conclude, after outlining the promising outcomes, by presenting prospective work.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.