The dynamics of peer participation, or churn, are an inherent property of Peer-to-Peer (P2P) systems and critical for design and evaluation. Accurately characterizing churn requires precise and unbiased information about the arrival and departure of peers, which is challenging to acquire. Prior studies show that peer participation is highly dynamic but with conflicting characteristics. Therefore, churn remains poorly understood, despite its significance.In this paper, we identify several common pitfalls that lead to measurement error. We carefully address these difficulties and present a detailed study using three widelydeployed P2P systems: an unstructured file-sharing system (Gnutella), a content-distribution system (BitTorrent), and a Distributed Hash Table (Kad). Our analysis reveals several properties of churn: (i) overall dynamics are surprisingly similar across different systems, (ii) session lengths are not exponential, (iii) a large portion of active peers are highly stable while the remaining peers turn over quickly, and (iv) peer session lengths across consecutive appearances are correlated. In summary, this paper advances our understanding of churn by improving accuracy, comparing different P2P file sharing/distribution systems, and exploring new aspects of churn.
Abstract-In recent years, peer-to-peer (P2P) file-sharing systems have evolved to accommodate growing numbers of participating peers. In particular, new features have changed the properties of the unstructured overlay topologies formed by these peers. Little is known about the characteristics of these topologies and their dynamics in modern file-sharing applications, despite their importance. This paper presents a detailed characterization of P2P overlay topologies and their dynamics, focusing on the modern Gnutella network. We present Cruiser, a fast and accurate P2P crawler, which can capture a complete snapshot of the Gnutella network of more than one million peers in just a few minutes, and show how inaccuracy in snapshots can lead to erroneous conclusions-such as a power-law degree distribution. Leveraging recent overlay snapshots captured with Cruiser, we characterize the graphrelated properties of individual overlay snapshots and overlay dynamics across slices of back-to-back snapshots. Our results reveal that while the Gnutella network has dramatically grown and changed in many ways, it still exhibits the clustering and short path lengths of a small world network. Furthermore, its overlay topology is highly resilient to random peer departure and even systematic attacks. More interestingly, overlay dynamics lead to an "onion-like" biased connectivity among peers where each peer is more likely connected to peers with higher uptime. Therefore, long-lived peers form a stable core that ensures reachability among peers despite overlay dynamics.
Abstract-This paper presents Respondent-Driven Sampling (RDS) as a promising technique to derive unbiased estimates of node properties in unstructured overlay networks such as Gnutella. Using RDS and a previously proposed technique, namely Metropolized Random Walk (MRW) sampling, we examine the efficiency of estimating node properties in unstructured overlays and identify some of the key factors that determine the accuracy of sampling techniques. We evaluate the RDS and MRW techniques using simulation over a wide range of static and dynamic graphs as well as experiments over a widely deployed Gnutella network. Our study sheds light on how the connectivity structure among nodes and its dynamics affect the accuracy and efficiency of the two sampling techniques. Both techniques exhibit a rather similar performance over a wide range of scenarios. However, RDS significantly outperforms MRW when the overlay structure exhibits a combination of highly skewed node degrees and highly skewed (local) clustering coefficients.
Abstract-During recent years, Distributed Hash Tables (DHTs) have been extensively studied through simulation and analysis. However, due to their limited deployment, it has not been possible to observe the behavior of a widely-deployed DHT in practice. Recently, the popular eMule file-sharing software incorporated a Kademlia-based DHT, called Kad, which currently has around one million simultaneous users.In this paper, we empirically study the performance of the key DHT operation, lookup, over Kad. First, we analytically derive the benefits of different ways to increase the richness of routing tables in Kademlia-based DHTs. Second, we empirically characterize two aspects of the accuracy of routing tables in Kad, namely completeness and freshness, and characterize their impact on Kad's lookup performance. Finally, we investigate how the efficiency and consistency of lookup in Kad can be improved by performing parallel lookup and maintaining multiple replicas, respectively. Our results pinpoint the best operating point for the degree of lookup parallelism and the degree of replication for Kad.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.