Abstract-Peer-to-peer (P2P) file sharing systems generate a major portion of the Internet traffic, and this portion is expected to increase in the future. We explore the potential of deploying proxy caches in different Autonomous Systems (ASes) with the goal of reducing the cost incurred by Internet service providers and alleviating the load on the Internet backbone. We conduct an eight-month measurement study to analyze the P2P traffic characteristics that are relevant to caching, such as object popularity, popularity dynamics, and object size. Our study shows that the popularity of P2P objects can be modeled by a Mandelbrot-Zipf distribution, and that several workloads exist in P2P traffic. Guided by our findings, we develop a novel caching algorithm for P2P traffic that is based on object segmentation, and proportional partial admission and eviction of objects. Our trace-based simulations show that with a relatively small cache size, a byte hit rate of up to 35% can be achieved by our algorithm, which is close to the byte hit rate achieved by an off-line optimal algorithm with complete knowledge of future requests. Our results also show that our algorithm achieves a byte hit rate that is at least 40% more, and at most triple, the byte hit rate of the common web caching algorithms. Furthermore, our algorithm is robust in face of aborted downloads, which is a common case in P2P systems.
Peer-to-peer (P2P) file sharing systems generate a major portion of the Internet traffic, and this portion is expected to increase in the future. We explore the potential of deploying caches in different Autonomous Systems (ASes) to reduce the cost imposed by P2P traffic on Internet service providers and to alleviate the load on the Internet backbone. We conduct a measurement study to analyze P2P characteristics that are relevant to caching, such as object popularity and size distributions. Our study shows that the popularity of P2P objects in different ASes can be modeled by a Mandelbrot-Zipf distribution, and that several workloads exist in P2P systems, each could be modeled by a Beta distribution. Guided by our findings, we develop a novel caching algorithm for P2P traffic that is based on object segmentation and partial admission and eviction of objects. Our trace-based simulations show that with a relatively small cache size, a byte hit rate of up to 35% can be acheived by our algorithm, which is a few percentages smaller than that of the off-line optimal algorithm. Our results also show that our algorithm achieves a byte hit rate that is at least 40% more and at most triple the byte hit rate of the common web caching algorithms. Furthermore, our algorithm is robust in face of aborted downloads-a common case in P2P systems-and achieves a byte hit rate that is even higher than in the case of full downloads.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.