Yann Busnel scite author profile

Abstract-The population protocol model provides theoretical foundations for analyzing the properties emerging from simple and pairwise interactions among a very large number n of anonymous agents. The problem tackled in this paper is the following one: is there an efficient population protocol that exactly counts the difference κ between the number of agents that initially and independently set their state to A and the one that initially set it to B, assuming that each agent only uses a finite set of states ? We propose a solution which guarantees with any high probability that after O(log n) interactions any agent outputs the exact value of κ. Simulation results illustrate our theoretical analysis.

show abstract

New Results on a Generalized Coupon Collector Problem Using Markov Chains

Anceaume

Busnel

Séricola

2015

Journal of Applied Probability

View full text Add to dashboard Cite

Abstract. We study in this paper a generalized coupon collector problem, which consists in determining the distribution and the moments of the time needed to collect a given number of distinct coupons that are drawn from a set of coupons with an arbitrary probability distribution. We suppose that a special coupon called the null coupon can be drawn but never belongs to any collection. In this context, we obtain expressions of the distribution and the moments of this time. We also prove that the almost-uniform distribution, for which all the non-null coupons have the same drawing probability, is the distribution which minimizes the expected time to get a fixed subset of distinct coupons. This optimization result is extended to the complementary distribution of that time when the full collection is considered, proving by the way this well-known conjecture. Finally, we propose a new conjecture which expresses the fact that the almost-uniform distribution should minimize the complementary distribution of the time needed to get any fixed number of distinct coupons.

show abstract

Instantaneous throughput prediction in cellular networks: Which information is needed?

Samba

Busnel

Blanc

et al. 2017

View full text Add to dashboard Cite

Abstract-Downlink data rates can vary significantly in cellular networks, with a potentially non-negligible effect on the user experience. Content providers address this problem by using different representations (e.g., picture resolution, video resolution and rate) of the same content and switch among these based on measurements collected during the connection. If it were possible to know the achievable data rate before the connection establishment, content providers could choose the most appropriate representation from the very beginning. We have conducted a measurement campaign involving 60 users connected to a production network in France, to determine whether it is possible to predict the achievable data rate using measurements collected, before establishing the connection to the content provider, on the operator's network and on the mobile node. We show that it is indeed possible to exploit these measurements to predict, with a reasonable accuracy, the achievable data rate.

show abstract

Efficient key grouping for near-optimal load balancing in stream processing systems

Rivetti

Querzoni

Anceaume

et al. 2015

View full text Add to dashboard Cite

Key grouping is a technique used by stream processing frameworks to simplify the development of parallel stateful operators. Through key grouping a stream of tuples is partitioned in several disjoint sub-streams depending on the values contained in the tuples themselves. Each operator instance target of one sub-stream is guaranteed to receive all the tuples containing a specific key value. A common solution to implement key grouping is through hash functions that, however, are known to cause load imbalances on the target operator instances when the input data stream is characterized by a skewed value distribution. In this paper we present DKG, a novel approach to key grouping that provides near-optimal load distribution for input streams with skewed value distribution. DKG starts from the simple observation that with such inputs the load balance is strongly driven by the most frequent values; it identifies such values and explicitly maps them to sub-streams together with groups of less frequent items to achieve a near-optimal load balance. We provide theoretical approximation bounds for the quality of the mapping derived by DKG and show, through both simulations and a running prototype, its impact on stream processing applications.

show abstract

Dissecting bitcoin blockchain: Empirical analysis of bitcoin network (2009–2020)

Nerurkar

Patel²,

Busnel

et al. 2021

Journal of Network and Computer Applications

View full text Add to dashboard Cite

Bitcoin system (or Bitcoin) is a peer-to-peer and decentralized payment system that uses cryptocurrency named bitcoins (BTCs) and was released as open-source software in 2009. Unlike fiat currencies, there is no centralized authority or any statutory recognition, backing, or regulation for Bitcoin. All transactions are confirmed for validity by a network of volunteer nodes (miners) and after collective agreement is subsequently recorded into a distributed ledger "Blockchain". Bitcoin platform has attracted both social and anti-social elements. On the one hand, it is social as it ensures the exchange of value, maintaining trust in a cooperative, community-driven manner without the need for a trusted third party. At the same time, it is anti-social as it creates hurdles for law enforcement to trace suspicious transactions due to anonymity and privacy. To understand how the social and anti-social tendencies in the user base of Bitcoin affect its evolution, there is a need to analyze the Bitcoin system as a network. The current paper aims to explore the local topology and geometry of the Bitcoin network during its first decade of existence.

show abstract

Online Scheduling for Shuffle Grouping in Distributed Stream Processing Systems

Rivetti

Anceaume

Busnel

et al. 2016

View full text Add to dashboard Cite

Shuffle grouping is a technique used by stream processing frameworks to share input load among parallel instances of stateless operators. With shuffle grouping each tuple of a stream can be assigned to any available operator instance, independently from any previous assignment. A common approach to implement shuffle grouping is to adopt a Round-Robin policy, a simple solution that fares well as long as the tuple execution time is almost the same for all the tuples. However, such an assumption rarely holds in real cases where execution time strongly depends on tuple content. As a consequence, parallel stateless operators within stream processing applications may experience unpredictable unbalance that, in the end, causes undesirable increase in tuple completion times. In this paper we propose Online Shuffle Grouping (OSG), a novel approach to shuffle grouping aimed at reducing the overall tuple completion time. OSG estimates the execution time of each tuple, enabling a proactive and online scheduling of input load to the target operator instances. Sketches are used to efficiently store the otherwise large amount of information required to schedule incoming load. We provide a probabilistic analysis and illustrate, through both simulations and a running prototype, its impact on stream processing applications

show abstract

Uniform and Ergodic Sampling in Unstructured Peer-to-Peer Systems with Malicious Nodes

Anceaume

Busnel

Gambs

2010

View full text Add to dashboard Cite

ISBN: 978-3-642-17652-4International audienceWe consider the problem of uniform sampling in large scale open systems. Uniform sampling is a fundamental schema that guarantees that any individual in a population has the same probability to be selected as sample. An important issue that seriously hampers the feasibility of uniform sampling in open and large scale systems is the inevitable presence of malicious nodes. In this paper we show that restricting the number of requests that malicious nodes can issue and allowing for a full knowledge of the composition of the system is a necessary and sufficient condition to guarantee uniform and ergodic sampling. In a nutshell, a uniform and ergodic sampling guarantees that any node in the system is equally likely to appear as a sample at any non malicious node in the system and that infinitely often any nodes have a non null probability to appear as a sample at any honest nodes

show abstract

A Distributed Information Divergence Estimation over Data Streams

Anceaume¹,

Busnel²

2014

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

International audienceIn this paper, we consider the setting of large scale distributed systems, in which each node needs to quickly process a huge amount of data received in the form of a stream that may have been tampered with by an adversary. In this situation, a fundamental problem is how to detect and quantify the amount of work performed by the adversary. To address this issue, we propose a novel algorithm AnKLe for estimating the Kullback-Leibler divergence of an observed stream compared with the expected one. AnKLe combines sampling techniques and information-theoretic methods. It is very efficient, both in terms of space and time complexities, and requires only a single pass over the data stream. We show that AnKLe is an (ε, δ)-approximation algorithm with a space complexity Õ(1/ε + 1/ε^2) bits in "most" cases, and Õ(1/ε + (n−ε−1)/ε^2) otherwise, where n is the number of distinct data items in a stream. Moreover, we propose a distributed version of AnKLe that requires at most O (rl (log n + 1)) bits of communication between the l participating nodes, where r is number of rounds of the algorithm. Experimental results show that the estimation provided by AnKLe remains accurate even for different adversarial settings for which the quality of other methods dramatically decreases

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

334 Leonard St

Brooklyn, NY 11211

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yann Busnel

Counting with Population Protocols

New Results on a Generalized Coupon Collector Problem Using Markov Chains

Instantaneous throughput prediction in cellular networks: Which information is needed?

Efficient key grouping for near-optimal load balancing in stream processing systems

Dissecting bitcoin blockchain: Empirical analysis of bitcoin network (2009–2020)

Online Scheduling for Shuffle Grouping in Distributed Stream Processing Systems

Uniform and Ergodic Sampling in Unstructured Peer-to-Peer Systems with Malicious Nodes

A Distributed Information Divergence Estimation over Data Streams

Contact Info

Product

Resources

About