Abstract. As the size of modern data sets exceeds the disk and memory capacities of a single computer, machine learning practitioners have resorted to parallel and distributed computing. Given that optimization is one of the pillars of machine learning and predictive modeling, distributed optimization methods have recently garnered ample attention in the literature. Although previous research has mostly focused on settings where either the observations, or features of the problem at hand are stored in distributed fashion, the situation where both are partitioned across the nodes of a computer cluster (doubly distributed) has barely been studied. In this work we propose two doubly distributed optimization algorithms. The first one falls under the umbrella of distributed dual coordinate ascent methods, while the second one belongs to the class of stochastic gradient/coordinate descent hybrid methods. We conduct numerical experiments in Spark using real-world and simulated data sets and study the scaling properties of our methods. Our empirical evaluation of the proposed algorithms demonstrates the out-performance of a block distributed ADMM method, which, to the best of our knowledge is the only other existing doubly distributed optimization algorithm.
In this work, we investigate the structure and evolution of a peer-to-peer (P2P) payment application. A unique aspect of the network under consideration is that the edges among nodes represent financial transactions among individuals who shared an offline social interaction. Our dataset comes from Venmo, the most popular P2P mobile payment service. We present a series of static and dynamic measurements that summarize the key aspects of any social network, namely the degree distribution, density and connectivity. We find that the degree distributions do not follow a powerlaw distribution, confirming previous studies that real-world social networks are rarely scale-free. The giant component of Venmo is eventually composed of 99.9% of all nodes, and its clustering coefficient reaches 0.2. Last, we examine the topological version of the small-world hypothesis, and find that Venmo users are separated by a mean of 5.9 steps and a median of 6 steps.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.