Rasmus Pagh scite author profile

Federated learning (FL) is a machine learning setting where many clients (e.g. mobile devices or whole organizations) collaboratively train a model under the orchestration of a central server (e.g. service provider), while keeping the training data decentralized. FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches. Motivated by the explosive growth in FL research, this paper discusses recent advances and presents an extensive collection of open problems and challenges.

show abstract

Cuckoo hashing

Pagh

Rodler²

2004

Journal of Algorithms

802

297

View full text Add to dashboard Cite

Fast and scalable polynomial kernels via explicit feature maps

2013

View full text Add to dashboard Cite

Approximation of non-linear kernels using random feature mapping has been successfully employed in large-scale data analysis applications, accelerating the training of kernel machines. While previous random feature mappings run in O(ndD) time for n training samples in d-dimensional space and D random feature maps, we propose a novel randomized tensor product technique, called Tensor Sketching, for approximating any polynomial kernel in O(n(d + D log D)) time. Also, we introduce both absolute and relative error bounds for our approximation to guarantee the reliability of our estimation algorithm. Empirically, Tensor Sketching achieves higher accuracy and often runs orders of magnitude faster than the state-of-the-art approach for large-scale real-world datasets.

show abstract

Cuckoo Hashing

Pagh¹

2008

134

215

View full text Add to dashboard Cite

Tight Thresholds for Cuckoo Hashing via XORSAT

et al. 2010

View full text Add to dashboard Cite

We settle the question of tight thresholds for offline cuckoo hashing. The problem can be stated as follows: we have n keys to be hashed into m buckets each capable of holding a single key. Each key has k ≥ 3 (distinct) associated buckets chosen uniformly at random and independently of the choices of other keys. A hash table can be constructed successfully if each key can be placed into one of its buckets. We seek thresholds c k such that, as n goes to infinity, if n/m ≤ c for some c < c k then a hash table can be constructed successfully with high probability, and if n/m ≥ c for some c > c k a hash table cannot be constructed successfully with high probability. Here we are considering the offline version of the problem, where all keys and hash values are given, so the problem is equivalent to previous models of multiple-choice hashing. We find the thresholds for all values of k > 2 by showing that they are in fact the same as the previously known thresholds for the random k-XORSAT problem. We then extend these results to the setting where keys can have differing number of choices, and provide evidence in the form of an algorithm for a conjecture extending this result to cuckoo hash tables that store multiple keys in a bucket.

show abstract

Cuckoo Hashing

Pagh¹,

Rodler²

2001

BRICS

116

149

View full text Add to dashboard Cite

We present a simple and efficient dictionary with worst case constant lookup time, equaling the theoretical performance of the classic dynamic perfect hashing scheme of Dietzfelbinger et al. (<em>Dynamic perfect hashing: Upper and lower bounds. SIAM J. Comput., 23(4):738-761, 1994</em>). The space usage is similar to that of binary search trees, i.e., three words per key on average. The practicality of the scheme is backed by extensive experiments and comparisons with known methods, showing it to be quite competitive also in the average case.

show abstract

Space Efficient Hash Tables with Worst Case Constant Access Time

Fotakis

Pagh

Sanders³

et al. 2004

Theory Comput Syst

126

137

View full text Add to dashboard Cite

Uniform Hashing in Constant Time and Optimal Space

Pagh¹,

Pagh²

2008

SIAM J. Comput.

132

View full text Add to dashboard Cite

Many algorithms and data structures employing hashing have been analyzed under the uniform hashing assumption, i.e., the assumption that hash functions behave like truly random functions. Starting with the discovery of universal hash functions, many researchers have studied to what extent this theoretical ideal can be realized by hash functions that do not take up too much space and can be evaluated quickly. In this paper we present an almost ideal solution to this problem: A hash function h : U → V that, on any set of n inputs, behaves like a truly random function with high probability, can be evaluated in constant time on a RAM, and can be stored in (1 +)n lg |V | + O(n + lg lg |U |) bits. Here can be chosen to be any positive constant, so this essentially matches the entropy lower bound. For many hashing schemes this is the first hash function that makes their uniform hashing analysis come true, with high probability, without incurring overhead in time or space.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Rasmus Pagh

Advances and Open Problems in Federated Learning

Cuckoo hashing

Fast and scalable polynomial kernels via explicit feature maps

Cuckoo Hashing

Tight Thresholds for Cuckoo Hashing via XORSAT

Cuckoo Hashing

Space Efficient Hash Tables with Worst Case Constant Access Time

Uniform Hashing in Constant Time and Optimal Space

Contact Info

Product

Resources

About