Locality Sensitive Hashing (LSH) is widely adopted to index similar data in high-dimensional space for approximate nearest neighbor search. Demanding applications (e.g. web search) mean that LSH must exhibit low response times and high throughput. To achieve this, they tend to load balance between multiple machines. However, as the scale of concurrent queries and the volume of data grow, large numbers of index messages are required. Hence, the network is a key bottleneck. To address this gap, we propose NetSHa, which exploits the computational capacity of programmable switches. Specifically, we introduce a heuristic sort-reduce approach to drop potentially poor candidate answers while preserving search quality. Then, NetSHa aggregates good candidate answers from different index messages when transmitting them. Through this, it reduces the network communication cost. Furthermore, we introduce a best-effort replacement mechanism to improve its concurrency. We implement NetSHa on a Barefoot Tofino programmable switch and evaluate it using 7 real-world datasets. The experimental results show that NetSHa reduces the packet volume by 4 ∼ 10 times and improves the search efficiency by least 3× in comparison with typical LSH-based distributed search frameworks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.