Learned Indexes use a model to restrict the search of a sorted table to a smaller interval. Typically, a final binary search is done using the lower_bound routine of the Standard C++ library. Recent studies have shown that on current processors other search approaches (such as k-ary search) can be more efficient in some applications. Using the SOSD learned indexing benchmarking software, we extend these results to show that k-ary search is indeed a better choice when using learned indexes. We highlight how such a choice may be dependent on the computer architecture used, for example, Intel I7 or Apple M1, and provide guidelines for the selection of the Search routine within the learned indexing framework.
K E Y W O R D Salgorithms with prediction, binary search variants, learned index structures, search on sorted data platform
INTRODUCTIONLearned Static Indexes, introduced by Kraska et al. 1 (but see also Reference 2), with follow-up in References 3-9, are a recent approach to search in a sorted table, quite effective with respect to existing procedures and data structures, for example, B-trees, 10 used in important application domains such as Databases 11 and Search Engines. 12 As described in Section 2, such a model may be as simple as a straight line or more complex, with a tree-like structure, as the ones mentioned in Section 3.2.2. It is used to make a prediction regarding where a query element may be in the sorted table.Then, the search is limited to the interval so identified and performed via standard binary search. The use of this routine is more of a natural choice rather than a requirement. In fact, the lower_bound routine from the standard C++ library is almost exclusively used.In order to place our contributions on the proper ground, it is useful to recall that two major studies 12,13 have recently investigated which binary search routines or variants are better suited to take advantage of modern computer