Learned Sorted Table Search and Static Indexes in Small Model Space

Amato, Domenico; Bosco, Giosuè Lo; Giancarlo, Raffaele

doi:10.1007/978-3-031-08421-8_32

Cited by 6 publications

(16 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In particular, here we concentrate on the study of how different kinds of Binary and k-ary Searches can affect the performance of learned indexes. Moreover, following Amato et al, 3,4 we use datasets of varying sizes in order to understand how the data structures perform on the different levels of the internal memory hierarchy. In addition to that, we also use datasets generated as in Reference 12, in order to establish that the binary search routines we use behave consistently with the findings in the mentioned paper.…”

Section: Experimental Methodologymentioning

confidence: 99%

“…The second kind of datasets have origin from the carefully chosen ones in Reference 19 (and therein referred to as amzn32 , amzn64 , face , osm , wiki ). They have been derived from them in References 3 and 4, in order to fit well each level of the main memory hierarchy with respect to the Intel I7 architecture. The essential point of the derivation is that, for each of the generated datasets, the CDF of the corresponding original dataset is well approximated.…”

Section: Experimental Methodologymentioning

confidence: 99%

“…Here we focus on average query time, although space may be of importance also. 3,4 For a given dataset, this amounts to up to thirty different Models to choose from. The routine used for the final search stage is, by default, the lower_bound routine.…”

Section: Experiments: Searching Using Learned Indexes With or Without...mentioning

confidence: 99%

“…It may be a small percentage or a really large one. The interested reader can find a study in References 3 and 4. The scenario we consider is the one in which one can only use constant additional space with respect to the input table.…”

Section: Experiments: Searching In Constant Additional Space With or ...mentioning

confidence: 99%

“…Learned Static Indexes, introduced by Kraska et al 1 (but see also Reference 2), with follow‐up in References 3‐9, are a recent approach to search in a sorted table, quite effective with respect to existing procedures and data structures, for example, B‐trees, 10 used in important application domains such as Databases 11 and Search Engines 12 . As described in Section 2, such a model may be as simple as a straight line or more complex, with a tree‐like structure, as the ones mentioned in Section 3.2.2.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Standard versus uniform binary search and their variants in learned static indexing: The case of the searching on sorted data benchmarking software platform

2022

Self Cite

View full text Add to dashboard Cite

Learned Indexes use a model to restrict the search of a sorted table to a smaller interval. Typically, a final binary search is done using the lower_bound routine of the Standard C++ library. Recent studies have shown that on current processors other search approaches (such as k-ary search) can be more efficient in some applications. Using the SOSD learned indexing benchmarking software, we extend these results to show that k-ary search is indeed a better choice when using learned indexes. We highlight how such a choice may be dependent on the computer architecture used, for example, Intel I7 or Apple M1, and provide guidelines for the selection of the Search routine within the learned indexing framework. K E Y W O R D Salgorithms with prediction, binary search variants, learned index structures, search on sorted data platform INTRODUCTIONLearned Static Indexes, introduced by Kraska et al. 1 (but see also Reference 2), with follow-up in References 3-9, are a recent approach to search in a sorted table, quite effective with respect to existing procedures and data structures, for example, B-trees, 10 used in important application domains such as Databases 11 and Search Engines. 12 As described in Section 2, such a model may be as simple as a straight line or more complex, with a tree-like structure, as the ones mentioned in Section 3.2.2. It is used to make a prediction regarding where a query element may be in the sorted table.Then, the search is limited to the interval so identified and performed via standard binary search. The use of this routine is more of a natural choice rather than a requirement. In fact, the lower_bound routine from the standard C++ library is almost exclusively used.In order to place our contributions on the proper ground, it is useful to recall that two major studies 12,13 have recently investigated which binary search routines or variants are better suited to take advantage of modern computer

show abstract

Section: Experimental Methodologymentioning

confidence: 99%

Section: Experimental Methodologymentioning

confidence: 99%

Section: Experiments: Searching Using Learned Indexes With or Without...mentioning

confidence: 99%

Section: Experiments: Searching In Constant Additional Space With or ...mentioning

confidence: 99%