Order preserving minimal perfect hash functions and information retrieval

Fox, Edward A.; Chen, Q. F.; Daoud, Amjad M.; Heath, Lenwood S.

doi:10.1145/96749.98233

Cited by 11 publications

(6 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…we propose an enhanced hybrid index, called Clustered IUR-tree (CIURtree) incorporating textual clusters and two optimization algorithms based on CIUR-tree (Section 6). 4. Results of empirical studies with implementations of the proposed techniques demonstrate the scalability and efficiency of proposed indexes and algorithms (Section 7).…”

Section: Introductionmentioning

confidence: 80%

See 1 more Smart Citation

Reverse spatial and textual k nearest neighbor search

Lü

Cong

2011

Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data

129

111

View full text Add to dashboard Cite

Geographic objects associated with descriptive texts are becoming prevalent. This gives prominence to spatial keyword queries that take into account both the locations and textual descriptions of content. Specifically, the relevance of an object to a query is measured by spatial-textual similarity that is based on both spatial proximity and textual similarity. In this paper, we define Reverse Spatial Textual k Nearest Neighbor (RSTk NN) query, i.e., finding objects that take the query object as one of their k most spatial-textual similar objects. Existing works on reverse kNN queries focus solely on spatial locations but ignore text relevance.To answer RSTk NN queries efficiently, we propose a hybrid index tree called IUR-tree (Intersection-Union R-Tree) that effectively combines location proximity with textual similarity. Based on the IUR-tree, we design a branch-andbound search algorithm. To further accelerate the query processing, we propose an enhanced variant of the IUR-tree called clustered IUR-tree and two corresponding optimization algorithms. Empirical studies show that the proposed algorithms offer scalability and are capable of excellent performance.

show abstract

Section: Introductionmentioning

confidence: 80%

“…Similarly, ϕt and ψt are the minimum and maximum textual similarity of pairs of distinct objects in the dataset, respectively. Specifically, EJ (p1.vct, p2.vct) is the Extended Jaccard [21], which is widely used in textual similarity computing, as shown in Eqn (4).…”

Section: Problem Definitionmentioning

confidence: 99%

Reverse spatial and textual k nearest neighbor search

Lü

Cong

2011

Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data

129

111

View full text Add to dashboard Cite

show abstract

“…The order preserving hash functions discussed in [16] map a set of input values into a set of hash values for fast information retrieval, with the hash values preserving the order of input values. These hash functions are not designed for protecting security.…”

Section: Related Workmentioning

confidence: 99%

“…These hash functions are not designed for protecting security. For example, the hash functions [16] usually describe some algorithmic procedures used by all users. That is, there is no concept of secret values (such as encryption keys) that prevents the recovery of input values from hash values.…”

Section: Related Workmentioning

confidence: 99%

Nonlinear order preserving index for encrypted database query in service cloud environments

Liu

Wang

2013

Concurrency and Computation

View full text Add to dashboard Cite

SUMMARYThe database services on cloud are appearing as an attractive way of outsourcing databases. When a database is deployed on a cloud database service, the data security and privacy becomes a big concern for users. A straightforward way to address this concern is to encrypt the database. However, after encryption, the database cannot be easily queried. In this paper, we propose a nonlinear order preserving scheme for indexing encrypted data, which facilitates the range queries over encrypted databases. The scheme is secure even there are a large number of duplicates in plaintexts. Moreover, our scheme allows the programmability of basic indexing expressions and thus provides the capability of hiding the distribution of plaintexts from the distribution of indexes. This scheme is suitable for long-standing databases because its use does not need any assumption on the characteristics of database data, such as their distribution, range and number, which may change dramatically over time.

show abstract

“…However, the simple MPHF is not sufficient to keep the alphabetical ordering of string sets that is the prerequisite for the efficient compression of the links part. For that purpose an order preserving minimal perfect hash function (OPMPHF) is required, and the lower bound for the size of an OPMPHF is n log 2 n bits [30,34]. In this case that would be five times the size of the enumerated LZ tries.…”

Section: Enumerated Lz Triementioning

confidence: 99%

LZ trie and dictionary compression

Ristov¹

2005

Softw: Pract. Exper.

View full text Add to dashboard Cite

An efficient algorithm for trie compression has already been described. Here we present its practical value and demonstrate its superiority in terms of space savings to other methods of lexicon compression. Apart from simple lexicons, a compressed trie can, with some additional processing, be used as a component in the compact representation of simple static databases. We present the potential of the algorithm in compressing natural language dictionaries.

show abstract

Order preserving minimal perfect hash functions and information retrieval

Cited by 11 publications

References 9 publications

Reverse spatial and textual k nearest neighbor search

Reverse spatial and textual k nearest neighbor search

Nonlinear order preserving index for encrypted database query in service cloud environments

LZ trie and dictionary compression

Contact Info

Product

Resources

About