This article examines a new problem of k-anonymity with respect to a reference dataset in privacyaware location data publishing: given a user dataset and a sensitive event dataset, we want to generalize the user dataset such that by joining it with the event dataset through location, each event is covered by at least k users. Existing k-anonymity algorithms generalize every k user locations to the same vague value, regardless of the events. Therefore, they tend to overprotect against the privacy compromise and make the published data less useful. In this article, we propose a new generalization paradigm called local enlargement, as opposed to conventional hierarchy-or partition-based generalization. Local enlargement guarantees that user locations are enlarged just enough to cover all events k times, and thus maximize the usefulness of the published data. We develop an O(H n )-approximate algorithm under the local enlargement paradigm, where n is the maximum number of events a user could possibly cover and H n is the Harmonic number of n. With strong pruning techniques and mathematical analysis, we show that it runs efficiently and that the generalized user locations are up to several orders of magnitude smaller than those by the existing algorithms. In addition, it is robust enough to protect against various privacy attacks.
Owing to recent advances in semiconductor technologies, flash disks have been a competitive alternative to traditional magnetic disks as external storage media. In this paper, we study how transaction recovery can be efficiently supported in database management systems (DBMSs) running on SLC flash disks. Inspired by the classical shadow-paging approach, we propose a new commit scheme, called flagcommit, to exploit the unique characteristics of flash disks such as fast random read access, out-place updating, and partial page programming. To minimize the need of writing log records, we embed the transaction status into flash pages through a chain of commit flags. Based on flagcommit, we develop two recovery protocols, namely commit-based flag commit (CFC) and abort-based flag commit (AFC), to meet different performance needs. They are flexible to support no-force buffer management and fine-grained concurrency control. Our performance evaluation based on the TPC-C benchmark shows that both CFC and AFC outperform the state-of-the-art recovery protocols.
With the rapid increasing capacity of flash memory, flash-aware indexing techniques are highly desirable for flash devices. The unique features of flash memory, such as the erase-before-write constraint and the asymmetric read/write cost, severely deteriorate the performance of the traditional B+-tree algorithm. In this paper, we propose an optimized indexing method, called lazy-update B+-tree, to overcome the limitations of flash memory. The basic idea is to defer the committing of update requests to the B+-tree by buffering them in a segment of main memory. They are later committed in groups so that the cost of each write operation can be amortized by a bunch of update requests. We identify a victim selection problem for the lazy-update B+-tree and develop two heuristic-based commit policies to address this problem. Simulation results show that the proposed lazy-update method, along with a well-designed commit policy, greatly improves the update performance of the traditional B+-tree while preserving the query efficiency.to reduce the update cost of B+-tree by logging data changes on flash pages. In this paper, we suggest a different approach that buffers data updates in a segment of main memory (called lazy-update pool ). An optimized indexing method, called lazy-update B+tree, is then proposed. Consider an update sequence {q 1 , q 2 , q 3 , q 4 }, where q 1 and q 3 will insert keys into leaf node 1, while q 2 and q 4 will insert keys into leaf node 2. Under the traditional method, both nodes will be updated twice. In the lazy-update B+-tree, these update requests will be temporarily stored in the lazy-update pool. The benefit is two-folded. First, the buffered update requests can later be committed to the B+-tree in batch, thereby sharing some reading cost of B+-tree in locating the leaf nodes to update. Second, the update sequence can be re-ordered into groups, i.e., {q 1 , q 3 } into one group, and {q 2 , q 4 } into another group. Then, by group-based commitment, both nodes 1 and 2 are updated only once. That is, half of write operations can be saved. Moreover, the proposed lazy-update B+tree method is complementary to the aforementioned log-based indexing methods: they can be preceded by our method to group update requests so as to further improve their performance. However, the lazy-update B+-tree is not implemented without cost. A query now will have to search the lazy-update pool in addition to the B+-tree. Nonetheless, by striking a good trade-off between the saving from group updates and the overhead from increased query complexity, our approach improves the overall performance.For the lazy-update B+-tree, when a new update request arrives and the lazy-update pool is full, a commit policy should be adopted to select a group of update requests for commitment to make room for the new request. An efficient commit policy is important for the lazy-update B+-tree method, as it has a great impact on the effect of group updates. Ideally, an optimal commit policy should always select those groups which do not ...
Abstract-With the rapid increasing capacity of flash chips, flash-aware indexing techniques are highly desirable for flash devices. The unique features of flash memory, such as the erase-before-write constraint and the asymmetric read/write cost, severely deteriorate the performance of the traditional B+-tree algorithm. In this paper, we propose a new indexing method, called lazy-update B+-tree, to overcome the limitations of flash memory. The basic idea is to defer the time of committing update requests to the B+-tree by buffering them in a segment of main memory. They are later committed in groups so that each write operation can be amortized by a bunch of update requests. We identify a victim selection problem for the lazy-update B+-tree and develop two heuristic-based commit policies to address the problem. Simulation results show that the proposed lazyupdate method, along with a well-designed commit policy, greatly improves the update performance of the traditional B+-tree while preserving the query efficiency.
Abstract-Flash disks have been an emerging secondary storage media. In particular, there have been portable devices, multimedia players and laptop computers that are configured with no magnetic disks but flash disks. It is envisioned that some RDBMSs will operate on flash disks in the near future. However, the I/O characteristics of flash disks are different from those of magnetic disks. Thus, in this paper, we study the core of query processing in RDBMSs -join processing -on flash disks. Specifically, we propose a new join method, called DigestJoin, to exploit fast random reads of flash disks. DigestJoin consists of two phases: (1) projecting the join attributes followed by a join on the projected attributes; and (2) fetching the full tuples that satisfy the join to produce the final join results. While the problem of tuple/page fetching with minimum I/O cost (in the second phase) is intractable, we propose three heuristic fetching strategies. We have implemented DigestJoin on a real flash disk for performance evaluation. Experiments on TPC-H datasets show that DigestJoin clearly outperforms the traditional sort-merge join under various system configurations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.