Christian Lemke scite author profile

Peer Data Management Systems (PDMS) are a natural extension of heterogeneous database systems. One of the main tasks in such systems is efficient query processing. Insisting on complete answers, however, leads to asking almost every peer in the network. Relaxing these completeness requirements by applying approximate query answering techniques can significantly reduce costs. Since most users are not interested in the exact answers to their queries, rank-aware query operators like top-k or skyline play an important role in query processing. In this paper, we present the novel concept of relaxed skylines that combines the advantages of both rank-aware query operators and approximate query processing techniques. Furthermore, we propose a strategy for processing relaxed skylines in distributed environments that allows for giving guarantees for the completeness of the result using distributed data summaries as routing indexes.

show abstract

Speeding Up Queries in Column Stores

Lemke

Sattler

Faerber

et al. 2010

View full text Add to dashboard Cite

SAP HANA adoption of non-volatile memory

et al. 2017

View full text Add to dashboard Cite

Non-Volatile RAM (NVRAM) is a novel class of hardware technology which is an interesting blend of two storage paradigms: byte-addressable DRAM and block-addressable storage (e.g. HDD/SSD). Most of the existing enterprise relational data management systems such as SAP HANA have their internal architecture based on the inherent assumption that memory is volatile and base their persistence on explicit handling of block-oriented storage devices. In this paper, we present the early adoption of Non-Volatile Memory within the SAP HANA Database, from the architectural and technical angles. We discuss our architectural choices, dive deeper into a few challenges of the NVRAM integration and their solutions, and share our experimental results. As we present our solutions for the NVRAM integration, we also give, as a basis, a detailed description of the relevant HANA internals.

show abstract

A Relaxed But Not Necessarily Constrained Way from the Top to the Sky

Hose

Lemke

Sattler

et al.

View full text Add to dashboard Cite

Optical materials for astronomy from SCHOTT: the quality of large components

Jedamzik

Hengst

Elsmann

et al. 2008

View full text Add to dashboard Cite

Native store extension for SAP HANA

et al. 2019

View full text Add to dashboard Cite

We present an overview of SAP HANA's Native Store Extension (NSE). This extension substantially increases database capacity, allowing to scale far beyond available system memory. NSE is based on a hybrid in-memory and paged column store architecture composed from data access primitives. These primitives enable the processing of hybrid columns using the same algorithms optimized for traditional HANA's in-memory columns. Using only three key primitives, we fabricated byte-compatible counterparts for complex memory resident data structures (e.g. dictionary and hash-index), compressed schemes (e.g. sparse and run-length encoding), and exotic data types (e.g. geo-spatial). We developed a new buffer cache which optimizes the management of paged resources by smart strategies sensitive to page type and access patterns. The buffer cache integrates with HANA's new execution engine that issues pipelined prefetch requests to improve disk access patterns. A novel load unit configuration, along with a unified persistence format, allows the hybrid column store to dynamically switch between inmemory and paged data access to balance performance and storage economy according to application demands while reducing Total Cost of Ownership (TCO). A new partitioning scheme supports load unit specification at table, partition, and column level. Finally, a new advisor recommends optimal load unit configurations. Our experiments illustrate the performance and memory footprint improvements on typical customer scenarios.

show abstract

Maintenance strategies for routing indexes

Hose

Lemke

Sattler

2009

Distrib Parallel Databases

View full text Add to dashboard Cite

Query processing in large-scale unstructured P2P networks is a crucial part of operating such systems. In order to avoid expensive flooding of the network during query processing so-called routing indexes are used. Each peer maintains such an index for its neighbors. It provides a compact representation (data summary) of data accessible via each neighboring peer. An important problem in this context is to keep these data summaries up-to-date without paying high maintenance costs. In this paper, we investigate the problem of maintaining distributed data summaries in P2P-based environments without global knowledge and central instances. Based on a classification of update propagation strategies, we discuss several approaches to reduce maintenance costs and present results from an experimental evaluation.

show abstract

How to juggle columns

Paradies

Lemke

Plattner

et al. 2010

View full text Add to dashboard Cite

Many relational databases exhibit complex dependencies between data attributes, caused either by the nature of the underlying data or by explicitly denormalized schemas. In data warehouse scenarios, calculated key figures may be materialized or hierarchy levels may be held within a single dimension table. Such column correlations and the resulting data redundancy may result in additional storage requirements. They may also result in bad query performance if inappropriate independence assumptions are made during query compilation. In this paper, we tackle the specific problem of detecting functional dependencies between columns to improve the compression rate for column-based database systems, which both reduces main memory consumption and improves query performance. Although a huge variety of algorithms have been proposed for detecting column dependencies in databases, we maintain that increased data volumes and recent developments in hardware architectures demand novel algorithms with much lower runtime overhead and smaller memory footprint. Our novel approach is based on entropy estimations and exploits a combination of sampling and multiple heuristics to render it applicable for a wide range of use cases. We demonstrate the quality of our approach by means of an implementation within the SAP NetWeaver Business Warehouse Accelerator. Our experiments indicate that our approach scales well with the number of columns and produces reliable dependence structure information. This both reduces memory consumption and improves performance for nontrivial queries.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Christian Lemke

Processing relaxed skylines in PDMS using distributed data summaries

Speeding Up Queries in Column Stores

SAP HANA adoption of non-volatile memory

A Relaxed But Not Necessarily Constrained Way from the Top to the Sky

Optical materials for astronomy from SCHOTT: the quality of large components

Native store extension for SAP HANA

Maintenance strategies for routing indexes

How to juggle columns

Contact Info

Product

Resources

About