The Design and Implementation of Modern Column-Oriented Database Systems

Abadi, Daniel J.; Boncz, Peter; Harizopoulos, Stavros

doi:10.1561/1900000024

Cited by 149 publications

(131 citation statements)

References 75 publications

Supporting

Mentioning

129

Contrasting

Unclassified

Order By: Relevance

“…(Unless this is the case of a nested query, where the result of subquery can remain in a summarized form, like we did with rough queries in Kowalski et al 2013.) This phase could be referred to as materialization, though it should not be confused with a standard meaning of materialization in columnar databases (Abadi et al 2013). Alternatively, if the knowledge capture layer is regarded as responsible for aforementioned information granulation (Zadeh 1997), then translation of query result summaries into final approximate results can be treated as information degranulation.…”

Section: Generating Final Query Resultsmentioning

confidence: 99%

“…In that earlier framework, packrows were described by simple summaries accessible independently from the underlying data. It combined the ideas taken from other database technologies (Abadi et al 2013) and the theory of rough sets (Pawlak and Skowron 2007), by means of using summaries to classify data packs as relevant, irrelevant and partially relevant for particular SELECT statements -by analogy to deriving rough set positive, negative and boundary regions of the considered concepts, respectively. Such higher-level classifications were useful to limit the amounts of compressed data packs required to access to finish calculations.…”

Section: Historical Backgroundmentioning

confidence: 99%

“…For the considered engine, it would be a step toward the paradigms of multi-level granular data analytics (Yao 2016). Yet another aspect refers to vertical data organization (Abadi et al 2013), which -in our case -means an independent access to collections of histograms, special values, gaps, cooccurrence ratios, etc. This way, for every query, we can grasp these components of stored data summaries that are required to execute particular operations.…”

Section: Data Types and Derived Columnsmentioning

confidence: 99%

See 2 more Smart Citations

A new approximate query engine based on intelligent capture and fast transformations of granulated data summaries

Ślęzak

Glick

Betliński³

et al. 2017

J Intell Inf Syst

View full text Add to dashboard Cite

We outline the processes of intelligent creation and utilization of granulated data summaries in the engine aimed at fast approximate execution of analytical SQL statements. We discuss how to use the introduced engine for the purposes of ad-hoc data exploration over large and quickly increasing data collected in a heterogeneous or distributed fashion. We focus on mechanisms that transform input data summaries into result sets representing query outcomes. We also illustrate how our computational principles can be put together with other paradigms of scaling and harnessing data analytics.

show abstract

Section: Generating Final Query Resultsmentioning

confidence: 99%

Section: Historical Backgroundmentioning

confidence: 99%

Section: Data Types and Derived Columnsmentioning

confidence: 99%

See 1 more Smart Citation

A new approximate query engine based on intelligent capture and fast transformations of granulated data summaries

Ślęzak

Glick

Betliński³

et al. 2017

J Intell Inf Syst

View full text Add to dashboard Cite

show abstract

“…This trend has been called the not only SQL or NoSQL and was one of the outcomes of a rise of interactive, especially social, web services within the web 2.0 movement. [6] The most significant developments in the area of columnar data stores are the C-Store [2] [7] and MonetDB [7].…”

Section: Column-oriented Dbmsmentioning

confidence: 99%

The Column-oriented Data Store Performance Considerations

Nowosielski

Kowalski

Kulczycki

2016

Annals of Computer Science and Information Systems

View full text Add to dashboard Cite

Abstract-The massive amounts of data processed by information systems raise the importance of detailed database performance analysis. Column-oriented data stores are becoming increasingly popular in big data appliances. This paper identifies database performance factors on the basis of empirical studies on a custom implementation. To summarize the research, a simple performance mathematical model has been created.

show abstract

“…Nevertheless, it needs to be stated explicitly that both models are not related. Modern column-oriented data stores have been covered in [2]. Implementation details of the Apache Cassandra columnfamily store have been described in [3].…”

Section: B Column-oriented Databasesmentioning

confidence: 99%

The column-oriented database partitioning optimization based on the natural computing algorithms

Nowosielski

Kowalski

Kulczycki

2015

Annals of Computer Science and Information Systems

View full text Add to dashboard Cite

Abstract-This paper describes the basic components of a research project aimed at the application of natural computing metaheuristics to optimize the horizontal scaling of databases. Column oriented databases were selected for the project because of their unique properties. A mathematical model has been created in order to align the problem of horizontal scalability to the general optimization methods, such as natural computing algorithms.

show abstract

The Design and Implementation of Modern Column-Oriented Database Systems

Cited by 149 publications

References 75 publications

A new approximate query engine based on intelligent capture and fast transformations of granulated data summaries

A new approximate query engine based on intelligent capture and fast transformations of granulated data summaries

The Column-oriented Data Store Performance Considerations

The column-oriented database partitioning optimization based on the natural computing algorithms

Contact Info

Product

Resources

About