Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data 2013
DOI: 10.1145/2463676.2465298
|View full text |Cite
|
Sign up to set email alerts
|

On brewing fresh espresso

Abstract: Espresso is a document-oriented distributed data serving platform that has been built to address LinkedIn's requirements for a scalable, performant, source-of-truth primary store. It provides a hierarchical document model, transactional support for modifications to related documents, realtime secondary indexing, on-the-fly schema evolution and provides a timeline consistent change capture stream. This paper describes the motivation and design principles involved in building Espresso, the data model and capabil… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 42 publications
(3 citation statements)
references
References 8 publications
(4 reference statements)
0
3
0
Order By: Relevance
“…A workload with 20% of writes reduces the speedup from 6× for YCSB-B to 3.2×, and YCSB-A (50% writes) reduces it to 1.7×. However, workloads with atypically high fraction of writes are rare [7,10,15,62]. We observe a difference below 6% between the model and the platform at 1ms SLO.…”
Section: Validation Of the Queuing Modelmentioning
confidence: 69%
See 1 more Smart Citation
“…A workload with 20% of writes reduces the speedup from 6× for YCSB-B to 3.2×, and YCSB-A (50% writes) reduces it to 1.7×. However, workloads with atypically high fraction of writes are rare [7,10,15,62]. We observe a difference below 6% between the model and the platform at 1ms SLO.…”
Section: Validation Of the Queuing Modelmentioning
confidence: 69%
“…Recent work has demonstrated the scalability benefits of the CREW model on Xeon-class servers [44,45]. As most workloads are read dominated [7,10,15,62], CREW offers a sweet spot in terms of scalable performance by keeping synchronization requirements to a minimum.…”
Section: Concurrency Modelmentioning
confidence: 99%
“…At LinkedIn, Samza is commonly deployed with Databus inputs: Databus is a change data A Apache Samza, Fig. 1 The two operators of a streaming word-frequency counter using Samza's StreamTask API (Image source: Kleppmann andKreps 2015, © 2015 IEEE, reused with permission) capture technology that records the log of writes to a database and makes this log available for applications to consume (Das et al 2012;Qiao et al 2013). Processing the stream of writes to a database enables jobs to maintain external indexes or materialized views onto data in a database and is especially relevant in conjunction with Samza's support for local state (see section "Fault-Tolerant Local State") ( Fig.…”
Section: Partitioned Log Processingmentioning
confidence: 99%