2017
DOI: 10.14778/3137765.3137770
|View full text |Cite
|
Sign up to set email alerts
|

Samza

Abstract: Distributed stream processing systems need to support stateful processing, recover quickly from failures to resume such processing, and reprocess an entire data stream quickly. We present Apache Samza, a distributed system for stateful and fault-tolerant stream processing. Samza utilizes a partitioned local state along with a low-overhead background changelog mechanism, allowing it to scale to massive state sizes (hundreds of TB) per application. Recovery from fa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2018
2018
2021
2021

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 204 publications
(18 citation statements)
references
References 27 publications
0
18
0
Order By: Relevance
“…We believe that the use of the RAM 3 S framework can effectively help researchers in scaling out to distributed computing scenarios analysis techniques that were initially conceived for a centralized system. Among the future research directions that we are interested in pursuing, we put forward the application of RAM 3 S in other contexts (like automated industry, smart mobility, and public health) and expanding it to other, recently introduced, big data platforms, like Apache Samza (https://samza.apache.org) [24].…”
Section: Discussionmentioning
confidence: 99%
“…We believe that the use of the RAM 3 S framework can effectively help researchers in scaling out to distributed computing scenarios analysis techniques that were initially conceived for a centralized system. Among the future research directions that we are interested in pursuing, we put forward the application of RAM 3 S in other contexts (like automated industry, smart mobility, and public health) and expanding it to other, recently introduced, big data platforms, like Apache Samza (https://samza.apache.org) [24].…”
Section: Discussionmentioning
confidence: 99%
“…Apache Samza is an open source distributed processing framework developed by LinkedIn. This processing framework was created to solve various kinds of stream processing requirements such as efficient use of resources and at scale, handle failures gracefully, and scalability [58]. It provides at-least-once processing semantics and once-at-a-time processing model [57].…”
Section: Apache Samzamentioning
confidence: 99%
“…Apache Samza, an open source stream processing framework, can be used for any of the above applications (Kleppmann and Kreps 2015;Noghabi et al 2017). It was originally developed at LinkedIn, then donated to the Apache Software Foundation in 2013, and became a top-level Apache project in 2015.…”
Section: Overviewmentioning
confidence: 99%
“…Samza is designed for usage scenarios that require very high throughput: in some production settings, it processes millions of messages per second or trillions of events per day (Feng 2015;Paramasivam 2016;Noghabi et al 2017). Consequently, the design of Samza prioritizes scalability and operational robustness above most other concerns.…”
Section: Overviewmentioning
confidence: 99%
See 1 more Smart Citation