Shahzad Khan scite author profile

Big data processing systems are evolving to be more stream oriented where each data record is processed as it arrives by distributed and low-latency computational frameworks on a continuous basis. As the stream processing technology matures and more organizations invest in digital transformations, new applications of stream analytics will be identified and implemented across a wide spectrum of industries. One of the challenges in developing a streaming analytics infrastructure is the difficulty in selecting the right stream processing framework for the different use cases. With a view to addressing this issue, in this paper we present a taxonomy, a comparative study of distributed data stream processing and analytics frameworks, and a critical review of representative open source (Storm, Spark Streaming, Flink, Kafka Streams) and commercial (IBM Streams) distributed data stream processing frameworks. The study also reports our ongoing study on a multilevel streaming analytics architecture that can serve as a guide for organizations and individuals planning to implement a real-time data stream processing and analytics framework. INDEX TERMS Dataflow architectures, data stream architectures, distributed processing systems comparison, survey, taxonomy.

show abstract

VoIP: State of art for global connectivity—A critical review

Singh

Singh³

et al. 2014

Journal of Network and Computer Applications

View full text Add to dashboard Cite

Artificial intelligence framework for smart city microgrids: State of the art, challenges, and opportunities

Khan

Paul²,

Momtahan³

et al. 2018

View full text Add to dashboard Cite

Performance improvement in wireless sensor and actor networks based on actor repositioning

Khan

2015

View full text Add to dashboard Cite

A Scalable Framework for Multilevel Streaming Data Analytics using Deep Learning

Isah

Zulkernine

et al. 2019

View full text Add to dashboard Cite

The rapid growth of data in velocity, volume, value, variety, and veracity has enabled exciting new opportunities and presented big challenges for businesses of all types. Recently, there has been considerable interest in developing systems for processing continuous data streams with the increasing need for real-time analytics for decision support in the business, healthcare, manufacturing, and security. The analytics of streaming data usually relies on the output of offline analytics on static or archived data. However, businesses and organizations like our industry partner Gnowit, strive to provide their customers with real time market information and continuously look for a unified analytics framework that can integrate both streaming and offline analytics in a seamless fashion to extract knowledge from large volumes of hybrid streaming data. We present our study on designing a multilevel streaming text data analytics framework by comparing leading edge scalable open-source, distributed, and in-memory technologies. We demonstrate the functionality of the framework for a use case of multilevel text analytics using deep learning for language understanding and sentiment analysis including data indexing and query processing. Our framework combines Spark streaming for real time text processing, the Long Short Term Memory (LSTM) deep learning model for higher level sentiment analysis, and other tools for SQL-based analytical processing to provide a scalable solution for multilevel streaming text analytics.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Shahzad Khan

A Survey of Distributed Data Stream Processing Frameworks

VoIP: State of art for global connectivity—A critical review

Artificial intelligence framework for smart city microgrids: State of the art, challenges, and opportunities

Performance improvement in wireless sensor and actor networks based on actor repositioning

A Scalable Framework for Multilevel Streaming Data Analytics using Deep Learning

Contact Info

Product

Resources

About