Lambros Odysseos scite author profile

Lambros Odysseos

5Publications

7Citation Statements Received

25Citation Statements Given

How they've been cited

How they cite others

Affiliations

Cyprus University of Technology

Publications

Order By: Most citations

Automatic Performance Tuning for Distributed Data Stream Processing Systems

Herodotou

Odysseos

Chen

et al. 2022

View full text Add to dashboard Cite

Distributed data stream processing systems (DSPSs) such as Storm, Flink, and Spark Streaming are now routinely used to process continuous data streams in (near) real-time. However, achieving the low latency and high throughput demanded by today's streaming applications can be a daunting task, especially since the performance of DSPSs highly depends on a large number of system parameters that control load balancing, degree of parallelism, buffer sizes, and various other aspects of system execution. This tutorial offers a comprehensive review of the state-of-the-art automatic performance tuning approaches that have been proposed in recent years. The approaches are organized into five main categories based on their methodologies and features: cost modeling, simulation-based, experimentdriven, machine learning, and adaptive tuning. The categories of approaches will be analyzed in depth and compared to each other, exposing their various strengths and weaknesses. Finally, we will identify several open research problems and challenges related to automatic performance tuning for DSPSs.

show abstract

Migration of Software Components to Microservices: Matching and Synthesis

Christoforou

Odysseos

Andreou

2019

View full text Add to dashboard Cite

An Intelligent Framework for Vessel Traffic Monitoring Using AIS Data

Evmides

Odysseos

Michaelides

et al. 2022

View full text Add to dashboard Cite

DITIS: A Distributed Tiered Storage Simulator

Filho¹,

Odysseos²,

Yang³

et al. 2022

Infocommunications journal

View full text Add to dashboard Cite

This paper presents DITIS, a simulator for distributed and tiered file-based storage systems. In particular, DITIS can model a distributed storage system with up to three levels of storage tiers and up to three additional levels of caches. Each tier and cache can be configured with different number and type of storage media devices (e.g., HDD, SSD, NVRAM, DRAM), each with their own performance characteristics. The simulator utilizes the provided characteristics in fine-grained performance cost models (which are distinct for each device type) in order to compute the duration time of each I/O request processed on each tier. At the same time, DITIS simulates the overall flow of requests through the different layers and storage nodes of the system using numerous pluggable policies that control every aspect of execution, ranging from request routing and data redundancy to cache and tiering strategies. For performing the simulation, DITIS adapts an extended version of the Actor Model, during which key components of the system exchange asynchronous messages with each other, much like a real distributed multi-threaded system. The ability to simulate the execution of a workload in such an accurate and realistic way brings multiple benefits for its users, since DITIS can be used to better understand the behavior of the underlying file system as well as evaluate different storage setups and policies.

show abstract

On combining system and machine learning performance tuning for distributed data stream applications

Odysseos

Herodotou

2023

Distrib Parallel Databases

View full text Add to dashboard Cite

The growing need to identify patterns in data and automate decisions based on them in near-real time, has stimulated the development of new machine learning (ML) applications processing continuous data streams. However, the deployment of ML applications over distributed stream processing engines (DSPEs) such as Apache Spark Streaming is a complex procedure that requires extensive tuning along two dimensions. First, DSPEs have a plethora of system configuration parameters, like degree of parallelism, memory buffer sizes, etc., that have a direct impact on application throughput and/or latency, and need to be optimized. Second, ML models have their own set of hyperparameters that require tuning as they can affect the overall prediction accuracy of the trained model significantly. These two forms of tuning have been studied extensively in the literature but only in isolation from each other. This manuscript presents a comprehensive experimental study that combines system configuration and hyperparameter tuning of ML applications over DSPEs. The experimental results reveal unexpected and complex interactions between the choices of system configurations and hyperparameters, and their impact on both application and model performance. These insights motivate the need for new combined system and ML model tuning approaches, and open up new research directions in the field of self-managing distributed stream processing systems.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.