TY - GEN
T1 - A systematic mapping of performance in distributed stream processing systems
AU - Vogel, Adriano
AU - Henning, Sören
AU - Ertl, Otmar
AU - Rabiser, Rick
PY - 2023
Y1 - 2023
N2 - Several software systems are built upon stream processing architectures to process large amounts of data in near real-time. Today’s distributed stream processing systems (DSPSs) spread the processing among multiple machines to provide scalable performance. However, high-performance and Quality of Service (QoS) in distributed stream processing are challenging to predict, achieve, and maintain. While many studies focus on evaluating or improving the performance of stream processing, getting a comprehensive view of the current state of DSPSs and their performance in real-world deployments is challenging. In this paper, we present a systematic mapping study of the literature on DSPSs’ performance. We discuss existing challenges, the most used DSPSs, achieved performance, and future trends. Our results demonstrate that performance is still one of the major concerns in stream processing, with several solutions available and different outcomes regarding the metrics, execution environments, and use cases considered. Moreover, there is a need for better benchmarks and workloads as well as for performance improvements by increasing efficiency and utilizing modern hardware. Our study intends to help software engineering practitioners and researchers to understand how to choose the most suitable DSPS to build efficient data-intensive architectures
AB - Several software systems are built upon stream processing architectures to process large amounts of data in near real-time. Today’s distributed stream processing systems (DSPSs) spread the processing among multiple machines to provide scalable performance. However, high-performance and Quality of Service (QoS) in distributed stream processing are challenging to predict, achieve, and maintain. While many studies focus on evaluating or improving the performance of stream processing, getting a comprehensive view of the current state of DSPSs and their performance in real-world deployments is challenging. In this paper, we present a systematic mapping study of the literature on DSPSs’ performance. We discuss existing challenges, the most used DSPSs, achieved performance, and future trends. Our results demonstrate that performance is still one of the major concerns in stream processing, with several solutions available and different outcomes regarding the metrics, execution environments, and use cases considered. Moreover, there is a need for better benchmarks and workloads as well as for performance improvements by increasing efficiency and utilizing modern hardware. Our study intends to help software engineering practitioners and researchers to understand how to choose the most suitable DSPS to build efficient data-intensive architectures
UR - https://www.scopus.com/pages/publications/85175210645
U2 - 10.1109/SEAA60479.2023.00052
DO - 10.1109/SEAA60479.2023.00052
M3 - Conference proceedings
SN - 979-8-3503-4235-2
T3 - Proceedings - 2023 49th Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2023
SP - 293
EP - 300
BT - Proceedings of the 49th Euromicro Conference on Software Engineering and Advanced Applications (SEAA)
PB - IEEE
CY - New York City, United States
ER -