End to End Zero Shot Model for Parallelism Prediction

Supervisor: Pratyush Agnihotri
KOM-ID: KOM-M-0778 Student: Jeffrey Resnik
Link zur Ausschreibung

To deal with rapidly changing workloads,  Distributed Stream Processing Systems rely on parallelism to use multiple instances of operators to process a high amount of data. In recent times, several heuristic-based and learned-based approaches have been proposed to tune parallelism, but they are limited to specific workloads and can't work for unseen workloads which could result in degraded performance and inefficient resource utilization. In this thesis, we look at how can end-to-end learning model be  used for parallelism prediction in distributed systems.  The proposed model should aim to accurately predict optimal parallelism levels under a variety of scenarios, thus improving data processing efficiency and resource utilization in distributed systems.