Accurate Synthetic Tasks for Scientific Workflow Benchmarks
Key: WKMS26
Author: Tobias Wies, Sami Kharma, Tobias Meuser, Björn Scheuermann
Date: May 2026
Kind: In proceedings
Abstract: Scientific workflows are fundamental for automating complex data processing tasks. Research on workflow scheduling and execution strategies often relies on real-world workflows for testing and validation. The setup of such workflows is time-consuming and error-prone due to their dependence on large datasets, specialized software, or specific hardware requirements. Synthetic workflows have been proposed as a solution, offering realistic benchmarking scenarios. Existing workload generators, however, primarily model tasks based on aggregated resource consumption metrics, such as total CPU time or peak memory usage, which do not capture the dynamic resource usage patterns exhibited by real-world tasks. This limits the expressiveness of experiments performed on synthetic workflows. In this work, we introduce a novel approach for generating synthetic workflow tasks that capture dynamic resource usage patterns. In addition, we derive realistic task models by monitoring real workflow task executions and applying segmented linear regression to the measurement data. These models serve as the basis for generating synthetic tasks. Our method allows researchers to create synthetic workflows that more accurately reflect the behavior of real-world scientific workflows, while still being efficient and without requiring external datasets.
View Full paper (PDF) | Download Full paper (PDF)

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, not withstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.