Communication Networks
I/O of Scientific Workflows Monitored in Detail | |
| Key: | WLBSS24 |
| Author: | Joel Witzke, Ansgar Lößer, Vasilis Bountris, Florian Schintke, Björn Scheuermann |
| Date: | September 2024 |
| Kind: | In proceedings |
| Publisher: | IEEE |
| Abstract: | Correlating detailed local resource utilization data with the high-level concepts of distributed scientific workflow systems eventually causing it is challenging. When running a large-scale scientific data analysis workflow across a distributed execution environment, we want to analyze its I/O behaviour to identify potential bottlenecks. Since tasks are assigned to any available nodes, local resource usage on a node does not directly show which tasks are causing it. We acquire resource usage profiles of the involved nodes to link them to the individual workflow tasks. This is done by properly associating low-level trace metadata with high-level task information from log files and job management systems like Kubernetes. This information helps identifying areas of the workflow on a logical task level where improvements can make the biggest impact. |
| Official URL | |
The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, not withstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.