Cost-effective Job-splitting in Parallelized Systems

September 05, 2017 – ,

The ubiquitous presence of parallel systems (also, Fork-Join systems) in this age of high-performance computing makes them a topic of great research interest. Few popular implementations of such systems are Apache Spark, Hadoop, Multipath TCP. In a Fork-Join system, jobs are split into tasks as they arrive and are assigned to one of the parallel servers. A job leaves the system when all of its constituent tasks are finished. A common metric of performance for such systems is the time a job has to wait in the system. This quantity is random and dependent on the dynamics of the job arrivals, i.e., how jobs enter the systems and the dynamics of the server capacities, i.e., how fast workers process their tasks. In this thesis, however, we intend to focus on cost-effective ways of splitting incoming jobs subject to the waiting time. With some assumptions on cost structure as per server usage (e.g. cost model of Amazon EC2 virtual machines), we seek to explore splitting strategies under the constraint of costs. The first step would be to find a scheduling strategy assuming linear cost per server along with simple job arrival and worker service time. Thereafter, we may look for further generalizations. There will be an opportunity to collaborate with Prof P. Kuehn (Universitaet Stuttgart) while working on this thesis.

download corresponding tendering

Keywords: Queueing Theory

Research Area(s):

Tutor: Sounak Kar,

Open Theses