SC20 Is Everywhere We Are

SC20 Virtual Platform
Waiting Game: Optimally Provisioning Fixed Resources for Cloud-Enabled Schedulers
Event Type
Paper
Tags
Cloud and Distributed Computing
Containers
Machine Learning, Deep Learning and Artificial Intelligence
Resource Management and Scheduling
Award Finalists
Best Paper Finalist
Best Student Paper Finalist
Registration Categories
TP
TimeWednesday, 18 November 20203pm - 3:30pm EST
LocationTrack 2
DescriptionWhile cloud platforms enable users to rent computing resources on demand to execute their jobs, buying fixed resources is still much cheaper than renting if utilization is high. Optimizing cloud costs requires users to determine how many fixed resources to buy versus rent based on their workload. In this paper, we introduce the concept of a waiting policy for cloud-enabled schedulers, the dual of a scheduling policy, and show that the optimal cost depends on this policy. We define multiple waiting policies and develop simple analytical models to reveal the trade-off among resource provisioning, cost and job waiting time. We evaluate the impact of these waiting policies on a year-long production batch workload consisting of 14m jobs run on a 14.3k-core cluster, and show that a compound waiting policy decreases the cost (by 5%) and the mean job waiting time (by 7x) compared to the current cluster.
Back To Top Button