SC20 Proceedings

The International Conference for High Performance Computing, Networking, Storage, and Analysis

Alita: Comprehensive Performance Isolation through Bias Resource Management for Public Clouds

Authors: Quan Chen, Shuai Xue, and Shang Zhao (Shanghai Jiao Tong University, Alibaba Cloud); Shanpei Chen, Yihao Wu, Yu Xu, Zhuo Song, Tao Ma, and Yong Yang (Alibaba Cloud); and Minyi Guo (Shanghai Jiao Tong University)

Abstract: The tenants of public clouds share hardware resources on the same node, resulting in the potential for performance interference (or malicious attacks). A tenant is able to degrade the performance of its neighbors on the same node significantly through overuse of the shared memory bus, last level cache (LLC)/memory bandwidth, and power.

To eliminate such unfairness we propose Alita, a runtime system consisting of an online interference identifier and adaptive interference eliminator. The interference identifier monitors hardware and system-level event statistics to identify resource polluters. The eliminator improves the performance of normal applications by throttling only the resource usage of polluters. Specifically, Alita adopts bus lock sparsification, bias LLC/bandwidth isolation and selective power throttling to throttle the resource usage of polluters. Results for an experimental platform and in-production cloud demonstrate that Alita significantly improves the performance of co-located virtual machines in the presence of resource polluters based on system-level knowledge.

Back to Technical Papers Archive Listing