μConAdapter: Reinforcement Learning-based Fast Concurrency Adaptation for Microservices in Cloud

Abstract

Modern web-facing applications such as e-commerce comprise tens or hundreds of distributed and loosely coupled microservices that promise to facilitate high scalability. While hardware resource scaling approaches [28] have been proposed to address response time fluctuations in critical microservices, little attention has been given to the scaling of soft resources (e.g., threads or database connections), which control hardware resource concurrency. This paper demonstrates that optimal soft resource allocation for critical microservices significantly impacts overall system performance, particularly response time. This suggests the need for fast and intelligent runtime reallocation of soft resources as part of microservices scaling management. We introduce 𝜇ConAdapter, an intelligent and efficient framework for managing concurrency adaptation. It quickly identifies optimal soft resource allocations for critical microservices and adjusts them to mitigate violations of service-level objectives (SLOs). 𝜇ConAdapter utilizes fine-grained online monitoring metrics from both the system and application levels and a Deep Q-Network (DQN) to quickly and adaptively provide optimal concurrency settings for critical microservices. Using six realistic bursty workload traces and two representative microservices-based benchmarks (SockShop and SocialNetwork), our experimental results show that 𝜇ConAdapter can effectively mitigate large response time fluctuation and reduce the tail latency at the 99th percentile by 3× on average when compared to the hardware-only scaling strategies like Kubernetes Autoscaling and FIRM [28], and by 1.6× to the state-of-the-art concurrency-aware system scaling strategy like ConScale [21].

Publication
In Proceedings of the 14th ACM Symposium on Cloud Computing (SoCC’23)