Chasing the "Tail at Scale": Toward Cloud-Native Architectures

Talk
Jovan Stojkovic
Talk Series: 
Time: 
02.18.2025 11:00 to 12:00

To democratize access to cloud computing systems, cloud providers have introduced new, cloud-native, computing paradigms. These emerging paradigms, including microservices and serverless computing, offer significantly simpler programming models alongside cost-efficient billing models. However, cloud-native services differ fundamentally from traditional monolithic applications. They exhibit short execution times, frequent context switching, bursty request patterns, and strict tail latency requirements. Hence, when such workloads run on conventional hardware and software systems, they end up having substantial performance, energy, and resource inefficiencies. In this talk, I will present my research efforts to tackle these challenges by designing hardware platforms and software stacks that deliver orders of magnitude improvements in the efficiency of cloud-native workloads. First, I will introduce μManycore, a processor architecture tailored to minimize the tail latency of cloud services. I will then enhance the microarchitecture with Mosaic, which uses fine-grained partitioning of hardware resources to retain service state across frequent context switches. Next, I will integrate many domain-specific accelerators with the processor and introduce AccelFlow, an efficient mechanism for orchestrating these accelerators. Finally, I will showcase EcoFaaS, an energy management framework designed to significantly reduce the energy consumption of cloud-native services on these architectures while maintaining high performance. The combination of all these techniques enables a major advance in the efficiency of cloud-native workloads.