Tag Archives: serverless

Reading Group. Faster and Cheaper Serverless Computing on Harvested Resources

The 83rd paper in the reading group continues with another SOSP’21 paper: “Faster and Cheaper Serverless Computing on Harvested Resources” by Yanqi Zhang, Íñigo Goiri, Gohar Irfan Chaudhry, Rodrigo Fonseca, Sameh Elnikety, Christina Delimitrou, Ricardo Bianchini. This paper is the second one in a series of harvested resources papers, with the first one appearing in OSDI’20.

As a brief background, harvested resources in the context of the paper are temporary free resources that are otherwise left unused on some machine. The first paper described a new type of VM and that can grow and contract to absorb all unused resources of a physical machine. An alternative mechanism to using underutilized machines is creating spot instances on these machines. The spot instance approach, however, comes with start-up and shut-down overheads. As a result, having one VM that runs all the time but can change its size depending on what is available can reduce these overheads. Of course, using such unpredictable VMs creates several challenges. What type of uses cases can tolerate such dynamic resource availability? The harvested VMs paper tested the approach on in-house data-processing tasks with modified Hadoop. 

The SOSP’21 serverless computing paper appears to present a commercial use case for harvested VMs. It makes sense to schedule serverless functions that have a rather limited runtime onto one of these dynamic harvested VMs. See, if a function runs for 30 seconds, then we only need the harvested VM to not shrink for these 30 seconds to support the function’s runtime. Of course, the reality is a bit more nuanced than this — serverless functions suffer from cold starts when the function is placed on a new machine, so, ideally, we want to have enough resources on the VM to last us through many invocations. The paper spends significant time studying various aspects of harvest VM performance and resource allocation. Luckily, around 70% of harvest VMs do not change their CPU allocations for at least 10 minutes, allowing plenty of time for a typical function to be invoked multiple times. Moreover, not all of these CPU allocation changes shrink the harvest VM, and adding more resources to the VM will not negatively impact functions it already has. 

There are two major problems with running serverless workloads in the harvest VM environment: VM eviction, and resource shrinkage. Both of these problems impact running functions and create additional scheduling issues. 

The VM eviction is not unique to harvest VMs and can also occur in spot VMs. According to the original harvest VM paper, harvest VMs should get evicted far less frequently — only when the full capacity of the physical server is needed. Moreover, the VM eviction has a negative impact only when it runs a function, and since VMs get an eviction warning, most often they have enough time to finish executions that have already started. As a result, a serverless workload running in harvest VMs still has a 99.9985\% success rate in the worst-case situation when a data center has limited harvested resources and undergoes many changes. Nevertheless, the paper considers a few other strategies to minimize evictions and their impact. For instance, one strategy is to use regular VMs for longer running functions to prevent them from being evicted “mid-flight,” while using harvest VMs for shorter jobs. 

The resource shrinkage problem is a bit more unique to harvest VMs. Despite most harvest VMs undergoing resource reallocation relatively infrequently, a VM shrinkage can have severe implications. One or more running functions may need to stop due to resource shrinkage, and rescheduling these jobs may also be impacted by the cold start. As a result, the paper presents a scheduling policy, called Min-Worker-Set (MWS), that minimizes the cold starts for a function. The idea behind MWS is to place each function onto as few servers as possible while ensuring adequate compute capacity to serve the workload.

The authors have implemented the prototype with the OpenWhisk platform. The paper provides extensive testing and evaluations both for performance and cost. Each figure in the paper has tons of information! That being said, I am including the performance on a fixed budget figure below to show how much cheaper running serverless on harvest VMs can be. The blue line is running some workload with rather heavy and longer-running functions on dedicated VMs under some fixed budget. Other lines show the latency vs throughput of harvest VM solution under different levels of harvest VM availability. The “lowest” (red triangle) line is when few harvest resources are available, making harvest VMs most expensive (who and how decides the price of harvest VM?). 

As usual, we had the presentation in the reading group. Bocheng Cui presented this paper, and we have the video on YouTube:

Discussion.

1) Commercializing Harvest VMs. This paper sounds like an attempt to sell otherwise unused resources in the data center. The previous harvest VM paper aimed at internal use cases (and there are lots of them, ranging from data processing to building and test systems). I think this is a great idea, and hopefully, it can make the cloud both cheaper and greener to operate with less resource waste. At the same time, it seems like the current prototype is just that — a prototype based on an open-source platform, and I wonder if this is feasible (or otherwise can be done even more efficiently) at the scale of an entire Azure cloud.

2) Evaluation. Most of the eval is done in an environment that simulates harvest VMs based on the real Azure traces. I suppose this is good enough, and the evaluation is rather extensive. It also includes a small section of the “real thing” running in real harvest VMs. But again, I wonder about the scale.

Reading Group

Our reading group takes place over Zoom every Wednesday at 2:00 pm EST. We have a slack group where we post papers, hold discussions, and most importantly manage Zoom invites to paper discussions. Please join the slack group to get involved!

One Page Summary. Photons: Lambdas on a diet

Recently, to prepare for a class I teach this semester, I went through the “Photons: Lambdas on a diet” SoCC’20 paper by Vojislav Dukic, Rodrigo Bruno, Ankit Singla, Gustavo Alonso. This is a very well-written paper with a ton of educational value for people like me who are only vaguely familiar with serverless space!

The authors try to solve some deficiencies in serverless runtime. See, each function invocation is isolated from all other functions by running in separate containers. Such an isolated approach has lots of benefits, ranging from security to resource isolation. But there are downsides as well — each invocation of the same function needs its own container to run, so if there are no available idle containers from previous runs, a new one needs to be created, leading to cold starts. A “cold” function takes more time to run, as it needs to be deployed in a new container and complied/loaded up to the execution environment. In the case of many popular JIT-compiled languages (i.e., Java), this also means that it initially runs from byte-code in an interpreted mode without a bunch of clever compile-time optimizations. The paper states another problem to having all functions invocations requiring their separate containers — the resource wastage. In particular, if we run many invocations of the same function concurrently, we are likely to waste memory on loading up all the dependencies/libraries separately in each container. The authors also mention that some function invocations, for instance, functions for some ML or data processing workloads, may operate from the same initial dataset. Obviously, this dataset must be loaded to each function container separately.

The paper proposes a solution, called Photons, to address the cold-start issues and resource wastage in workloads with many concurrent invocations of the same functions. The concept of a Photon describes a function executed in a container shared with other function invocations of the same type. The premise here is to forgo some isolation and allow multiple invocations of the same function to share the container. As a result, the number of cold starts can be reduced, and resources, like RAM, can be better utilized by having libraries and common data loaded only once for many concurrent function calls.

The lack of isolation between the function invocations, however, creates a few problems. The paper solves some of them, but it also passes quite a few of these problems off to the function developers. One big problem passed to the developers is ensuring that different invocations of the same function consume a predictable amount of resources. This is needed for a couple of reasons. First, there are scheduling needs, as the system needs to predict machine and container resource usage to determine which containers can be expanded to take on more function calls or which machines can handle new containers. Second, the lack of container-level isolation means that a misbehaved function instance can grab too many resources and starve other concurrent functions in the same container.

Another big problem is memory isolation and memory sharing between concurrent functions invocations. A significant part of the Photons platform deals with these problems. On memory isolation, things are easy when we only work with class variables. Each object created from the class will have a separate copy. However, some class variables may be static, and what is even worse, they can be mutable, allowing concurrent executions of the code to modify these static fields. Photons address this problem by transforming static variables into a map, where a key is a photonID (i.e., a unique invocation id of a function). Naturally, all reads and writes to a static variable change to map puts and gets. There are a bit more nuances with statics, and the paper covers them in greater detail. For sharing state, Photons runtime maintains a KV-store exposed to all Photons/function invocation in the container. One major downside of having a KV-store is managing concurrent access to it, and the systems leave it up to the developers to code coordination to this shared store. Aside from shared KV-store, functions also share the file system for temporary storage.