Paper #191: Occam’s Razor for Distributed Protocols

We have been doing a Zoom distributed systems paper reading group for 5 years and have covered around 190 papers. This semester, we should reach the milestone of 200 papers. Over the years, my commitment to the group has varied — at some point, I was writing paper reviews, and more recently, I’ve had less time to do that. However, I will try to recommit to these reviews in some short form… And so, our 191st paper was “Occam’s Razor for Distributed Protocols” from SoCC’24.

This paper presents a reasoning framework to optimize distributed systems and protocols in one specific way — finding opportunities for greater parallelism. The authors argue that many distributed protocols have largely unnecessary sequential steps. Consider a protocol that sends a message MA from some nodes A to node B. Node B receives MA and sends a message MB to C: A -> B -> C. The paper looks for such sequential steps and checks whether they can be changed into parallel steps: A-> C and A-> B.

Such parallelization may be possible if node B is a weak dependency of C. A weak dependency must satisfy two rules. First, for B to be a weak dependency of C, the message MB must not depend on any state of B (but may depend on the state of A). If MB is independent of the state of B, then C can essentially recreate MB from MA. The second rule is a bit more nuanced — if we parallelize the computation, then the new version of the protocol can never “forget” to run B. In the sequential version, we get to C through B, even if we do not use B’s state. Thus, we know that B ran if C ran, and we must preserve the same behavior of always running B when C was running. I oversimplified things a bit here, and the paper has a much more complete formalism of the idea. There are also some additional restriction to the new parallelized versions, for example, if C sends a message, it may need to buffer it until B is also done.

The authors look at several classical protocols like Paxos and 2PC and optimize several steps in them. For instance, looking at the leader-follower interaction in the replication phase (phase-2) of MultiPaxos, we see that the followers must send acks (e.g., votes) to the leader; the leader counts votes and sends back the commit notification. We have that A->B->C chain, where A is a set of followers, MA are vote messages each follower sends, B is the leader, MB is the commit message, and C is again the set of followers who receive the commit notification. Note that MB really only depends on the collection of votes and not on some intrinsic state of B. In this scenario, we can parallelize the protocol and make A send votes directly to C and B (i.e., followers send votes to each other and the leader). Upon receiving the votes, the followers can count the majority of votes to arrive at the same commit conclusion that a leader would’ve typically sent in the sequential version.

This parallelization is an interesting result! In fact, this parallelization of the protocol is precisely how Lamport described Paxos with the learner nodes in Paxos Made Simple: “The obvious algorithm is to have each acceptor, whenever it accepts a proposal, respond to all learners, sending them the proposal. This allows learners to find out about a chosen value as soon as possible…”

Anyway, while the paper and its reasoning framework may have found a very well-known optimization — the learner pattern, the authors mention a few other examples and optimizations they could find. As such, I think the proposed reasoning framework is a good tool to have in my toolbox. However, it is not a one-fits-all solution. Some protocols and systems can be optimized in the exact opposite way by increasing the number of sequential steps. The pipelined parallelism is a well-known approach; this is how our CPUs work. Regarding distributed systems, Compartmentalized Paxos and PigPaxos are examples of making longer pipelines for better throughput at the expense of some latency.