# Reading Group. DistAI: Data-Driven Automated Invariant Learning for Distributed Protocols

In the 71st DistSys reading group meeting, we have discussed “DistAI: Data-Driven Automated Invariant Learning for Distributed Protocols” OSDI’21 paper. Despite the misleading title, this paper has nothing to do with AI or Machine Learning. Instead, it focuses on the automated search for invariants in distributed algorithms. I will be brief and a bit hand-wavy in this summary, mainly because I am too busy with other things to take care of (thanks, the beginning of the semester!), and this paper requires rather careful attention and a deep read to fully grasp it.

Before going into the paper itself, I want to talk about invariants. We use invariants a lot when thinking about the safety of algorithms and protocols. So an invariant is a property that must hold throughout the execution of an algorithm. It holds initially and after every discrete step of the protocols. For people like me, finding a good safety property is usually enough when checking some algorithm written in a spec language like TLA+. A TLC model checker runs an exhaustive search of the state space and ensures the safety property after every action. If the invariant condition is violated, TLC will produce a counterexample. There is a bit of a problem with this approach — for big algorithms, checking the model can take hours, days, or even weeks. TLA+/TLC is a bruteforce solution.

IVy, the system used as a component in today’s paper, takes a different approach. Instead of brute-forcing and checking all possible executions (and of course, the execution may be infinite too!), it tries to be clever and prove an algorithm correct by induction. The idea is simple on the surface — take an inductive invariant I prescribing the safety of an algorithm, and show that I holds initially and after all possible transitions starting from any state that satisfied the invariant I. In other words, we need to check that every possible transition from state S satisfying invariant results in a state S’ that still satisfies I. Unfortunately, in practice, this is where things get complicated. The problem is that our safety property I from the model-checking world may not be inductive. More specifically, if we are in some non-initial state S that satisfies I, there is still a possibility that after at action S->S’S’ will not satisfy I. This situation may arise not because the safety property is incorrect but because the state is an impossible or unreachable state that still somehow satisfies I. The solution is then to strengthen the invariant to some new invariant I’ that prevents the unreachable starting state for our possible transitions. IVy system provides tools to do this strengthening semi-automatically with some user help.

The DistAI paper builds on IVy and improves and optimizes the invariant search process to make it faster, more automatic, and more accurate. The process starts similarly to IVy’s, as a user provides the IVy spec with all possible actions, safety properties that must hold, and relations between variables. These relations between variables are important since they specify the properties of the algorithm. For example, a system may have a relation specifying whether a node holds a lock or whether the two nodes have communicated with each other. The system will use these relations to find the inductive invariants. The inductive invariant will be used in conjunction with safety property to finally check the protocol.

The rest of the DistAI’s invariant search process, however, is largely automatic. The process is iterative, with each iteration starting with some input to produce invariant candidates that involve up to some number of variables and no longer than some number of disjunctive terms. For instance, if all variable relations involve up to 2 variables, it may make sense to, initially, restrict the invariant search to just two variables.

Interestingly, at this step, DistAI brings back some of the model-checking magic into the process. It produces a few sample trace executions of the protocol and uses these traces to create a list of all observed variable relations after taking each action. In practice, this list of relations can be large if the simulation trace has many variables, so the system will randomly narrow down its focus to only look at a subsample of the execution trace with just a few variables at a time. For example, if all relations have no more than two variables, we may look at how some combination of two variables changed during the execution trace. This two-variable sample will follow a template with two quantified variables: $$\{\forall V_1, V_2\}$$. The DistAI system produces a handful of these subsamples for each quantified template.

After having these subsamples/sub-traces ready, DistAI enumerates the relations from subsamples. It can easily see which relations were holding throughout the entire execution trace. These relations can be generalized from having specific variable values in the samples to quantified variable placeholders from the sample’s template. For example, if a relation some_property holds in all subsamples, then we can generalize based on the template to $$\forall V_1, V_2: some\_property(V_1, V_2)$$. More than one relation may hold true in the subsamples, and so DistAI will produce multiple of these generalized relations. For example, some other relation $$some\_other\_property(V_1)$$ may have been true throughout all the subsamples. We will use all these relations to produce all different disjunctive invariant candidates, such as
(1): $$\forall V_1, V_2: some\_property(V_1, V_2)$$,
(2): $$\forall V_1, V_2: some\_property(V_1, V_2) \lor some\_other\_property(V_1)$$
and other permutations. Also, note that candidate (1) implies candidate (2), i.e., whenever (1) is true, (2) is also always true. In reality, the DistAI candidate enumeration process is a bit more complicated, as different templates may result in duplicate/overlapped invariant candidates.

At this point, DistAI will use IVy to check the candidate invariants. So since we have obtained the candidates from an execution trace, we already know that the candidates hold in at least some executions, and so we hope that we have had filtered most of the incorrect invariants. DistAI will start the check with the shortest candidate first. In the hypothetical example, candidate (1) will be checked first, since if it is inductive, then there is no need to check candidate (2) that is always true when (1) is satisfied. If the shorter invariant candidate fails IVy check, then DistAI will try to refine it by iterating to a longer version or a version with more variables. In this case, candidate (2) is a refined candidate that has one extra disjunctive term. If DistAI cannot find a successful candidate in the list created during the enumeration process, then it can start from the beginning by running new sample traces with more variables and increased the maximum length of the disjunctive invariant formula.

Phew, this is all very complicated. In short, the idea is to get some invariant candidates from the simulation trace and then check these candidates in IVy, starting with the shortest/simplest candidate and gradually working the way up to more complex invariants. Each of the more complicated candidates you check adds a bit of disjunctive stuff to exclude the issues the previous less-refined candidate stumbled upon.

As always, we had a video:

## Discussion.

Quite frankly, I no longer remember the full extent of the discussion we had. I remember it was quite a lot of talking.

1) Inductive invariants. We spent a bit of time on the difference between inductive invariants and just an invariant. See, we have TLA people in the group, but not that many experts in formal methods. As I mentioned in the beginning, TLA is happy with a short non-inductive safety invariant because it is a brute-force tool that can only check finite protocols.

2) Candidate refinement. It seems like most of the refinement is based on going through the list of candidates from the enumeration step and trying them from simple to more complicated once. Most of the refinement is done by adding more disjunctive predicates until the maximum number of disjunctive terms have been reached (at which point we can try with more terms?).

3) Limitations. The templates for sampling require universal quantifiers. The authors claim that they could not check Paxos protocol because it requires existential quantifiers. However, it was pointed out that another related work (SWISS) managed to find invariants for Paxos, so it seems their approach does not have this limitation.

Our reading groups takes place over Zoom every Wednesday at 2:00 pm EST. We have a slack group where we post papers, hold discussions and most importantly manage Zoom invites to the papers. Please join the slack group to get involved!

# Looking at State and Operational Consistency

Recently I rediscovered the “The many faces of consistency” paper by Marcos Aguilera and Doug Terry. When I first read the paper two years ago, I largely dismissed it as trivial, and, oh boy, now I realized how wrong I was at that time.  It is easy to read for sure, and may appear as some summary of various consistency models at first, but it is thought provoking and really makes you ask more questions and draw interesting parallels after giving it some quality time.

Murat gave a good summary of this paper recently in relation to his sabbatical. The questions he asks after the summary, however, provoke even more thoughts about consistency and how we classify, categorize and view it.

In a nutshell, the paper talks about consistency from different perspectives, namely state consistency as observed by a system itself and operational consistency that clients see. State consistency involves enforcing a system-state to hold some invariants. State consistency promotes invariant-based reasoning. The operational consistency is different, as it looks at the system from the client point of view. Outside clients do not directly observe the state of the system, instead they perform operations against it and can only see the results of these operations. So in short, state consistency is invariant-based and concerns with the internal state of the systems. Operational consistency deals with what clients observe from the outside of the system. These operational consistencies include various types sequential equivalence, such as linearizability and serializability, and other client-centric guarantees, like read-your-write or bounded-staleness.

For more details, read the original paper, Murat’s summary or one from the morning paper.

## “Strength” of state consistency.

What strikes me right away is how different the two types of consistency are described. Operational consistency gives us the framework for reasoning about systems without having too many details about the internals. We can gauge the relative “strength” of different consistency models and put them in perspective against each other. We know the serializability is weaker then session serializability. Or that linearizability is stronger than sequential equivalence.

But what about state consistency? There is no such reasoning framework. And in fact, it is not easy to even classify state consistency, yet along reason about the “strength” of different state consistency classes. The paper mentions a few state consistency models, such referential integrity for databases, or mutual consistency in primary-backup systems, or error bounds. But these are not generally applicable across the board. These examples of state consistency operate within the constraints of their specific domains or applications.

However, state consistency still comes at different “strength” levels. When reasoning about state consistency, we use invariant-based approach. The invariants on the state we need to enforce for an eventually consistent data-store and a strongly-consistent one are different. In the former case, we can be more relaxed, since we only need to make sure that at least one future state will have different nodes of the store to have the same data. The latter case is more complicated, as we need to hold stricter invariants (i.e. no two alive nodes have different committed value for the same slot in the log, and there can be only one active leader, and the leader must process the commands in the receive order, etc.).

It is the “tightness” of the invariants that makes state consistency strong or weak. But deciding which invariant is tighter or stricter is hard. We cannot simply compare various invariants on the merits of when they should hold, or how many parameters they cover or how many nodes they span, as all these (and other) metrics mean something only in the context of their systems/problems. Invariant that must hold at every state is not necessarily tighter than the eventual one, as it may simply be an invariant against some irrelevant or trivial parameter that holds all the time anyway and has no impact on the system.

And this is why we often translate these invariants and state consistency they represent into the operational consistency. The operational side of things allows us to observe the impact of the invariants on the system, albeit indirectly. Operational consistency levels the playing field and enables the comparison from the external point of view. It allows us to gauge how otherwise hard-to-compare invariants at the state-consistency level affect the outcome of operations.

## Smart systems or smart clients

Does this mean that a system providing stronger operational consistency has stronger state consistency? Well, it would have been too simple if that was the case. It often happens, and more so recently, that systems have “misalignments” between their internal state consistency and the operational one exposed on the client side.

A system that provides stronger operational semantics may do so because it has a strong state that makes it easy to expose the strong operational consistency. These smart systems preserve strong state-consistency at all costs. They may need to run complicated algorithms (i.e. Paxos) to achieve that, but doing so makes the clients lean and simple with minimal or no state at all.

On the other hand, simple systems may forgo the complicated protocols needed to enforce a strong state. Instead they aim to run as lean as possible at the core system level and shift as much burden to the smart clients. These clients need to have more complex state and protocols if they are to provide stronger operational guarantees. There are systems that do exactly that.

Both “smart system – lean client” and “lean system – smart client” approaches have their advantages and drawbacks. Designing and maintaining a smart system may actually be simpler: all the things engineers need are readily available at the system level. Invariant-based reasoning and tools like TLA easily apply in this setting. Debugging is simpler too, since lean and stateless clients are likely not the cause of a problem, and internal logging can help collect all the necessary information. On the other hand, having a lean system may improve the performance by reducing the bottlenecks, spreading load more evenly across the nodes and even sharing the load with smart clients. But it comes at some engineering costs. Protocols now involve more state and state is even more distributed: both at the lean system nodes and smart clients. Modeling this state with TLA is still possible, but it will likely take more time to check and require a more complicated models that include clients and client interactions with the system nodes. Debugging may be slowed down due to the lack of necessary data, since many issues (especially on production) may happen at smart clients outside of the engineers’ reach.