Welcome to the DistSys Reading Group! Every week we present and discuss one distributed systems paper. We try to focus on relatively new papers, although we occasionally break this rule for some important older publications. The main objective of this group is to share knowledge through the discussion. Our participants come from academia and industry and often carry a unique perspective and expertise on the subject matter.
We start each meeting with a short presentation of the paper by one of the group members. We record the presentation and later upload it to YouTube for the general audience. After the presentation, we move into a group discussion of the paper. This part is not on the record to make sure we can speak freely about the topic and the paper. However, I write a moderated discussion summary for each meeting and post it here. All the summaries are available via the “Summary” link next to the paper title. For the archive of the summaries, navigate to the “Past Meetings” section below.
Current Schedule (Papers ##71-80)
Past Special Sessions
- Building Distributed Systems With Stateright – March 30th @ 1pm EST – Jon Nadal.
- Distributed Transactions in YugabyteDB – May 11th @12pm EST – Karthik Ranganathan.
In the 75th reading group session, we discussed the transaction locality and dynamic data partitioning through the eyes of a recent OSDI’21 paper – “Don’t Look Back, Look into the Future: Prescient Data Partitioning and Migration for Deterministic Database Systems.” This interesting paper solves the transaction locality problem in distributed, sharded deterministic databases. The deterministic […]
Our 74th paper was a foundational one — we looked at Viestamped Replication protocol through the lens of the “Viewstamped Replication Revisited” paper. Joran Dirk Greef presented the protocol along with bits of his engineering experience using the protocol in practice. Viestamped Replication (VR) solves the problem of state machine replication in a crash fault […]
Our 72nd paper was on avoiding coordination as much as possible. We looked at the “Meerkat: Multicore-Scalable Replicated Transactions Following the Zero-Coordination Principle” EuroSys’20 paper by Adriana Szekeres, Michael Whittaker, Jialin Li, Naveen Kr. Sharma, Arvind Krishnamurthy, Dan R. K. Ports, Irene Zhang. As the name suggests, this paper discusses coordination-free distributed transaction execution. In […]
In the 71st DistSys reading group meeting, we have discussed “DistAI: Data-Driven Automated Invariant Learning for Distributed Protocols” OSDI’21 paper. Despite the misleading title, this paper has nothing to do with AI or Machine Learning. Instead, it focuses on the automated search for invariants in distributed algorithms. I will be brief and a bit hand-wavy […]
Our 70th meeting covered the “In Reference to RPC: It’s Time to Add Distributed Memory” paper by Stephanie Wang, Benjamin Hindman, and Ion Stoica. This paper proposes some improvements to remote procedure call (RPC) frameworks. In current RPC implementations, the frameworks pass parameters to function by value. The same happens to the function return values. […]
In the last reading group meeting, we discussed MongoDB‘s replication protocol, as described in the “Fault-Tolerant Replication with Pull-Based Consensus in MongoDB” NSDI’21 paper. Our reading group has a few regular members from MongoDB, and this time around, Siyuan Zhou, one of the paper authors, attended the discussion, so we had a perfect opportunity to […]
In the 68th reading group session, we discussed scheduling in dataflow-like systems with Cameo. The paper, titled “Move Fast and Meet Deadlines: Fine-grained Real-time Stream Processing with Cameo,” appeared at NSDI’21. This paper discusses some scheduling issues in data processing pipelines. When a system answers a query, it breaks the query into several steps or […]
We will start the fall semester with a new set of reading group papers. As before, we have ten papers in total. Nine of them are new papers from top venues, and one is a foundational paper on Viestamped Replication. Instead of the original VR paper, we will look at its more modern counterpart/rewrite/update — […]
I am very behind on the reading group summaries, so this summary will be short and less detailed. In the 67th reading group meeting, we discussed the “When Cloud Storage Meets RDMA” paper from Alibaba. This paper is largely an experience report on using RDMA in practical storage systems. Large-scale RDMA deployments are rather difficult […]
Our 66th paper was a recent HotOS piece about faulty CPUs: “Cores that don’t count.” This paper from Google describes a decently common (at Google datacenter scale) issue with CPUs that may miscompute or silently fail under some conditions. This is a big deal, as we expect CPUs to be deterministic and always provide correct […]