Reading Group

Welcome to the DistSys Reading Group! Every week we present and discuss one distributed systems paper. We try to focus on relatively new papers, although we occasionally break this rule for some important older publications. The main objective of this group is to share knowledge through the discussion. Our participants come from academia and industry and often carry a unique perspective and expertise on the subject matter.


We start each meeting with a short presentation of the paper by one of the group members. We record the presentation and later upload it to YouTube for the general audience. After the presentation, we move into a group discussion of the paper. This part is not on the record to make sure we can speak freely about the topic and the paper. However, I write a moderated discussion summary for each meeting and post it here. All the summaries are available via the “Summary” link next to the paper title. For the archive of the summaries, navigate to the “Past Meetings” section below.

Meeting Info

Current Schedule (Papers ##51-60)

  1. Distributed Snapshots: Determining Global States of a Distributed System – April 7th – :classical_building: Classical/Foundation Paper [YouTube][Summary]
  2. Facebook’s Tectonic Filesystem: Efficiency from Exascale – April 14th [YouTube][Summary]
  3. New Directions in Cloud Programming – April 21st [YouTube][Summary]
  4. Paxos vs Raft: Have we reached consensus on distributed consensus? – April 28th
  5. Protocol-Aware Recovery for Consensus-Based Storage – May 5th
  6. chainifyDB: How to get rid of your Blockchain and use your DBMS instead – May 12th
  7. XFT: practical fault tolerance beyond crashes – May 19
  8. Cerebro: A Layered Data Platform for Scalable Deep Learning – May 26th
  9. Multitenancy for Fast and Programmable Networks in the Cloud – June 2nd
  10. Exploiting Symbolic Execution to Accelerate Deterministic Databases – June 9th

Past Meetings

Past Special Sessions

  1. Building Distributed Systems With StaterightMarch 30th @ 1pm EST – Jon Nadal. [YouTube]
  2. Distributed Transactions in YugabyteDBMay 11th @12pm EST – Karthik Ranganathan. [YouTube]

Recent Reading Group Meetings

Reading Group. chainifyDB: How to get rid of your Blockchain and use your DBMS instead

Our recent meeting focused on Blockchains, as we discussed “chainifyDB: How to get rid of your Blockchain and use your DBMS instead” CIDR’21 paper. The presentation by Karolis Petrauskas is available here: The paper argues for using existing and proven technology to implement a permissioned blockchain-like system. The core idea is to leverage relational SQL-99 […]

Reading Group. Protocol-Aware Recovery for Consensus-Based Storage

Our last reading group meeting was about storage faults in state machine replications. We looked at the “Protocol-Aware Recovery for Consensus-Based Storage” paper from FAST’18.  The paper explores an interesting omission in most of the state machine replication (SMR) protocols. These protocols, such as (multi)-Paxos and Raft, are specified with the assumption of having a […]

Reading Group Special Session: Distributed Transactions in YugabyteDB

When: May 11th at 12:00 pm EST Who: Karthik Ranganathan. Karthik Ranganathan is a founder and CTO of YugabyteDB, a globally distributed, strongly consistent database. Prior to Yugabyte, Karthik was at Facebook, where he built the Cassandra database. In this talk, Karthik will discuss Yugabyte’s use of time synchronization and Raft protocol along with some […]

Reading Group. Facebook’s Tectonic Filesystem: Efficiency from Exascale

This time around our reading group discussed a distributed filesystem paper. We looked at FAST’21 paper from Facebook: “Facebook’s Tectonic Filesystem: Efficiency from Exascale.” We had a nice presentation by Akash Mishra: The paper talks about a unified filesystem across many services and use cases at Facebook. Historically, Facebook had multiple specialized storage infrastructures: one […]

Reading Group. Protean: VM Allocation Service at Scale

The last paper in our reading group was “Protean: VM Allocation Service at Scale.” This paper from Microsoft is full of technical insights into how they operate their datacenters/regions at scale. In particular, the paper discusses one of the fundamental components of any cloud provider — the VM service. The system, called Protean, is an […]

Reading Group. Sundial: Fault-tolerant Clock Synchronization for Datacenters

In our 48th reading group meeting, we talked about time synchronization in distributed systems. More specifically, we discussed the poor state of time sync, the reasons for it, and most importantly, the solutions, as outline in the “Sundial: Fault-tolerant Clock Synchronization for Datacenters” OSDI’20 paper. We had a comprehensive presentation by Murat Demirbas. Murat’s talk […]