Reading Group

Reading Group. Protocol-Aware Recovery for Consensus-Based Storage

Reading Group

Aleksey Charapko

·

May 9, 2021

Our last reading group meeting was about storage faults in state machine replications. We looked at the “Protocol-Aware Recovery for Consensus-Based Storage” paper from FAST’18. The paper explores an interesting omission in most of the state machine replication (SMR) protocols. These protocols, such as (multi)-Paxos and Raft, are specified with the assumption of having a…
Read More
Reading Group Special Session: Distributed Transactions in YugabyteDB

RG Special Session

Aleksey Charapko

·

May 3, 2021

When: May 11th at 12:00 pm EST Who: Karthik Ranganathan. Karthik Ranganathan is a founder and CTO of YugabyteDB, a globally distributed, strongly consistent database. Prior to Yugabyte, Karthik was at Facebook, where he built the Cassandra database. In this talk, Karthik will discuss Yugabyte’s use of time synchronization and Raft protocol along with some…
Read More
Reading Group. Paxos vs Raft: Have we reached consensus on distributed consensus?

Reading Group

Aleksey Charapko

·

May 1, 2021

In our 54th reading group meeting, we were looking for an answer to an important question in the distributed systems community: “What about Raft?” We looked at the “Paxos vs Raft: Have we reached consensus on distributed consensus?” paper to try to find the answer. As always, we had an excellent presentation, this time by…
Read More
Reading Group. New Directions in Cloud Programming

Reading Group

Aleksey Charapko

·

Apr 25, 2021

Recently we have discussed a CIDR’21 paper: “New Directions in Cloud Programming.” Murat Demirbas did the presentation: Quite honestly, I don’t like to write summaries for this kind of paper. Here, the authors propose a vision for the future of cloud applications, and I feel that summarizing a vision often results in the misinterpretation of…
Read More
Reading Group. Facebook’s Tectonic Filesystem: Efficiency from Exascale

Reading Group

Aleksey Charapko

·

Apr 18, 2021

This time around our reading group discussed a distributed filesystem paper. We looked at FAST’21 paper from Facebook: “Facebook’s Tectonic Filesystem: Efficiency from Exascale.” We had a nice presentation by Akash Mishra: The paper talks about a unified filesystem across many services and use cases at Facebook. Historically, Facebook had multiple specialized storage infrastructures: one…
Read More
Reading Group. Distributed Snapshots: Determining Global States of Distributed Systems

Reading Group

Aleksey Charapko

·

Apr 10, 2021

On Wednesday we kicked off a new set of papers in the reading group. We have started with one of the classical foundational papers in distributed systems and looked at the Chandy-Lamport token-based distributed snapshot algorithm. The basic idea here is to capture the state of distributed processes and channels by “flushing” the messages out…
Read More
Reading Group. Aragog: Scalable Runtime Verification of Shardable Networked Systems

Reading Group

Aleksey Charapko

·

Apr 2, 2021

We have covered 50 papers in the reading group so far! This week we looked at the “Aragog: Scalable Runtime Verification of Shardable Networked Systems” from OSDI’20. This paper discusses the problem of verifying the network functions (NFs), such as NAT Gateways or firewalls at the runtime. The problem is quite challenging due to its…
Read More
Reading Group. Protean: VM Allocation Service at Scale

Reading Group

Aleksey Charapko

·

Mar 26, 2021

The last paper in our reading group was “Protean: VM Allocation Service at Scale.” This paper from Microsoft is full of technical insights into how they operate their datacenters/regions at scale. In particular, the paper discusses one of the fundamental components of any cloud provider — the VM service. The system, called Protean, is an…
Read More
Reading Group. Sundial: Fault-tolerant Clock Synchronization for Datacenters

Reading Group

Aleksey Charapko

·

Mar 19, 2021

In our 48th reading group meeting, we talked about time synchronization in distributed systems. More specifically, we discussed the poor state of time sync, the reasons for it, and most importantly, the solutions, as outline in the “Sundial: Fault-tolerant Clock Synchronization for Datacenters” OSDI’20 paper. We had a comprehensive presentation by Murat Demirbas. Murat’s talk…
Read More
Reading Group Special Session: Building Distributed Systems With Stateright

RG Special Session

Aleksey Charapko

·

Mar 17, 2021

This talk is part of the Distributed Systems Reading Group. Stateright is a software framework for analyzing and systematically verifying distributed systems. Its name refers to its goal of verifying that a system’s collective state always satisfies a correctness specification, such as “operations invoked against the system are always linearizable.” Cloud service providers like AWS…
Read More

Aleksey Charapko

Reading Group

Reading Group. Protocol-Aware Recovery for Consensus-Based Storage

Reading Group Special Session: Distributed Transactions in YugabyteDB

Reading Group. Paxos vs Raft: Have we reached consensus on distributed consensus?

Reading Group. Facebook’s Tectonic Filesystem: Efficiency from Exascale

Reading Group. Distributed Snapshots: Determining Global States of Distributed Systems

Reading Group. Aragog: Scalable Runtime Verification of Shardable Networked Systems

Reading Group. Protean: VM Allocation Service at Scale

Reading Group. Sundial: Fault-tolerant Clock Synchronization for Datacenters

Reading Group Special Session: Building Distributed Systems With Stateright

Search

Recent Posts

Categories