paxos

Paper #191: Occam’s Razor for Distributed Protocols

Reading Group

Aleksey Charapko

·

Feb 20, 2025

We have been doing a Zoom distributed systems paper reading group for 5 years and have covered around 190 papers. This semester, we should reach the milestone of 200 papers. Over the years, my commitment to the group has varied — at some point, I was writing paper reviews, and more recently, I’ve had less…
Read More
Pile of Eternal Rejections: The Cost of Garbage Collection for State Machine Replication

Pile of Eternal Rejections

Aleksey Charapko

·

May 28, 2024

I have a “pile” of papers that continuously get rejected from any conference. All these papers, according to the reviews, “lack novelty,” and therefore are deemed “not interesting” by the reviewing experts. There are some things in common in these papers — they are either observational or rely on old and proven techniques to solve a problem or improve a system/algorithm. Jokingly, I call this set of papers the “pile of…
Read More
Reading Group. Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service

Reading Group

Aleksey Charapko

·

Jan 21, 2023

In the 120th DistSys meeting, we talked about “Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service” ATC’22 paper by Mostafa Elhemali, Niall Gallagher, Nicholas Gordon, Joseph Idziorek, Richard Krog, Colin Lazier, Erben Mo, Akhilesh Mritunjai, Somu Perianayagam, Tim Rath, Swami Sivasubramanian, James Christopher Sorenson III, Sroaj Sosothikul, Doug Terry, Akshat Vig.…
Read More
Reading Group. Rabia: Simplifying State-Machine Replication Through Randomization

Reading Group

Aleksey Charapko

·

Jan 30, 2022

We covered yet another state machine replication (SMR) paper in our reading group: “Rabia: Simplifying State-Machine Replication Through Randomization” by Haochen Pan, Jesse Tuglu, Neo Zhou, Tianshu Wang, Yicheng Shen, Xiong Zheng, Joseph Tassarotti, Lewis Tseng, Roberto Palmieri. This paper appeared at SOSP’21. A traditional SMR approach, based on Raft or Multi-Paxos protocols, involves a…
Read More
Reading Group. Exploiting Nil-Externality for Fast Replicated Storage

Reading Group

Aleksey Charapko

·

Jan 21, 2022

85th DistSys reading group meeting discussed “Exploiting Nil-Externality for Fast Replicated Storage” SOSP’21 paper by Aishwarya Ganesan, Ramnatthan Alagappan, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. The paper uses an old trick of delaying the execution of some operations to improve the throughput while maintaining strong consistency. Consistency is an externally-observable property, and simple strategies,…
Read More
Scalable but Wasteful or Why Fast Replication Protocols are Actually Slow

Other Thoughts

Aleksey Charapko

·

Jul 12, 2021

In the last decade or so, quite a few new state machine replication protocols emerged in the literature and the internet. I am “guilty” of this myself, with the PigPaxos appearing in this year’s SIGMOD and the PQR paper at HotStorage’19. There are better-known examples as well — EPaxos inspired a lot of development in…
Read More
Reading Group. XFT: Practical Fault Tolerance beyond Crashes

Reading Group

Aleksey Charapko

·

May 25, 2021

In the 57th reading group meeting, we continued looking at byzantine fault tolerance. In particular, we looked at “XFT: Practical Fault Tolerance beyond Crashes” OSDI’16 paper. Today’s summary & discussion will be short, as I am doing it way past my regular time. The paper talks about a fault tolerance model that is stronger than…
Read More
Reading Group. Protocol-Aware Recovery for Consensus-Based Storage

Reading Group

Aleksey Charapko

·

May 9, 2021

Our last reading group meeting was about storage faults in state machine replications. We looked at the “Protocol-Aware Recovery for Consensus-Based Storage” paper from FAST’18. The paper explores an interesting omission in most of the state machine replication (SMR) protocols. These protocols, such as (multi)-Paxos and Raft, are specified with the assumption of having a…
Read More
Reading Group. Paxos vs Raft: Have we reached consensus on distributed consensus?

Reading Group

Aleksey Charapko

·

May 1, 2021

In our 54th reading group meeting, we were looking for an answer to an important question in the distributed systems community: “What about Raft?” We looked at the “Paxos vs Raft: Have we reached consensus on distributed consensus?” paper to try to find the answer. As always, we had an excellent presentation, this time by…
Read More
Reading Group. Microsecond Consensus for Microsecond Applications

Reading Group

Aleksey Charapko

·

Feb 12, 2021

Our 43rd reading group paper was about an extremely low-latency consensus using RDMA: “Microsecond Consensus for Microsecond Applications.” The motivation is pretty compelling — if you have a fast application, then you need fast replication to make your app reliable without holding it back. How fast are we talking here? Authors go for ~1 microsecond…
Read More

Aleksey Charapko

paxos

Paper #191: Occam’s Razor for Distributed Protocols

Pile of Eternal Rejections: The Cost of Garbage Collection for State Machine Replication

Reading Group. Amazon DynamoDB: A Scalable, Predictably Performant, and Fully Managed NoSQL Database Service

Reading Group. Rabia: Simplifying State-Machine Replication Through Randomization

Reading Group. Exploiting Nil-Externality for Fast Replicated Storage

Scalable but Wasteful or Why Fast Replication Protocols are Actually Slow

Reading Group. XFT: Practical Fault Tolerance beyond Crashes

Reading Group. Protocol-Aware Recovery for Consensus-Based Storage

Reading Group. Paxos vs Raft: Have we reached consensus on distributed consensus?

Reading Group. Microsecond Consensus for Microsecond Applications

Search

Recent Posts

Categories