With just four more papers to go in the DistSys Reading Group’s current batch, it is time to get the next set going. This round, we will have 10 papers that should last till the end of the spring semester. Our last batch was all about OSDI’20 papers, and this time around we will mix things around both in terms of the venues and paper recency. We will also start the batch with one foundational paper taken from Murat’s recent list of must-read classical papers in distributed systems. Without further ado, here is the list:
- Distributed Snapshots: Determining Global States of a Distributed System – April 7th
- Facebook’s Tectonic Filesystem: Efficiency from Exascale – April 14th
- New Directions in Cloud Programming – April 21st
- Paxos vs Raft: Have we reached consensus on distributed consensus? – April 28th
- Protocol-Aware Recovery for Consensus-Based Storage – May 5th
- chainifyDB: How to get rid of your Blockchain and use your DBMS instead – May 12th
- XFT: practical fault tolerance beyond crashes – May 19
- Cerebro: A Layered Data Platform for Scalable Deep Learning – May 26th
- Multitenancy for Fast and Programmable Networks in the Cloud – June 2nd
- Exploiting Symbolic Execution to Accelerate Deterministic Databases – June 9th
Our reading group takes place over Zoom every Wednesday at 2:00 pm EST. We have a slack group where we post papers, hold discussions, and most importantly manage Zoom invites to paper discussions. Please join the slack group to get involved!
This spring semester I am teaching an exciting seminar class: “Planetary-Scale Systems.” I will start the seminar with a 4 lectures long crash course to get my students on the same page, but the bulk of the class will be paper presentations and discussions. The format is similar to the zoom reading group I am running.
The class meets twice a week, on Tuesdays and Thursdays. On day one, we will have a paper presentation, followed by a class discussion. On day two we take the discussion up a notch and dive deeper into 2-3 select topics/questions from day one. The time should also allow students to prepare for the in-depth discussion.
Speaking of preparing, all students should read the paper and ask questions before day one to help drive the discussion. Similar applies to the in-depth discussion, as each student is expected to contribute to the discussion.
Overall, we will cover 11 papers, roughly broken down into 3 groups. The papers are as follow:
Planetary Scale Storage
Planetary-Scale Analytics & ML
This week we are on the 35th paper in our reading group. We will be discussing “Autoscaling Tiered Cloud Storage in Anna” from VLDB 2019. It is exciting that the reading group has managed to go through this many papers since we have started in April!
Today I and Murat have sat down and picked the next few papers from the OSDI 2020 to discuss. Originally we planned to select just 10 OSDI papers for the next batch, but all the papers are so interesting, and it was so hard to narrow the list down. After some debating, we settled on 14 papers instead. Below is the list in the order the papers will appear in the reading group:
- Fault-tolerant and transactional stateful serverless workflows – December 23rd
- Aragog: Scalable Runtime Verification of Shardable Networked Systems – December 30th
- Toward a Generic Fault Tolerance Technique for Partial Network Partitioning – January 6th
- hXDP: Efficient Software Packet Processing on FPGA NICs – January 13th
- Virtual Consensus in Delos – January 20th
- A large scale analysis of hundreds of in-memory cache clusters at Twitter – January 27th
- Cobra: Making Transactional Key-Value Stores Verifiably Serializable – February 3rd
- Microsecond Consensus for Microsecond Applications – February 10th
- Performance-Optimal Read-Only Transactions – February 17th
- Pegasus: Tolerating Skewed Workloads in Distributed Storage with In-Network Coherence Directories – February 24th
- FlightTracker: Consistency across Read-Optimized Online Stores at Facebook – March 3rd
- Heterogeneity-Aware Cluster Scheduling Policies for Deep Learning Workloads – March 10th
- Sundial: Fault-tolerant Clock Synchronization for Datacenters – March 17th
- Protean: VM Allocation Service at Scale – March 24th
This list will start on December 23rd, and will run us all the way to paper #50! Please join the reading group on slack for schedule, discussion, and Zoom links. The reading group Zoom is every Wednesday at 3:30 pm EST.