Welcome to the DistSys Reading Group! Every week we present and discuss one distributed systems paper. We try to focus on relatively new papers, although we occasionally break this rule for some important older publications. The main objective of this group is to share knowledge through discussion. Our participants come from academia and industry and often carry a unique perspective and expertise on the subject matter.
Format
We start each meeting with a short presentation of the paper by one of the group members. We record the presentation and later upload it to YouTube for the general audience. After the presentation, we move into a group discussion of the paper. This part is not on the record to make sure we can speak freely about the topic and the paper. However, I write a moderated discussion summary for each meeting and post it here. All the summaries are available via the “Summary” link next to the paper title. To see the archive of past meetings, scroll down to the “Past Meetings” section below.
Meeting Info
- Meeting Time: Thursdays at 1:00 PM EST (10 am PST)
- Duration: ~1 hour
- Slack Channel – Join our Slack for Zoom information
- YouTube Channel
- Google Calendar with our schedule.
Current Schedule (Papers ##161-170)
Below is a list of papers for the fall term of the distributed systems reading group.
- A CloudScale Characterization of Remote Procedure Calls [SOSP’23]
- What: Study and characterization of 10,000 different RPC methods across many Google services.
- Authors: Korakit Seemakhupt, Brent E. Stephens, Samira Khan, Sihang Liu, Hassan Wassel, Soheil Hassas Yeganeh, Alex C. Snoeren, Arvind Krishnamurthy, David E. Culler, Henry M. Levy
- When: March 21st
- Chablis: Fast and General Transactions in Geo-Distributed Systems [CIDR’24]
- What: Scalable, geo-distributed, multi-versioned transactional key-value store that relies on regional locality.
- Authors: Tamer Eldeeb, Philip A. Bernstein, Asaf Cidon, Junfeng Yang
- When: March 28th
- Load is not what you should balance: Introducing Prequal [NSDI’24]
- What: Load balancer for multi-tenant systems aimed to minimize tail latency.
- Authors: Bartek Wydrowski, Robert Kleinberg, Stephen M. Rumble, Aaron Archer
- When: April 4th
- Timestamp as a Service, not an Oracle [VLDB]
- What: Logical timestamping service without a single point of failure designed for distributed transactional systems
- Authors: Yishuai Li, Yunfeng Zhu, Chao Shi, Guanhua Zhang, Jianzhong Wang, Xiaolu Zhang
- When: April 11th
- Database Kernels: Seamless Integration of Database Systems and Fast Storage via CXL [CIDR’24]
- What: SSD over CXL with database-specific hardware offloads.
- Authors: Sangjin Lee, Alberto Lerner, Philippe Bonnet, Philippe Cudré-Mauroux
- When: April 18th
- In-Memory Key-Value Store Live Migration with NetMigrate [FAST’24]
- What: Cross-partition data migration with zero service interruptions using in-network hardware offload for redirecting requests to new “home”
- Authors: Zeying Zhu, Yibo Zhao, Zaoxing Liu
- When: April 25th
- What’s the Story in EBS Glory: Evolutions and Lessons in Building Cloud Block Store [FAST’24]
- What: Alibaba Cloud Elastic Block Storage
- Authors: Weidong Zhang, Erci Xu, Qiuping Wang, Xiaolu Zhang, Yuesheng Gu, Zhenwei Lu, Tao Ouyang, Guanqun Dai, Wenwen Peng, Zhe Xu, Shuo Zhang, Dong Wu, Yilei Peng, Tianyun Wang, Haoran Zhang, Jiasheng Wang, Wenyuan Yan, Yuanyuan Dong, Wenhui Yao, Zhongjie Wu, Lingjun Zhu, Chao Shi, Yinhu Wang, Rong Liu, Junping Wu, Jiaji Zhu, Jiesheng Wu
- When: May 2nd
- Morty: Scaling Concurrency Control with Re-Execution [Eurosys’23]
- What: Better performance in serializable transactional systems through transaction re-execution.
- Authors: Matthew Burke, Florian Suri-Payer, Jeffrey Helt, Lorenzo Alvisi, Natacha Crooks
- When: May 9th
- GEMINI: Fast Failure Recovery in Distributed Training with In-Memory Checkpoints [SOSP’23]
- What: New checkpoint mechanism for ML training systems.
- Authors: Zhuang Wang, Zhen Jia, Shuai Zheng, Zhen Zhang, Xinwei Fu, T. S. Eugene Ng, Yida Wang
- When: May 16th
- Ditto: An Elastic and Adaptive Memory-Disaggregated Caching System [SOSP’23]
- What: In-memory cache on top of a disaggregated-memory system.
- Authors: Jiacheng Shen, Pengfei Zuo, Xuchuan Luo, Yuxin Su, Jiazhen Gu, Hao Feng, Yangfan Zhou, Michael R. Lyu
- When: May 23rd
Past Meetings
- Papers ##37-50
- Papers ##51-60
- Papers ##61-70
- Papers ##71-80
- Papers ##81-90
- Papers ##91-100
- Papers ##101-110
- Papers ##111-120
- Papers ##121-130
- Papers ##131-140
- Papers ##141-150
- Papers ##151-160
Past Special Sessions
- Building Distributed Systems With Stateright – March 30th @ 1pm EST – Jon Nadal.
- Distributed Transactions in YugabyteDB – May 11th @12pm EST – Karthik Ranganathan.
- Fast General Purpose Transactions in Apache Cassandra – February 9thth @ 2 pm EST – Benedict Elliott Smith
- Scalability and Fault Tolerance in YDB – August 10th @ 2pm EST – Andrey Fomichev