Fall 2024 Reading Group Papers (Papers ##181-190)

Without further ado, here is the list:

Starburst: A Cost-aware Scheduler for Hybrid Cloud [ATC’24]
- Authors: Michael Luo, Siyuan Zhuang, Suryaprakash Vengadesan, Romil Bhardwaj, Justin Chang, Eric Friedman, Scott Shenker, Ion Stoica
- What: A batched workload scheduler that spans private and public cloud and reduces public cloud cost while ensuring timely job completion.
- When: October 24th
If At First You Don’t Succeed, Try, Try, Again…? [SOSP’24]
- Authors: Bogdan A. Stoica, Utsav Sethi, Yiming Su, Cyrus Zhou, Shan Lu, Jonathan Mace, Madanlal Musuvathi, Suman Nath
- What: Retries, retry bugs, and a bit of LLMs to analyze those
- When: October 31st
ServiceLab: Preventing Tiny Performance Regressions at Hyperscale through Pre-Production Testing [OSDI’24]
- Authors: Mike Chow, Yang Wang,William Wang, Ayichew Hailu, Rohan Bopardikar, Bin Zhang, Jialiang Qu, David Meisner, Santosh Sonawane, Yunqi Zhang, Rodrigo Paim, Mack Ward, Ivor Huang, Matt McNally, Daniel Hodges, Zoltan Farkas, Caner Gocmen, Elvis Huang, and Chunqiang Tang
- What: Performance testing platform for detecting performance regressions in large systems deployed in noisy (e.g., cloud) environments.
- When: November 7th
Massively Parallel Multi-Versioned Transaction Processing [OSDI’24]
- Authors: Shujian Qian, Ashvin Goel
- What: Multi-versioned OLTP store with GPU acceleration for massively parallel execution of transactions
- When: November 14th
Resource Management in Aurora Serverless [VLDB]
- Authors: Bradley Barnhart, Marc Brooker, Daniil Chinenkov, Tony Hooper, Jihoun Im, Prakash Chandra Jha, Tim Kraska, Ashok Kurakula, Alexey Kuznetsov, Grant McAlister, Arjun Muthukrishnan,Aravinthan Narayanan, Douglas Terry, Bhuvan Urgaonkar, Jiaming Yan
- What: Serverless add-on for AWS Aurora that abstracts the resource/capacity usage into Aurora Capacity Units and allows a pay-for-usage model via automatic scaling up/down based on demand.
- When: November 21st
Beaver: Practical Partial Snapshots for Distributed Cloud Services [OSDI’24]
- Authors: Liangcheng Yu, Xiao Zhang, Haoran Zhang, John Sonchack, Dan Ports, Vincent Liu
- What: “Practical partial snapshot protocol that ensures causal consistency.”
- When: December 5th
SwiftPaxos: Fast Geo-Replicated State Machines [NSDI’24]
- Authors: Fedor Ryabinin, Alexey Gotsman, Pierre Sutra
- What: Partly multi-writer Geo-distributed Paxos SMR with lower latency in optimal case compared to MultiPaxos.
- When: December 12th
Anvil: Verifying Liveness of Cluster Management Controllers [OSDI’24]
- Authors: Xudong Sun, Wenjie Ma, Jiawei Tyler Gu, Zicheng Ma, Tej Chajed, Jon Howell, Andrea Lattuada, Oded Padon, Lalith Suresh, Feldera; Adriana Szekeres, Tianyin Xu
- What: Formal verification of cloud management controllers.
- When: December 19th
GaussDB: A Cloud-Native Multi-Primary Database with Compute-Memory-Storage Disaggregation [VLDB]
- Authors: Guoliang Li, Wengang Tian, Jinyu Zhang, Ronen Grosman, Zongchao Liu, Sihao Li
- What: A cloud-native, multi-writer database service with 3-way resource disaggregation: compute for TX processing, disaggregated memory for buffers and locks, and disaggregated storage for persistence/durability.
- When: January 9th
SWARM: Replicating Shared Disaggregated-Memory Data in No Time [SOSP’24]
- Authors: Antoine Murat, Clément Burgelin, Athanasios Xygkis, Igor Zablotchi, Marcos K. Aguilera, Rachid Guerraoui
- What: Replication for in-disaggregated-memory data: how to ensure objects in shared disaggregated memory survive failures inside the shared memory subsystem.
- When: January 16th