debugging
-
Reading Group. How to fight production incidents?: an empirical study on a large-scale cloud service
In the 125th reading group meeting, we looked at the reliability of cloud services. In particular, we read the “How to fight production incidents?: an empirical study on a large-scale cloud service” SoCC’22 paper by Supriyo Ghosh, Manish Shetty, Chetan Bansal, and Suman Nath. This paper looks at 152 severe production incidents in the Microsoft…
-
Reading Group. Distributed Snapshots: Determining Global States of Distributed Systems
On Wednesday we kicked off a new set of papers in the reading group. We have started with one of the classical foundational papers in distributed systems and looked at the Chandy-Lamport token-based distributed snapshot algorithm. The basic idea here is to capture the state of distributed processes and channels by “flushing” the messages out…
Search
Recent Posts
- Spring 2024 Reading Group Papers (##161-170)
- Reading Group #153. Deep Note: Can Acoustic Interference Damage the Availability of Hard Disk Storage in Underwater Data Centers?
- Reading Group #152. Blueprint: A Toolchain for Highly-Reconfigurable Microservice Applications
- Reading Group 151. Towards Modern Development of Cloud ApplicationsReading Group 151.
- Reading Group #150. Model Checking Guided Testing for Distributed Systems.
Categories
- One Page Summary (10)
- Other Thoughts (10)
- Paper Review and Summary (14)
- Playing Around (14)
- Reading Group (96)
- RG Special Session (4)
- Teaching (1)