• Metastable Failures in the Wild

    ·

    Metastable failures in distributed systems are failures that “feed” and strengthen their own “failed” condition. The main characteristic of a metastable failure is a positive feedback loop that keeps the system in a degraded/failed state. These failures are hard to spot, as they always start with some other distraction — some trigger event that nudges…

    Read More

  • Reading Group. Darwin: Scale-In Stream Processing

    ·

    Placeholder Icon

    In the 99th reading group meeting, we discussed stream processing. The paper we read, “Darwin: Scale-In Stream Processing” by Lawrence Benson and Tilmann Rabl, argues that many stream processing systems are relatively inefficient in utilizing the hardware. These inefficiencies stem from the need to ingest large volumes of data to the requirement of durably storing…

    Read More

  • Reading Group. Achieving High Throughput and Elasticity in a Larger-than-Memory Store

    ·

    Placeholder Icon

    “Achieving High Throughput and Elasticity in a Larger-than-Memory Store” paper by Chinmay Kulkarni, Badrish Chandramouli, and Ryan Stutsman discusses elastic, scalable distributed storage. The paper proposes Shadowfax, an extension to the FASTER single-node KV-store. The particular use case targeted by Shadowfax is the ingestion of large volumes of (streaming) data. The system does not appear…

    Read More

  • Reading Group. Shard Manager: A Generic Shard Management Framework for Geo-distributed Applications

    ·

    Placeholder Icon

    The 97th paper in the reading group was “Shard Manager: A Generic Shard Management Framework for Geo-distributed Applications.” This paper from Facebook talks about a sharding framework used in many of Facebook’s internal systems and applications. Sharding is a standard way to provide horizontal scalability — systems can break down their data into (semi-) independent…

    Read More

  • Reading Group. Solar Superstorms: Planning for an Internet Apocalypse

    ·

    Placeholder Icon

    Our 96th reading group paper was very different from the topics we usually discuss. We talked about the “Solar Superstorms: Planning for an Internet Apocalypse” SIGCOMM’21 paper by Sangeetha Abdu Jyothi. Now (May 2022), we are slowly approaching the peak of solar cycle 25 (still due in a few years?) as the number of observable…

    Read More

  • New Reading List: Papers #101-110

    ·

    Placeholder Icon

    Summer term papers are here! The list is bellow. Also, here is a Google Calendar. Graham: Synchronizing Clocks by Leveraging Local Clock Properties – NSDI’22 Authors: Ali Najafi, Michael Wei What: Better clock sync under failures When: May 18th, 2022 Understanding, Detecting and Localizing Partial Failures in Large System Software – NSDI’20 Authors: Chang Lou,…

    Read More

  • Reading Group. ByShard: Sharding in a Byzantine Environment

    ·

    Placeholder Icon

    Our 93rd paper in the reading group was “ByShard: Sharding in a Byzantine Environment” by Jelle Hellings, Mohammad Sadoghi. This VLDB’21 paper talks about sharded byzantine systems and proposes an approach that can implement 18 different multi-shard transaction algorithms. More specifically, the paper discusses two-phase commit (2PC) and two-phase locking (2PL) in a byzantine environment.…

    Read More

  • Reading Group. CompuCache: Remote Computable Caching using Spot VMs

    ·

    Placeholder Icon

    In the 92nd reading group meeting, we have covered “CompuCache: Remote Computable Caching using Spot VMs” CIDR’22 paper by Qizhen Zhang, Philip A. Bernstein, Daniel S. Berger, Badrish Chandramouli, Vincent Liu, and Boon Thau Loo.  Cloud efficiency seems to be a popular topic recently. A handful of solutions try to improve the efficiency of the…

    Read More

  • Reading Group. Using Lightweight Formal Methods to Validate a Key-Value Storage Node in Amazon S3

    ·

    Placeholder Icon

    For the 90th reading group paper, we did “Using Lightweight Formal Methods to Validate a Key-Value Storage Node in Amazon S3” by James Bornholt, Rajeev Joshi, Vytautas Astrauskas, Brendan Cully, Bernhard Kragl, Seth Markle, Kyle Sauri, Drew Schleit, Grant Slatton, Serdar Tasiran, Jacob Van Geffen, Andrew Warfield. As usual, we have a video: Andrey Satarin…

    Read More

  • Reading Group. Basil: Breaking up BFT with ACID (transactions)

    ·

    Placeholder Icon

    Our 89th paper in the reading group was “Basil: Breaking up BFT with ACID (transactions)” from SOSP’21 by Florian Suri-Payer, Matthew Burke, Zheng Wang, Yunhao Zhang, Lorenzo Alvisi, and Natacha Crooks. I will make this summary short. We had a quick and improvised presentation as well. Unfortunately, this time around, it was not recorded.  The…

    Read More

Aleksey CharapkoI am an assistant professor of computer science at the University of New Hampshire. My research interests lie in distributed systems, distributed consensus, fault tolerance, reliability, and scalability.
X (twitter)@AlekseyCharapko
emailaleksey.charapko@unh.edu

Search