• Fall Term Reading Group Papers: ##111-120

    ·

    Placeholder Icon

    Below is a list of papers for the fall term of the distributed systems reading group. The list is also on the reading group’s Google Calendar. Metastable Failures in the Wild [OSDI’22] Authors: Lexiang Huang, Matthew Magnusson, Abishek Bangalore Muralikrishna, Salman Estyak, Rebecca Isaacs, Abutalib Aghayev, Timothy Zhu, Aleksey Charapko What: An exploration of many…

    Read More

  • Reading Group Special Session: Scalability and Fault Tolerance in YDB

    ·

    Placeholder Icon

    YDB is an open-source Distributed SQL Database. YDB is used as an OLTP Database for mission-critical user-facing applications. It provides strong consistency and serializable transaction isolation for the end user. One of the main characteristics of YDB is scalability to very large clusters together with multitenancy, i.e. ability to provide an isolated user environment for…

    Read More

  • Metastable Failures in the Wild

    ·

    Metastable failures in distributed systems are failures that “feed” and strengthen their own “failed” condition. The main characteristic of a metastable failure is a positive feedback loop that keeps the system in a degraded/failed state. These failures are hard to spot, as they always start with some other distraction — some trigger event that nudges…

    Read More

  • Reading Group. Darwin: Scale-In Stream Processing

    ·

    Placeholder Icon

    In the 99th reading group meeting, we discussed stream processing. The paper we read, “Darwin: Scale-In Stream Processing” by Lawrence Benson and Tilmann Rabl, argues that many stream processing systems are relatively inefficient in utilizing the hardware. These inefficiencies stem from the need to ingest large volumes of data to the requirement of durably storing…

    Read More

  • Reading Group. Achieving High Throughput and Elasticity in a Larger-than-Memory Store

    ·

    Placeholder Icon

    “Achieving High Throughput and Elasticity in a Larger-than-Memory Store” paper by Chinmay Kulkarni, Badrish Chandramouli, and Ryan Stutsman discusses elastic, scalable distributed storage. The paper proposes Shadowfax, an extension to the FASTER single-node KV-store. The particular use case targeted by Shadowfax is the ingestion of large volumes of (streaming) data. The system does not appear…

    Read More

  • Reading Group. Shard Manager: A Generic Shard Management Framework for Geo-distributed Applications

    ·

    Placeholder Icon

    The 97th paper in the reading group was “Shard Manager: A Generic Shard Management Framework for Geo-distributed Applications.” This paper from Facebook talks about a sharding framework used in many of Facebook’s internal systems and applications. Sharding is a standard way to provide horizontal scalability — systems can break down their data into (semi-) independent…

    Read More

  • Reading Group. Solar Superstorms: Planning for an Internet Apocalypse

    ·

    Placeholder Icon

    Our 96th reading group paper was very different from the topics we usually discuss. We talked about the “Solar Superstorms: Planning for an Internet Apocalypse” SIGCOMM’21 paper by Sangeetha Abdu Jyothi. Now (May 2022), we are slowly approaching the peak of solar cycle 25 (still due in a few years?) as the number of observable…

    Read More

  • New Reading List: Papers #101-110

    ·

    Placeholder Icon

    Summer term papers are here! The list is bellow. Also, here is a Google Calendar. Graham: Synchronizing Clocks by Leveraging Local Clock Properties – NSDI’22 Authors: Ali Najafi, Michael Wei What: Better clock sync under failures When: May 18th, 2022 Understanding, Detecting and Localizing Partial Failures in Large System Software – NSDI’20 Authors: Chang Lou,…

    Read More

  • Reading Group. ByShard: Sharding in a Byzantine Environment

    ·

    Placeholder Icon

    Our 93rd paper in the reading group was “ByShard: Sharding in a Byzantine Environment” by Jelle Hellings, Mohammad Sadoghi. This VLDB’21 paper talks about sharded byzantine systems and proposes an approach that can implement 18 different multi-shard transaction algorithms. More specifically, the paper discusses two-phase commit (2PC) and two-phase locking (2PL) in a byzantine environment.…

    Read More

  • Reading Group. CompuCache: Remote Computable Caching using Spot VMs

    ·

    Placeholder Icon

    In the 92nd reading group meeting, we have covered “CompuCache: Remote Computable Caching using Spot VMs” CIDR’22 paper by Qizhen Zhang, Philip A. Bernstein, Daniel S. Berger, Badrish Chandramouli, Vincent Liu, and Boon Thau Loo.  Cloud efficiency seems to be a popular topic recently. A handful of solutions try to improve the efficiency of the…

    Read More

Aleksey CharapkoI am an assistant professor of computer science at the University of New Hampshire. My research interests lie in distributed systems, distributed consensus, fault tolerance, reliability, and scalability.
X (twitter)@AlekseyCharapko
emailaleksey.charapko@unh.edu

Search