• Modeling Paxos Performance – Part 2

    ·

    Placeholder Icon

    In the previous posts I started to explore node-scalability of paxos-style protocols. In this post I will look at processing overheads that I estimate with the help of a queue or a processing pipeline. I show how these overheads cap the performance and affect the latency at different cluster loads. I look at the scalability for…

    Read More

  • Paxos Performance Modeling – Part 1.5

    ·

    Placeholder Icon

    This post is a quick update/conclusion to the part 1. So, does the network variations make any impact at all? In the earlier simulation I showed some small performance degradation going from 3 to 5 nodes. The reality is that for paxos, network behavior makes very little difference on scalability, and in some cases no difference at…

    Read More

  • Do not Blame (only) Network for Your Paxos Scalability Issues. (PPM Part 1)

    ·

    Placeholder Icon

    In the past few months our lab has been doing a lot of work with different flavors of paxos consensus algorithm. Paxos and its numerous flavors are widely used in today’s cloud infrastructure. Distributed systems rely on it for many different tasks to ensure safe operation. For instance, coordination services use some consensus protocol flavor…

    Read More

  • Trace Synchronization with HLC

    ·

    Placeholder Icon

    Event logging or tracing is one of the most common techniques for collecting data about the software execution. For simple application running on the same machine, a trace of events timestamped with the machine’s hardware clock is typically sufficient. When the system grows and becomes distributed over multiple nodes, each node is going to produce…

    Read More

  • One Page Summary: Flease – Lease Coordination without a Lock Server

    ·

    Placeholder Icon

    This paper talks about a decentralized lease management solution. In the past, many lock/lease services have been centralized, placing a single authority to manage all locks in the system. Google’s Chubby, Apache ZooKeeper, etcd, and others rely on a centralized approach and backed by some flavor of a consensus algorithm for fault-tolerance. According to Flease authors,…

    Read More

  • One Page Summary: “milliScope: a Fine-Grained Monitoring Framework for Performance Debugging of n-Tier Web Services”

    ·

    Placeholder Icon

    Authors of the ICDCS2017 milliScope paper attack an interesting monitoring problem for distributed systems: detecting and determining a cause of short-lived events in the system. In particular, they address the issue of identifying very short bottlenecks (VSBs) in distributed web services. VSBs manifest themselves as performance degradation of a small number of requests, however they…

    Read More

  • Sonification of Distributed Systems with RQL

    ·

    Placeholder Icon

    In the past, I have discussed sonification as a mean of representing monitoring data. Aside from some silly and toy examples, sonifications can be used for serious applications. In many monitoring cases, the presence of some phenomena is more important than the details about it. In such situations, simple sonification is a perfect way to…

    Read More

  • Retroscoping Zookeeper Staleness

    ·

    Placeholder Icon

    ZooKeeper is a popular coordination service used as part of many large scale distributed systems. ZooKeeper provides a file-system inspired abstraction to the users on top of its replicated key-value store. Like other Paxos-inspired protocols, ZooKeeper is typically deployed on at least 3 nodes, and can tolerate F node failure for a cluster of size…

    Read More

  • Why Government IT is Expensive and Archaic

    ·

    Placeholder Icon

    Disclaimer: I do not work for the government, and my rant below is based on my very limited exposure to how IT works at the US government setting. Why Government IT is Expensive and Archaic? I think, this can be a very long discussion, but I do have a quick answer:  standards imposed by government…

    Read More

  • The First Datastore-driven Vehicle

    ·

    Placeholder Icon

    It is not a secret that procrastination is the favorite activity of most PhD students. I have been procrastinating today, even though my advisor probably wants me to keep writing.  In the midst of my procrastination, I thought: “Why are there self-driving vehicles, but no database-driven vehicles?” As absurd as it sounds, I gave it…

    Read More

Aleksey CharapkoI am an assistant professor of computer science at the University of New Hampshire. My research interests lie in distributed systems, distributed consensus, fault tolerance, reliability, and scalability.
X (twitter)@AlekseyCharapko
emailaleksey.charapko@unh.edu

Search