• Review – Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems

    ·

    Placeholder Icon

    Debugging can be a nightmare for software engineers, it is even more so in the distributed systems that span many machines in potentially more than one datacenter. Unfortunately, many of the debugging and monitoring techniques for such large system do not differ much from the methods used to debug and monitor simple single-machine software. Logs…

    Read More

  • Review: Implementing Linearizability at Large Scale and Low Latency

    ·

    Placeholder Icon

    In this post I will talk about Implementing Linearizability at Large Scale and Low Latency SOSP 2015 paper. Linearizability, the strongest form of consistency, can be very important in large scale data storage systems, although many such systems either do not implement linearizability or do not fully expose serializable operation to the clients. The later type…

    Read More

  • A Few Words about Inconsistent Replication (IR)

    ·

    Placeholder Icon

    Recently I was reading the “Building Consistent Transaction with Inconsistent Replication” paper. In this paper authors use inconsistently replicated state machine, but yet they are capable of creating a consistent transaction system. So what is Inconsistent Replication (IR)? In the previous posts I summarized Raft and EPaxos. These two algorithms are used to achieve consensus…

    Read More

  • EPaxos: Consensus with no leader

    ·

    Placeholder Icon

    In my previous post I talked about Raft consensus algorithm. Raft has a strong leader which may present some problems under certain circumstances, for example in case of leader failure or when deployed over a wide area network (WAN). Egalitarian Paxos, or EPaxos, discards the notion of a leader and allows each node to be…

    Read More

  • Consensus with Raft Algorithm

    ·

    Placeholder Icon

    When we talk about consensus in a distributed system, we talk about a system consisting of multiple machines that act as one state machine yet capable of surviving failures of some of the system nodes. Consensus algorithms are designed to enforce all distributed nodes have the same state so that the distributed system can tolerate…

    Read More

  • About Google’s Dataflow Model

    ·

    Placeholder Icon

    In this post I am trying to understand the Google’s Dataflow Model, a data management and manipulation framework used for dealing with unbounded and unordered datasets. A lot of the data is being constantly produced today and has no “maximum size”, in other words the amount of such data is constantly increasing, and therefore modern…

    Read More

  • Understanding ZooKeeper

    ·

    Placeholder Icon

    ZooKeeper is not new, it has been around for quite some time now, yet I feel like not many people who use it in one way or another do understand what it is. ZooKeeper is used by so many distributed systems at the moment that it became a crucial part of the distributed computing, and…

    Read More

  • Graph Processing at Facebook Scale

    ·

    Placeholder Icon

    I will start with a little note on large scale graph processing, as described in the paper “One Trillion Edges: Graph Processing at Facebook Scale”. Graph processing tasks are very common in analyzing various kinds of data, such as network topology of interconnection of people. Social media imposes challenges for such systems and algorithms due…

    Read More

  • New Blog

    ·

    Placeholder Icon

    My name is Aleksey Charapko, I am a computer science student at the University at Buffalo. In this blog I will try to write on mostly technical topics that I am interested in.

    Read More

Aleksey CharapkoI am an assistant professor of computer science at the University of New Hampshire. My research interests lie in distributed systems, distributed consensus, fault tolerance, reliability, and scalability.
X (twitter)@AlekseyCharapko
emailaleksey.charapko@unh.edu

Search