Aleksey Charapko

Is Java Fast Enough for Distributed Applications?

Paper Review and Summary

Aleksey Charapko

·

Feb 9, 2017

Lots of modern distributed systems are built with Java programming language, and consequently use Java Virtual Machine (JVM) as their execution environment. The list of such systems is rather large: Hadoop, Spark, HBase, Cassandra, Voldemort, ZooKeeper, BookKeeper, Kafka, and the list goes on and on. But is JVM fast enough for these systems? Anyone who…
Read More
Globally Consistent Distributed Snapshots with Retroscope

Playing Around

Aleksey Charapko

·

Feb 8, 2017

Taking a consistent snapshot of a distributed system is no trivial task for the reasons of asynchrony between the nodes in the system. As the state of each machine changes in response to incoming external messages or internal events, each node may produce a log of such state changes. With the log abstraction, the problem…
Read More
Gorilla – Facebook’s Cache for Time Series Data

Paper Review and Summary

Aleksey Charapko

·

Jan 11, 2017

Facebook operates a huge infrastructure that needs to be constantly monitored for performance and stability. Such monitoring collects huge amounts of data that must be easily accessible to various diagnosis and anomaly detection tools in order to quickly identify and react to possible issues. Many of such parameters can be represented as real-valued time series.…
Read More
The Light of Voldemort

Playing Around

Aleksey Charapko

·

Dec 19, 2016

Few month ago I showcased how a single server of Voldemort key-value store sounds. Sonification is a valid way to monitor systems, and has been used a lot in real applications. Geiger counter would be one of the most well-known examples of a sonified application. In some cases sonification may be the preferred form of…
Read More
Pivot Tracing Part 2

Paper Review and Summary

Aleksey Charapko

·

Oct 16, 2016

After looking more at Pivot Tracing tool described in my earlier post, I asked myself about the limitations of such monitoring approach. Pivot tracing is not a universal tool, so it appears that there are few problems it does not address well enough. The basic idea of the Pivot Tracing is to collect the information…
Read More
The Sound of Voldemort

Playing Around

Aleksey Charapko

·

Sep 4, 2016

Recently at our lab we discussed a fun little project of making distributed systems “play” music. The idea of sonifying a distributed application can be of some benefit for debugging and maintenance, since people have natural ability in recognizing patterns. Of course developer or systems administrators can analyze the logs of their systems and study the…
Read More
Review – Pivot Tracing: Dynamic Causal Monitoring for Distributed Systems

Paper Review and Summary

Aleksey Charapko

·

Jul 10, 2016

Debugging can be a nightmare for software engineers, it is even more so in the distributed systems that span many machines in potentially more than one datacenter. Unfortunately, many of the debugging and monitoring techniques for such large system do not differ much from the methods used to debug and monitor simple single-machine software. Logs…
Read More
Review: Implementing Linearizability at Large Scale and Low Latency

Paper Review and Summary

Aleksey Charapko

·

Feb 21, 2016

In this post I will talk about Implementing Linearizability at Large Scale and Low Latency SOSP 2015 paper. Linearizability, the strongest form of consistency, can be very important in large scale data storage systems, although many such systems either do not implement linearizability or do not fully expose serializable operation to the clients. The later type…
Read More
A Few Words about Inconsistent Replication (IR)

Paper Review and Summary

Aleksey Charapko

·

Nov 11, 2015

Recently I was reading the “Building Consistent Transaction with Inconsistent Replication” paper. In this paper authors use inconsistently replicated state machine, but yet they are capable of creating a consistent transaction system. So what is Inconsistent Replication (IR)? In the previous posts I summarized Raft and EPaxos. These two algorithms are used to achieve consensus…
Read More
EPaxos: Consensus with no leader

Paper Review and Summary

Aleksey Charapko

·

Oct 26, 2015

In my previous post I talked about Raft consensus algorithm. Raft has a strong leader which may present some problems under certain circumstances, for example in case of leader failure or when deployed over a wide area network (WAN). Egalitarian Paxos, or EPaxos, discards the notion of a leader and allows each node to be…
Read More

I am an assistant professor of computer science at the University of New Hampshire. My research interests lie in distributed systems, distributed consensus, fault tolerance, reliability, and scalability.

@AlekseyCharapko

aleksey.charapko@unh.edu

Search