Aleksey Charapko

Sonification of Distributed Systems with RQL

Playing Around

Aleksey Charapko

·

Jun 12, 2017

In the past, I have discussed sonification as a mean of representing monitoring data. Aside from some silly and toy examples, sonifications can be used for serious applications. In many monitoring cases, the presence of some phenomena is more important than the details about it. In such situations, simple sonification is a perfect way to…
Read More
Retroscoping Zookeeper Staleness

Other Thoughts

Aleksey Charapko

·

Apr 24, 2017

ZooKeeper is a popular coordination service used as part of many large scale distributed systems. ZooKeeper provides a file-system inspired abstraction to the users on top of its replicated key-value store. Like other Paxos-inspired protocols, ZooKeeper is typically deployed on at least 3 nodes, and can tolerate F node failure for a cluster of size…
Read More
Why Government IT is Expensive and Archaic

Other Thoughts

Aleksey Charapko

·

Apr 6, 2017

Disclaimer: I do not work for the government, and my rant below is based on my very limited exposure to how IT works at the US government setting. Why Government IT is Expensive and Archaic? I think, this can be a very long discussion, but I do have a quick answer: standards imposed by government…
Read More
The First Datastore-driven Vehicle

Playing Around

Aleksey Charapko

·

Mar 25, 2017

It is not a secret that procrastination is the favorite activity of most PhD students. I have been procrastinating today, even though my advisor probably wants me to keep writing. In the midst of my procrastination, I thought: “Why are there self-driving vehicles, but no database-driven vehicles?” As absurd as it sounds, I gave it…
Read More
One Page Summary: “Musketeer: all for one, one for all in data processing systems”.

One Page Summary

Aleksey Charapko

·

Mar 12, 2017

Many distributed computation platforms and programming frameworks exist today, and new ones constantly popping out from the industry and academia. Some platforms are domain specific, such as TensorFlow for machine learning. Others, like Hadoop and Naiad are more general, and this generality allows for sophisticated and specialized programming abstractions to be built on top. So…
Read More
One Page Summary: “Slicer: Auto-Sharding for Datacenter Applications”

One Page Summary

Aleksey Charapko

·

Mar 8, 2017

One of the questions engineers of large distributed system must answer is “where to compute”. This is a big and important question, as we do not want to send a request originating in the US to some server in Australia. It simply makes no sense to incur the communication overhead if there are resources available…
Read More
Monitoring with Retroscope: Detecting Invariant Violations

Playing Around

Aleksey Charapko

·

Feb 24, 2017

Earlier I briefly mentioned Retroscope, our distributed snapshot library that makes taking non-blocking, unplanned consistent global distributed snapshots possible. However, these snapshots are only good if we know how to use them well. Of course the most obvious use case is just a data backup, and despite it being an important application for snapshots, I…
Read More
One Page Summary: Incremental, Iterative Processing with Timely Dataflow

One Page Summary

Aleksey Charapko

·

Feb 13, 2017

This paper describes Naiad distributed computation system. Naiad uses dataflow model to represent the computations, but it aims to be a general dataflow framework in contrast to other specialized approaches such as TensorFlow. Similarly to other dataflow systems, the computations are represented as graphs, where vertices represent data and operations and edges carry the data…
Read More
Is Java Fast Enough for Distributed Applications?

Paper Review and Summary

Aleksey Charapko

·

Feb 9, 2017

Lots of modern distributed systems are built with Java programming language, and consequently use Java Virtual Machine (JVM) as their execution environment. The list of such systems is rather large: Hadoop, Spark, HBase, Cassandra, Voldemort, ZooKeeper, BookKeeper, Kafka, and the list goes on and on. But is JVM fast enough for these systems? Anyone who…
Read More
Globally Consistent Distributed Snapshots with Retroscope

Playing Around

Aleksey Charapko

·

Feb 8, 2017

Taking a consistent snapshot of a distributed system is no trivial task for the reasons of asynchrony between the nodes in the system. As the state of each machine changes in response to incoming external messages or internal events, each node may produce a log of such state changes. With the log abstraction, the problem…
Read More

I am an assistant professor of computer science at the University of New Hampshire. My research interests lie in distributed systems, distributed consensus, fault tolerance, reliability, and scalability.

@AlekseyCharapko

aleksey.charapko@unh.edu

Search