Few month ago I showcased how a single server of Voldemort key-value store sounds. Sonification is a valid way to monitor systems, and has been used a lot in real applications. Geiger counter would be one of the most well-known examples of a sonified application. In some cases sonification may be the preferred form of representing information, as other forms surprisingly may not work as well for human perception. Even visualization of information is often not as good as sonification. Take the same Geiger counter example; research has shown that visual radiation level monitors do not perform quite as well as the sound ones as people tend to be distracted from the display to perform other tasks. Even visual and audio hybrids do not alert users of high radiation levels as good as simple audio counter.
As an example of a hybrid audio-visual system for Voldemort, I have built a small “traffic-light monitor” that changes colors and beeps differently depending on what action is performed by the server. The rig is built with Arduino and simply plugs to the USB of the machine running Voldemort server. Below is a short video of how it operates:
The green light lights up when server handles a “get” operation. Yellow light is for “put” requests and red light is for “get version” commands. As can be see, writing to Voldemort requires two operations, “get version” and “put” and they happen so quickly that Arduino is barely capable to light up the LEDs.
P.S. “Traffic light monitor” is not to be taken seriously, it is rather a silly example to show that there are plenty of ways to monitor a system or represent system’s logs.
Recently at our lab we discussed a fun little project of making distributed systems “play” music. The idea of sonifying a distributed application can be of some benefit for debugging and maintenance, since people have natural ability in recognizing patterns. Of course developer or systems administrators can analyze the logs of their systems and study the patterns that way, but listing to patterns and hearing the changes in such patterns is something we can do in the background, probably without taxing our entire attention span.
So how does a distributed system sound? And what can we learn by listening to it? Here is a 4.5 minute clip of a single Voldemort server playing its song.
Each message request type coming to the server was assigned a different pitch, with the note duration roughly corresponding to the time it took to fulfill the request. Of course, the recording was slowed down compared to the original execution of the node, with the entire 4 minutes and 37 seconds of audio representing just a coupe of seconds of real-time operation.
The audio has been recorded under a static workload of read and write operations, but there are few things that we can definitely hear about Voldemort’s operation even without the workload variations. The most obvious one is that the first half a minute of the audio is mostly silence. This is something I observed from the logs earlier as well, as Voldemort takes some time to get to its paces. As the execution progresses, we can definitely hear different operations happening at somewhat constant rate. In the second half of the audio, we can hear a few “hick-ups” as well as some louder and more forceful sounds for the requests that took longer to process. This, however was normal operation of Voldemort node, and introducing some problems into the system, such as network congestion or some machine failure will most definitely impact the sounds of this node.
What about making the sound of the entire distributed system? This becomes a trickier problem, as now we need to play multiple streams for all the distributed components in our system at once. Such components can be located on different physical servers and different racks and even different datacenters. However, for us to play the “true” sound of the system while preserving the causality of events, we need to be able to precisely synchronize and align the streams from various servers, accounting for any time skew and clock imprecisions.
Additionally, with multiple servers “playing” at once it may be more difficult for people to comprehend the patterns and the changes to such patterns.