Distributed Systems

  • Reading Group. Log-structured Protocols in Delos

    ·

    Placeholder Icon

    For the 87th DistSys paper, we looked at “Log-structured Protocols in Delos” by Mahesh Balakrishnan, Chen Shen, Ahmed Jafri, Suyog Mapara, David Geraghty, Jason Flinn Vidhya Venkat, Ivailo Nedelchev, Santosh Ghosh, Mihir Dharamshi, Jingming Liu, Filip Gruszczynski, Jun Li Rounak Tibrewal, Ali Zaveri, Rajeev Nagar, Ahmed Yossef, Francois Richard, Yee Jiun Song. The paper appeared…

    Read More

  • Reading Group. Conflict-free Replicated Data Types

    ·

    Placeholder Icon

    We kicked off a new set of papers in the reading group with some fundamental reading – “Conflict-free Replicated Data Types.” Although not very old (and not the first one to suggest something similar to CRDTs), the paper we discussed presents a proper definition of Conflict-free Replicated Data Types (CRDTs) and the consistency framework around…

    Read More

  • Planetary-Scale Systems Seminar Spring 2021

    ·

    Placeholder Icon

    This spring semester I am teaching an exciting seminar class: “Planetary-Scale Systems.” I will start the seminar with a 4 lectures long crash course to get my students on the same page, but the bulk of the class will be paper presentations and discussions. The format is similar to the zoom reading group I am…

    Read More

  • One Page Summary: “Musketeer: all for one, one for all in data processing systems”.

    ·

    Placeholder Icon

    Many distributed computation platforms and programming frameworks exist today, and new ones constantly popping out from the industry and academia.  Some platforms are domain specific, such as TensorFlow for machine learning. Others, like Hadoop and Naiad are more general, and this generality allows for sophisticated and specialized programming abstractions to be built on top. So…

    Read More

  • One Page Summary: Incremental, Iterative Processing with Timely Dataflow

    ·

    Placeholder Icon

    This paper describes Naiad distributed computation system. Naiad uses dataflow model to represent the computations, but it aims to be a general dataflow framework in contrast to other specialized approaches such as TensorFlow. Similarly to other dataflow systems, the computations are represented as graphs, where vertices represent data and operations and edges carry the data…

    Read More

  • Is Java Fast Enough for Distributed Applications?

    ·

    Placeholder Icon

    Lots of modern distributed systems are built with Java programming language, and consequently use Java Virtual Machine (JVM) as their execution environment. The list of such systems is rather large: Hadoop, Spark, HBase, Cassandra, Voldemort, ZooKeeper, BookKeeper, Kafka, and the list goes on and on. But is JVM fast enough for these systems? Anyone who…

    Read More