RG Special Session

Reading Group Special Session: Scalability and Fault Tolerance in YDB

YDB is an open-source Distributed SQL Database. YDB is used as an OLTP Database for mission-critical user-facing applications. It provides strong consistency and serializable transaction isolation for the end user. One of the main characteristics of YDB is scalability to very large clusters together with multitenancy, i.e. ability to provide an isolated user environment for users in a single cluster. In this talk, we will cover two layers of YDB – Tablet and BlobStorage layers that together provide fault tolerance, scalability, and user isolation.

Tablet is a very lightweight component that implements distributed consensus. It is a building block for user data storage, schema manager, tablet manager, system views, and many other components. A single YDB cluster is known to have more than a million tablets, a single YDB node can serve up to 10K tablets. Tablet implements distributed consensus protocol over a shared log provided by the BlobStorage layer.

The BlobStorage layer is a fault-tolerant key-value append-only database. It implements a specific API for logging, blobs reading and writing, garbage collection, and quorum management. We believe that the protocol between Tablet and BlobStorage is quite simple and understandable by listeners compared to Paxos or Raft. We also illustrate the flexibility of YDB in real-life use cases like migrating some parts of the cluster from one availability zone to another without a maintenance window or storing some data blobs on fast media like SSD and another blob on HDD.

Speaker: Andrey is the head of the YDB project (https://ydb.tech). He works in Yandex as a Head of Department. He also has a long story in Yandex as a member of the Web Crawler Team, a member of the Infrastructure team, one of the creators of Yandex Cloud, and now the part of CloudIL Team. Andrey holds a Ph.D. from Moscow State University.

When: Wednesday, August 10th at 2:00 pm EST (6 pm UTC)

Where: Zoom

Reading Group Special Session: Fast General Purpose Transactions in Apache Cassandra

Modern distributed databases employ leader-based consensus protocols to achieve consistency, entailing certain trade-offs: typically either a scalability bottleneck or weak isolation. Leaderless protocols have been proposed to address these and other shortcomings of leader-based techniques, but these have not yet materialized into production systems. 

This paper outlines compromises entailed by existing leaderless protocols versus leader-based protocols and proposes general techniques for addressing them. A new protocol, called Accord, is proposed with optimal failure tolerance that, under reasonable assumptions, achieves optimal latency of two message delays for transactions initiated by coordinators in any region, under any level of competing transactions and maximal tolerated process failures.

Benedict Elliott Smith will present the Accord protocol in our DistSys Reading Group. Benedict is an Apache Cassandra contributor with an interest in performance, correctness, and algorithm design.

When: February 9th 2022 at 2 PM EST (Check your time zone)

Reading Group Special Session: Distributed Transactions in YugabyteDB

When: May 11th at 12:00 pm EST

Who: Karthik Ranganathan.

Karthik Ranganathan is a founder and CTO of YugabyteDB, a globally distributed, strongly consistent database. Prior to Yugabyte, Karthik was at Facebook, where he built the Cassandra database. In this talk, Karthik will discuss Yugabyte’s use of time synchronization and Raft protocol along with some optimizations that enable high-performance distributed transactions.

Abstract

ACID transactions are a fundamental building block when developing business-critical, user-facing applications. They simplify the complex task of ensuring data integrity while supporting highly concurrent operations. While they are taken for granted in monolithic SQL databases, most distributed DBs would forsake them completely.

Fortunately, this is no longer the case. The trend started with Google Spanner, which offered distributed transactions using GPS based atomic clocks – unheard of in the database world before. Now, distributed transactions – without requiring atomic clocks – are offered by distributed SQL databases. One such example of a fully open source database offering this is YugabyteDB. Using the example of YugabyteDB, this talk will explain how distributed ACID transactions can be achieved without atomic clocks – without compromising on performance.

Reading Group Special Session: Building Distributed Systems With Stateright

This talk is part of the Distributed Systems Reading Group.

Stateright is a software framework for analyzing and systematically verifying distributed systems. Its name refers to its goal of verifying that a system’s collective state always satisfies a correctness specification, such as “operations invoked against the system are always linearizable.” Cloud service providers like AWS and Azure leverage verification software such as the TLA+ model checker to achieve the same goal, but whereas that solution can verify a high level system design, Stateright is able to systematically verify the underlying system implementation in addition to the design.

In this talk, Stateright’s author Jonathan Nadal will introduce the project, demo its usage, and discuss upcoming improvements. Jonathan has been in the cloud space since 2012 when he joined Amazon Web Services to work on S3. There he met Chris Newcombe, who introduced TLA+ to the company, and was immediately hooked by that technology. Jonathan later joined Oracle to help build Oracle Cloud Infrastructure, where he is currently a director of engineering and a proponent of model checking techniques within and outside his own teams. Stateright is a personal project not affiliated with Oracle or Amazon.

Video Recording

Reading Group

Our reading group takes place over Zoom every Wednesday at 2:00 pm EST. We have a slack group where we post papers, hold discussions, and most importantly manage Zoom invites to paper discussions. Please join the slack group to get involved!