The construction, process and usage of databases has evolved a lot over the last few decades. Traditional relational databases were enough to work with the data present at that time, but with the innate reliance on the Internet, the progression of cloud-native architecture, and the advancement of how businesses utilise and analyse data science, relational databases are not cutting it. What happens if a node fails in a traditional single machine of a relational database? Your database would go down along with any applications that depend on it.
Over time as NoSQL databases were introduced—which are capable of handling a large amount of data in real-time—the risk of apps failing began to decrease but the risk of data inconsistencies increased. So, there has been a growing need for a better storage solution for data to cope with today’s dynamic cloud-native architecture. CockroachDB was specifically designed to solve and meet this need.
What Is CockroachDB?
CockroachDB is a globally distributed SQL database constructed on top of a transactional and consistent key-value store that you can use everywhere. The database tool is optimized for the cloud to deliver guaranteed transactions for local and globally distributed workloads and it allows you to build global, scalable and resilient cloud services. When using CockroachDB, you will typically run into two main terms: nodes and clusters. Nodes are individual machines running CockroachDB, and when we join these nodes together, they form a cluster which is the start of an entire functioning CockroachDB system. It’s advisable to run CockroachDB as a multi-node cluster to leverage the full scope of it’s cloud-native database design.
CockroachDB Architecture
CockroachDB is implemented as a distributed key-value store over a monolithic sorted map, to make it easy for large tables and indexes to function. While CockroachDB is a distributed SQL database, developers treat it as a relational database because it uses the same SQL syntax. But on an architecture level, CockroachDB’s architecture is different from a relational database architecture. In CockroachDB, every table is ordered lexicographically by key. So, when we store the data on the database, we are leveraging the key value store.
Since CockroachDB has a distributed architecture, we just need to spin a node up of cockroach Database, point it at a cluster, and the database participates in that cluster. CockroachDB then coordinates with the nodes to gain consensus for all queries and transactions. When we spin up a node and point at the cluster, data is balanced out based on what you optimally want to do with that data. The whole cluster has just one type of node as a single composable unit and every node is a single consistent gateway to the entirety of the database. So, we could have a database with clusters and nodes worldwide, which will look like one logical database to the application that is accessing it in whichever region.
CockroachDB architecture offers high availability and consistency. In the CockroachDB system, if a node dies, your application continues running by leveraging the other nodes in the cluster, and when you bring the node back online, the node reads are immediately consistent with the other nodes.
Synchronizing CockroachDB and Kubernetes
If you have spent a lot of time around people in the container or the orchestration community, you may have heard the opinion on occasion that running databases on Kubernetes is a bad idea. While some of these complaints were valid a couple of years ago, today there are multiple resilient databases that are capable of hiding virtualizations, failovers, container workloads, etc. Which is what makes CockroachDB is a great fit for Kubernetes clusters.
While Kubernetes makes it easy to deploy, scale and manage applications, managing and retaining the state of a cluster is a challenge on the orchestration platform. Kubernetes needs a storage system that can replicate data across the database nodes to survive any kind of failure, and that is what CockroachDB thrives at. Combining CockroachDB with Kubernetes allows us to orchestrate containers without sacrificing high availability and helps in maintaining the correctness of stateful databases.
Deploying CockroachDB on a Kubernetes Cluster
The ideal prerequisite for following the below steps is to have a Kubernetes cluster up and running for deploying and using CockroachDB on.
There are multiple ways how we can deploy CockroachDB on Kubernetes, for this article though, we will deploy through the CockroachDB Kubernetes operator. Firstly, apply the CockroachDB Operator using CustomResourceDefinition (CRD).