A Definitive guide to Kubernetes Operator — The unfolding!

Episode 1

Sanjit Mohanty
6 min readAug 10, 2022

Let me tell you a story. “Story, huh??” — Well, just stay with me & I promise there’s a lot in store for you!

The Prelude

Photo by Jukan Tateisi on Unsplash

This story is about a firm — Sanjit Ecom Pvt. Ltd. The firm is in the business of e-commerce & doing extremely well, booking an astounding YoY profit.

Consequently, there are now more customers onboarded onto their platform which now is driving the firm to establish short term & long term plans for improving upon their application/platform’s performance, scalability & availability. It’s clear, if this is not addressed, the future growth of the firm is questionable!

As a mitigation plan, the firm hired an external consultant to help them address these pressing needs. The external consultant suggested the firm with some remediation actions -

  • Moving from their traditional development architecture towards a cloud native architecture based on micro-services for their workloads
  • Building the services as containers and
  • Deploying, orchestrating & automating through Kubernetes platform.
  • The consultant also suggested to embrace the DevOps practices in the firm as part of which the firm was suggested to combine their software development (dev team) & IT operations teams (ops team) to conceive, build and deliver secure software at top speed through close collaborations.

The consultant promised that if done right, these remediations will shorten the systems development life cycle & provide continuous delivery with high software quality as well as meeting the short & long term goal of the firm.

The firm was convinced with this proposal and all the stakeholders sprang onto actions & implementations.

Within a year the company had their workloads moved to Microservices based architecture and deployed on Kubernetes platform. They were quite happy with these new changes as they saw immediate results where their applications/platform has become more resilient, scalable and available than ever before.

The Tussle

Photo by Sven Vahaja on Unsplash

Over a period of time, as more and more features & consequently new assets got rolled out to production, surprisingly, the ops team started facing hard time. They started being under tremendous pressure and started burning out more.

But why? Isn’t the adoption to Kubernetes was suppose to make life simpler? What happened to all the tall promises?

The Retrospection

Photo by Memento Media on Unsplash

A retrospection within the firm revealed some very interesting facts.

As more and more assets started being pushed to prod, the ops team needed to up their game and learn the details of configuring and managing these new assets, such as installing a new database cluster of a declared software version and spinning desired number of members. On top of that, they also needed to monitor these assets as it runs, and either manually or through automated ways back up data, recover from failures, and upgrade these assets over time.

The ops team were finding hard time managing these complex workloads on the Kubernetes cluster. This needed to be addressed at any cost for the sustenance of the product and the firm!

The exciting bit

Photo by Andrew George on Unsplash

Well, the firm again had to hire the consultant they collaborated earlier & after due deligence the consultant proposed the firm to start embracing The Kubernetes Operators.

“But what it is? What it promises? And how it does what it promises?” — These were few of the questions the firm & the Ops engineers had when the proposal was thrown at them.

The consultant then explained these in sequences -

What is a Kubernetes Operator?

A Kubernetes Operator is an abstraction for deploying non-trivial applications on top of Kubernetes , behind Kubernetes APIs.

Sounds too geeky?? Well in simple terms, the idea behind the operators is when you have an application, like a database such as Postgres or Cassandra or any other complex application for that matter, it needs quite a bit of domain-specific knowledge! The Kubernetes Operator attempts to wraps the logic for deploying and operating such complex non-trivial application using Kubernetes constructs and thus making life easy for the Ops engineering team.

Let’s understand this through an example — The cass operator .

DataStax Kubernetes Operator for Apache Cassandra i.e. the Cass Operator, automates the process of deploying and managing open-source Apache Cassandra or DataStax Enterprise in a Kubernetes cluster. Cass Operator distills the user-supplied information down to the number of nodes and cluster name to manage the lifecycle of individual Kubernetes resources. Additional options are available, but for starters, that’s essentially all you’ll need to specify. Now the process of managing the distributed Cassandra or DSE data platform is much easier!

What does the Kubernetes Operator promises?

Well, to summarise what Kubernetes Operator promises, the consultant quotes Sebastian Pahl here -

“When it comes to really fully automating an application, and by fully automating, I mean handling updates from one version to another without waking people up, handling failure recovery if it’s needed, scaling the application up and down depending on some scenarios — all of that should be handled automatically. Humans should not be involved in this kind of operation because it kind of breaks the promise that containers gave us. Containers’ promise was ‘Hey, you package it once, it runs everywhere.’ Well, that’s true for developers, but in production.”

Kubernetes Operators aim to fulfil exactly this promise! Think of it as your fully automated Ops Engineering team who are tuned to automatically manage a cluster of database servers, for example.

It knows the details of configuring and managing its application, and can install a database cluster of a declared software version and number of members. It continues to monitor its application as it runs, and can back up data, recover from failures, and upgrade the application over time, automatically.

How the Kubernetes Operator does what it promises to do?

Well, let’s first understand what Custom Resources are in Kubernetes. CRs or Custom Resources in Kubernetes are the ways for extending Kubernetes core APIs capabilities. There is a schema defined first for such custom resources which is called as a Custom Resource Definition or CRD.

Unlike members of the official API, a given CRD doesn’t exist on every Kubernetes cluster. These needs to be installed first via different ways such as with the kubectl utility. Post that a cluster operator can interact with the CRs just like they do for any other Kubernetes core resources.

The Kubernetes Operators adds a new custom resource (CR) to the Kubernetes cluster and along with it also introduces a new component that continuously monitors and maintains resources of this new type.

So, in summary, making an Operator actually means -

  • Creating a CRD and
  • Providing a program that runs in a loop watching the CRs of that kind CRD.

What the Operator does in response to changes in the CR is specific to the application the Operator manages.

Like for example — The Cass operator’s logic will be specific to Apache Cassandra domain. The actions the Cass operator may decide to take could include stuffs like horizontal scaling of Cassandra pods or upgrading the Cassandra versions to a new version.

Cass Operator scaling up reference — https://raw.githubusercontent.com/k8ssandra/cass-operator/master/docs/developer/diagrams/scale-up-diagram.svg

The firm & the Ops engineer found it quite useful and were convinced once again and decided to start embracing the Kubernetes Operators!

What Next?

Well, hope with this story, I was able to explain some of the key concepts around the Kubernetes Operators & how it works and helps one to address some of the missing gaps right now.

My intention in this episode was to explain the key concepts without getting too geeky.

In the subsequent episode, I promise no more stories; instead we dig deep and get our hands dirty trying with some examples and hands-on experience the complete lifecycle of a Kubernetes Operator. So, until then stay tuned!

Update: Episode 2 is live now. Do check out here! 🙏



Sanjit Mohanty

Engineering Manager, Broadcom | Views expressed on my blogs are solely mine; not that of present/past employers. Support my work https://ko-fi.com/sanjitmohanty