Kubernetes is one of the most powerful platforms ever built. It is also one of the most operationally expensive. Whether it is the right tool for your team depends on a small number of specific questions — not on what your favorite tech podcast is recommending this month.
This post is a clear-eyed framework for deciding whether Kubernetes is right for you, and how to operate it well if you’re already on it.
When Kubernetes is the right call
Kubernetes earns its operational cost when you have:
If you check most of those boxes, Kubernetes is likely the right answer.
When Kubernetes is the wrong call
Kubernetes is overkill when:
If you’re already on Kubernetes and it’s painful
This is the most common situation we see. A team adopted Kubernetes 12–18 months ago, the original engineer who set it up has since left, and now the cluster is a black box that nobody wants to touch.
The fix is not to rip Kubernetes out. The fix is to make it operable.
Step 1: Get the cluster into IaC
If your cluster was created via the console or one-off scripts, the first move is bringing it under Terraform (or Pulumi). This is foundational — nothing else matters if you can’t reproduce the cluster.
Step 2: Set up GitOps
Argo CD or Flux. Stop deploying via 'kubectl apply' from someone’s laptop. Every change to the cluster goes through git. This single change usually eliminates 80% of the operational pain.
Step 3: Standardize the platform layer
Pick your defaults: ingress controller, certificate manager, secrets management, observability stack, network policies. Document them. Make new services adopt them.
Step 4: Bootstrap real observability inside the cluster
Prometheus + Grafana for metrics. Loki or your existing log platform for logs. OpenTelemetry collector for traces. “SSH and run kubectl logs” is not observability.
Step 5: Create a cluster operations runbook
Upgrades, scaling, troubleshooting, incident response. Written down. Shared. Tested. The runbook is what turns a black-box cluster into an operable platform.
What about EKS vs GKE vs AKS
For most teams, the right answer is whichever cloud you’re already on. They are all production-ready. EKS has the largest ecosystem. GKE has the best out-of-the-box defaults. AKS is the most affordable on Azure.
Pick the one that matches your existing cloud presence. The differences between them matter far less than the differences between operating Kubernetes well versus poorly.
What this looks like as an engagement
If you’re standing up a production-ready Kubernetes platform from scratch, our [Kubernetes Platform Setup](/services#kubernetes-platform) engagement gets you there in 4 weeks: cluster bootstrapping, GitOps, networking, security, observability, and a runbook your team can actually follow.
If you’re stabilizing an existing cluster, that’s usually a custom engagement — the work depends on what state the cluster is in and what your operability gap looks like.
The takeaway
Kubernetes is a power tool. It rewards teams that genuinely need its features and have the engineering capacity to operate it. It punishes teams that adopted it for the wrong reasons. If you’re on it and struggling, the answer is almost never to remove it — it’s to invest in the platform layer that makes it operable.