Bare-metal Kubernetes for a SaaS platform
A growing SaaS company needed cloud-grade elasticity on their own hardware to control costs and data residency.
The constraint
Public-cloud pricing scaled faster than the business did, and a subset of the customer base had hard data-residency requirements that ruled out shared regions outright. The team had capable hardware in their own racks but no way to treat it like a cloud: every node was hand-built, scaling meant a ticket and a weekend, and nothing was reproducible.
Architecture
We treated the physical fleet as a programmable substrate rather than a set of pets:
- Infrastructure as code. Terraform describes the desired cluster topology; Ansible turns bare hosts into hypervisors and configures the libvirt/KVM layer. A cluster is a committed, reviewable artifact — not tribal knowledge.
- Virtualized node pools. Kubernetes nodes run as KVM guests over libvirt, so capacity can be partitioned, recreated, and rebalanced without touching the physical layout.
- Custom autoscaler. Off-the-shelf autoscalers assume a cloud provider API. We wrote a controller that watches pending workloads and provisions or reclaims KVM-backed nodes against the on-prem pool — the elasticity story, on hardware they own.
Everything is GitOps-driven: a change to the IaC repo is the only way the fleet changes, which makes drift visible and rollbacks boring.
Outcome
The result is reproducible clusters, automatic scale-to-load on bare metal, and — critically — infrastructure the client’s own team can read, extend, and operate without us in the loop. Costs track their hardware, and the residency requirements are satisfied by construction.