The Concept#
Kubernetes scheduling is often misunderstood because testing it requires scale. Spinning up 50 nodes in the cloud is expensive and slow. I wanted to create a "Flight Simulator" for DevOps engineers—a risk-free environment to crash-test scheduling logic without the AWS bill.
The Stack#
- Engine: KWOK (Kubernetes WithOut Kubelet) to simulate thousands of nodes with a tiny memory footprint.
- Visualization: A custom Python-based TUI (Text User Interface) to render the cluster state live in the terminal.
- Orchestration: Standard Kubernetes Manifests to define complex scenarios (Affinity, Taints, Preemption).
The "AI-First" Workflow#
This project was built in a rapid "Lead Architect + AI Team" sprint.
- Curriculum Design: I defined the pedagogical path, breaking scheduling down into 5 modules: Affinity, Taints, Topology, Preemption, and Manual Binding.
- Tool Generation: I tasked the AI agent with writing the complex Python logic for the "Scheduler Watch" tool. It parses JSON streams from the cluster and renders a visual dashboard with semantic icons (đź§ for AI apps, â›” for Taints, đź§± for Batch jobs).
- Infrastructure Logic: We iterated on the node generation scripts to create "Semantic Nodes" (
gpu-node-1,zone-a-node-1) instead of generic names, significantly improving the learning experience.
The Result#
A complete, open-source workshop kit that runs a 30-node cluster using less than 1GB of RAM. It allows engineers to visually witness the Kubernetes Scheduler making "hidden" decisions—like evicting low-priority pods or balancing traffic across zones—in real-time.
