32b16737 gke preempt 80

Speed your CI, decrease your cost. The preemptible node


We are running gitlab, self-hosted, in Google Kubernetes Engine (GKE). And we use gitlab runner for our CI. And I have to say, this has been working beyond expectations for me: it works really well.

Now a bit of a puzzle hit our happy landscape about 6 months ago or so. One large project which didn’t economically fit into the model. I tried a few things, finally settling on running 2 runners (each in a separate Kubernetes cluster). The one in the GKE was labelled ‘small’ and the other ‘big’. The ‘big’ one runs in my basement on the 72 thread / 256GB machine which would be uneconomical to leave running in GKE.

Enter the ‘pre-emptible’ VM. Pricing is here. As you can see, its quite a bit less. In return, you get reset at least once per day. Also, if the neighbours get ‘noisy’ you get unscheduled for a bit. This is probably acceptable for the CI pipeline.

I added this nodeSelector to the gitlab-runner:

nodeSelector:
  cloud.google.com/gke-preemptible: "true"

I then added a ‘taint’ (no really that is what it is called) to prevent this nodepool from attracting scheduled Pods that didn’t explicitly tolerate:

kubectl taint nodes [NODE_NAME] cloud.google.com/gke-preemptible="true":NoSchedule

And boom, we have a faster ‘small’ CI, which costs less than what it replaced. I still am going to keep the beast of the basement online for a bit.