Skip to content

How to safely remove a worker node from TKGM clusters

Environment Details

Tanzu Cluster Details

tanzu cluster list
  NAME          NAMESPACE  STATUS   CONTROLPLANE  WORKERS  KUBERNETES        ROLES   PLAN
  wph-wld-rp01  default    running  1/1           3/3      v1.21.2+vmware.1  <none>  dev

kubectl config use-context wph-wld-rp01-admin@wph-wld-rp01
Switched to context "wph-wld-rp01-admin@wph-wld-rp01".

kubectl get nodes
NAME                                 STATUS   ROLES                  AGE     VERSION
wph-wld-rp01-control-plane-c7mxm     Ready    control-plane,master   17d     v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw   Ready    <none>                 3m20s   v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb   Ready    <none>                 17d     v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-mvbj5   Ready    <none>                 3m40s   v1.21.2+vmware.1

# From Management cluster context

kubectl config use-context ph-mgmt-rp01-admin@ph-mgmt-rp01
Switched to context "ph-mgmt-rp01-admin@ph-mgmt-rp01".

kubectl get machines
NAME                                 PROVIDERID                                       PHASE     VERSION
wph-wld-rp01-control-plane-c7mxm     vsphere://423c40ed-a5fe-669d-0bb7-92432f23b36b   Running   v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw   vsphere://423cb203-15e6-e024-fb5c-fa62e555defa   Running   v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb   vsphere://423ce2cf-ce49-fca2-f10c-4d7996b5fc74   Running   v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-mvbj5   vsphere://423c4715-64ac-decc-7634-522888597e45   Running   v1.21.2+vmware.1

Safely Removing a worker nodes

  • Select the worker node for deletion. In this example - wph-wld-rp01-md-0-64fc56fb95-mvbj5

Switch to workload cluster context

kubectl config use-context wph-wld-rp01-admin@wph-wld-rp01

Drain the node

  • Drain the nodes using kubectl drain
  • Depending on the workload on this node other options may be needed. The two other frequently used options are
  • delete-local-data - Continue even if there are pods using emptyDir (local data that will be deleted when the node is drained)
  • force - Continue even if there are pods that do not declare a controller.
  • Other drain options
kubectl drain wph-wld-rp01-md-0-64fc56fb95-mvbj5 --ignore-daemonsets

node/wph-wld-rp01-md-0-64fc56fb95-mvbj5 already cordoned
Warning: ignoring DaemonSet-managed Pods: kube-system/calico-node-r5sfv, kube-system/kube-proxy-52pdn, kube-system/vsphere-csi-node-nq4xd
node/wph-wld-rp01-md-0-64fc56fb95-mvbj5 drained

Make sure scheduling is disabled

kubectl get nodes

NAME                                 STATUS                     ROLES                  AGE   VERSION
wph-wld-rp01-control-plane-c7mxm     Ready                      control-plane,master   17d   v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw   Ready                      <none>                 12m   v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb   Ready                      <none>                 17d   v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-mvbj5   Ready,SchedulingDisabled   <none>                 12m   v1.21.2+vmware.1

Delete the node

kubectl delete node wph-wld-rp01-md-0-64fc56fb95-mvbj5

node "wph-wld-rp01-md-0-64fc56fb95-mvbj5" deleted

Observing changes in the management cluster

  • kubectl delete node on workload cluster will trigger machine object deletion as well
  • As seen from the output below the older machine object is deleted and the new machine `wph-wld-rp01-md-0-64fc56fb95-wqt7d`` is being provisioned now
kubectl config use-context ph-mgmt-rp01-admin@ph-mgmt-rp01
Switched to context "ph-mgmt-rp01-admin@ph-mgmt-rp01".

kubectl get machines

NAME                                 PROVIDERID                                       PHASE          VERSION
wph-wld-rp01-control-plane-c7mxm     vsphere://423c40ed-a5fe-669d-0bb7-92432f23b36b   Running        v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw   vsphere://423cb203-15e6-e024-fb5c-fa62e555defa   Running        v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb   vsphere://423ce2cf-ce49-fca2-f10c-4d7996b5fc74   Running        v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-wqt7d                                                    Provisioning   v1.21.2+vmware.1

# Provisioning Complete

kubectl get machines
NAME                                 PROVIDERID                                       PHASE     VERSION
wph-wld-rp01-control-plane-c7mxm     vsphere://423c40ed-a5fe-669d-0bb7-92432f23b36b   Running   v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw   vsphere://423cb203-15e6-e024-fb5c-fa62e555defa   Running   v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb   vsphere://423ce2cf-ce49-fca2-f10c-4d7996b5fc74   Running   v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-wqt7d   vsphere://423cf20b-d028-993d-f452-f9c303e98ce5   Running   v1.21.2+vmware.1

Verify new worker node is added

  • New node wph-wld-rp01-md-0-64fc56fb95-wqt7d is added with the same id as the newly spun up machine wph-wld-rp01-md-0-64fc56fb95-wqt7d from the output above
kubectl config use-context wph-wld-rp01-admin@wph-wld-rp01

kubectl get nodes

NAME                                 STATUS   ROLES                  AGE     VERSION
wph-wld-rp01-control-plane-c7mxm     Ready    control-plane,master   17d     v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-7zhpw   Ready    <none>                 19m     v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-hvqsb   Ready    <none>                 17d     v1.21.2+vmware.1
wph-wld-rp01-md-0-64fc56fb95-wqt7d   Ready    <none>                 4m53s   v1.21.2+vmware.1

Alternative Approach