Why do I see the error Cannot complete cluster master upgrade because there is a migration in progress
?
Virtual Private Cloud Classic infrastructure
You see the following error message during master upgrade.
Cannot complete cluster master upgrade because there is a migration in progress
You are upgrading the cluster master, but some resources were still being migrated from a previous update.
For example, if this was a master update from IBM Cloud Kubernetes Service version 1.29 to 1.30, the Tigera Operator didn't yet complete its migration from the previous IBM Cloud Kubernetes Service version 1.28 to 1.29 update.
To resolve the issue, first wait longer. Larger clusters take longer to complete the migration. It takes approximately 100 seconds per node after the master is successfully updated for the migration to complete. The Tigera Operator is required to perform several actions and can involve spinning up and down pods across nodes.
However, it's possible the migration could have gotten stuck. Check if the calico-typha
and calico-node
pods were removed from the kube-system
namespace and created in the calico-system
namespace.
If those resources have not been moved, there might be an issue with one or more worker nodes.
To troubleshoot the migration:
-
Check the status of the worker nodes.
kubectl get nodes
If you see a worker node that is not in the
Ready
state, such as in theNotReady
orSchedulingDisabled
state, the migration might be stuck.NotReady
example:NAME STATUS ROLES AGE VERSION 10.177.112.32 NotReady <none> 2d2h v1.30.0+IKS 10.177.112.50 Ready <none> 2d2h v1.30.0+IKS 10.177.112.52 Ready <none> 2d2h v1.30.0+IKS
SchedulingDisabled
example:NAME STATUS ROLES AGE VERSION 10.177.112.32 Ready,SchedulingDisabled <none> 95m v1.30.0+IKS 10.177.112.50 Ready <none> 95m v1.30.0+IKS 10.177.112.52 Ready <none> 95m v1.30.0+IKS
Result:
When the worker nodes are healthy, the calico-typha
and calico-node
pods can resume scaling down in the kube-system
namespace and scaling up in the calico-system
namespace.
To confirm the migration is complete:
-
Verify that the
calico-typha
deployment no longer exists in thekube-system
namespace.kubectl get deployment calico-typha -n kube-system
Result:
Error from server (NotFound): deployments.apps "calico-typha" not found
-
Verify there aren't any nodes with the
projectcalico.org/operator-node-migration
label.kubectl get nodes -l projectcalico.org/operator-node-migration
Result:
No resources found
-
If the migration is still stuck, replace or remove the problematic nodes. For more information, see Debugging worker nodes.
When you have confirmed that the migration is complete, proceed with the master update to IBM Cloud Kubernetes Service version 1.30.