diff --git a/content/en/blog/_posts/2025-02-14-cloud-controller-manager-chicken-egg-problem/ccm-chicken-egg-problem-sequence-diagram.svg b/content/en/blog/_posts/2025-02-14-cloud-controller-manager-chicken-egg-problem/ccm-chicken-egg-problem-sequence-diagram.svg new file mode 100644 index 0000000000..ad01afaee2 --- /dev/null +++ b/content/en/blog/_posts/2025-02-14-cloud-controller-manager-chicken-egg-problem/ccm-chicken-egg-problem-sequence-diagram.svg @@ -0,0 +1,3 @@ + + +Cloud-controller-managerKube-apiserverKubeletCloud-controller-managerKube-apiserverKubeletTaint: node.cloudprovider.kubernetes.ioNode is Not Ready Tainted, Missing Node Addresses*, ...Send UpdatesInitialize Node:Cloud Provider Labels, Node Addresses, ...Node is ReadyCreate Node1Node Created2Watch: New Node Created3Update Node4 \ No newline at end of file diff --git a/content/en/blog/_posts/2025-02-14-cloud-controller-manager-chicken-egg-problem.md b/content/en/blog/_posts/2025-02-14-cloud-controller-manager-chicken-egg-problem/index.md similarity index 92% rename from content/en/blog/_posts/2025-02-14-cloud-controller-manager-chicken-egg-problem.md rename to content/en/blog/_posts/2025-02-14-cloud-controller-manager-chicken-egg-problem/index.md index 99662b9041..0a1e601bfd 100644 --- a/content/en/blog/_posts/2025-02-14-cloud-controller-manager-chicken-egg-problem.md +++ b/content/en/blog/_posts/2025-02-14-cloud-controller-manager-chicken-egg-problem/index.md @@ -25,34 +25,27 @@ The [cloud controller manager is part of the control plane][ccm]. It is a critic that replaces some functionality that existed previously in the kube-controller-manager and the kubelet. -![Components of Kubernetes](https://kubernetes.io/images/docs/components-of-kubernetes.svg) +{{< figure + src="/images/docs/components-of-kubernetes.svg" + alt="Components of Kubernetes" + caption="Components of Kubernetes" +>}} One of the most critical functionalities of the cloud controller manager is the node controller, which is responsible for the initialization of the nodes. -As you can see in the following diagram, when the **kubelet** starts, it registers the `Node` +As you can see in the following diagram, when the **kubelet** starts, it registers the Node object with the apiserver, Tainting the node so it can be processed first by the -cloud-controller-manager. The initial `Node` is missing the cloud-provider specific information, +cloud-controller-manager. The initial Node is missing the cloud-provider specific information, like the Node Addresses and the Labels with the cloud provider specific information like the Node, Region and Instance type information. -```mermaid -sequenceDiagram - autonumber - rect rgb(191, 223, 255) - Kubelet->>+Kube-apiserver: Create Node - Note over Kubelet: Taint:
node.cloudprovider.kubernetes.io - Kube-apiserver->>-Kubelet: Node Created - end - Note over Kube-apiserver: Node is Not Ready
Tainted, Missing Node Addresses*, ... - Note over Kube-apiserver: Send Updates - rect rgb(200, 150, 255) - Kube-apiserver->>+Cloud-controller-manager: Watch: New Node Created - Note over Cloud-controller-manager: Initialize Node:
Cloud Provider Labels, Node Addresses, ... - Cloud-controller-manager->>-Kube-apiserver: Update Node - end - Note over Kube-apiserver: Node is Ready -``` +{{< figure + src="ccm-chicken-egg-problem-sequence-diagram.svg" + alt="Chicken and egg problem sequence diagram" + caption="Chicken and egg problem sequence diagram" + class="diagram-large" +>}} This new initialization process adds some latency to the node readiness. Previously, the kubelet was able to initialize the node at the same time it created the node. Since the logic has moved @@ -100,9 +93,9 @@ The [Kubernetes documentation describes][kubedocs1] the `node.kubernetes.io/not- > "The Node controller detects whether a Node is ready by monitoring its health and adds or removes this taint accordingly." -One of the conditions that can lead to a `Node` resource having this taint is when the container +One of the conditions that can lead to a Node resource having this taint is when the container network has not yet been initialized on that node. As the cloud-controller-manager is responsible -for adding the IP addresses to a `Node` resource, and the IP addresses are needed by the container +for adding the IP addresses to a Node resource, and the IP addresses are needed by the container network controllers to properly configure the container network, it is possible in some circumstances for a node to become stuck as not ready and uninitialized permanently.