Merge pull request #33287 from rolfedh/issue#33236proc

Add task page for troubleshooting CNI plugin-related errors
pull/33334/head
Kubernetes Prow Robot 2022-04-29 11:07:14 -07:00 committed by GitHub
commit 07ae0348c3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 145 additions and 0 deletions

View File

@ -0,0 +1,145 @@
---
title: Troubleshooting CNI plugin-related errors
content_type: task
reviewers:
- mikebrow
- divya-mohan0209
weight: 10
---
<!-- overview -->
To avoid CNI plugin-related errors, verify that you are using or upgrading to a
container runtime that has been tested to work correctly with your version of
Kubernetes.
For example, the following container runtimes are being prepared, or have already been prepared, for Kubernetes v1.24:
* containerd v1.6.4 and later, v1.5.11 and later
* The CRI-O v1.24.0 and later
## About the "Incompatible CNI versions" and "Failed to destroy network for sandbox" errors
Service issues exist for pod CNI network setup and tear down in containerd
v1.6.0-v1.6.3 when the CNI plugins have not been upgraded and/or the CNI config
version is not declared in the CNI config files. The containerd team reports, "these issues are resolved in containerd v1.6.4."
With containerd v1.6.0-v1.6.3, if you do not upgrade the CNI plugins and/or
declare the CNI config version, you might encounter the following "Incompatible
CNI versions" or "Failed to destroy network for sandbox" error conditions.
### Incompatible CNI versions error
If the version of your CNI plugin does not correctly match the plugin version in
the config because the config version is later than the plugin version, the
containerd log will likely show an error message on startup of a pod similar
to:
```
incompatible CNI versions; config is \"1.0.0\", plugin supports [\"0.1.0\" \"0.2.0\" \"0.3.0\" \"0.3.1\" \"0.4.0\"]"
```
To fix this issue, [update your CNI plugins and CNI config files](#updating-your-cni-plugins-and-cni-config-files).
### Failed to destroy network for sandbox error
If the version of the plugin is missing in the CNI plugin config, the pod may
run. However, stopping the pod generates an error similar to:
```
ERRO[2022-04-26T00:43:24.518165483Z] StopPodSandbox for "b" failed
error="failed to destroy network for sandbox \"bbc85f891eaf060c5a879e27bba9b6b06450210161dfdecfbb2732959fb6500a\": invalid version \"\": the version is empty"
```
This error leaves the pod in the not-ready state with a network namespace still
attached. To recover from this problem, [edit the CNI config file](#updating-your-cni-plugins-and-cni-config-files) to add
the missing version information. The next attempt to stop the pod should
be successful.
### Updating your CNI plugins and CNI config files
If you're using containerd v1.6.0-v1.6.3 and encountered "Incompatible CNI
versions" or "Failed to destroy network for sandbox" errors, consider updating
your CNI plugins and editing the CNI config files.
Here's an overview of the typical steps for each node:
1. [Safely drain and cordon the
node](/docs/tasks/administer-cluster/safely-drain-node/).
2. After stopping your container runtime and kubelet services, perform the
following upgrade operations:
- If you're running CNI plugins, upgrade them to the latest version.
- If you're using non-CNI plugins, replace them with CNI plugins. Use the
latest version of the plugins.
- Update the plugin configuration file to specify or match a version of the
CNI specification that the plugin supports, as shown in the following ["An
example containerd configuration
file"](#an-example-containerd-configuration-file) section.
- For `containerd`, ensure that you have installed the latest version (v1.0.0
or later) of the CNI loopback plugin.
- Upgrade node components (for example, the kubelet) to Kubernetes v1.24
- Upgrade to or install the most current version of the container runtime.
3. Bring the node back into your cluster by restarting your container runtime
and kubelet. Uncordon the node (`kubectl uncordon <nodename>`).
## An example containerd configuration file
The following example shows a configuration for `containerd` runtime v1.6.x,
which supports a recent version of the CNI specification (v1.0.0).
Please see the documentation from your plugin and networking provider for
further instructions on configuring your system.
On Kubernetes, containerd runtime adds a loopback interface, `lo`, to pods as a
default behavior. The containerd runtime configures the loopback interface via a
CNI plugin, `loopback`. The `loopback` plugin is distributed as part of the
`containerd` release packages that have the `cni` designation. `containerd`
v1.6.0 and later includes a CNI v1.0.0-compatible loopback plugin as well as
other default CNI plugins. The configuration for the loopback plugin is done
internally by containerd, and is set to use CNI v1.0.0. This also means that the
version of the `loopback` plugin must be v1.0.0 or later when this newer version
`containerd` is started.
The following bash command generates an example CNI config. Here, the 1.0.0
value for the config version is assigned to the `cniVersion` field for use when
`containerd` invokes the CNI bridge plugin.
```bash
cat << EOF | tee /etc/cni/net.d/10-containerd-net.conflist
{
"cniVersion": "1.0.0",
"name": "containerd-net",
"plugins": [
{
"type": "bridge",
"bridge": "cni0",
"isGateway": true,
"ipMasq": true,
"promiscMode": true,
"ipam": {
"type": "host-local",
"ranges": [
[{
"subnet": "10.88.0.0/16"
}],
[{
"subnet": "2001:db8:4860::/64"
}]
],
"routes": [
{ "dst": "0.0.0.0/0" },
{ "dst": "::/0" }
]
}
},
{
"type": "portmap",
"capabilities": {"portMappings": true}
}
]
}
EOF
```
Update the IP address ranges in the preceding example with ones that are based
on your use case and network addressing plan.