2021-07-07 05:45:51 +00:00
---
title: Running Kubernetes Node Components as a Non-root User
content_type: task
min-kubernetes-server-version: 1.22
2023-01-11 16:12:34 +00:00
weight: 300
2021-07-07 05:45:51 +00:00
---
<!-- overview -->
{{< feature-state for_k8s_version = "v1.22" state = "alpha" > }}
This document describes how to run Kubernetes Node components such as kubelet, CRI, OCI, and CNI
without root privileges, by using a {{< glossary_tooltip text = "user namespace" term_id = "userns" > }}.
This technique is also known as _rootless mode_ .
{{< note > }}
2021-11-12 02:02:28 +00:00
This document describes how to run Kubernetes Node components (and hence pods) as a non-root user.
2021-07-07 05:45:51 +00:00
If you are just looking for how to run a pod as a non-root user, see [SecurityContext ](/docs/tasks/configure-pod-container/security-context/ ).
{{< / note > }}
## {{% heading "prerequisites" %}}
{{% version-check %}}
* [Enable Cgroup v2 ](https://rootlesscontaine.rs/getting-started/common/cgroup2/ )
* [Enable systemd with user session ](https://rootlesscontaine.rs/getting-started/common/login/ )
* [Configure several sysctl values, depending on host Linux distribution ](https://rootlesscontaine.rs/getting-started/common/sysctl/ )
* [Ensure that your unprivileged user is listed in `/etc/subuid` and `/etc/subgid` ](https://rootlesscontaine.rs/getting-started/common/subuid/ )
2021-10-10 13:31:14 +00:00
* Enable the `KubeletInUserNamespace` [feature gate ](/docs/reference/command-line-tools-reference/feature-gates/ )
2021-07-07 05:45:51 +00:00
<!-- steps -->
## Running Kubernetes inside Rootless Docker/Podman
2021-09-13 06:01:26 +00:00
### kind
[kind ](https://kind.sigs.k8s.io/ ) supports running Kubernetes inside Rootless Docker or Rootless Podman.
2021-07-07 05:45:51 +00:00
See [Running kind with Rootless Docker ](https://kind.sigs.k8s.io/docs/user/rootless/ ).
2021-09-13 06:01:26 +00:00
### minikube
2022-07-01 02:24:05 +00:00
[minikube ](https://minikube.sigs.k8s.io/ ) also supports running Kubernetes inside Rootless Docker or Rootless Podman.
2021-09-13 06:01:26 +00:00
2022-07-01 02:24:05 +00:00
See the Minikube documentation:
2021-09-13 06:01:26 +00:00
2022-07-01 02:24:05 +00:00
* [Rootless Docker ](https://minikube.sigs.k8s.io/docs/drivers/docker/ )
* [Rootless Podman ](https://minikube.sigs.k8s.io/docs/drivers/podman/ )
2021-07-07 05:45:51 +00:00
2022-01-11 22:20:30 +00:00
## Running Kubernetes inside Unprivileged Containers
{{% thirdparty-content %}}
### sysbox
[Sysbox ](https://github.com/nestybox/sysbox ) is an open-source container runtime
(similar to "runc") that supports running system-level workloads such as Docker
and Kubernetes inside unprivileged containers isolated with the Linux user
namespace.
See [Sysbox Quick Start Guide: Kubernetes-in-Docker ](https://github.com/nestybox/sysbox/blob/master/docs/quickstart/kind.md ) for more info.
Sysbox supports running Kubernetes inside unprivileged containers without
requiring Cgroup v2 and without the `KubeletInUserNamespace` feature gate. It
does this by exposing specially crafted `/proc` and `/sys` filesystems inside
the container plus several other advanced OS virtualization techniques.
2021-07-07 05:45:51 +00:00
## Running Rootless Kubernetes directly on a host
{{% thirdparty-content %}}
### K3s
[K3s ](https://k3s.io/ ) experimentally supports rootless mode.
See [Running K3s with Rootless mode ](https://rancher.com/docs/k3s/latest/en/advanced/#running-k3s-with-rootless-mode-experimental ) for the usage.
### Usernetes
[Usernetes ](https://github.com/rootless-containers/usernetes ) is a reference distribution of Kubernetes that can be installed under `$HOME` directory without the root privilege.
Usernetes supports both containerd and CRI-O as CRI runtimes.
Usernetes supports multi-node clusters using Flannel (VXLAN).
See [the Usernetes repo ](https://github.com/rootless-containers/usernetes ) for the usage.
## Manually deploy a node that runs the kubelet in a user namespace {#userns-the-hard-way}
This section provides hints for running Kubernetes in a user namespace manually.
{{< note > }}
This section is intended to be read by developers of Kubernetes distributions, not by end users.
{{< / note > }}
### Creating a user namespace
The first step is to create a {{< glossary_tooltip text = "user namespace" term_id = "userns" > }}.
If you are trying to run Kubernetes in a user-namespaced container such as
Rootless Docker/Podman or LXC/LXD, you are all set, and you can go to the next subsection.
Otherwise you have to create a user namespace by yourself, by calling `unshare(2)` with `CLONE_NEWUSER` .
A user namespace can be also unshared by using command line tools such as:
2021-10-10 13:31:14 +00:00
- [`unshare(1)` ](https://man7.org/linux/man-pages/man1/unshare.1.html )
2021-07-07 05:45:51 +00:00
- [RootlessKit ](https://github.com/rootless-containers/rootlesskit )
- [become-root ](https://github.com/giuseppe/become-root )
After unsharing the user namespace, you will also have to unshare other namespaces such as mount namespace.
You do *not* need to call `chroot()` nor `pivot_root()` after unsharing the mount namespace,
however, you have to mount writable filesystems on several directories *in* the namespace.
At least, the following directories need to be writable *in* the namespace (not *outside* the namespace):
- `/etc`
- `/run`
- `/var/logs`
- `/var/lib/kubelet`
- `/var/lib/cni`
- `/var/lib/containerd` (for containerd)
- `/var/lib/containers` (for CRI-O)
### Creating a delegated cgroup tree
In addition to the user namespace, you also need to have a writable cgroup tree with cgroup v2.
{{< note > }}
Kubernetes support for running Node components in user namespaces requires cgroup v2.
Cgroup v1 is not supported.
{{< / note > }}
If you are trying to run Kubernetes in Rootless Docker/Podman or LXC/LXD on a systemd-based host, you are all set.
Otherwise you have to create a systemd unit with `Delegate=yes` property to delegate a cgroup tree with writable permission.
On your node, systemd must already be configured to allow delegation; for more details, see
[cgroup v2 ](https://rootlesscontaine.rs/getting-started/common/cgroup2/ ) in the Rootless
Containers documentation.
### Configuring network
2021-10-10 13:31:14 +00:00
2021-07-07 05:45:51 +00:00
{{% thirdparty-content %}}
The network namespace of the Node components has to have a non-loopback interface, which can be for example configured with
2021-10-10 13:31:14 +00:00
[slirp4netns ](https://github.com/rootless-containers/slirp4netns ),
[VPNKit ](https://github.com/moby/vpnkit ), or
[lxc-user-nic(1) ](https://www.man7.org/linux/man-pages/man1/lxc-user-nic.1.html ).
2021-07-07 05:45:51 +00:00
The network namespaces of the Pods can be configured with regular CNI plugins.
For multi-node networking, Flannel (VXLAN, 8472/UDP) is known to work.
Ports such as the kubelet port (10250/TCP) and `NodePort` service ports have to be exposed from the Node network namespace to
2021-10-10 13:31:14 +00:00
the host with an external port forwarder, such as RootlessKit, slirp4netns, or
[socat(1) ](https://linux.die.net/man/1/socat ).
2021-07-07 05:45:51 +00:00
2021-10-10 13:31:14 +00:00
You can use the port forwarder from K3s.
See [Running K3s in Rootless Mode ](https://rancher.com/docs/k3s/latest/en/advanced/#known-issues-with-rootless-mode )
for more details.
2021-11-12 02:02:28 +00:00
The implementation can be found in [the `pkg/rootlessports` package ](https://github.com/k3s-io/k3s/blob/v1.22.3+k3s1/pkg/rootlessports/controller.go ) of k3s.
2021-07-07 05:45:51 +00:00
### Configuring CRI
2021-10-10 13:31:14 +00:00
The kubelet relies on a container runtime. You should deploy a container runtime such as
containerd or CRI-O and ensure that it is running within the user namespace before the kubelet starts.
2021-07-07 05:45:51 +00:00
{{< tabs name = "cri" > }}
{{% tab name="containerd" %}}
Running CRI plugin of containerd in a user namespace is supported since containerd 1.4.
2021-11-12 02:02:28 +00:00
Running containerd within a user namespace requires the following configurations.
2021-07-07 05:45:51 +00:00
```toml
version = 2
[plugins."io.containerd.grpc.v1.cri"]
# Disable AppArmor
disable_apparmor = true
# Ignore an error during setting oom_score_adj
restrict_oom_score_adj = true
# Disable hugetlb cgroup v2 controller (because systemd does not support delegating hugetlb controller)
disable_hugetlb_controller = true
[plugins."io.containerd.grpc.v1.cri".containerd]
# Using non-fuse overlayfs is also possible for kernel >= 5.11, but requires SELinux to be disabled
snapshotter = "fuse-overlayfs"
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
# We use cgroupfs that is delegated by systemd, so we do not use SystemdCgroup driver
# (unless you run another systemd in the namespace)
SystemdCgroup = false
```
2021-11-12 02:02:28 +00:00
The default path of the configuration file is `/etc/containerd/config.toml` .
The path can be specified with `containerd -c /path/to/containerd/config.toml` .
2021-07-07 05:45:51 +00:00
{{% /tab %}}
{{% tab name="CRI-O" %}}
Running CRI-O in a user namespace is supported since CRI-O 1.22.
CRI-O requires an environment variable `_CRIO_ROOTLESS=1` to be set.
2021-11-12 02:02:28 +00:00
The following configurations are also recommended:
2021-07-07 05:45:51 +00:00
```toml
[crio]
storage_driver = "overlay"
# Using non-fuse overlayfs is also possible for kernel >= 5.11, but requires SELinux to be disabled
storage_option = ["overlay.mount_program=/usr/local/bin/fuse-overlayfs"]
[crio.runtime]
# We use cgroupfs that is delegated by systemd, so we do not use "systemd" driver
# (unless you run another systemd in the namespace)
cgroup_manager = "cgroupfs"
```
2021-11-12 02:02:28 +00:00
The default path of the configuration file is `/etc/crio/crio.conf` .
The path can be specified with `crio --config /path/to/crio/crio.conf` .
2021-07-07 05:45:51 +00:00
{{% /tab %}}
{{< / tabs > }}
### Configuring kubelet
Running kubelet in a user namespace requires the following configuration:
```yaml
apiVersion: kubelet.config.k8s.io/v1beta1
2021-10-10 13:31:14 +00:00
kind: KubeletConfiguration
2021-07-07 05:45:51 +00:00
featureGates:
KubeletInUserNamespace: true
# We use cgroupfs that is delegated by systemd, so we do not use "systemd" driver
# (unless you run another systemd in the namespace)
cgroupDriver: "cgroupfs"
```
2021-10-10 13:31:14 +00:00
When the `KubeletInUserNamespace` feature gate is enabled, the kubelet ignores errors
that may happen during setting the following sysctl values on the node.
2021-07-07 05:45:51 +00:00
- `vm.overcommit_memory`
- `vm.panic_on_oom`
- `kernel.panic`
- `kernel.panic_on_oops`
- `kernel.keys.root_maxkeys`
- `kernel.keys.root_maxbytes` .
Within a user namespace, the kubelet also ignores any error raised from trying to open `/dev/kmsg` .
This feature gate also allows kube-proxy to ignore an error during setting `RLIMIT_NOFILE` .
The `KubeletInUserNamespace` feature gate was introduced in Kubernetes v1.22 with "alpha" status.
2021-10-10 13:31:14 +00:00
Running kubelet in a user namespace without using this feature gate is also possible
2022-01-11 22:20:30 +00:00
by mounting a specially crafted proc filesystem (as done by [Sysbox ](https://github.com/nestybox/sysbox )), but not officially supported.
2021-07-07 05:45:51 +00:00
### Configuring kube-proxy
Running kube-proxy in a user namespace requires the following configuration:
```yaml
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "iptables" # or "userspace"
conntrack:
# Skip setting sysctl value "net.netfilter.nf_conntrack_max"
maxPerCore: 0
# Skip setting "net.netfilter.nf_conntrack_tcp_timeout_established"
tcpEstablishedTimeout: 0s
# Skip setting "net.netfilter.nf_conntrack_tcp_timeout_close"
tcpCloseWaitTimeout: 0s
```
## Caveats
- Most of "non-local" volume drivers such as `nfs` and `iscsi` do not work.
Local volumes like `local` , `hostPath` , `emptyDir` , `configMap` , `secret` , and `downwardAPI` are known to work.
- Some CNI plugins may not work. Flannel (VXLAN) is known to work.
For more on this, see the [Caveats and Future work ](https://rootlesscontaine.rs/caveats/ ) page
on the rootlesscontaine.rs website.
## {{% heading "seealso" %}}
2021-10-10 13:31:14 +00:00
2021-07-07 05:45:51 +00:00
- [rootlesscontaine.rs ](https://rootlesscontaine.rs/ )
- [Rootless Containers 2020 (KubeCon NA 2020) ](https://www.slideshare.net/AkihiroSuda/kubecon-na-2020-containerd-rootless-containers-2020 )
- [Running kind with Rootless Docker ](https://kind.sigs.k8s.io/docs/user/rootless/ )
- [Usernetes ](https://github.com/rootless-containers/usernetes )
- [Running K3s with rootless mode ](https://rancher.com/docs/k3s/latest/en/advanced/#running-k3s-with-rootless-mode-experimental )
- [KEP-2033: Kubelet-in-UserNS (aka Rootless mode) ](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2033-kubelet-in-userns-aka-rootless )