parent
e9c79c6ff8
commit
60b5600157
|
@ -43,7 +43,6 @@ toc:
|
||||||
section:
|
section:
|
||||||
- docs/concepts/cluster-administration/network-plugins.md
|
- docs/concepts/cluster-administration/network-plugins.md
|
||||||
- docs/concepts/cluster-administration/device-plugins.md
|
- docs/concepts/cluster-administration/device-plugins.md
|
||||||
- docs/concepts/cluster-administration/sysctl-cluster.md
|
|
||||||
- docs/concepts/service-catalog/index.md
|
- docs/concepts/service-catalog/index.md
|
||||||
|
|
||||||
- title: Containers
|
- title: Containers
|
||||||
|
|
|
@ -143,6 +143,7 @@ toc:
|
||||||
- docs/tasks/administer-cluster/access-cluster-api.md
|
- docs/tasks/administer-cluster/access-cluster-api.md
|
||||||
- docs/tasks/administer-cluster/access-cluster-services.md
|
- docs/tasks/administer-cluster/access-cluster-services.md
|
||||||
- docs/tasks/administer-cluster/securing-a-cluster.md
|
- docs/tasks/administer-cluster/securing-a-cluster.md
|
||||||
|
- docs/tasks/administer-cluster/sysctl-cluster.md
|
||||||
- docs/tasks/administer-cluster/encrypt-data.md
|
- docs/tasks/administer-cluster/encrypt-data.md
|
||||||
- docs/tasks/administer-cluster/configure-upgrade-etcd.md
|
- docs/tasks/administer-cluster/configure-upgrade-etcd.md
|
||||||
- docs/tasks/administer-cluster/static-pod.md
|
- docs/tasks/administer-cluster/static-pod.md
|
||||||
|
|
|
@ -50,7 +50,7 @@
|
||||||
/docs/admin/resourcequota/limitstorageconsumption/ /docs/tasks/administer-cluster/limit-storage-consumption/ 301
|
/docs/admin/resourcequota/limitstorageconsumption/ /docs/tasks/administer-cluster/limit-storage-consumption/ 301
|
||||||
/docs/admin/resourcequota/walkthrough/ /docs/tasks/administer-cluster/quota-api-object/ 301
|
/docs/admin/resourcequota/walkthrough/ /docs/tasks/administer-cluster/quota-api-object/ 301
|
||||||
/docs/admin/static-pods/ /docs/tasks/administer-cluster/static-pod/ 301
|
/docs/admin/static-pods/ /docs/tasks/administer-cluster/static-pod/ 301
|
||||||
/docs/admin/sysctls/ /docs/concepts/cluster-administration/sysctl-cluster/ 301
|
/docs/admin/sysctls/ /docs/tasks/administer-cluster/sysctl-cluster/ 301
|
||||||
/docs/admin/upgrade-1-6/ /docs/tasks/administer-cluster/upgrade-1-6/ 301
|
/docs/admin/upgrade-1-6/ /docs/tasks/administer-cluster/upgrade-1-6/ 301
|
||||||
/docs/admin/resource-quota/ /docs/concepts/policy/resource-quotas/ 301
|
/docs/admin/resource-quota/ /docs/concepts/policy/resource-quotas/ 301
|
||||||
|
|
||||||
|
@ -97,6 +97,7 @@
|
||||||
/docs/concepts/cluster-administration/multiple-clusters/ /docs/concepts/cluster-administration/federation/ 301
|
/docs/concepts/cluster-administration/multiple-clusters/ /docs/concepts/cluster-administration/federation/ 301
|
||||||
/docs/concepts/cluster-administration/out-of-resource/ /docs/tasks/administer-cluster/out-of-resource/ 301
|
/docs/concepts/cluster-administration/out-of-resource/ /docs/tasks/administer-cluster/out-of-resource/ 301
|
||||||
/docs/concepts/cluster-administration/resource-usage-monitoring /docs/tasks/debug-application-cluster/resource-usage-monitoring/ 301
|
/docs/concepts/cluster-administration/resource-usage-monitoring /docs/tasks/debug-application-cluster/resource-usage-monitoring/ 301
|
||||||
|
/docs/concepts/cluster-administration/sysctl-cluster/ /docs/tasks/administer-cluster/sysctl-cluster/ 301
|
||||||
/docs/concepts/cluster-administration/static-pod/ /docs/tasks/administer-cluster/static-pod/ 301
|
/docs/concepts/cluster-administration/static-pod/ /docs/tasks/administer-cluster/static-pod/ 301
|
||||||
/docs/concepts/clusters/logging/ /docs/concepts/cluster-administration/logging/ 301
|
/docs/concepts/clusters/logging/ /docs/concepts/cluster-administration/logging/ 301
|
||||||
/docs/concepts/configuration/container-command-arg/ /docs/tasks/inject-data-application/define-command-argument-container/ 301
|
/docs/concepts/configuration/container-command-arg/ /docs/tasks/inject-data-application/define-command-argument-container/ 301
|
||||||
|
|
|
@ -1,15 +1,24 @@
|
||||||
---
|
---
|
||||||
|
title: Using Sysctls in a Kubernetes Cluster
|
||||||
reviewers:
|
reviewers:
|
||||||
- sttts
|
- sttts
|
||||||
title: Using Sysctls in a Kubernetes Cluster
|
|
||||||
---
|
---
|
||||||
|
|
||||||
* TOC
|
{% capture overview %}
|
||||||
{:toc}
|
|
||||||
|
|
||||||
This document describes how sysctls are used within a Kubernetes cluster.
|
This document describes how sysctls are used within a Kubernetes cluster.
|
||||||
|
|
||||||
## What is a Sysctl?
|
{% endcapture %}
|
||||||
|
|
||||||
|
{% capture prerequisites %}
|
||||||
|
|
||||||
|
{% include task-tutorial-prereqs.md %}
|
||||||
|
|
||||||
|
{% endcapture %}
|
||||||
|
|
||||||
|
{% capture steps %}
|
||||||
|
|
||||||
|
## Listing all Sysctl Parameters
|
||||||
|
|
||||||
In Linux, the sysctl interface allows an administrator to modify kernel
|
In Linux, the sysctl interface allows an administrator to modify kernel
|
||||||
parameters at runtime. Parameters are available via the `/proc/sys/` virtual
|
parameters at runtime. Parameters are available via the `/proc/sys/` virtual
|
||||||
|
@ -23,11 +32,59 @@ process file system. The parameters cover various subsystems such as:
|
||||||
|
|
||||||
To get a list of all parameters, you can run
|
To get a list of all parameters, you can run
|
||||||
|
|
||||||
```
|
```shell
|
||||||
$ sudo sysctl -a
|
$ sudo sysctl -a
|
||||||
```
|
```
|
||||||
|
|
||||||
## Namespaced vs. Node-Level Sysctls
|
## Enabling Unsafe Sysctls
|
||||||
|
|
||||||
|
Sysctls are grouped into _safe_ and _unsafe_ sysctls. In addition to proper
|
||||||
|
namespacing a _safe_ sysctl must be properly _isolated_ between pods on the same
|
||||||
|
node. This means that setting a _safe_ sysctl for one pod
|
||||||
|
|
||||||
|
- must not have any influence on any other pod on the node
|
||||||
|
- must not allow to harm the node's health
|
||||||
|
- must not allow to gain CPU or memory resources outside of the resource limits
|
||||||
|
of a pod.
|
||||||
|
|
||||||
|
By far, most of the _namespaced_ sysctls are not necessarily considered _safe_.
|
||||||
|
The following sysctls are supported in the _safe_ set:
|
||||||
|
|
||||||
|
- `kernel.shm_rmid_forced`,
|
||||||
|
- `net.ipv4.ip_local_port_range`,
|
||||||
|
- `net.ipv4.tcp_syncookies`.
|
||||||
|
|
||||||
|
**Note**: The example `net.ipv4.tcp_syncookies` is not namespaced on Linux kernel version 4.4 or lower.
|
||||||
|
{: .note}
|
||||||
|
|
||||||
|
This list will be extended in future Kubernetes versions when the kubelet
|
||||||
|
supports better isolation mechanisms.
|
||||||
|
|
||||||
|
All _safe_ sysctls are enabled by default.
|
||||||
|
|
||||||
|
All _unsafe_ sysctls are disabled by default and must be allowed manually by the
|
||||||
|
cluster admin on a per-node basis. Pods with disabled unsafe sysctls will be
|
||||||
|
scheduled, but will fail to launch.
|
||||||
|
|
||||||
|
With the warning above in mind, the cluster admin can allow certain _unsafe_
|
||||||
|
sysctls for very special situations like e.g. high-performance or real-time
|
||||||
|
application tuning. _Unsafe_ sysctls are enabled on a node-by-node basis with a
|
||||||
|
flag of the kubelet, e.g.:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
$ kubelet --experimental-allowed-unsafe-sysctls \
|
||||||
|
'kernel.msg*,net.ipv4.route.min_pmtu' ...
|
||||||
|
```
|
||||||
|
|
||||||
|
For minikube, this can be done via the `extra-config` flag:
|
||||||
|
|
||||||
|
```shell
|
||||||
|
$ minikube start --extra-config="kubelet.AllowedUnsafeSysctls=kernel.msg*,net.ipv4.route.min_pmtu"...
|
||||||
|
```
|
||||||
|
|
||||||
|
Only _namespaced_ sysctls can be enabled this way.
|
||||||
|
|
||||||
|
## Setting Sysctls for a Pod
|
||||||
|
|
||||||
A number of sysctls are _namespaced_ in today's Linux kernels. This means that
|
A number of sysctls are _namespaced_ in today's Linux kernels. This means that
|
||||||
they can be set independently for each pod on a node. Being namespaced is a
|
they can be set independently for each pod on a node. Being namespaced is a
|
||||||
|
@ -46,67 +103,8 @@ manually by the cluster admin, either by means of the underlying Linux
|
||||||
distribution of the nodes (e.g. via `/etc/sysctls.conf`) or using a DaemonSet
|
distribution of the nodes (e.g. via `/etc/sysctls.conf`) or using a DaemonSet
|
||||||
with privileged containers.
|
with privileged containers.
|
||||||
|
|
||||||
**Note**: it is good practice to consider nodes with special sysctl settings as
|
The sysctl feature is an alpha API. Therefore, sysctls are set using annotations
|
||||||
_tainted_ within a cluster, and only schedule pods onto them which need those
|
on pods. They apply to all containers in the same pod.
|
||||||
sysctl settings. It is suggested to use the Kubernetes [_taints and toleration_
|
|
||||||
feature](/docs/user-guide/kubectl/{{page.version}}/#taint) to implement this.
|
|
||||||
|
|
||||||
## Safe vs. Unsafe Sysctls
|
|
||||||
|
|
||||||
Sysctls are grouped into _safe_ and _unsafe_ sysctls. In addition to proper
|
|
||||||
namespacing a _safe_ sysctl must be properly _isolated_ between pods on the same
|
|
||||||
node. This means that setting a _safe_ sysctl for one pod
|
|
||||||
|
|
||||||
- must not have any influence on any other pod on the node
|
|
||||||
- must not allow to harm the node's health
|
|
||||||
- must not allow to gain CPU or memory resources outside of the resource limits
|
|
||||||
of a pod.
|
|
||||||
|
|
||||||
By far, most of the _namespaced_ sysctls are not necessarily considered _safe_.
|
|
||||||
|
|
||||||
For Kubernetes 1.4, the following sysctls are supported in the _safe_ set:
|
|
||||||
|
|
||||||
- `kernel.shm_rmid_forced`,
|
|
||||||
- `net.ipv4.ip_local_port_range`,
|
|
||||||
- `net.ipv4.tcp_syncookies`.
|
|
||||||
|
|
||||||
**Note**: The example `net.ipv4.tcp_syncookies` is not namespaced on Linux kernel version 4.4 or lower.
|
|
||||||
{: .note}
|
|
||||||
|
|
||||||
This list will be extended in future Kubernetes versions when the kubelet
|
|
||||||
supports better isolation mechanisms.
|
|
||||||
|
|
||||||
All _safe_ sysctls are enabled by default.
|
|
||||||
|
|
||||||
All _unsafe_ sysctls are disabled by default and must be allowed manually by the
|
|
||||||
cluster admin on a per-node basis. Pods with disabled unsafe sysctls will be
|
|
||||||
scheduled, but will fail to launch.
|
|
||||||
|
|
||||||
**Warning**: Due to their nature of being _unsafe_, the use of _unsafe_ sysctls
|
|
||||||
is at-your-own-risk and can lead to severe problems like wrong behavior of
|
|
||||||
containers, resource shortage or complete breakage of a node.
|
|
||||||
|
|
||||||
## Enabling Unsafe Sysctls
|
|
||||||
|
|
||||||
With the warning above in mind, the cluster admin can allow certain _unsafe_
|
|
||||||
sysctls for very special situations like e.g. high-performance or real-time
|
|
||||||
application tuning. _Unsafe_ sysctls are enabled on a node-by-node basis with a
|
|
||||||
flag of the kubelet, e.g.:
|
|
||||||
|
|
||||||
```shell
|
|
||||||
$ kubelet --experimental-allowed-unsafe-sysctls 'kernel.msg*,net.ipv4.route.min_pmtu' ...
|
|
||||||
```
|
|
||||||
For minikube, this can be done via the `extra-config` flag:
|
|
||||||
|
|
||||||
```shell
|
|
||||||
$ minikube start --extra-config="kubelet.AllowedUnsafeSysctls=kernel.msg*,net.ipv4.route.min_pmtu"...
|
|
||||||
```
|
|
||||||
Only _namespaced_ sysctls can be enabled this way.
|
|
||||||
|
|
||||||
## Setting Sysctls for a Pod
|
|
||||||
|
|
||||||
The sysctl feature is an alpha API in Kubernetes 1.4. Therefore, sysctls are set
|
|
||||||
using annotations on pods. They apply to all containers in the same pod.
|
|
||||||
|
|
||||||
Here is an example, with different annotations for _safe_ and _unsafe_ sysctls:
|
Here is an example, with different annotations for _safe_ and _unsafe_ sysctls:
|
||||||
|
|
||||||
|
@ -121,11 +119,25 @@ metadata:
|
||||||
spec:
|
spec:
|
||||||
...
|
...
|
||||||
```
|
```
|
||||||
|
{% endcapture %}
|
||||||
|
|
||||||
**Note**: a pod with the _unsafe_ sysctls specified above will fail to launch on
|
{% capture discussion %}
|
||||||
any node which has not enabled those two _unsafe_ sysctls explicitly. As with
|
|
||||||
_node-level_ sysctls it is recommended to use [_taints and toleration_
|
**Warning**: Due to their nature of being _unsafe_, the use of _unsafe_ sysctls
|
||||||
feature](/docs/user-guide/kubectl/{{page.version}}/#taint) or [taints on nodes](/docs/concepts/configuration/taint-and-toleration/)
|
is at-your-own-risk and can lead to severe problems like wrong behavior of
|
||||||
|
containers, resource shortage or complete breakage of a node.
|
||||||
|
{: .warning}
|
||||||
|
|
||||||
|
It is good practice to consider nodes with special sysctl settings as
|
||||||
|
_tainted_ within a cluster, and only schedule pods onto them which need those
|
||||||
|
sysctl settings. It is suggested to use the Kubernetes [_taints and toleration_
|
||||||
|
feature](/docs/user-guide/kubectl/{{page.version}}/#taint) to implement this.
|
||||||
|
|
||||||
|
A pod with the _unsafe_ sysctls will fail to launch on any node which has not
|
||||||
|
enabled those two _unsafe_ sysctls explicitly. As with _node-level_ sysctls it
|
||||||
|
is recommended to use
|
||||||
|
[_taints and toleration_ feature](/docs/user-guide/kubectl/{{page.version}}/#taint) or
|
||||||
|
[taints on nodes](/docs/concepts/configuration/taint-and-toleration/)
|
||||||
to schedule those pods onto the right nodes.
|
to schedule those pods onto the right nodes.
|
||||||
|
|
||||||
## PodSecurityPolicy Annotations
|
## PodSecurityPolicy Annotations
|
||||||
|
@ -148,3 +160,7 @@ metadata:
|
||||||
spec:
|
spec:
|
||||||
...
|
...
|
||||||
```
|
```
|
||||||
|
|
||||||
|
{% endcapture %}
|
||||||
|
|
||||||
|
{% include templates/task.md %}
|
Loading…
Reference in New Issue