parent
e9c79c6ff8
commit
60b5600157
|
@ -43,7 +43,6 @@ toc:
|
|||
section:
|
||||
- docs/concepts/cluster-administration/network-plugins.md
|
||||
- docs/concepts/cluster-administration/device-plugins.md
|
||||
- docs/concepts/cluster-administration/sysctl-cluster.md
|
||||
- docs/concepts/service-catalog/index.md
|
||||
|
||||
- title: Containers
|
||||
|
|
|
@ -143,6 +143,7 @@ toc:
|
|||
- docs/tasks/administer-cluster/access-cluster-api.md
|
||||
- docs/tasks/administer-cluster/access-cluster-services.md
|
||||
- docs/tasks/administer-cluster/securing-a-cluster.md
|
||||
- docs/tasks/administer-cluster/sysctl-cluster.md
|
||||
- docs/tasks/administer-cluster/encrypt-data.md
|
||||
- docs/tasks/administer-cluster/configure-upgrade-etcd.md
|
||||
- docs/tasks/administer-cluster/static-pod.md
|
||||
|
|
|
@ -50,7 +50,7 @@
|
|||
/docs/admin/resourcequota/limitstorageconsumption/ /docs/tasks/administer-cluster/limit-storage-consumption/ 301
|
||||
/docs/admin/resourcequota/walkthrough/ /docs/tasks/administer-cluster/quota-api-object/ 301
|
||||
/docs/admin/static-pods/ /docs/tasks/administer-cluster/static-pod/ 301
|
||||
/docs/admin/sysctls/ /docs/concepts/cluster-administration/sysctl-cluster/ 301
|
||||
/docs/admin/sysctls/ /docs/tasks/administer-cluster/sysctl-cluster/ 301
|
||||
/docs/admin/upgrade-1-6/ /docs/tasks/administer-cluster/upgrade-1-6/ 301
|
||||
/docs/admin/resource-quota/ /docs/concepts/policy/resource-quotas/ 301
|
||||
|
||||
|
@ -97,6 +97,7 @@
|
|||
/docs/concepts/cluster-administration/multiple-clusters/ /docs/concepts/cluster-administration/federation/ 301
|
||||
/docs/concepts/cluster-administration/out-of-resource/ /docs/tasks/administer-cluster/out-of-resource/ 301
|
||||
/docs/concepts/cluster-administration/resource-usage-monitoring /docs/tasks/debug-application-cluster/resource-usage-monitoring/ 301
|
||||
/docs/concepts/cluster-administration/sysctl-cluster/ /docs/tasks/administer-cluster/sysctl-cluster/ 301
|
||||
/docs/concepts/cluster-administration/static-pod/ /docs/tasks/administer-cluster/static-pod/ 301
|
||||
/docs/concepts/clusters/logging/ /docs/concepts/cluster-administration/logging/ 301
|
||||
/docs/concepts/configuration/container-command-arg/ /docs/tasks/inject-data-application/define-command-argument-container/ 301
|
||||
|
|
|
@ -1,15 +1,24 @@
|
|||
---
|
||||
title: Using Sysctls in a Kubernetes Cluster
|
||||
reviewers:
|
||||
- sttts
|
||||
title: Using Sysctls in a Kubernetes Cluster
|
||||
---
|
||||
|
||||
* TOC
|
||||
{:toc}
|
||||
{% capture overview %}
|
||||
|
||||
This document describes how sysctls are used within a Kubernetes cluster.
|
||||
|
||||
## What is a Sysctl?
|
||||
{% endcapture %}
|
||||
|
||||
{% capture prerequisites %}
|
||||
|
||||
{% include task-tutorial-prereqs.md %}
|
||||
|
||||
{% endcapture %}
|
||||
|
||||
{% capture steps %}
|
||||
|
||||
## Listing all Sysctl Parameters
|
||||
|
||||
In Linux, the sysctl interface allows an administrator to modify kernel
|
||||
parameters at runtime. Parameters are available via the `/proc/sys/` virtual
|
||||
|
@ -23,11 +32,59 @@ process file system. The parameters cover various subsystems such as:
|
|||
|
||||
To get a list of all parameters, you can run
|
||||
|
||||
```
|
||||
```shell
|
||||
$ sudo sysctl -a
|
||||
```
|
||||
|
||||
## Namespaced vs. Node-Level Sysctls
|
||||
## Enabling Unsafe Sysctls
|
||||
|
||||
Sysctls are grouped into _safe_ and _unsafe_ sysctls. In addition to proper
|
||||
namespacing a _safe_ sysctl must be properly _isolated_ between pods on the same
|
||||
node. This means that setting a _safe_ sysctl for one pod
|
||||
|
||||
- must not have any influence on any other pod on the node
|
||||
- must not allow to harm the node's health
|
||||
- must not allow to gain CPU or memory resources outside of the resource limits
|
||||
of a pod.
|
||||
|
||||
By far, most of the _namespaced_ sysctls are not necessarily considered _safe_.
|
||||
The following sysctls are supported in the _safe_ set:
|
||||
|
||||
- `kernel.shm_rmid_forced`,
|
||||
- `net.ipv4.ip_local_port_range`,
|
||||
- `net.ipv4.tcp_syncookies`.
|
||||
|
||||
**Note**: The example `net.ipv4.tcp_syncookies` is not namespaced on Linux kernel version 4.4 or lower.
|
||||
{: .note}
|
||||
|
||||
This list will be extended in future Kubernetes versions when the kubelet
|
||||
supports better isolation mechanisms.
|
||||
|
||||
All _safe_ sysctls are enabled by default.
|
||||
|
||||
All _unsafe_ sysctls are disabled by default and must be allowed manually by the
|
||||
cluster admin on a per-node basis. Pods with disabled unsafe sysctls will be
|
||||
scheduled, but will fail to launch.
|
||||
|
||||
With the warning above in mind, the cluster admin can allow certain _unsafe_
|
||||
sysctls for very special situations like e.g. high-performance or real-time
|
||||
application tuning. _Unsafe_ sysctls are enabled on a node-by-node basis with a
|
||||
flag of the kubelet, e.g.:
|
||||
|
||||
```shell
|
||||
$ kubelet --experimental-allowed-unsafe-sysctls \
|
||||
'kernel.msg*,net.ipv4.route.min_pmtu' ...
|
||||
```
|
||||
|
||||
For minikube, this can be done via the `extra-config` flag:
|
||||
|
||||
```shell
|
||||
$ minikube start --extra-config="kubelet.AllowedUnsafeSysctls=kernel.msg*,net.ipv4.route.min_pmtu"...
|
||||
```
|
||||
|
||||
Only _namespaced_ sysctls can be enabled this way.
|
||||
|
||||
## Setting Sysctls for a Pod
|
||||
|
||||
A number of sysctls are _namespaced_ in today's Linux kernels. This means that
|
||||
they can be set independently for each pod on a node. Being namespaced is a
|
||||
|
@ -46,67 +103,8 @@ manually by the cluster admin, either by means of the underlying Linux
|
|||
distribution of the nodes (e.g. via `/etc/sysctls.conf`) or using a DaemonSet
|
||||
with privileged containers.
|
||||
|
||||
**Note**: it is good practice to consider nodes with special sysctl settings as
|
||||
_tainted_ within a cluster, and only schedule pods onto them which need those
|
||||
sysctl settings. It is suggested to use the Kubernetes [_taints and toleration_
|
||||
feature](/docs/user-guide/kubectl/{{page.version}}/#taint) to implement this.
|
||||
|
||||
## Safe vs. Unsafe Sysctls
|
||||
|
||||
Sysctls are grouped into _safe_ and _unsafe_ sysctls. In addition to proper
|
||||
namespacing a _safe_ sysctl must be properly _isolated_ between pods on the same
|
||||
node. This means that setting a _safe_ sysctl for one pod
|
||||
|
||||
- must not have any influence on any other pod on the node
|
||||
- must not allow to harm the node's health
|
||||
- must not allow to gain CPU or memory resources outside of the resource limits
|
||||
of a pod.
|
||||
|
||||
By far, most of the _namespaced_ sysctls are not necessarily considered _safe_.
|
||||
|
||||
For Kubernetes 1.4, the following sysctls are supported in the _safe_ set:
|
||||
|
||||
- `kernel.shm_rmid_forced`,
|
||||
- `net.ipv4.ip_local_port_range`,
|
||||
- `net.ipv4.tcp_syncookies`.
|
||||
|
||||
**Note**: The example `net.ipv4.tcp_syncookies` is not namespaced on Linux kernel version 4.4 or lower.
|
||||
{: .note}
|
||||
|
||||
This list will be extended in future Kubernetes versions when the kubelet
|
||||
supports better isolation mechanisms.
|
||||
|
||||
All _safe_ sysctls are enabled by default.
|
||||
|
||||
All _unsafe_ sysctls are disabled by default and must be allowed manually by the
|
||||
cluster admin on a per-node basis. Pods with disabled unsafe sysctls will be
|
||||
scheduled, but will fail to launch.
|
||||
|
||||
**Warning**: Due to their nature of being _unsafe_, the use of _unsafe_ sysctls
|
||||
is at-your-own-risk and can lead to severe problems like wrong behavior of
|
||||
containers, resource shortage or complete breakage of a node.
|
||||
|
||||
## Enabling Unsafe Sysctls
|
||||
|
||||
With the warning above in mind, the cluster admin can allow certain _unsafe_
|
||||
sysctls for very special situations like e.g. high-performance or real-time
|
||||
application tuning. _Unsafe_ sysctls are enabled on a node-by-node basis with a
|
||||
flag of the kubelet, e.g.:
|
||||
|
||||
```shell
|
||||
$ kubelet --experimental-allowed-unsafe-sysctls 'kernel.msg*,net.ipv4.route.min_pmtu' ...
|
||||
```
|
||||
For minikube, this can be done via the `extra-config` flag:
|
||||
|
||||
```shell
|
||||
$ minikube start --extra-config="kubelet.AllowedUnsafeSysctls=kernel.msg*,net.ipv4.route.min_pmtu"...
|
||||
```
|
||||
Only _namespaced_ sysctls can be enabled this way.
|
||||
|
||||
## Setting Sysctls for a Pod
|
||||
|
||||
The sysctl feature is an alpha API in Kubernetes 1.4. Therefore, sysctls are set
|
||||
using annotations on pods. They apply to all containers in the same pod.
|
||||
The sysctl feature is an alpha API. Therefore, sysctls are set using annotations
|
||||
on pods. They apply to all containers in the same pod.
|
||||
|
||||
Here is an example, with different annotations for _safe_ and _unsafe_ sysctls:
|
||||
|
||||
|
@ -121,11 +119,25 @@ metadata:
|
|||
spec:
|
||||
...
|
||||
```
|
||||
{% endcapture %}
|
||||
|
||||
**Note**: a pod with the _unsafe_ sysctls specified above will fail to launch on
|
||||
any node which has not enabled those two _unsafe_ sysctls explicitly. As with
|
||||
_node-level_ sysctls it is recommended to use [_taints and toleration_
|
||||
feature](/docs/user-guide/kubectl/{{page.version}}/#taint) or [taints on nodes](/docs/concepts/configuration/taint-and-toleration/)
|
||||
{% capture discussion %}
|
||||
|
||||
**Warning**: Due to their nature of being _unsafe_, the use of _unsafe_ sysctls
|
||||
is at-your-own-risk and can lead to severe problems like wrong behavior of
|
||||
containers, resource shortage or complete breakage of a node.
|
||||
{: .warning}
|
||||
|
||||
It is good practice to consider nodes with special sysctl settings as
|
||||
_tainted_ within a cluster, and only schedule pods onto them which need those
|
||||
sysctl settings. It is suggested to use the Kubernetes [_taints and toleration_
|
||||
feature](/docs/user-guide/kubectl/{{page.version}}/#taint) to implement this.
|
||||
|
||||
A pod with the _unsafe_ sysctls will fail to launch on any node which has not
|
||||
enabled those two _unsafe_ sysctls explicitly. As with _node-level_ sysctls it
|
||||
is recommended to use
|
||||
[_taints and toleration_ feature](/docs/user-guide/kubectl/{{page.version}}/#taint) or
|
||||
[taints on nodes](/docs/concepts/configuration/taint-and-toleration/)
|
||||
to schedule those pods onto the right nodes.
|
||||
|
||||
## PodSecurityPolicy Annotations
|
||||
|
@ -148,3 +160,7 @@ metadata:
|
|||
spec:
|
||||
...
|
||||
```
|
||||
|
||||
{% endcapture %}
|
||||
|
||||
{% include templates/task.md %}
|
Loading…
Reference in New Issue