From 701ed987c5328737f44fe42310e0ead93ef3eb91 Mon Sep 17 00:00:00 2001 From: PiotrProkop Date: Tue, 8 Nov 2022 10:35:37 +0100 Subject: [PATCH] topologymanager: document topology manager policy options Signed-off-by: PiotrProkop --- .../feature-gates.md | 12 ++++++++++ .../administer-cluster/topology-manager.md | 23 ++++++++++++++++++- 2 files changed, 34 insertions(+), 1 deletion(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 495bb4dcca..69a39f7a3f 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -196,6 +196,9 @@ For a reference to old feature gates that are removed, please refer to | `TopologyAwareHints` | `true` | Beta | 1.24 | | | `TopologyManager` | `false` | Alpha | 1.16 | 1.17 | | `TopologyManager` | `true` | Beta | 1.18 | | +| `TopologyManagerPolicyAlphaOptions` | `false` | Alpha | 1.26 | | +| `TopologyManagerPolicyBetaOptions` | `false` | Beta | 1.26 | | +| `TopologyManagerPolicyOptions` | `false` | Alpha | 1.26 | | | `UserNamespacesStatelessPodsSupport` | `false` | Alpha | 1.25 | | | `VolumeCapacityPriority` | `false` | Alpha | 1.21 | - | | `WinDSR` | `false` | Alpha | 1.14 | | @@ -727,6 +730,15 @@ Each feature gate is designed for enabling/disabling a specific feature: - `TopologyManager`: Enable a mechanism to coordinate fine-grained hardware resource assignments for different components in Kubernetes. See [Control Topology Management Policies on a node](/docs/tasks/administer-cluster/topology-manager/). +- `TopologyManagerPolicyAlphaOptions`: Allow fine-tuning of topology manager policies, + experimental, Alpha-quality options. + This feature gate guards *a group* of topology manager options whose quality level is alpha. + This feature gate will never graduate to beta or stable. +- `TopologyManagerPolicyBetaOptions`: Allow fine-tuning of topology manager policies, + experimental, Beta-quality options. + This feature gate guards *a group* of topology manager options whose quality level is alpha. + This feature gate will never graduate to stable. +- `TopologyManagerPolicyOptions`: Allow fine-tuning of topology manager policies, - `UserNamespacesStatelessPodsSupport`: Enable user namespace support for stateless Pods. - `VolumeCapacityPriority`: Enable support for prioritizing nodes in different topologies based on available PV capacity. diff --git a/content/en/docs/tasks/administer-cluster/topology-manager.md b/content/en/docs/tasks/administer-cluster/topology-manager.md index a238f132a1..b02b2531b6 100644 --- a/content/en/docs/tasks/administer-cluster/topology-manager.md +++ b/content/en/docs/tasks/administer-cluster/topology-manager.md @@ -213,6 +213,28 @@ reschedule the pod. It is recommended to use a Deployment with replicas to trigg the Pod.An external control loop could be also implemented to trigger a redeployment of pods that have the `Topology Affinity` error. +### Topology manager policy options + +Support for the Topology Manager policy options requires `TopologyManagerPolicyOptions` +[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) to be enabled. + +You can toggle groups of options on and off based upon their maturity level using the following feature gates: +* `TopologyManagerPolicyBetaOptions` default disabled. Enable to show beta-level options. Currently there are no beta-level options. +* `TopologyManagerPolicyAlphaOptions` default disabled. Enable to show alpha-level options. You will still have to enable each option using the `TopologyManagerPolicyOptions` kubelet option. + +The following policy options exists: +* `prefer-closest-numa-nodes` (alpha, invisible by default, `TopologyManagerPolicyOptions` and `TopologyManagerPolicyAlphaOptions` feature gates have to be enabled)(1.26 or higher) + +If the `prefer-closest-numa-nodes` policy option is specified, the `best-effort` and `restricted` +policies will favor sets of NUMA nodes with shorter distance between them when making admission decisions. +You can enable this option by adding `prefer-closest-numa-nodes=true` to the Topology Manager policy options. +By default, without this option, Topology Manager aligns resources on either a single NUMA node or +the minimum number of NUMA nodes (in cases where more than one NUMA node is required). However, +the `TopologyManager` is not aware of NUMA distances and does not take them into account when making admission decisions. +This limitation surfaces in multi-socket, as well as single-socket multi NUMA systems, +and can cause significant performance degradation in latency-critical execution and high-throughput applications if the +Topology Manager decides to align resources on non-adjacent NUMA nodes. + ### Pod Interactions with Topology Manager Policies Consider the containers in the following pod specs: @@ -330,4 +352,3 @@ assignments. 2. The scheduler is not topology-aware, so it is possible to be scheduled on a node and then fail on the node due to the Topology Manager. -