diff --git a/content/en/docs/concepts/scheduling-eviction/api-eviction.md b/content/en/docs/concepts/scheduling-eviction/api-eviction.md index 5da823d566..b1aea442e8 100644 --- a/content/en/docs/concepts/scheduling-eviction/api-eviction.md +++ b/content/en/docs/concepts/scheduling-eviction/api-eviction.md @@ -11,11 +11,11 @@ using a client of the {{}}. You may be able to attempt the eviction again later. You might also see this - response because of API rate limiting. + response because of API rate limiting. * `500 Internal Server Error`: the eviction is not allowed because there is a misconfiguration, like if multiple PodDisruptionBudgets reference the same Pod. If the Pod you want to evict isn't part of a workload that has a PodDisruptionBudget, the API server always returns `200 OK` and allows the -eviction. +eviction. If the API server allows the eviction, the Pod is deleted as follows: @@ -103,12 +103,12 @@ If the API server allows the eviction, the Pod is deleted as follows: ## Troubleshooting stuck evictions In some cases, your applications may enter a broken state, where the Eviction -API will only return `429` or `500` responses until you intervene. This can -happen if, for example, a ReplicaSet creates pods for your application but new +API will only return `429` or `500` responses until you intervene. This can +happen if, for example, a ReplicaSet creates pods for your application but new pods do not enter a `Ready` state. You may also notice this behavior in cases where the last evicted Pod had a long termination grace period. -If you notice stuck evictions, try one of the following solutions: +If you notice stuck evictions, try one of the following solutions: * Abort or pause the automated operation causing the issue. Investigate the stuck application before you restart the operation. diff --git a/content/en/docs/concepts/scheduling-eviction/assign-pod-node.md b/content/en/docs/concepts/scheduling-eviction/assign-pod-node.md index 3aeb05d8ec..bb6ce8d416 100644 --- a/content/en/docs/concepts/scheduling-eviction/assign-pod-node.md +++ b/content/en/docs/concepts/scheduling-eviction/assign-pod-node.md @@ -96,7 +96,7 @@ define. Some of the benefits of affinity and anti-affinity include: The affinity feature consists of two types of affinity: - *Node affinity* functions like the `nodeSelector` field but is more expressive and - allows you to specify soft rules. + allows you to specify soft rules. - *Inter-pod affinity/anti-affinity* allows you to constrain Pods against labels on other Pods. @@ -305,22 +305,22 @@ Pod affinity rule uses the "hard" `requiredDuringSchedulingIgnoredDuringExecution`, while the anti-affinity rule uses the "soft" `preferredDuringSchedulingIgnoredDuringExecution`. -The affinity rule specifies that the scheduler is allowed to place the example Pod +The affinity rule specifies that the scheduler is allowed to place the example Pod on a node only if that node belongs to a specific [zone](/docs/concepts/scheduling-eviction/topology-spread-constraints/) -where other Pods have been labeled with `security=S1`. -For instance, if we have a cluster with a designated zone, let's call it "Zone V," -consisting of nodes labeled with `topology.kubernetes.io/zone=V`, the scheduler can -assign the Pod to any node within Zone V, as long as there is at least one Pod within -Zone V already labeled with `security=S1`. Conversely, if there are no Pods with `security=S1` +where other Pods have been labeled with `security=S1`. +For instance, if we have a cluster with a designated zone, let's call it "Zone V," +consisting of nodes labeled with `topology.kubernetes.io/zone=V`, the scheduler can +assign the Pod to any node within Zone V, as long as there is at least one Pod within +Zone V already labeled with `security=S1`. Conversely, if there are no Pods with `security=S1` labels in Zone V, the scheduler will not assign the example Pod to any node in that zone. -The anti-affinity rule specifies that the scheduler should try to avoid scheduling the Pod +The anti-affinity rule specifies that the scheduler should try to avoid scheduling the Pod on a node if that node belongs to a specific [zone](/docs/concepts/scheduling-eviction/topology-spread-constraints/) -where other Pods have been labeled with `security=S2`. -For instance, if we have a cluster with a designated zone, let's call it "Zone R," -consisting of nodes labeled with `topology.kubernetes.io/zone=R`, the scheduler should avoid -assigning the Pod to any node within Zone R, as long as there is at least one Pod within -Zone R already labeled with `security=S2`. Conversely, the anti-affinity rule does not impact +where other Pods have been labeled with `security=S2`. +For instance, if we have a cluster with a designated zone, let's call it "Zone R," +consisting of nodes labeled with `topology.kubernetes.io/zone=R`, the scheduler should avoid +assigning the Pod to any node within Zone R, as long as there is at least one Pod within +Zone R already labeled with `security=S2`. Conversely, the anti-affinity rule does not impact scheduling into Zone R if there are no Pods with `security=S2` labels. To get yourself more familiar with the examples of Pod affinity and anti-affinity, @@ -371,12 +371,12 @@ When you want to use it, you have to enable it via the {{< /note >}} Kubernetes includes an optional `matchLabelKeys` field for Pod affinity -or anti-affinity. The field specifies keys for the labels that should match with the incoming Pod's labels, +or anti-affinity. The field specifies keys for the labels that should match with the incoming Pod's labels, when satisfying the Pod (anti)affinity. The keys are used to look up values from the pod labels; those key-value labels are combined (using `AND`) with the match restrictions defined using the `labelSelector` field. The combined -filtering selects the set of existing pods that will be taken into Pod (anti)affinity calculation. +filtering selects the set of existing pods that will be taken into Pod (anti)affinity calculation. A common use case is to use `matchLabelKeys` with `pod-template-hash` (set on Pods managed as part of a Deployment, where the value is unique for each revision). @@ -405,7 +405,7 @@ spec: # Only Pods from a given rollout are taken into consideration when calculating pod affinity. # If you update the Deployment, the replacement Pods follow their own affinity rules # (if there are any defined in the new Pod template) - matchLabelKeys: + matchLabelKeys: - pod-template-hash ``` @@ -422,7 +422,7 @@ When you want to use it, you have to enable it via the {{< /note >}} Kubernetes includes an optional `mismatchLabelKeys` field for Pod affinity -or anti-affinity. The field specifies keys for the labels that should **not** match with the incoming Pod's labels, +or anti-affinity. The field specifies keys for the labels that should **not** match with the incoming Pod's labels, when satisfying the Pod (anti)affinity. One example use case is to ensure Pods go to the topology domain (node, zone, etc) where only Pods from the same tenant or team are scheduled in. @@ -438,22 +438,22 @@ metadata: ... spec: affinity: - podAffinity: + podAffinity: requiredDuringSchedulingIgnoredDuringExecution: # ensure that pods associated with this tenant land on the correct node pool - matchLabelKeys: - tenant topologyKey: node-pool - podAntiAffinity: + podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: # ensure that pods associated with this tenant can't schedule to nodes used for another tenant - mismatchLabelKeys: - - tenant # whatever the value of the "tenant" label for this Pod, prevent + - tenant # whatever the value of the "tenant" label for this Pod, prevent # scheduling to nodes in any pool where any Pod from a different # tenant is running. labelSelector: # We have to have the labelSelector which selects only Pods with the tenant label, - # otherwise this Pod would hate Pods from daemonsets as well, for example, + # otherwise this Pod would hate Pods from daemonsets as well, for example, # which aren't supposed to have the tenant label. matchExpressions: - key: tenant @@ -633,13 +633,13 @@ The following operators can only be used with `nodeAffinity`. | Operator | Behaviour | | :------------: | :-------------: | -| `Gt` | The supplied value will be parsed as an integer, and that integer is less than the integer that results from parsing the value of a label named by this selector | -| `Lt` | The supplied value will be parsed as an integer, and that integer is greater than the integer that results from parsing the value of a label named by this selector | +| `Gt` | The supplied value will be parsed as an integer, and that integer is less than the integer that results from parsing the value of a label named by this selector | +| `Lt` | The supplied value will be parsed as an integer, and that integer is greater than the integer that results from parsing the value of a label named by this selector | {{}} -`Gt` and `Lt` operators will not work with non-integer values. If the given value -doesn't parse as an integer, the pod will fail to get scheduled. Also, `Gt` and `Lt` +`Gt` and `Lt` operators will not work with non-integer values. If the given value +doesn't parse as an integer, the pod will fail to get scheduled. Also, `Gt` and `Lt` are not available for `podAffinity`. {{}} diff --git a/content/en/docs/concepts/scheduling-eviction/pod-priority-preemption.md b/content/en/docs/concepts/scheduling-eviction/pod-priority-preemption.md index c312611564..c6b8da1838 100644 --- a/content/en/docs/concepts/scheduling-eviction/pod-priority-preemption.md +++ b/content/en/docs/concepts/scheduling-eviction/pod-priority-preemption.md @@ -64,7 +64,7 @@ and it cannot be prefixed with `system-`. A PriorityClass object can have any 32-bit integer value smaller than or equal to 1 billion. This means that the range of values for a PriorityClass object is -from -2147483648 to 1000000000 inclusive. Larger numbers are reserved for +from -2147483648 to 1000000000 inclusive. Larger numbers are reserved for built-in PriorityClasses that represent critical system Pods. A cluster admin should create one PriorityClass object for each such mapping that they want. @@ -256,9 +256,9 @@ the Node is not considered for preemption. If a pending Pod has inter-pod {{< glossary_tooltip text="affinity" term_id="affinity" >}} to one or more of the lower-priority Pods on the Node, the inter-Pod affinity -rule cannot be satisfied in the absence of those lower-priority Pods. In this case, +rule cannot be satisfied in the absence of those lower-priority Pods. In this case, the scheduler does not preempt any Pods on the Node. Instead, it looks for another -Node. The scheduler might find a suitable Node or it might not. There is no +Node. The scheduler might find a suitable Node or it might not. There is no guarantee that the pending Pod can be scheduled. Our recommended solution for this problem is to create inter-Pod affinity only @@ -361,7 +361,7 @@ to get evicted. The kubelet ranks pods for eviction based on the following facto 1. Whether the starved resource usage exceeds requests 1. Pod Priority - 1. Amount of resource usage relative to requests + 1. Amount of resource usage relative to requests See [Pod selection for kubelet eviction](/docs/concepts/scheduling-eviction/node-pressure-eviction/#pod-selection-for-kubelet-eviction) for more details. diff --git a/content/en/docs/concepts/scheduling-eviction/pod-scheduling-readiness.md b/content/en/docs/concepts/scheduling-eviction/pod-scheduling-readiness.md index 9b1df2851f..e895ffd5fb 100644 --- a/content/en/docs/concepts/scheduling-eviction/pod-scheduling-readiness.md +++ b/content/en/docs/concepts/scheduling-eviction/pod-scheduling-readiness.md @@ -9,7 +9,7 @@ weight: 40 {{< feature-state for_k8s_version="v1.27" state="beta" >}} Pods were considered ready for scheduling once created. Kubernetes scheduler -does its due diligence to find nodes to place all pending Pods. However, in a +does its due diligence to find nodes to place all pending Pods. However, in a real-world case, some Pods may stay in a "miss-essential-resources" state for a long period. These Pods actually churn the scheduler (and downstream integrators like Cluster AutoScaler) in an unnecessary manner. @@ -79,7 +79,7 @@ Given the test-pod doesn't request any CPU/memory resources, it's expected that transited from previous `SchedulingGated` to `Running`: ```none -NAME READY STATUS RESTARTS AGE IP NODE +NAME READY STATUS RESTARTS AGE IP NODE test-pod 1/1 Running 0 15s 10.0.0.4 node-2 ``` @@ -94,8 +94,8 @@ scheduling. You can use `scheduler_pending_pods{queue="gated"}` to check the met {{< feature-state for_k8s_version="v1.27" state="beta" >}} You can mutate scheduling directives of Pods while they have scheduling gates, with certain constraints. -At a high level, you can only tighten the scheduling directives of a Pod. In other words, the updated -directives would cause the Pods to only be able to be scheduled on a subset of the nodes that it would +At a high level, you can only tighten the scheduling directives of a Pod. In other words, the updated +directives would cause the Pods to only be able to be scheduled on a subset of the nodes that it would previously match. More concretely, the rules for updating a Pod's scheduling directives are as follows: 1. For `.spec.nodeSelector`, only additions are allowed. If absent, it will be allowed to be set. @@ -107,8 +107,8 @@ previously match. More concretely, the rules for updating a Pod's scheduling dir or `fieldExpressions` are allowed, and no changes to existing `matchExpressions` and `fieldExpressions` will be allowed. This is because the terms in `.requiredDuringSchedulingIgnoredDuringExecution.NodeSelectorTerms`, are ORed - while the expressions in `nodeSelectorTerms[].matchExpressions` and - `nodeSelectorTerms[].fieldExpressions` are ANDed. + while the expressions in `nodeSelectorTerms[].matchExpressions` and + `nodeSelectorTerms[].fieldExpressions` are ANDed. 4. For `.preferredDuringSchedulingIgnoredDuringExecution`, all updates are allowed. This is because preferred terms are not authoritative, and so policy controllers diff --git a/content/en/docs/concepts/scheduling-eviction/resource-bin-packing.md b/content/en/docs/concepts/scheduling-eviction/resource-bin-packing.md index 49432b6210..46930cc062 100644 --- a/content/en/docs/concepts/scheduling-eviction/resource-bin-packing.md +++ b/content/en/docs/concepts/scheduling-eviction/resource-bin-packing.md @@ -57,8 +57,8 @@ the `NodeResourcesFit` score function can be controlled by the Within the `scoringStrategy` field, you can configure two parameters: `requestedToCapacityRatio` and `resources`. The `shape` in the `requestedToCapacityRatio` parameter allows the user to tune the function as least requested or most -requested based on `utilization` and `score` values. The `resources` parameter -comprises both the `name` of the resource to be considered during scoring and +requested based on `utilization` and `score` values. The `resources` parameter +comprises both the `name` of the resource to be considered during scoring and its corresponding `weight`, which specifies the weight of each resource. Below is an example configuration that sets diff --git a/content/en/docs/concepts/scheduling-eviction/scheduling-framework.md b/content/en/docs/concepts/scheduling-eviction/scheduling-framework.md index d68548f68e..63a8c7d3e6 100644 --- a/content/en/docs/concepts/scheduling-eviction/scheduling-framework.md +++ b/content/en/docs/concepts/scheduling-eviction/scheduling-framework.md @@ -83,7 +83,7 @@ the Pod is put into the active queue or the backoff queue so that the scheduler will retry the scheduling of the Pod. {{< note >}} -QueueingHint evaluation during scheduling is a beta-level feature. +QueueingHint evaluation during scheduling is a beta-level feature. The v1.28 release series initially enabled the associated feature gate; however, after the discovery of an excessive memory footprint, the Kubernetes project set that feature gate to be disabled by default. In Kubernetes {{< skew currentVersion >}}, this feature gate is diff --git a/content/en/docs/concepts/scheduling-eviction/topology-spread-constraints.md b/content/en/docs/concepts/scheduling-eviction/topology-spread-constraints.md index 2a8eb78d5b..6ebddabd8a 100644 --- a/content/en/docs/concepts/scheduling-eviction/topology-spread-constraints.md +++ b/content/en/docs/concepts/scheduling-eviction/topology-spread-constraints.md @@ -99,7 +99,7 @@ your cluster. Those fields are: {{< note >}} The `MinDomainsInPodTopologySpread` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) enables `minDomains` for pod topology spread. Starting from v1.28, - the `MinDomainsInPodTopologySpread` gate + the `MinDomainsInPodTopologySpread` gate is enabled by default. In older Kubernetes clusters it might be explicitly disabled or the field might not be available. {{< /note >}}