Merge pull request #45986 from my-git9/patch-12867

[zh-cn] sync assign-pod-node node-pressure-eviction pod-priority-preemption pod-scheduling-readiness
pull/46355/head
Kubernetes Prow Robot 2024-04-24 00:17:35 -07:00 committed by GitHub
commit 15715569ac
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
4 changed files with 51 additions and 52 deletions

View File

@ -473,7 +473,7 @@ the node label that the system uses to denote the domain. For examples, see
{{< note >}}
<!--
Inter-pod affinity and anti-affinity require substantial amount of
Inter-pod affinity and anti-affinity require substantial amounts of
processing which can slow down scheduling in large clusters significantly. We do
not recommend using them in clusters larger than several hundred nodes.
-->
@ -483,7 +483,7 @@ Pod 间亲和性和反亲和性都需要相当的计算量,因此会在大规
{{< note >}}
<!--
Pod anti-affinity requires nodes to be consistently labelled, in other words,
Pod anti-affinity requires nodes to be consistently labeled, in other words,
every node in the cluster must have an appropriate label matching `topologyKey`.
If some or all nodes are missing the specified `topologyKey` label, it can lead
to unintended behavior.
@ -567,13 +567,13 @@ uses the "soft" `preferredDuringSchedulingIgnoredDuringExecution`.
`preferredDuringSchedulingIgnoredDuringExecution`
<!--
The affinity rule specifies that the scheduler is allowed to place the example Pod
The affinity rule specifies that the scheduler is allowed to place the example Pod
on a node only if that node belongs to a specific [zone](/docs/concepts/scheduling-eviction/topology-spread-constraints/)
where other Pods have been labeled with `security=S1`.
For instance, if we have a cluster with a designated zone, let's call it "Zone V,"
consisting of nodes labeled with `topology.kubernetes.io/zone=V`, the scheduler can
assign the Pod to any node within Zone V, as long as there is at least one Pod within
Zone V already labeled with `security=S1`. Conversely, if there are no Pods with `security=S1`
where other Pods have been labeled with `security=S1`.
For instance, if we have a cluster with a designated zone, let's call it "Zone V,"
consisting of nodes labeled with `topology.kubernetes.io/zone=V`, the scheduler can
assign the Pod to any node within Zone V, as long as there is at least one Pod within
Zone V already labeled with `security=S1`. Conversely, if there are no Pods with `security=S1`
labels in Zone V, the scheduler will not assign the example Pod to any node in that zone.
-->
亲和性规则规定,只有节点属于特定的[区域](/zh-cn/docs/concepts/scheduling-eviction/topology-spread-constraints/)
@ -584,13 +584,13 @@ labels in Zone V, the scheduler will not assign the example Pod to any node in t
则调度器不会将示例 Pod 调度给该区域中的任何节点。
<!--
The anti-affinity rule specifies that the scheduler should try to avoid scheduling the Pod
The anti-affinity rule specifies that the scheduler should try to avoid scheduling the Pod
on a node if that node belongs to a specific [zone](/docs/concepts/scheduling-eviction/topology-spread-constraints/)
where other Pods have been labeled with `security=S2`.
For instance, if we have a cluster with a designated zone, let's call it "Zone R,"
consisting of nodes labeled with `topology.kubernetes.io/zone=R`, the scheduler should avoid
assigning the Pod to any node within Zone R, as long as there is at least one Pod within
Zone R already labeled with `security=S2`. Conversely, the anti-affinity rule does not impact
where other Pods have been labeled with `security=S2`.
For instance, if we have a cluster with a designated zone, let's call it "Zone R,"
consisting of nodes labeled with `topology.kubernetes.io/zone=R`, the scheduler should avoid
assigning the Pod to any node within Zone R, as long as there is at least one Pod within
Zone R already labeled with `security=S2`. Conversely, the anti-affinity rule does not impact
scheduling into Zone R if there are no Pods with `security=S2` labels.
-->
反亲和性规则规定,如果节点属于特定的[区域](/zh-cn/docs/concepts/scheduling-eviction/topology-spread-constraints/)
@ -681,7 +681,7 @@ null `namespaceSelector` matches the namespace of the Pod where the rule is defi
{{< note >}}
<!-- UPDATE THIS WHEN PROMOTING TO BETA -->
<!--
The `matchLabelKeys` field is a alpha-level field and is disabled by default in
The `matchLabelKeys` field is an alpha-level field and is disabled by default in
Kubernetes {{< skew currentVersion >}}.
When you want to use it, you have to enable it via the
`MatchLabelKeysInPodAffinity` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/).
@ -693,7 +693,7 @@ When you want to use it, you have to enable it via the
<!--
Kubernetes includes an optional `matchLabelKeys` field for Pod affinity
or anti-affinity. The field specifies keys for the labels that should match with the incoming Pod's labels,
or anti-affinity. The field specifies keys for the labels that should match with the incoming Pod's labels,
when satisfying the Pod (anti)affinity.
The keys are used to look up values from the pod labels; those key-value labels are combined
@ -755,7 +755,7 @@ spec:
{{< note >}}
<!-- UPDATE THIS WHEN PROMOTING TO BETA -->
<!--
The `mismatchLabelKeys` field is a alpha-level field and is disabled by default in
The `mismatchLabelKeys` field is an alpha-level field and is disabled by default in
Kubernetes {{< skew currentVersion >}}.
When you want to use it, you have to enable it via the
`MatchLabelKeysInPodAffinity` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/).
@ -767,7 +767,7 @@ When you want to use it, you have to enable it via the
<!--
Kubernetes includes an optional `mismatchLabelKeys` field for Pod affinity
or anti-affinity. The field specifies keys for the labels that should **not** match with the incoming Pod's labels,
or anti-affinity. The field specifies keys for the labels that should **not** match with the incoming Pod's labels,
when satisfying the Pod (anti)affinity.
One example use case is to ensure Pods go to the topology domain (node, zone, etc) where only Pods from the same tenant or team are scheduled in.
@ -790,22 +790,22 @@ metadata:
...
spec:
affinity:
podAffinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
# ensure that pods associated with this tenant land on the correct node pool
- matchLabelKeys:
- tenant
topologyKey: node-pool
podAntiAffinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
# ensure that pods associated with this tenant can't schedule to nodes used for another tenant
- mismatchLabelKeys:
- tenant # whatever the value of the "tenant" label for this Pod, prevent
- tenant # whatever the value of the "tenant" label for this Pod, prevent
# scheduling to nodes in any pool where any Pod from a different
# tenant is running.
labelSelector:
# We have to have the labelSelector which selects only Pods with the tenant label,
# otherwise this Pod would hate Pods from daemonsets as well, for example,
# otherwise this Pod would hate Pods from daemonsets as well, for example,
# which aren't supposed to have the tenant label.
matchExpressions:
- key: tenant
@ -823,13 +823,13 @@ metadata:
...
spec:
affinity:
podAffinity:
podAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
# 确保与此租户关联的 Pod 落在正确的节点池上
- matchLabelKeys:
- tenant
topologyKey: node-pool
podAntiAffinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
# 确保与此租户关联的 Pod 不能调度到用于其他租户的节点上
- mismatchLabelKeys:
@ -974,7 +974,7 @@ where each web server is co-located with a cache, on three separate nodes.
| *cache-1* | *cache-2* | *cache-3* |
<!--
The overall effect is that each cache instance is likely to be accessed by a single client, that
The overall effect is that each cache instance is likely to be accessed by a single client that
is running on the same node. This approach aims to minimize both skew (imbalanced load) and latency.
-->
总体效果是每个缓存实例都非常可能被在同一个节点上运行的某个客户端访问,
@ -1024,18 +1024,18 @@ Some of the limitations of using `nodeName` to select nodes are:
而其失败原因中会给出是否因为内存或 CPU 不足而造成无法运行。
- 在云环境中的节点名称并不总是可预测的,也不总是稳定的。
{{< note >}}
{{< warning >}}
<!--
`nodeName` is intended for use by custom schedulers or advanced use cases where
you need to bypass any configured schedulers. Bypassing the schedulers might lead to
failed Pods if the assigned Nodes get oversubscribed. You can use [node affinity](#node-affinity) or a the
[`nodeselector` field](#nodeselector) to assign a Pod to a specific Node without bypassing the schedulers.
failed Pods if the assigned Nodes get oversubscribed. You can use [node affinity](#node-affinity)
or a the [`nodeselector` field](#nodeselector) to assign a Pod to a specific Node without bypassing the schedulers.
-->
`nodeName` 旨在供自定义调度器或需要绕过任何已配置调度器的高级场景使用。
如果已分配的 Node 负载过重,绕过调度器可能会导致 Pod 失败。
你可以使用[节点亲和性](#node-affinity)或 [`nodeselector` 字段](#nodeselector)将
Pod 分配给特定 Node而无需绕过调度器。
{{</ note >}}
{{</ warning >}}
<!--
Here is an example of a Pod spec using the `nodeName` field:
@ -1113,7 +1113,7 @@ The following operators can only be used with `nodeAffinity`.
<!--
| Operator | Behaviour |
| :------------: | :-------------: |
| `Gt` | The supplied value will be parsed as an integer, and that integer is less than the integer that results from parsing the value of a label named by this selector |
| `Gt` | The supplied value will be parsed as an integer, and that integer is less than the integer that results from parsing the value of a label named by this selector |
| `Lt` | The supplied value will be parsed as an integer, and that integer is greater than the integer that results from parsing the value of a label named by this selector |
-->
| 操作符 | 行为 |
@ -1123,8 +1123,8 @@ The following operators can only be used with `nodeAffinity`.
{{<note>}}
<!--
`Gt` and `Lt` operators will not work with non-integer values. If the given value
doesn't parse as an integer, the pod will fail to get scheduled. Also, `Gt` and `Lt`
`Gt` and `Lt` operators will not work with non-integer values. If the given value
doesn't parse as an integer, the pod will fail to get scheduled. Also, `Gt` and `Lt`
are not available for `podAffinity`.
-->
`Gt``Lt` 操作符不能与非整数值一起使用。
@ -1144,9 +1144,8 @@ are not available for `podAffinity`.
- Learn how to use [affinity and anti-affinity](/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/).
-->
- 进一步阅读[污点与容忍度](/zh-cn/docs/concepts/scheduling-eviction/taint-and-toleration/)文档。
- 阅读[节点亲和性](https://git.k8s.io/design-proposals-archive/scheduling/nodeaffinity.md)
和 [Pod 间亲和性与反亲和性](https://git.k8s.io/design-proposals-archive/scheduling/podaffinity.md)
的设计文档。
- 阅读[节点亲和性](https://git.k8s.io/design-proposals-archive/scheduling/nodeaffinity.md)和
[Pod 间亲和性与反亲和性](https://git.k8s.io/design-proposals-archive/scheduling/podaffinity.md)的设计文档。
- 了解[拓扑管理器](/zh-cn/docs/tasks/administer-cluster/topology-manager/)如何参与节点层面资源分配决定。
- 了解如何使用 [nodeSelector](/zh-cn/docs/tasks/configure-pod-container/assign-pods-nodes/)。
* 了解如何使用[亲和性和反亲和性](/zh-cn/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/)。
- 了解如何使用[亲和性和反亲和性](/zh-cn/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/)。

View File

@ -324,6 +324,7 @@ The kubelet has the following default hard eviction thresholds:
- `nodefs.available<10%`
- `imagefs.available<15%`
- `nodefs.inodesFree<5%` (Linux nodes)
- `imagefs.inodesFree<5%` (Linux nodes)
-->
kubelet 具有以下默认硬驱逐条件:
@ -331,6 +332,7 @@ kubelet 具有以下默认硬驱逐条件:
- `nodefs.available<10%`
- `imagefs.available<15%`
- `nodefs.inodesFree<5%`Linux 节点)
- `imagefs.inodesFree<5%` (Linux 节点)
<!--
These default values of hard eviction thresholds will only be set if none

View File

@ -293,8 +293,8 @@ When Pod priority is enabled, the scheduler orders pending Pods by
their priority and a pending Pod is placed ahead of other pending Pods
with lower priority in the scheduling queue. As a result, the higher
priority Pod may be scheduled sooner than Pods with lower priority if
its scheduling requirements are met. If such Pod cannot be scheduled,
scheduler will continue and tries to schedule other lower priority Pods.
its scheduling requirements are met. If such Pod cannot be scheduled, the
scheduler will continue and try to schedule other lower priority Pods.
-->
### Pod 优先级对调度顺序的影响 {#effect-of-pod-priority-on-scheduling-order}
@ -329,7 +329,7 @@ Pod 被创建后会进入队列等待调度。
### User exposed information
When Pod P preempts one or more Pods on Node N, `nominatedNodeName` field of Pod
P's status is set to the name of Node N. This field helps scheduler track
P's status is set to the name of Node N. This field helps the scheduler track
resources reserved for Pod P and also gives users information about preemptions
in their clusters.
@ -339,8 +339,8 @@ After victim Pods are preempted, they get their graceful termination period. If
another node becomes available while scheduler is waiting for the victim Pods to
terminate, scheduler may use the other node to schedule Pod P. As a result
`nominatedNodeName` and `nodeName` of Pod spec are not always the same. Also, if
scheduler preempts Pods on Node N, but then a higher priority Pod than Pod P
arrives, scheduler may give Node N to the new higher priority Pod. In such a
the scheduler preempts Pods on Node N, but then a higher priority Pod than Pod P
arrives, the scheduler may give Node N to the new higher priority Pod. In such a
case, scheduler clears `nominatedNodeName` of Pod P. By doing this, scheduler
makes Pod P eligible to preempt Pods on another Node.
-->
@ -502,7 +502,7 @@ enough demand and if we find an algorithm with reasonable performance.
<!--
## Troubleshooting
Pod priority and pre-emption can have unwanted side effects. Here are some
Pod priority and preemption can have unwanted side effects. Here are some
examples of potential problems and ways to deal with them.
-->
## 故障排除 {#troubleshooting}

View File

@ -11,11 +11,11 @@ weight: 40
<!-- overview -->
{{< feature-state for_k8s_version="v1.27" state="beta" >}}
{{< feature-state for_k8s_version="v1.30" state="stable" >}}
<!--
Pods were considered ready for scheduling once created. Kubernetes scheduler
does its due diligence to find nodes to place all pending Pods. However, in a
does its due diligence to find nodes to place all pending Pods. However, in a
real-world case, some Pods may stay in a "miss-essential-resources" state for a long period.
These Pods actually churn the scheduler (and downstream integrators like Cluster AutoScaler)
in an unnecessary manner.
@ -98,7 +98,7 @@ The output is:
<!--
To inform scheduler this Pod is ready for scheduling, you can remove its `schedulingGates` entirely
by re-applying a modified manifest:
by reapplying a modified manifest:
-->
要通知调度程序此 Pod 已准备好进行调度,你可以通过重新应用修改后的清单来完全删除其 `schedulingGates`
@ -130,7 +130,7 @@ transited from previous `SchedulingGated` to `Running`:
`SchedulingGated` 转变为 `Running`
```none
NAME READY STATUS RESTARTS AGE IP NODE
NAME READY STATUS RESTARTS AGE IP NODE
test-pod 1/1 Running 0 15s 10.0.0.4 node-2
```
@ -148,16 +148,14 @@ scheduling. You can use `scheduler_pending_pods{queue="gated"}` to check the met
你可以使用 `scheduler_pending_pods{queue="gated"}` 来检查指标结果。
<!--
## Mutable Pod Scheduling Directives
## Mutable Pod scheduling directives
-->
## 可变 Pod 调度指令 {#mutable-pod-scheduling-directives}
{{< feature-state for_k8s_version="v1.27" state="beta" >}}
<!--
You can mutate scheduling directives of Pods while they have scheduling gates, with certain constraints.
At a high level, you can only tighten the scheduling directives of a Pod. In other words, the updated
directives would cause the Pods to only be able to be scheduled on a subset of the nodes that it would
At a high level, you can only tighten the scheduling directives of a Pod. In other words, the updated
directives would cause the Pods to only be able to be scheduled on a subset of the nodes that it would
previously match. More concretely, the rules for updating a Pod's scheduling directives are as follows:
-->
当 Pod 具有调度门控时,你可以在某些约束条件下改变 Pod 的调度指令。
@ -180,7 +178,7 @@ Pod 只能被调度到它之前匹配的节点子集上。
or `fieldExpressions` are allowed, and no changes to existing `matchExpressions`
and `fieldExpressions` will be allowed. This is because the terms in
`.requiredDuringSchedulingIgnoredDuringExecution.NodeSelectorTerms`, are ORed
while the expressions in `nodeSelectorTerms[].matchExpressions` and
while the expressions in `nodeSelectorTerms[].matchExpressions` and
`nodeSelectorTerms[].fieldExpressions` are ANDed.
-->
3. 如果 `NodeSelectorTerms` 之前为空,则允许设置该字段。