Merge pull request from mengjiao-liu/sync-scheduling-1.22

[zh] Concept files to sync for 1.22 - (9) Scheduling
pull/29431/head
Kubernetes Prow Robot 2021-08-16 20:20:01 -07:00 committed by GitHub
commit 4c047a7495
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
8 changed files with 111 additions and 117 deletions

View File

@ -579,7 +579,7 @@ must be satisfied for the pod to be scheduled onto a node.
-->
#### 名字空间选择算符
{{< feature-state for_k8s_version="v1.21" state="alpha" >}}
{{< feature-state for_k8s_version="v1.22" state="beta" >}}
<!--
Users can also select matching namespaces using `namespaceSelector`, which is a label query over the set of namespaces.
@ -595,14 +595,14 @@ null `namespaceSelector` means "this pod's namespace".
`namespaces` 列表以及 null 值 `namespaceSelector` 意味着“当前 Pod 的名字空间”。
<!--
This feature is alpha and disabled by default. You can enable it by setting the
This feature is beta and enabled by default. You can disable it via the
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
`PodAffinityNamespaceSelector` in both kube-apiserver and kube-scheduler.
-->
此功能特性是 Alpha 版本的,默认是被禁用的。你可以通过针对 kube-apiserver 和
此功能特性是 Beta 版本的,默认是被启用的。你可以通过针对 kube-apiserver 和
kube-scheduler 设置
[特性门控](/zh/docs/reference/command-line-tools-reference/feature-gates/)
`PodAffinityNamespaceSelector`用此特性。
`PodAffinityNamespaceSelector`用此特性。
<!--
#### More Practical Use-cases

View File

@ -1,47 +0,0 @@
---
title: 驱逐策略
content_type: concept
weight: 60
---
<!--
title: Eviction Policy
content_type: concept
weight: 60
-->
<!-- overview -->
<!--
This page is an overview of Kubernetes' policy for eviction.
-->
本页提供 Kubernetes 驱逐策略的概览。
<!-- body -->
<!--
## Eviction Policy
The {{< glossary_tooltip text="kubelet" term_id="kubelet" >}} proactively monitors for
and prevents total starvation of a compute resource. In those cases, the `kubelet` can reclaim
the starved resource by failing one or more Pods. When the `kubelet` fails
a Pod, it terminates all of its containers and transitions its `PodPhase` to `Failed`.
If the evicted Pod is managed by a Deployment, the Deployment creates another Pod
to be scheduled by Kubernetes.
-->
## 驱逐策略 {#eviction-policy}
{{< glossary_tooltip text="Kubelet" term_id="kubelet" >}} 主动监测和防止
计算资源的全面短缺。在资源短缺时,`kubelet` 可以主动地结束一个或多个 Pod
以回收短缺的资源。
`kubelet` 结束一个 Pod 时,它将终止 Pod 中的所有容器,而 Pod 的 `Phase`
将变为 `Failed`
如果被驱逐的 Pod 由 Deployment 管理,这个 Deployment 会创建另一个 Pod 给
Kubernetes 来调度。
## {{% heading "whatsnext" %}}
<!--
- Learn how to [configure out of resource handling](/docs/tasks/administer-cluster/out-of-resource/) with eviction signals and thresholds.
-->
- 阅读[配置资源不足的处理](/zh/docs/tasks/administer-cluster/out-of-resource/)
进一步了解驱逐信号和阈值。

View File

@ -95,7 +95,7 @@ the API server about this decision in a process called _binding_.
kube-apiserver这个过程叫做 _绑定_
<!--
Factors that need taken into account for scheduling decisions include
Factors that need to be taken into account for scheduling decisions include
individual and collective resource requirements, hardware / software /
policy constraints, affinity and anti-affinity specifications, data
locality, inter-workload interference, and so on.
@ -173,7 +173,7 @@ of the scheduler:
* Read about [scheduler performance tuning](/docs/concepts/scheduling-eviction/scheduler-perf-tuning/)
* Read about [Pod topology spread constraints](/docs/concepts/workloads/pods/pod-topology-spread-constraints/)
* Read the [reference documentation](/docs/reference/command-line-tools-reference/kube-scheduler/) for kube-scheduler
* Read the [kube-scheduler config (v1beta1)](/docs/reference/config-api/kube-scheduler-config.v1beta1/) reference
* Read the [kube-scheduler config (v1beta1)](/docs/reference/config-api/kube-scheduler-config.v1beta2/) reference
* Learn about [configuring multiple schedulers](/docs/tasks/extend-kubernetes/configure-multiple-schedulers/)
* Learn about [topology management policies](/docs/tasks/administer-cluster/topology-manager/)
* Learn about [Pod Overhead](/docs/concepts/scheduling-eviction/pod-overhead/)
@ -181,7 +181,7 @@ of the scheduler:
* 阅读关于 [调度器性能调优](/zh/docs/concepts/scheduling-eviction/scheduler-perf-tuning/)
* 阅读关于 [Pod 拓扑分布约束](/zh/docs/concepts/workloads/pods/pod-topology-spread-constraints/)
* 阅读关于 kube-scheduler 的 [参考文档](/zh/docs/reference/command-line-tools-reference/kube-scheduler/)
* 阅读 [kube-scheduler 配置参考 (v1beta1)](/zh/docs/reference/config-api/kube-scheduler-config.v1beta1/)
* 阅读 [kube-scheduler 配置参考 (v1beta1)](/zh/docs/reference/config-api/kube-scheduler-config.v1beta2/)
* 了解关于 [配置多个调度器](/zh/docs/tasks/extend-kubernetes/configure-multiple-schedulers/) 的方式
* 了解关于 [拓扑结构管理策略](/zh/docs/tasks/administer-cluster/topology-manager/)
* 了解关于 [Pod 额外开销](/zh/docs/concepts/scheduling-eviction/pod-overhead/)

View File

@ -432,17 +432,17 @@ the Node is not considered for preemption.
{{< /note >}}
<!--
If a pending Pod has inter-pod affinity to one or more of the lower-priority
Pods on the Node, the inter-Pod affinity rule cannot be satisfied in the absence
of those lower-priority Pods. In this case, the scheduler does not preempt any
Pods on the Node. Instead, it looks for another Node. The scheduler might find a
suitable Node or it might not. There is no guarantee that the pending Pod can be
scheduled.
If a pending Pod has inter-pod {{< glossary_tooltip text="affinity" term_id="affinity" >}}
to one or more of the lower-priority Pods on the Node, the inter-Pod affinity
rule cannot be satisfied in the absence of those lower-priority Pods. In this case,
the scheduler does not preempt any Pods on the Node. Instead, it looks for another
Node. The scheduler might find a suitable Node or it might not. There is no
guarantee that the pending Pod can be scheduled.
Our recommended solution for this problem is to create inter-Pod affinity only
towards equal or higher priority Pods.
-->
如果悬决 Pod 与节点上的一个或多个较低优先级 Pod 具有 Pod 间亲和性,
如果悬决 Pod 与节点上的一个或多个较低优先级 Pod 具有 Pod 间{{< glossary_tooltip text="亲和性" term_id="affinity" >}}
则在没有这些较低优先级 Pod 的情况下,无法满足 Pod 间亲和性规则。
在这种情况下,调度程序不会抢占节点上的任何 Pod。
相反,它寻找另一个节点。调度程序可能会找到合适的节点,
@ -620,7 +620,7 @@ Pod 优先级和 {{<glossary_tooltip text="QoS 类" term_id="qos-class" >}}
或者最低优先级的 Pod 受 PodDisruptionBudget 保护时,才会考虑优先级较高的 Pod。
<!--
The kubelet uses Priority to determine pod order for [out-of-resource eviction](/docs/tasks/administer-cluster/out-of-resource/).
The kubelet uses Priority to determine pod order for [node-pressure eviction](/docs/concepts/scheduling-eviction/node-pressure-eviction/).
You can use the QoS class to estimate the order in which pods are most likely
to get evicted. The kubelet ranks pods for eviction based on the following factors:
@ -628,25 +628,26 @@ to get evicted. The kubelet ranks pods for eviction based on the following facto
1. Pod Priority
1. Amount of resource usage relative to requests
See [evicting end-user pods](/docs/tasks/administer-cluster/out-of-resource/#evicting-end-user-pods)
See [evicting end-user pods](/docs/concepts/scheduling-eviction/node-pressure-eviction/#pod-selection-for-kubelet-eviction)
for more details.
kubelet out-of-resource eviction does not evict Pods when their
kubelet node-pressure eviction does not evict Pods when their
usage does not exceed their requests. If a Pod with lower priority is not
exceeding its requests, it won't be evicted. Another Pod with higher priority
that exceeds its requests may be evicted.
-->
kubelet 使用优先级来确定
[资源不足时驱逐](/zh/docs/tasks/administer-cluster/out-of-resource/) Pod 的顺序。
[节点压力驱逐](/zh/docs/concepts/scheduling-eviction/node-pressure-eviction/) Pod 的顺序。
你可以使用 QoS 类来估计 Pod 最有可能被驱逐的顺序。kubelet 根据以下因素对 Pod 进行驱逐排名:
1. 对紧俏资源的使用是否超过请求值
1. Pod 优先级
1. 相对于请求的资源使用量
有关更多详细信息,请参阅[驱逐最终用户的 Pod](/zh/docs/tasks/administer-cluster/out-of-resource/#evicting-end-user-pods)。
有关更多详细信息,请参阅
[kubelet 驱逐时 Pod 的选择](/zh/docs/concepts/scheduling-eviction/node-pressure-eviction/#pod-selection-for-kubelet-eviction)。
当某 Pod 的资源用量未超过其请求时kubelet 资源不足驱逐不会驱逐该 Pod。
当某 Pod 的资源用量未超过其请求时kubelet 节点压力驱逐不会驱逐该 Pod。
如果优先级较低的 Pod 没有超过其请求,则不会被驱逐。
另一个优先级高于其请求的 Pod 可能会被驱逐。

View File

@ -32,60 +32,70 @@ The kube-scheduler can be configured to enable bin packing of resources along wi
<!--
## Enabling Bin Packing using RequestedToCapacityRatioResourceAllocation
Before Kubernetes 1.15, Kube-scheduler used to allow scoring nodes based on the request to capacity ratio of primary resources like CPU and Memory. Kubernetes 1.16 added a new parameter to the priority function that allows the users to specify the resources along with weights for each resource to score nodes based on the request to capacity ratio. This allows users to bin pack extended resources by using appropriate parameters and improves the utilization of scarce resources in large clusters. The behavior of the `RequestedToCapacityRatioResourceAllocation` priority function can be controlled by a configuration option called `requestedToCapacityRatioArguments`. This argument consists of two parameters `shape` and `resources`. Shape allows the user to tune the function as least requested or most requested based on `utilization` and `score` values. Resources
consists of `name` which specifies the resource to be considered during scoring and `weight` specify the weight of each resource.
Kubernetes allows the users to specify the resources along with weights for
each resource to score nodes based on the request to capacity ratio. This
allows users to bin pack extended resources by using appropriate parameters
and improves the utilization of scarce resources in large clusters. The
behavior of the `RequestedToCapacityRatioResourceAllocation` priority function
can be controlled by a configuration option called `RequestedToCapacityRatioArgs`.
This argument consists of two parameters `shape` and `resources`. The `shape`
parameter allows the user to tune the function as least requested or most
requested based on `utilization` and `score` values. The `resources` parameter
consists of `name` of the resource to be considered during scoring and `weight`
specify the weight of each resource.
-->
## 使用 RequestedToCapacityRatioResourceAllocation 启用装箱
在 Kubernetes 1.15 之前Kube-scheduler 通常允许根据对主要资源(如 CPU 和内存)
的请求数量和可用容量 之比率对节点评分。
Kubernetes 1.16 在优先级函数中添加了一个新参数,该参数允许用户指定资源以及每类资源的权重,
Kubernetes 允许用户指定资源以及每类资源的权重,
以便根据请求数量与可用容量之比率为节点评分。
这就使得用户可以通过使用适当的参数来对扩展资源执行装箱操作,从而提高了大型集群中稀缺资源的利用率。
`RequestedToCapacityRatioResourceAllocation` 优先级函数的行为可以通过名为
`requestedToCapacityRatioArguments` 的配置选项进行控制。
`RequestedToCapacityRatioArgs` 的配置选项进行控制。
该标志由两个参数 `shape``resources` 组成。
`shape` 允许用户根据 `utilization``score` 值将函数调整为最少请求
least requested
最多请求most requested计算。
`shape` 允许用户根据 `utilization``score` 值将函数调整为
最少请求least requested或最多请求most requested计算。
`resources` 包含由 `name``weight` 组成,`name` 指定评分时要考虑的资源,
`weight` 指定每种资源的权重。
<!--
Below is an example configuration that sets `requestedToCapacityRatioArguments` to bin packing behavior for extended resources `intel.com/foo` and `intel.com/bar`
Below is an example configuration that sets
`requestedToCapacityRatioArguments` to bin packing behavior for extended
resources `intel.com/foo` and `intel.com/bar`.
-->
以下是一个配置示例,该配置将 `requestedToCapacityRatioArguments` 设置为对扩展资源
`intel.com/foo``intel.com/bar` 的装箱行为
```json
{
"kind": "Policy",
"apiVersion": "v1",
...
"priorities": [
...
{
"name": "RequestedToCapacityRatioPriority",
"weight": 2,
"argument": {
"requestedToCapacityRatioArguments": {
"shape": [
{"utilization": 0, "score": 0},
{"utilization": 100, "score": 10}
],
"resources": [
{"name": "intel.com/foo", "weight": 3},
{"name": "intel.com/bar", "weight": 5}
]
}
}
}
],
}
```yaml
apiVersion: kubescheduler.config.k8s.io/v1beta1
kind: KubeSchedulerConfiguration
profiles:
# ...
pluginConfig:
- name: RequestedToCapacityRatio
args:
shape:
- utilization: 0
score: 10
- utilization: 100
score: 0
resources:
- name: intel.com/foo
weight: 3
- name: intel.com/bar
weight: 5
```
<!--
Referencing the `KubeSchedulerConfiguration` file with the kube-scheduler
flag `--config=/path/to/config/file` will pass the configuration to the
scheduler.
-->
使用 kube-scheduler 标志 `--config=/path/to/config/file`
引用 `KubeSchedulerConfiguration` 文件将配置传递给调度器。
<!--
**This feature is disabled by default**
-->

View File

@ -81,11 +81,11 @@ kube-scheduler 的表现等价于设置值为 100。
<!--
To change the value, edit the
[kube-scheduler configuration file](/docs/reference/config-api/kube-scheduler-config.v1beta1/)
[kube-scheduler configuration file](/docs/reference/config-api/kube-scheduler-config.v1beta2/)
and then restart the scheduler.
In many cases, the configuration file can be found at `/etc/kubernetes/config/kube-scheduler.yaml`
-->
要修改这个值,先编辑 [kube-scheduler 的配置文件](/zh/docs/reference/config-api/kube-scheduler-config.v1beta1/)
要修改这个值,先编辑 [kube-scheduler 的配置文件](/zh/docs/reference/config-api/kube-scheduler-config.v1beta2/)
然后重启调度器。
大多数情况下,这个配置文件是 `/etc/kubernetes/config/kube-scheduler.yaml`
@ -298,6 +298,6 @@ After going over all the Nodes, it goes back to Node 1.
## {{% heading "whatsnext" %}}
<!-- * Check the [kube-scheduler configuration reference (v1beta1)](/docs/reference/config-api/kube-scheduler-config.v1beta1/) -->
<!-- * Check the [kube-scheduler configuration reference (v1beta1)](/docs/reference/config-api/kube-scheduler-config.v1beta2/) -->
* 参见 [kube-scheduler 配置参考 (v1beta1)](/zh/docs/reference/config-api/kube-scheduler-config.v1beta1/)
* 参见 [kube-scheduler 配置参考 (v1beta1)](/zh/docs/reference/config-api/kube-scheduler-config.v1beta2/)

View File

@ -16,7 +16,7 @@ weight: 90
<!-- overview -->
{{< feature-state for_k8s_version="1.15" state="alpha" >}}
{{< feature-state for_k8s_version="1.19" state="stable" >}}
<!--
The scheduling framework is a pluggable architecture for the Kubernetes scheduler.

View File

@ -476,10 +476,43 @@ This ensures that DaemonSet pods are never evicted due to these problems.
## 基于节点状态添加污点
<!--
The node lifecycle controller automatically creates taints corresponding to
Node conditions with `NoSchedule` effect.
Similarly the scheduler does not check Node conditions; instead the scheduler checks taints. This assures that Node conditions don't affect what's scheduled onto the Node. The user can choose to ignore some of the Node's problems (represented as Node conditions) by adding appropriate Pod tolerations.
The control plane, using the node {{<glossary_tooltip text="controller" term_id="controller">}},
automatically creates taints with a `NoSchedule` effect for [node conditions](/docs/concepts/scheduling-eviction/node-pressure-eviction/#node-conditions).
The scheduler checks taints, not node conditions, when it makes scheduling
decisions. This ensures that node conditions don't directly affect scheduling.
For example, if the `DiskPressure` node condition is active, the control plane
adds the `node.kubernetes.io/disk-pressure` taint and does not schedule new pods
onto the affected node. If the `MemoryPressure` node condition is active, the
control plane adds the `node.kubernetes.io/memory-pressure` taint.
-->
控制平面使用节点{{<glossary_tooltip text="控制器" term_id="controller">}}自动创建
与[节点状况](/zh/docs/concepts/scheduling-eviction/node-pressure-eviction/#node-conditions)对应的带有 `NoSchedule` 效应的污点。
调度器在进行调度时检查污点,而不是检查节点状况。这确保节点状况不会直接影响调度。
例如,如果 `DiskPressure` 节点状况处于活跃状态,则控制平面
添加 `node.kubernetes.io/disk-pressure` 污点并且不会调度新的 pod
到受影响的节点。如果 `MemoryPressure` 节点状况处于活跃状态,则
控制平面添加 `node.kubernetes.io/memory-pressure` 污点。
<!--
You can ignore node conditions for newly created pods by adding the corresponding
Pod tolerations. The control plane also adds the `node.kubernetes.io/memory-pressure`
toleration on pods that have a {{< glossary_tooltip text="QoS class" term_id="qos-class" >}}
other than `BestEffort`. This is because Kubernetes treats pods in the `Guaranteed`
or `Burstable` QoS classes (even pods with no memory request set) as if they are
able to cope with memory pressure, while new `BestEffort` pods are not scheduled
onto the affected node.
-->
对于新创建的 Pod可以通过添加相应的 Pod 容忍度来忽略节点状况。
控制平面还在具有除 `BestEffort` 之外的 {{<glossary_tooltip text="QoS 类" term_id="qos-class" >}}的 pod 上
添加 `node.kubernetes.io/memory-pressure` 容忍度。
这是因为 Kubernetes 将 `Guaranteed``Burstable` QoS 类中的 Pod甚至没有设置内存请求的 Pod
视为能够应对内存压力,而新创建的 `BestEffort` Pod 不会被调度到受影响的节点上。
<!--
The DaemonSet controller automatically adds the
following `NoSchedule` tolerations to all daemons, to prevent DaemonSets from
breaking.
@ -490,9 +523,6 @@ breaking.
* `node.kubernetes.io/unschedulable` (1.10 or later)
* `node.kubernetes.io/network-unavailable` (*host network only*)
-->
Node 生命周期控制器会自动创建与 Node 条件相对应的带有 `NoSchedule` 效应的污点。
同样,调度器不检查节点条件,而是检查节点污点。这确保了节点条件不会影响调度到节点上的内容。
用户可以通过添加适当的 Pod 容忍度来选择忽略某些 Node 的问题(表示为 Node 的调度条件)。
DaemonSet 控制器自动为所有守护进程添加如下 `NoSchedule` 容忍度以防 DaemonSet 崩溃:
@ -512,8 +542,8 @@ arbitrary tolerations to DaemonSets.
## {{% heading "whatsnext" %}}
<!--
* Read about [out of resource handling](/docs/tasks/administer-cluster/out-of-resource/) and how you can configure it
* Read about [pod priority](/docs/concepts/configuration/pod-priority-preemption/)
* Read about [Node-pressure Eviction](/docs/concepts/scheduling-eviction/node-pressure-eviction/) and how you can configure it
* Read about [Pod Priority](/docs/concepts/scheduling-eviction/pod-priority-preemption/)
-->
* 阅读[资源耗尽的处理](/zh/docs/tasks/administer-cluster/out-of-resource/),以及如何配置其行为
* 阅读 [Pod 优先级](/zh/docs/concepts/configuration/pod-priority-preemption/)
* 阅读[节点压力驱逐](/zh/docs/concepts/scheduling-eviction/node-pressure-eviction/),以及如何配置其行为
* 阅读 [Pod 优先级](/zh/docs/concepts/scheduling-eviction/pod-priority-preemption/)