Merge pull request #28176 from chenrui333/zh/resync-scheduling-eviction-files
zh: resync scheduling filespull/28300/head
commit
04ea603058
|
@ -1,5 +1,53 @@
|
|||
---
|
||||
title: 调度和驱逐 (Scheduling and Eviction)
|
||||
title: 调度,抢占和驱逐
|
||||
weight: 90
|
||||
description: 在Kubernetes中,调度 (Scheduling) 指的是确保 Pods 匹配到合适的节点,以便 kubelet 能够运行它们。驱逐 (Eviction) 是在资源匮乏的节点上,主动让一个或多个 Pods 失效的过程。
|
||||
content_type: concept
|
||||
description: >
|
||||
在Kubernetes中,调度 (scheduling) 指的是确保 Pods 匹配到合适的节点,
|
||||
以便 kubelet 能够运行它们。抢占 (Preemption) 指的是终止低优先级的 Pods 以便高优先级的 Pods 可以
|
||||
调度运行的过程。驱逐 (Eviction) 是在资源匮乏的节点上,主动让一个或多个 Pods 失效的过程。
|
||||
---
|
||||
|
||||
<!--
|
||||
---
|
||||
title: "Scheduling, Preemption and Eviction"
|
||||
weight: 90
|
||||
content_type: concept
|
||||
description: >
|
||||
In Kubernetes, scheduling refers to making sure that Pods are matched to Nodes
|
||||
so that the kubelet can run them. Preemption is the process of terminating
|
||||
Pods with lower Priority so that Pods with higher Priority can schedule on
|
||||
Nodes. Eviction is the process of proactively terminating one or more Pods on
|
||||
resource-starved Nodes.
|
||||
no_list: true
|
||||
---
|
||||
-->
|
||||
|
||||
<!--
|
||||
In Kubernetes, scheduling refers to making sure that {{<glossary_tooltip text="Pods" term_id="pod">}}
|
||||
are matched to {{<glossary_tooltip text="Nodes" term_id="node">}} so that the
|
||||
{{<glossary_tooltip text="kubelet" term_id="kubelet">}} can run them. Preemption
|
||||
is the process of terminating Pods with lower {{<glossary_tooltip text="Priority" term_id="pod-priority">}}
|
||||
so that Pods with higher Priority can schedule on Nodes. Eviction is the process
|
||||
of terminating one or more Pods on Nodes.
|
||||
-->
|
||||
|
||||
<!-- ## Scheduling -->
|
||||
|
||||
## 调度
|
||||
|
||||
* [Kubernetes 调度器](/zh/docs/concepts/scheduling-eviction/kube-scheduler/)
|
||||
* [将 Pods 指派到节点](/zh/docs/concepts/scheduling-eviction/assign-pod-node/)
|
||||
* [Pod 开销](/zh/docs/concepts/scheduling-eviction/pod-overhead/)
|
||||
* [污点和容忍](/zh/docs/concepts/scheduling-eviction/taint-and-toleration/)
|
||||
* [调度框架](/zh/docs/concepts/scheduling-eviction/scheduling-framework)
|
||||
* [调度器的性能调试](/zh/docs/concepts/scheduling-eviction/scheduler-perf-tuning/)
|
||||
* [扩展资源的资源装箱](/zh/docs/concepts/scheduling-eviction/resource-bin-packing/)
|
||||
|
||||
<!-- ## Pod Disruption -->
|
||||
|
||||
## Pod 干扰
|
||||
|
||||
* [Pod 优先级和抢占](/zh/docs/concepts/scheduling-eviction/pod-priority-preemption/)
|
||||
* [节点压力驱逐](/zh/docs/concepts/scheduling-eviction/node-pressure-eviction/)
|
||||
* [API发起的驱逐](/zh/docs/concepts/scheduling-eviction/api-eviction/)
|
||||
|
|
|
@ -75,9 +75,9 @@ Run `kubectl get nodes` to get the names of your cluster's nodes. Pick out the o
|
|||
|
||||
执行 `kubectl get nodes` 命令获取集群的节点名称。
|
||||
选择一个你要增加标签的节点,然后执行
|
||||
`kubectl label nodes <node-name> <label-key>=<label-value>`
|
||||
`kubectl label nodes <node-name> <label-key>=<label-value>`
|
||||
命令将标签添加到你所选择的节点上。
|
||||
例如,如果你的节点名称为 'kubernetes-foo-node-1.c.a-robinson.internal'
|
||||
例如,如果你的节点名称为 'kubernetes-foo-node-1.c.a-robinson.internal'
|
||||
并且想要的标签是 'disktype=ssd',则可以执行
|
||||
`kubectl label nodes kubernetes-foo-node-1.c.a-robinson.internal disktype=ssd` 命令。
|
||||
|
||||
|
@ -136,8 +136,18 @@ with a standard set of labels. See [Well-Known Labels, Annotations and Taints](/
|
|||
-->
|
||||
## 插曲:内置的节点标签 {#built-in-node-labels}
|
||||
|
||||
除了你[添加](#attach-labels-to-node)的标签外,节点还预先填充了一组标准标签。
|
||||
参见[常用标签、注解和污点](/zh/docs/reference/labels-annotations-taints/)。
|
||||
除了你[添加](#step-one-attach-label-to-the-node)的标签外,节点还预制了一组标准标签。
|
||||
参见这些[常用的标签,注解以及污点](/zh/docs/reference/labels-annotations-taints/):
|
||||
|
||||
* [`kubernetes.io/hostname`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#kubernetes-io-hostname)
|
||||
* [`failure-domain.beta.kubernetes.io/zone`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#failure-domainbetakubernetesiozone)
|
||||
* [`failure-domain.beta.kubernetes.io/region`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#failure-domainbetakubernetesioregion)
|
||||
* [`topology.kubernetes.io/zone`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#topologykubernetesiozone)
|
||||
* [`topology.kubernetes.io/region`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#topologykubernetesiozone)
|
||||
* [`beta.kubernetes.io/instance-type`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#beta-kubernetes-io-instance-type)
|
||||
* [`node.kubernetes.io/instance-type`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#nodekubernetesioinstance-type)
|
||||
* [`kubernetes.io/os`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#kubernetes-io-os)
|
||||
* [`kubernetes.io/arch`](/zh/docs/reference/kubernetes-api/labels-annotations-taints/#kubernetes-io-arch)
|
||||
|
||||
{{< note >}}
|
||||
<!--
|
||||
|
@ -181,7 +191,7 @@ To make use of that label prefix for node isolation:
|
|||
For example, `example.com.node-restriction.kubernetes.io/fips=true` or `example.com.node-restriction.kubernetes.io/pci-dss=true`.
|
||||
-->
|
||||
1. 检查是否在使用 Kubernetes v1.11+,以便 NodeRestriction 功能可用。
|
||||
2. 确保你在使用[节点授权](/zh/docs/reference/access-authn-authz/node/)并且已经_启用_
|
||||
2. 确保你在使用[节点授权](/zh/docs/reference/access-authn-authz/node/)并且已经_启用_
|
||||
[NodeRestriction 准入插件](/zh/docs/reference/access-authn-authz/admission-controllers/#noderestriction)。
|
||||
3. 将 `node-restriction.kubernetes.io/` 前缀下的标签添加到 Node 对象,
|
||||
然后在节点选择器中使用这些标签。
|
||||
|
@ -574,7 +584,7 @@ must be satisfied for the pod to be scheduled onto a node.
|
|||
<!--
|
||||
Users can also select matching namespaces using `namespaceSelector`, which is a label query over the set of namespaces.
|
||||
The affinity term is applied to the union of the namespaces selected by `namespaceSelector` and the ones listed in the `namespaces` field.
|
||||
Note that an empty `namespaceSelector` ({}) matches all namespaces, while a null or empty `namespaces` list and
|
||||
Note that an empty `namespaceSelector` ({}) matches all namespaces, while a null or empty `namespaces` list and
|
||||
null `namespaceSelector` means "this pod's namespace".
|
||||
-->
|
||||
用户也可以使用 `namespaceSelector` 选择匹配的名字空间,`namespaceSelector`
|
||||
|
@ -828,4 +838,3 @@ resource allocation decisions.
|
|||
一旦 Pod 分配给 节点,kubelet 应用将运行该 pod 并且分配节点本地资源。
|
||||
[拓扑管理器](/zh/docs/tasks/administer-cluster/topology-manager/)
|
||||
可以参与到节点级别的资源分配决定中。
|
||||
|
||||
|
|
|
@ -1,9 +1,21 @@
|
|||
---
|
||||
title: Pod 开销
|
||||
content_type: concept
|
||||
weight: 20
|
||||
weight: 30
|
||||
---
|
||||
|
||||
<!--
|
||||
---
|
||||
reviewers:
|
||||
- dchen1107
|
||||
- egernst
|
||||
- tallclair
|
||||
title: Pod Overhead
|
||||
content_type: concept
|
||||
weight: 30
|
||||
---
|
||||
-->
|
||||
|
||||
<!-- overview -->
|
||||
|
||||
{{< feature-state for_k8s_version="v1.18" state="beta" >}}
|
||||
|
@ -58,7 +70,7 @@ across your cluster, and a `RuntimeClass` is utilized which defines the `overhea
|
|||
您需要确保在集群中启用了 `PodOverhead` [特性门控](/zh/docs/reference/command-line-tools-reference/feature-gates/)
|
||||
(在 1.18 默认是开启的),以及一个用于定义 `overhead` 字段的 `RuntimeClass`。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
## Usage example
|
||||
-->
|
||||
## 使用示例
|
||||
|
@ -85,7 +97,7 @@ overhead:
|
|||
cpu: "250m"
|
||||
```
|
||||
|
||||
<!--
|
||||
<!--
|
||||
Workloads which are created which specify the `kata-fc` RuntimeClass handler will take the memory and
|
||||
cpu overheads into account for resource quota calculations, node scheduling, as well as Pod cgroup sizing.
|
||||
|
||||
|
@ -119,7 +131,7 @@ spec:
|
|||
memory: 100Mi
|
||||
```
|
||||
|
||||
<!--
|
||||
<!--
|
||||
At admission time the RuntimeClass [admission controller](/docs/reference/access-authn-authz/admission-controllers/)
|
||||
updates the workload's PodSpec to include the `overhead` as described in the RuntimeClass. If the PodSpec already has this field defined,
|
||||
the Pod will be rejected. In the given example, since only the RuntimeClass name is specified, the admission controller mutates the Pod
|
||||
|
@ -129,7 +141,7 @@ to include an `overhead`.
|
|||
RuntimeClass 中定义的 `overhead`. 如果 PodSpec 中该字段已定义,该 Pod 将会被拒绝。
|
||||
在这个例子中,由于只指定了 RuntimeClass 名称,所以准入控制器更新了 Pod, 包含了一个 `overhead`.
|
||||
|
||||
<!--
|
||||
<!--
|
||||
After the RuntimeClass admission controller, you can check the updated PodSpec:
|
||||
-->
|
||||
在 RuntimeClass 准入控制器之后,可以检验一下已更新的 PodSpec:
|
||||
|
@ -138,7 +150,7 @@ After the RuntimeClass admission controller, you can check the updated PodSpec:
|
|||
kubectl get pod test-pod -o jsonpath='{.spec.overhead}'
|
||||
```
|
||||
|
||||
<!--
|
||||
<!--
|
||||
The output is:
|
||||
-->
|
||||
输出:
|
||||
|
@ -146,25 +158,25 @@ The output is:
|
|||
map[cpu:250m memory:120Mi]
|
||||
```
|
||||
|
||||
<!--
|
||||
<!--
|
||||
If a ResourceQuota is defined, the sum of container requests as well as the
|
||||
`overhead` field are counted.
|
||||
-->
|
||||
如果定义了 ResourceQuata, 则容器请求的总量以及 `overhead` 字段都将计算在内。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
When the kube-scheduler is deciding which node should run a new Pod, the scheduler considers that Pod's
|
||||
`overhead` as well as the sum of container requests for that Pod. For this example, the scheduler adds the
|
||||
requests and the overhead, then looks for a node that has 2.25 CPU and 320 MiB of memory available.
|
||||
-->
|
||||
当 kube-scheduler 决定在哪一个节点调度运行新的 Pod 时,调度器会兼顾该 Pod 的 `overhead` 以及该 Pod 的容器请求总量。在这个示例中,调度器将资源请求和开销相加,然后寻找具备 2.25 CPU 和 320 MiB 内存可用的节点。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
Once a Pod is scheduled to a node, the kubelet on that node creates a new {{< glossary_tooltip text="cgroup" term_id="cgroup" >}}
|
||||
for the Pod. It is within this pod that the underlying container runtime will create containers. -->
|
||||
一旦 Pod 调度到了某个节点, 该节点上的 kubelet 将为该 Pod 新建一个 {{< glossary_tooltip text="cgroup" term_id="cgroup" >}}. 底层容器运行时将在这个 pod 中创建容器。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
If the resource has a limit defined for each container (Guaranteed QoS or Bustrable QoS with limits defined),
|
||||
the kubelet will set an upper limit for the pod cgroup associated with that resource (cpu.cfs_quota_us for CPU
|
||||
and memory.limit_in_bytes memory). This upper limit is based on the sum of the container limits plus the `overhead`
|
||||
|
@ -179,7 +191,7 @@ requests plus the `overhead` defined in the PodSpec.
|
|||
-->
|
||||
对于 CPU, 如果 Pod 的 QoS 是 Guaranteed 或者 Burstable, kubelet 会基于容器请求总量与 PodSpec 中定义的 `overhead` 之和设置 `cpu.shares`.
|
||||
|
||||
<!--
|
||||
<!--
|
||||
Looking at our example, verify the container requests for the workload:
|
||||
-->
|
||||
请看这个例子,验证工作负载的容器请求:
|
||||
|
@ -187,7 +199,7 @@ Looking at our example, verify the container requests for the workload:
|
|||
kubectl get pod test-pod -o jsonpath='{.spec.containers[*].resources.limits}'
|
||||
```
|
||||
|
||||
<!--
|
||||
<!--
|
||||
The total container requests are 2000m CPU and 200MiB of memory:
|
||||
-->
|
||||
容器请求总计 2000m CPU 和 200MiB 内存:
|
||||
|
@ -195,7 +207,7 @@ The total container requests are 2000m CPU and 200MiB of memory:
|
|||
map[cpu: 500m memory:100Mi] map[cpu:1500m memory:100Mi]
|
||||
```
|
||||
|
||||
<!--
|
||||
<!--
|
||||
Check this against what is observed by the node:
|
||||
-->
|
||||
对照从节点观察到的情况来检查一下:
|
||||
|
@ -203,7 +215,7 @@ Check this against what is observed by the node:
|
|||
kubectl describe node | grep test-pod -B2
|
||||
```
|
||||
|
||||
<!--
|
||||
<!--
|
||||
The output shows 2250m CPU and 320MiB of memory are requested, which includes PodOverhead:
|
||||
-->
|
||||
该输出显示请求了 2250m CPU 以及 320MiB 内存,包含了 PodOverhead 在内:
|
||||
|
@ -226,8 +238,9 @@ cgroups directly on the node.
|
|||
|
||||
First, on the particular node, determine the Pod identifier:
|
||||
-->
|
||||
在工作负载所运行的节点上检查 Pod 的内存 cgroups. 在接下来的例子中,将在该节点上使用具备 CRI 兼容的容器运行时命令行工具 [`crictl`](https://github.com/kubernetes-sigs/cri-tools/blob/master/docs/crictl.md).
|
||||
这是一个展示 PodOverhead 行为的进阶示例,用户并不需要直接在该节点上检查 cgroups.
|
||||
在工作负载所运行的节点上检查 Pod 的内存 cgroups. 在接下来的例子中,
|
||||
将在该节点上使用具备 CRI 兼容的容器运行时命令行工具
|
||||
[`crictl`](https://github.com/kubernetes-sigs/cri-tools/blob/master/docs/crictl.md)。
|
||||
|
||||
首先在特定的节点上确定该 Pod 的标识符:
|
||||
|
||||
|
@ -240,7 +253,7 @@ First, on the particular node, determine the Pod identifier:
|
|||
POD_ID="$(sudo crictl pods --name test-pod -q)"
|
||||
```
|
||||
|
||||
<!--
|
||||
<!--
|
||||
From this, you can determine the cgroup path for the Pod:
|
||||
-->
|
||||
可以依此判断该 Pod 的 cgroup 路径:
|
||||
|
@ -254,7 +267,7 @@ From this, you can determine the cgroup path for the Pod:
|
|||
sudo crictl inspectp -o=json $POD_ID | grep cgroupsPath
|
||||
```
|
||||
|
||||
<!--
|
||||
<!--
|
||||
The resulting cgroup path includes the Pod's `pause` container. The Pod level cgroup is one directory above.
|
||||
-->
|
||||
执行结果的 cgroup 路径中包含了该 Pod 的 `pause` 容器。Pod 级别的 cgroup 即上面的一个目录。
|
||||
|
@ -262,7 +275,7 @@ The resulting cgroup path includes the Pod's `pause` container. The Pod level cg
|
|||
"cgroupsPath": "/kubepods/podd7f4b509-cf94-4951-9417-d1087c92a5b2/7ccf55aee35dd16aca4189c952d83487297f3cd760f1bbf09620e206e7d0c27a"
|
||||
```
|
||||
|
||||
<!--
|
||||
<!--
|
||||
In this specific case, the pod cgroup path is `kubepods/podd7f4b509-cf94-4951-9417-d1087c92a5b2`. Verify the Pod level cgroup setting for memory:
|
||||
-->
|
||||
在这个例子中,该 pod 的 cgroup 路径是 `kubepods/podd7f4b509-cf94-4951-9417-d1087c92a5b2`。验证内存的 Pod 级别 cgroup 设置:
|
||||
|
@ -278,7 +291,7 @@ In this specific case, the pod cgroup path is `kubepods/podd7f4b509-cf94-4951-94
|
|||
cat /sys/fs/cgroup/memory/kubepods/podd7f4b509-cf94-4951-9417-d1087c92a5b2/memory.limit_in_bytes
|
||||
```
|
||||
|
||||
<!--
|
||||
<!--
|
||||
This is 320 MiB, as expected:
|
||||
-->
|
||||
和预期的一样是 320 MiB
|
||||
|
@ -286,7 +299,7 @@ This is 320 MiB, as expected:
|
|||
335544320
|
||||
```
|
||||
|
||||
<!--
|
||||
<!--
|
||||
### Observability
|
||||
-->
|
||||
### 可观察性
|
||||
|
@ -298,8 +311,11 @@ running with a defined Overhead. This functionality is not available in the 1.9
|
|||
kube-state-metrics, but is expected in a following release. Users will need to build kube-state-metrics
|
||||
from source in the meantime.
|
||||
-->
|
||||
在 [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) 中可以通过 `kube_pod_overhead` 指标来协助确定何时使用 PodOverhead 以及协助观察以一个既定开销运行的工作负载的稳定性。
|
||||
该特性在 kube-state-metrics 的 1.9 发行版本中不可用,不过预计将在后续版本中发布。在此之前,用户需要从源代码构建 kube-state-metrics.
|
||||
在 [kube-state-metrics](https://github.com/kubernetes/kube-state-metrics) 中可以通过
|
||||
`kube_pod_overhead` 指标来协助确定何时使用 PodOverhead 以及协助观察以一个既定
|
||||
开销运行的工作负载的稳定性。
|
||||
该特性在 kube-state-metrics 的 1.9 发行版本中不可用,不过预计将在后续版本中发布。
|
||||
在此之前,用户需要从源代码构建 kube-state-metrics。
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
|
@ -310,4 +326,3 @@ from source in the meantime.
|
|||
|
||||
* [RuntimeClass](/zh/docs/concepts/containers/runtime-class/)
|
||||
* [PodOverhead 设计](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/688-pod-overhead)
|
||||
|
||||
|
|
|
@ -1,16 +1,18 @@
|
|||
---
|
||||
title: 扩展资源的资源装箱
|
||||
content_type: concept
|
||||
weight: 30
|
||||
weight: 80
|
||||
---
|
||||
<!--
|
||||
---
|
||||
reviewers:
|
||||
- bsalamat
|
||||
- k82cn
|
||||
- ahg-g
|
||||
title: Resource Bin Packing for Extended Resources
|
||||
content_type: concept
|
||||
weight: 30
|
||||
weight: 80
|
||||
---
|
||||
-->
|
||||
|
||||
<!-- overview -->
|
||||
|
@ -18,7 +20,7 @@ weight: 30
|
|||
{{< feature-state for_k8s_version="1.16" state="alpha" >}}
|
||||
|
||||
<!--
|
||||
The kube-scheduler can be configured to enable bin packing of resources along with extended resources using `RequestedToCapacityRatioResourceAllocation` priority function. Priority functions can be used to fine-tune the kube-scheduler as per custom needs.
|
||||
The kube-scheduler can be configured to enable bin packing of resources along with extended resources using `RequestedToCapacityRatioResourceAllocation` priority function. Priority functions can be used to fine-tune the kube-scheduler as per custom needs.
|
||||
-->
|
||||
|
||||
使用 `RequestedToCapacityRatioResourceAllocation` 优先级函数,可以将 kube-scheduler
|
||||
|
@ -48,7 +50,7 @@ Kubernetes 1.16 在优先级函数中添加了一个新参数,该参数允许
|
|||
(least requested)或
|
||||
最多请求(most requested)计算。
|
||||
`resources` 包含由 `name` 和 `weight` 组成,`name` 指定评分时要考虑的资源,
|
||||
`weight` 指定每种资源的权重。
|
||||
`weight` 指定每种资源的权重。
|
||||
|
||||
<!--
|
||||
Below is an example configuration that sets `requestedToCapacityRatioArguments` to bin packing behavior for extended resources `intel.com/foo` and `intel.com/bar`
|
||||
|
@ -130,7 +132,7 @@ The above arguments give the node a score of 0 if utilization is 0% and 10 for u
|
|||
```
|
||||
|
||||
<!--
|
||||
It can be used to add extended resources as follows:
|
||||
It can be used to add extended resources as follows:
|
||||
-->
|
||||
它可以用来添加扩展资源,如下所示:
|
||||
|
||||
|
@ -249,4 +251,3 @@ CPU = resourceScoringFunction((2+6),8)
|
|||
NodeScore = (5 * 5) + (7 * 1) + (10 * 3) / (5 + 1 + 3)
|
||||
= 7
|
||||
```
|
||||
|
||||
|
|
|
@ -1,14 +1,16 @@
|
|||
---
|
||||
title: 调度器性能调优
|
||||
content_type: concept
|
||||
weight: 80
|
||||
weight: 100
|
||||
---
|
||||
<!--
|
||||
---
|
||||
reviewers:
|
||||
- bsalamat
|
||||
title: Scheduler Performance Tuning
|
||||
content_type: concept
|
||||
weight: 80
|
||||
weight: 100
|
||||
---
|
||||
-->
|
||||
|
||||
<!-- overview -->
|
||||
|
@ -45,7 +47,7 @@ large Kubernetes clusters.
|
|||
|
||||
<!-- body -->
|
||||
|
||||
<!--
|
||||
<!--
|
||||
In large clusters, you can tune the scheduler's behaviour balancing
|
||||
scheduling outcomes between latency (new Pods are placed quickly) and
|
||||
accuracy (the scheduler rarely makes poor placement decisions).
|
||||
|
@ -60,12 +62,12 @@ a threshold for scheduling nodes in your cluster.
|
|||
你可以通过设置 kube-scheduler 的 `percentageOfNodesToScore` 来配置这个调优设置。
|
||||
这个 KubeSchedulerConfiguration 设置决定了调度集群中节点的阈值。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
### Setting the threshold
|
||||
-->
|
||||
### 设置阈值
|
||||
|
||||
<!--
|
||||
<!--
|
||||
The `percentageOfNodesToScore` option accepts whole numeric values between 0
|
||||
and 100. The value 0 is a special number which indicates that the kube-scheduler
|
||||
should use its compiled-in default.
|
||||
|
@ -77,17 +79,17 @@ had set a value of 100.
|
|||
如果你设置 `percentageOfNodesToScore` 的值超过了 100,
|
||||
kube-scheduler 的表现等价于设置值为 100。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
To change the value, edit the
|
||||
[kube-scheduler configuration file](/docs/reference/config-api/kube-scheduler-config.v1beta1/)
|
||||
and then restart the scheduler.
|
||||
In many cases, the configuration file can be found at `/etc/kubernetes/config/kube-scheduler.yaml`.
|
||||
-->
|
||||
要修改这个值,编辑 [kube-scheduler 的配置文件](/zh/docs/reference/config-api/kube-scheduler-config.v1beta1/),
|
||||
之后重启调度器。
|
||||
在很多场合下,配置文件位于 `/etc/kubernetes/config/kube-scheduler.yaml`。
|
||||
In many cases, the configuration file can be found at `/etc/kubernetes/config/kube-scheduler.yaml`
|
||||
-->
|
||||
要修改这个值,先编辑 [kube-scheduler 的配置文件](/zh/docs/reference/config-api/kube-scheduler-config.v1beta1/)
|
||||
然后重启调度器。
|
||||
大多数情况下,这个配置文件是 `/etc/kubernetes/config/kube-scheduler.yaml`。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
After you have made this change, you can run
|
||||
-->
|
||||
修改完成后,你可以执行
|
||||
|
@ -96,17 +98,17 @@ After you have made this change, you can run
|
|||
kubectl get pods -n kube-system | grep kube-scheduler
|
||||
```
|
||||
|
||||
<!--
|
||||
<!--
|
||||
to verify that the kube-scheduler component is healthy.
|
||||
-->
|
||||
来检查该 kube-scheduler 组件是否健康。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
## Node scoring threshold {#percentage-of-nodes-to-score}
|
||||
-->
|
||||
## 节点打分阈值 {#percentage-of-nodes-to-score}
|
||||
|
||||
<!--
|
||||
<!--
|
||||
To improve scheduling performance, the kube-scheduler can stop looking for
|
||||
feasible nodes once it has found enough of them. In large clusters, this saves
|
||||
time compared to a naive approach that would consider every node.
|
||||
|
@ -114,7 +116,7 @@ time compared to a naive approach that would consider every node.
|
|||
要提升调度性能,kube-scheduler 可以在找到足够的可调度节点之后停止查找。
|
||||
在大规模集群中,比起考虑每个节点的简单方法相比可以节省时间。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
You specify a threshold for how many nodes are enough, as a whole number percentage
|
||||
of all the nodes in your cluster. The kube-scheduler converts this into an
|
||||
integer number of nodes. During scheduling, if the kube-scheduler has identified
|
||||
|
@ -122,24 +124,24 @@ enough feasible nodes to exceed the configured percentage, the kube-scheduler
|
|||
stops searching for more feasible nodes and moves on to the
|
||||
[scoring phase](/docs/concepts/scheduling-eviction/kube-scheduler/#kube-scheduler-implementation).
|
||||
-->
|
||||
你可以使用整个集群节点总数的百分比作为阈值来指定需要多少节点就足够。
|
||||
你可以使用整个集群节点总数的百分比作为阈值来指定需要多少节点就足够。
|
||||
kube-scheduler 会将它转换为节点数的整数值。在调度期间,如果
|
||||
kube-scheduler 已确认的可调度节点数足以超过了配置的百分比数量,
|
||||
kube-scheduler 将停止继续查找可调度节点并继续进行
|
||||
[打分阶段](/zh/docs/concepts/scheduling-eviction/kube-scheduler/#kube-scheduler-implementation)。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
[How the scheduler iterates over Nodes](#how-the-scheduler-iterates-over-nodes)
|
||||
describes the process in detail.
|
||||
-->
|
||||
[调度器如何遍历节点](#how-the-scheduler-iterates-over-nodes) 详细介绍了这个过程。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
### Default threshold
|
||||
-->
|
||||
### 默认阈值
|
||||
|
||||
<!--
|
||||
<!--
|
||||
If you don't specify a threshold, Kubernetes calculates a figure using a
|
||||
linear formula that yields 50% for a 100-node cluster and yields 10%
|
||||
for a 5000-node cluster. The lower bound for the automatic value is 5%.
|
||||
|
@ -147,20 +149,20 @@ for a 5000-node cluster. The lower bound for the automatic value is 5%.
|
|||
如果你不指定阈值,Kubernetes 使用线性公式计算出一个比例,在 100-节点集群
|
||||
下取 50%,在 5000-节点的集群下取 10%。这个自动设置的参数的最低值是 5%。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
This means that, the kube-scheduler always scores at least 5% of your cluster no
|
||||
matter how large the cluster is, unless you have explicitly set
|
||||
`percentageOfNodesToScore` to be smaller than 5.
|
||||
-->
|
||||
这意味着,调度器至少会对集群中 5% 的节点进行打分,除非用户将该参数设置的低于 5。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
If you want the scheduler to score all nodes in your cluster, set
|
||||
`percentageOfNodesToScore` to 100.
|
||||
-->
|
||||
如果你想让调度器对集群内所有节点进行打分,则将 `percentageOfNodesToScore` 设置为 100。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
## Example
|
||||
-->
|
||||
## 示例
|
||||
|
@ -189,15 +191,15 @@ percentageOfNodesToScore: 50
|
|||
<!--
|
||||
`percentageOfNodesToScore` must be a value between 1 and 100 with the default
|
||||
value being calculated based on the cluster size. There is also a hardcoded
|
||||
minimum value of 50 nodes.
|
||||
minimum value of 50 nodes.
|
||||
-->
|
||||
`percentageOfNodesToScore` 的值必须在 1 到 100 之间,而且其默认值是通过集群的规模计算得来的。
|
||||
另外,还有一个 50 个 Node 的最小值是硬编码在程序中。
|
||||
|
||||
<!--
|
||||
In clusters with less than 50 feasible nodes, the scheduler still
|
||||
{{< note >}} In clusters with less than 50 feasible nodes, the scheduler still
|
||||
checks all the nodes because there are not enough feasible nodes to stop
|
||||
the scheduler's search early.
|
||||
the scheduler's search early.
|
||||
|
||||
In a small cluster, if you set a low value for `percentageOfNodesToScore`, your
|
||||
change will have no or little effect, for a similar reason.
|
||||
|
@ -294,8 +296,8 @@ After going over all the Nodes, it goes back to Node 1.
|
|||
-->
|
||||
在评估完所有 Node 后,将会返回到 Node 1,从头开始。
|
||||
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
* 查阅 [kube-scheduler 配置参考 (v1beta1)](/zh/docs/reference/config-api/kube-scheduler-config.v1beta1/)
|
||||
<!-- * Check the [kube-scheduler configuration reference (v1beta1)](/docs/reference/config-api/kube-scheduler-config.v1beta1/) -->
|
||||
|
||||
* 参见 [kube-scheduler 配置参考 (v1beta1)](/zh/docs/reference/config-api/kube-scheduler-config.v1beta1/)
|
||||
|
|
|
@ -1,15 +1,17 @@
|
|||
---
|
||||
title: 调度框架
|
||||
content_type: concept
|
||||
weight: 70
|
||||
weight: 90
|
||||
---
|
||||
|
||||
<!--
|
||||
---
|
||||
reviewers:
|
||||
- ahg-g
|
||||
title: Scheduling Framework
|
||||
content_type: concept
|
||||
weight: 70
|
||||
weight: 90
|
||||
---
|
||||
-->
|
||||
|
||||
<!-- overview -->
|
||||
|
@ -17,17 +19,15 @@ weight: 70
|
|||
{{< feature-state for_k8s_version="1.15" state="alpha" >}}
|
||||
|
||||
<!--
|
||||
The scheduling framework is a plugable architecture for the Kubernetes Scheduler.
|
||||
It adds a new set of "plugin" APIs to the existing scheduler. Plugins are compiled
|
||||
into the scheduler. The APIs allow most scheduling features to be implemented as
|
||||
plugins, while keeping the
|
||||
scheduling "core" simple and maintainable. Refer to the [design proposal of the
|
||||
The scheduling framework is a pluggable architecture for the Kubernetes scheduler.
|
||||
It adds a new set of "plugin" APIs to the existing scheduler. Plugins are compiled into the scheduler. The APIs allow most scheduling features to be implemented as plugins, while keeping the
|
||||
scheduling "core" lightweight and maintainable. Refer to the [design proposal of the
|
||||
scheduling framework][kep] for more technical information on the design of the
|
||||
framework.
|
||||
-->
|
||||
调度框架是 Kubernetes 调度器的一种可插入架构。
|
||||
调度框架向现有的调度器增加了一组新的“插件(Plugin)” API。
|
||||
插件被编译到调度器程序中。
|
||||
|
||||
调度框架是面向 Kubernetes 调度器的一种插件架构,
|
||||
它为现有的调度器添加了一组新的“插件” API。插件会被编译到调度器之中。
|
||||
这些 API 允许大多数调度功能以插件的形式实现,同时使调度“核心”保持简单且可维护。
|
||||
请参考[调度框架的设计提案](https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/624-scheduling-framework/README.md)
|
||||
获取框架设计的更多技术信息。
|
||||
|
@ -98,7 +98,7 @@ stateful tasks.
|
|||
-->
|
||||
一个插件可以在多个扩展点处注册,以执行更复杂或有状态的任务。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
{{< figure src="/images/docs/scheduling-framework-extensions.png" title="scheduling framework extension points" >}}
|
||||
-->
|
||||
{{< figure src="/images/docs/scheduling-framework-extensions.png" title="调度框架扩展点" >}}
|
||||
|
@ -163,12 +163,12 @@ tries to make the pod schedulable by preempting other Pods.
|
|||
则其余的插件不会调用。典型的后筛选实现是抢占,试图通过抢占其他 Pod
|
||||
的资源使该 Pod 可以调度。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
### PreScore {#pre-score}
|
||||
-->
|
||||
### 前置评分 {#pre-score}
|
||||
|
||||
<!--
|
||||
<!--
|
||||
These plugins are used to perform "pre-scoring" work, which generates a sharable
|
||||
state for Score plugins to use. If a PreScore plugin returns an error, the
|
||||
scheduling cycle is aborted.
|
||||
|
@ -176,7 +176,7 @@ scheduling cycle is aborted.
|
|||
前置评分插件用于执行 “前置评分” 工作,即生成一个可共享状态供评分插件使用。
|
||||
如果 PreScore 插件返回错误,则调度周期将终止。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
### Score {#scoring}
|
||||
-->
|
||||
### 评分 {#scoring}
|
||||
|
@ -325,17 +325,17 @@ _Permit_ 插件在每个 Pod 调度周期的最后调用,用于防止或延迟
|
|||
将返回调度队列,从而触发 [Unreserve](#unreserve) 插件。
|
||||
|
||||
|
||||
<!--
|
||||
<!--
|
||||
While any plugin can access the list of "waiting" Pods and approve them
|
||||
(see [`FrameworkHandle`](https://git.k8s.io/enhancements/keps/sig-scheduling/624-scheduling-framework#frameworkhandle)), we expect only the permit
|
||||
plugins to approve binding of reserved Pods that are in "waiting" state. Once a Pod
|
||||
is approved, it is sent to the [PreBind](#pre-bind) phase.
|
||||
-->
|
||||
{{< note >}}
|
||||
尽管任何插件可以访问 “等待中” 状态的 Pod 列表并批准它们
|
||||
(参阅 [`FrameworkHandle`](https://git.k8s.io/enhancements/keps/sig-scheduling/624-scheduling-framework#frameworkhandle))。
|
||||
我们希望只有被允许的插件可以批准处于“等待中”状态的预留 Pod 的绑定。
|
||||
一旦 Pod 被批准了,它将进入到[预绑定](#pre-bind) 阶段。
|
||||
尽管任何插件可以访问 “等待中” 状态的 Pod 列表并批准它们
|
||||
(查看 [`FrameworkHandle`](https://git.k8s.io/enhancements/keps/sig-scheduling/624-scheduling-framework#frameworkhandle))。
|
||||
我们期望只有允许插件可以批准处于 “等待中” 状态的预留 Pod 的绑定。
|
||||
一旦 Pod 被批准了,它将发送到[预绑定](#pre-bind) 阶段。
|
||||
{{< /note >}}
|
||||
|
||||
<!--
|
||||
|
@ -444,7 +444,7 @@ type PreFilterPlugin interface {
|
|||
-->
|
||||
# 插件配置
|
||||
|
||||
<!--
|
||||
<!--
|
||||
You can enable or disable plugins in the scheduler configuration. If you are using
|
||||
Kubernetes v1.18 or later, most scheduling
|
||||
[plugins](/docs/reference/scheduling/config/#scheduling-plugins) are in use and
|
||||
|
@ -455,7 +455,7 @@ enabled by default.
|
|||
[插件](/zh/docs/reference/scheduling/config/#scheduling-plugins)
|
||||
都在使用中且默认启用。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
In addition to default plugins, you can also implement your own scheduling
|
||||
plugins and get them configured along with default plugins. You can visit
|
||||
[scheduler-plugins](https://github.com/kubernetes-sigs/scheduler-plugins) for more details.
|
||||
|
@ -464,7 +464,7 @@ plugins and get them configured along with default plugins. You can visit
|
|||
你可以访问[scheduler-plugins](https://github.com/kubernetes-sigs/scheduler-plugins)
|
||||
了解更多信息。
|
||||
|
||||
<!--
|
||||
<!--
|
||||
If you are using Kubernetes v1.18 or later, you can configure a set of plugins as
|
||||
a scheduler profile and then define multiple profiles to fit various kinds of workload.
|
||||
Learn more at [multiple profiles](/docs/reference/scheduling/config/#multiple-profiles).
|
||||
|
@ -472,4 +472,3 @@ Learn more at [multiple profiles](/docs/reference/scheduling/config/#multiple-pr
|
|||
如果你正在使用 Kubernetes v1.18 或更高版本,你可以将一组插件设置为
|
||||
一个调度器配置文件,然后定义不同的配置文件来满足各类工作负载。
|
||||
了解更多关于[多配置文件](/zh/docs/reference/scheduling/config/#multiple-profiles)。
|
||||
|
||||
|
|
|
@ -357,9 +357,9 @@ true. The following taints are built in:
|
|||
the NodeCondition `Ready` being "`False`".
|
||||
* `node.kubernetes.io/unreachable`: Node is unreachable from the node
|
||||
controller. This corresponds to the NodeCondition `Ready` being "`Unknown`".
|
||||
* `node.kubernetes.io/out-of-disk`: Node becomes out of disk.
|
||||
* `node.kubernetes.io/memory-pressure`: Node has memory pressure.
|
||||
* `node.kubernetes.io/disk-pressure`: Node has disk pressure.
|
||||
* `node.kubernetes.io/pid-pressure`: Node has PID pressure.
|
||||
* `node.kubernetes.io/network-unavailable`: Node's network is unavailable.
|
||||
* `node.kubernetes.io/unschedulable`: Node is unschedulable.
|
||||
* `node.cloudprovider.kubernetes.io/uninitialized`: When the kubelet is started
|
||||
|
@ -371,9 +371,9 @@ true. The following taints are built in:
|
|||
|
||||
* `node.kubernetes.io/not-ready`:节点未准备好。这相当于节点状态 `Ready` 的值为 "`False`"。
|
||||
* `node.kubernetes.io/unreachable`:节点控制器访问不到节点. 这相当于节点状态 `Ready` 的值为 "`Unknown`"。
|
||||
* `node.kubernetes.io/out-of-disk`:节点磁盘耗尽。
|
||||
* `node.kubernetes.io/memory-pressure`:节点存在内存压力。
|
||||
* `node.kubernetes.io/disk-pressure`:节点存在磁盘压力。
|
||||
* `node.kubernetes.io/pid-pressure`: 节点的 PID 压力。
|
||||
* `node.kubernetes.io/network-unavailable`:节点网络不可用。
|
||||
* `node.kubernetes.io/unschedulable`: 节点不可调度。
|
||||
* `node.cloudprovider.kubernetes.io/uninitialized`:如果 kubelet 启动时指定了一个 "外部" 云平台驱动,
|
||||
|
@ -486,7 +486,7 @@ breaking.
|
|||
|
||||
* `node.kubernetes.io/memory-pressure`
|
||||
* `node.kubernetes.io/disk-pressure`
|
||||
* `node.kubernetes.io/out-of-disk` (*only for critical pods*)
|
||||
* `node.kubernetes.io/pid-pressure` (1.14 or later)
|
||||
* `node.kubernetes.io/unschedulable` (1.10 or later)
|
||||
* `node.kubernetes.io/network-unavailable` (*host network only*)
|
||||
-->
|
||||
|
@ -498,7 +498,7 @@ DaemonSet 控制器自动为所有守护进程添加如下 `NoSchedule` 容忍
|
|||
|
||||
* `node.kubernetes.io/memory-pressure`
|
||||
* `node.kubernetes.io/disk-pressure`
|
||||
* `node.kubernetes.io/out-of-disk` (*只适合关键 Pod*)
|
||||
* `node.kubernetes.io/pid-pressure` (1.14 或更高版本)
|
||||
* `node.kubernetes.io/unschedulable` (1.10 或更高版本)
|
||||
* `node.kubernetes.io/network-unavailable` (*只适合主机网络配置*)
|
||||
|
||||
|
|
Loading…
Reference in New Issue