Merge pull request #49098 from my-git9/pp-2457

[zh-cn]sync dynamic-resource-allocation.md
pull/49127/head
Kubernetes Prow Robot 2024-12-17 02:56:52 +01:00 committed by GitHub
commit 90c85798be
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 228 additions and 155 deletions

View File

@ -2,6 +2,15 @@
title: 动态资源分配
content_type: concept
weight: 65
api_metadata:
- apiVersion: "resource.k8s.io/v1beta1"
kind: "ResourceClaim"
- apiVersion: "resource.k8s.io/v1beta1"
kind: "ResourceClaimTemplate"
- apiVersion: "resource.k8s.io/v1beta1"
kind: "DeviceClass"
- apiVersion: "resource.k8s.io/v1beta1"
kind: "ResourceSlice"
---
<!--
reviewers:
@ -10,22 +19,21 @@ reviewers:
title: Dynamic Resource Allocation
content_type: concept
weight: 65
api_metadata:
- apiVersion: "resource.k8s.io/v1beta1"
kind: "ResourceClaim"
- apiVersion: "resource.k8s.io/v1beta1"
kind: "ResourceClaimTemplate"
- apiVersion: "resource.k8s.io/v1beta1"
kind: "DeviceClass"
- apiVersion: "resource.k8s.io/v1beta1"
kind: "ResourceSlice"
-->
<!-- overview -->
<!--
Core Dynamic Resource Allocation with structured parameters:
-->
使用结构化参数进行核心动态资源分配:
{{< feature-state feature_gate_name="DynamicResourceAllocation" >}}
<!--
Dynamic Resource Allocation with control plane controller:
-->
使用控制平面控制器进行动态资源分配:
{{< feature-state feature_gate_name="DRAControlPlaneController" >}}
<!--
@ -39,9 +47,6 @@ responsible for tracking and preparing resources, with allocation of
resources handled by Kubernetes via _structured parameters_ (introduced in Kubernetes 1.30).
Different kinds of resources support arbitrary parameters for defining requirements and
initialization.
When a driver provides a _control plane controller_, the driver itself
handles allocation in cooperation with the Kubernetes scheduler.
-->
动态资源分配是一个用于在 Pod 之间和 Pod 内部容器之间请求和共享资源的 API。
它是持久卷 API 针对一般资源的泛化。通常这些资源是 GPU 这类设备。
@ -50,7 +55,15 @@ handles allocation in cooperation with the Kubernetes scheduler.
Kubernetes 通过**结构化参数**(在 Kubernetes 1.30 中引入)处理资源的分配。
不同类别的资源支持任意参数来定义要求和初始化。
当驱动程序提供**控制平面控制器**时,驱动程序本身与 Kubernetes 调度器合作一起处理资源分配。
<!--
Kubernetes v1.26 through to 1.31 included an (alpha) implementation of _classic DRA_,
which is no longer supported. This documentation, which is for Kubernetes
v{{< skew currentVersion >}}, explains the current approach to dynamic resource
allocation within Kubernetes.
-->
Kubernetes v1.26 至 1.31 包含了**经典 DRA** 的Alpha实现该实现已不再支持。
本文档适用于 Kubernetes v{{< skew currentVersion >}},解释了 Kubernetes
中当前的动态资源分配方法。
## {{% heading "prerequisites" %}}
@ -72,10 +85,10 @@ Kubernetes v{{< skew currentVersion >}} 包含用于动态资源分配的集群
## API
<!--
The `resource.k8s.io/v1alpha3`
The `resource.k8s.io/v1beta1`
{{< glossary_tooltip text="API group" term_id="api-group" >}} provides these types:
-->
`resource.k8s.io/v1alpha3`
`resource.k8s.io/v1beta1`
{{< glossary_tooltip text="API 组" term_id="api-group" >}}
提供了以下类型:
@ -117,43 +130,21 @@ DeviceClass
对 ResourceClaim 中某个设备的每个分配请求都必须准确引用一个 DeviceClass。
<!--
PodSchedulingContext
: Used internally by the control plane and resource drivers
to coordinate pod scheduling when ResourceClaims need to be allocated
for a Pod and those ResourceClaims use a control plane controller.
ResourceSlice
: Used with structured parameters to publish information about resources
that are available in the cluster.
-->
PodSchedulingContext
: 供控制平面和资源驱动程序内部使用,
在需要为 Pod 分配 ResourceClaim 且这些 ResourceClaim 使用控制平面控制器时协调 Pod 调度。
ResourceSlice
: 与结构化参数一起使用,以发布有关集群中可用资源的信息。
: 用于 DRA 驱动程序发布关于集群中可用资源的信息。
<!--
The developer of a resource driver decides whether they want to handle
allocation themselves with a control plane controller or instead rely on allocation
through Kubernetes with structured parameters. A
custom controller provides more flexibility, but cluster autoscaling is not
going to work reliably for node-local resources. Structured parameters enable
cluster autoscaling, but might not satisfy all use-cases.
All parameters that select devices are defined in the ResourceClaim and
DeviceClass with in-tree types. Configuration parameters can be embedded there.
Which configuration parameters are valid depends on the DRA driver -- Kubernetes
only passes them through without interpreting them.
-->
资源驱动程序的开发者决定他们是要使用控制平面控制器自己处理资源分配,
还是依赖 Kubernetes 使用结构化参数来处理资源分配。
自定义控制器提供更多的灵活性,但对于节点本地资源,集群自动扩缩可能无法可靠工作。
结构化参数使集群自动扩缩成为可能,但可能无法满足所有使用场景。
<!--
When a driver uses structured parameters, all parameters that select devices
are defined in the ResourceClaim and DeviceClass with in-tree types. Configuration
parameters can be embedded there as arbitrary JSON objects.
-->
当驱动程序使用结构化参数时,所有选择设备的参数都在
ResourceClaim 和 DeviceClass 中以树内类型被定义。
配置参数可以作为任意 JSON 对象嵌入其中。
所有选择设备的参数都在 ResourceClaim 和 DeviceClass 中使用内置类型定义。
其中可以嵌入配置参数。哪些配置参数有效取决于 DRA 驱动程序 —— Kubernetes 只是将它们传递下去而不进行解释。
<!--
The `core/v1` `PodSpec` defines ResourceClaims that are needed for a Pod in a
@ -184,7 +175,7 @@ will get created for this Pod and each container gets access to one of them.
该示例将为此 Pod 创建两个 ResourceClaim 对象,每个容器都可以访问其中一个。
```yaml
apiVersion: resource.k8s.io/v1alpha3
apiVersion: resource.k8s.io/v1beta1
kind: DeviceClass
name: resource.example.com
spec:
@ -192,7 +183,7 @@ spec:
- cel:
expression: device.driver == "resource-driver.example.com"
---
apiVersion: resource.k8s.io/v1alpha2
apiVersion: resource.k8s.io/v1beta1
kind: ResourceClaimTemplate
metadata:
name: large-black-cat-claim-template
@ -238,93 +229,19 @@ spec:
-->
## 调度 {#scheduling}
<!--
### With control plane controller
-->
### 使用控制平面控制器 {#with-control-plane-controller}
<!--
In contrast to native resources (CPU, RAM) and extended resources (managed by a
device plugin, advertised by kubelet), without structured parameters
the scheduler has no knowledge of what
dynamic resources are available in a cluster or how they could be split up to
satisfy the requirements of a specific ResourceClaim. Resource drivers are
responsible for that. They mark ResourceClaims as "allocated" once resources
for it are reserved. This also then tells the scheduler where in the cluster a
ResourceClaim is available.
-->
与原生资源CPU、RAM和扩展资源由设备插件管理并由 kubelet 公布)不同,
如果没有结构化参数,调度器无法知道集群中有哪些动态资源,
也不知道如何将它们拆分以满足特定 ResourceClaim 的要求。
资源驱动程序负责这些任务。
资源驱动程序在为 ResourceClaim 保留资源后将其标记为“已分配Allocated”。
然后告诉调度器集群中可用的 ResourceClaim 的位置。
<!--
When a pod gets scheduled, the scheduler checks all ResourceClaims needed by a Pod and
creates a PodScheduling object where it informs the resource drivers
responsible for those ResourceClaims about nodes that the scheduler considers
suitable for the Pod. The resource drivers respond by excluding nodes that
don't have enough of the driver's resources left. Once the scheduler has that
information, it selects one node and stores that choice in the PodScheduling
object. The resource drivers then allocate their ResourceClaims so that the
resources will be available on that node. Once that is complete, the Pod
gets scheduled.
-->
当 Pod 被调度时,调度器检查 Pod 所需的所有 ResourceClaim并创建一个 PodScheduling 对象,
通知负责这些 ResourceClaim 的资源驱动程序,告知它们调度器认为适合该 Pod 的节点。
资源驱动程序通过排除没有足够剩余资源的节点来响应调度器。
一旦调度器有了这些信息,它就会选择一个节点,并将该选择存储在 PodScheduling 对象中。
然后,资源驱动程序为其分配 ResourceClaim以便资源可用于该节点。
完成后Pod 就会被调度。
<!--
As part of this process, ResourceClaims also get reserved for the
Pod. Currently ResourceClaims can either be used exclusively by a single Pod or
an unlimited number of Pods.
-->
作为此过程的一部分ResourceClaim 会为 Pod 保留。
目前ResourceClaim 可以由单个 Pod 独占使用或不限数量的多个 Pod 使用。
<!--
One key feature is that Pods do not get scheduled to a node unless all of
their resources are allocated and reserved. This avoids the scenario where a Pod
gets scheduled onto one node and then cannot run there, which is bad because
such a pending Pod also blocks all other resources like RAM or CPU that were
set aside for it.
-->
除非 Pod 的所有资源都已分配和保留,否则 Pod 不会被调度到节点,这是一个重要特性。
这避免了 Pod 被调度到一个节点但无法在那里运行的情况,
这种情况很糟糕,因为被挂起 Pod 也会阻塞为其保留的其他资源,如 RAM 或 CPU。
{{< note >}}
<!--
Scheduling of pods which use ResourceClaims is going to be slower because of
the additional communication that is required. Beware that this may also impact
pods that don't use ResourceClaims because only one pod at a time gets
scheduled, blocking API calls are made while handling a pod with
ResourceClaims, and thus scheduling the next pod gets delayed.
-->
由于需要额外的通信,使用 ResourceClaim 的 Pod 的调度将会变慢。
请注意,这也可能会影响不使用 ResourceClaim 的 Pod因为一次仅调度一个
Pod在使用 ResourceClaim 处理 Pod 时会进行阻塞 API 调用,
从而推迟调度下一个 Pod。
{{< /note >}}
<!--
### With structured parameters
-->
### 使用结构化参数 {#with-structured-parameters}
<!--
When a driver uses structured parameters, the scheduler takes over the
responsibility of allocating resources to a ResourceClaim whenever a pod needs
The scheduler is responsible for allocating resources to a ResourceClaim whenever a pod needs
them. It does so by retrieving the full list of available resources from
ResourceSlice objects, tracking which of those resources have already been
allocated to existing ResourceClaims, and then selecting from those resources
that remain.
-->
当驱动程序使用结构化参数时,调度器负责在 Pod 需要资源时为 ResourceClaim 分配资源。
调度器负责在 Pod 需要资源时为 ResourceClaim 分配资源。
通过从 ResourceSlice 对象中检索可用资源的完整列表,
跟踪已分配给现有 ResourceClaim 的资源,然后从剩余的资源中进行选择。
@ -392,18 +309,13 @@ later.
Such a situation can also arise when support for dynamic resource allocation
was not enabled in the scheduler at the time when the Pod got scheduled
(version skew, configuration, feature gate, etc.). kube-controller-manager
detects this and tries to make the Pod runnable by triggering allocation and/or
reserving the required ResourceClaims.
detects this and tries to make the Pod runnable by reserving the required
ResourceClaims. However, this only works if those were allocated by
the scheduler for some other pod.
-->
这种情况也可能发生在 Pod 被调度时调度器中未启用动态资源分配支持的时候(原因可能是版本偏差、配置、特性门控等)。
kube-controller-manager 能够检测到这一点,并尝试通过触发分配和/或预留所需的 ResourceClaim 来使 Pod 可运行。
{{< note >}}
<!--
This only works with resource drivers that don't use structured parameters.
-->
这仅适用于不使用结构化参数的资源驱动程序。
{{< /note >}}
kube-controller-manager 能够检测到这一点,并尝试通过预留所需的一些 ResourceClaim 来使 Pod 可运行。
然而,这只有在这些 ResourceClaim 已经被调度器为其他 Pod 分配的情况下才有效。
<!--
It is better to avoid bypassing the scheduler because a Pod that is assigned to a node
@ -433,34 +345,180 @@ the `.spec.nodeName` field and to use a node selector instead.
-->
你还可以在准入时变更传入的 Pod取消设置 `.spec.nodeName` 字段,并改为使用节点选择算符。
<!--
## Admin access
-->
## 管理性质的访问 {#admin-access}
{{< feature-state feature_gate_name="DRAAdminAccess" >}}
<!--
You can mark a request in a ResourceClaim or ResourceClaimTemplate as having privileged features.
A request with admin access grants access to devices which are in use and
may enable additional permissions when making the device available in a
container:
-->
你可以在 ResourceClaim 或 ResourceClaimTemplate 中标记一个请求为具有特权特性。
具有管理员访问权限的请求可以允许用户访问使用中的设备,并且在将设备提供给容器时可能授权一些额外的访问权限:
```yaml
apiVersion: resource.k8s.io/v1beta1
kind: ResourceClaimTemplate
metadata:
name: large-black-cat-claim-template
spec:
spec:
devices:
requests:
- name: req-0
deviceClassName: resource.example.com
adminAccess: true
```
<!--
If this feature is disabled, the `adminAccess` field will be removed
automatically when creating such a ResourceClaim.
Admin access is a privileged mode which should not be made available to normal
users in a multi-tenant cluster. Cluster administrators can restrict usage of
this feature by installing a validating admission policy similar to the following
example. Cluster administrators need to adapt at least the names and replace
"dra.example.com".
-->
如果此特性被禁用,创建此类 ResourceClaim 时将自动移除 `adminAccess` 字段。
管理性质访问是一种特权模式,在多租户集群中不应该对普通用户开放。
集群管理员可以通过安装类似于以下示例的验证准入策略来限制哪些负载能够使用此特性。
集群管理员至少需要调整 name 属性并将 "dra.example.com" 替换为有意义的值。
<!--
# Permission to use admin access is granted only in namespaces which have the
# "admin-access.dra.example.com" label. Other ways of making that decision are
# also possible.
-->
```yaml
# 仅将管理性质访问权限授予具有 "admin-access.dra.example.com" 标签的命名空间。
# 也可以采用其他方式做出此类决定。
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
name: resourceclaim-policy.dra.example.com
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: ["resource.k8s.io"]
apiVersions: ["v1alpha3", "v1beta1"]
operations: ["CREATE", "UPDATE"]
resources: ["resourceclaims"]
validations:
- expression: '! object.spec.devices.requests.exists(e, has(e.adminAccess) && e.adminAccess)'
reason: Forbidden
messageExpression: '"admin access to devices not enabled"'
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: resourceclaim-binding.dra.example.com
spec:
policyName: resourceclaim-policy.dra.example.com
validationActions: [Deny]
matchResources:
namespaceSelector:
matchExpressions:
- key: admin-access.dra.example.com
operator: DoesNotExist
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicy
metadata:
name: resourceclaimtemplate-policy.dra.example.com
spec:
failurePolicy: Fail
matchConstraints:
resourceRules:
- apiGroups: ["resource.k8s.io"]
apiVersions: ["v1alpha3", "v1beta1"]
operations: ["CREATE", "UPDATE"]
resources: ["resourceclaimtemplates"]
validations:
- expression: '! object.spec.spec.devices.requests.exists(e, has(e.adminAccess) && e.adminAccess)'
reason: Forbidden
messageExpression: '"admin access to devices not enabled"'
---
apiVersion: admissionregistration.k8s.io/v1
kind: ValidatingAdmissionPolicyBinding
metadata:
name: resourceclaimtemplate-binding.dra.example.com
spec:
policyName: resourceclaimtemplate-policy.dra.example.com
validationActions: [Deny]
matchResources:
namespaceSelector:
matchExpressions:
- key: admin-access.dra.example.com
operator: DoesNotExist
```
<!--
## ResourceClaim Device Status
-->
ResourceClaim 设备状态 {#resourceclaim-device-status}
{{< feature-state feature_gate_name="DRAResourceClaimDeviceStatus" >}}
<!--
The drivers can report driver-specific device status data for each allocated device
in a resource claim. For example, IPs assigned to a network interface device can be
reported in the ResourceClaim status.
The drivers setting the status, the accuracy of the information depends on the implementation
of those DRA Drivers. Therefore, the reported status of the device may not always reflect the
real time changes of the state of the device.
When the feature is disabled, that field automatically gets cleared when storing the ResourceClaim.
A ResourceClaim device status is supported when it is possible, from a DRA driver, to update an
existing ResourceClaim where the `status.devices` field is set.
-->
驱动程序可以报告资源申领中各个已分配设备的、特定于驱动程序的设备状态。
例如,可以在 ResourceClaim 状态中报告分配给网络接口设备的 IP。
驱动程序设置状态,信息的准确性取决于 DRA 驱动程序的具体实现。因此,所报告的设备状态可能并不总是反映设备状态的实时变化。
当此特性被禁用时,该字段会在存储 ResourceClaim 时自动清除。
针对一个已经设置了 `status.devices` 字段的现有 ResourceClaim 而言,如果 DRA
驱动能够更新该 ResourceClaim则有可能支持 ResourceClaim 设备状态这一特性。
<!--
## Enabling dynamic resource allocation
-->
## 启用动态资源分配 {#enabling-dynamic-resource-allocation}
<!--
Dynamic resource allocation is an *alpha feature* and only enabled when the
<!--
Dynamic resource allocation is a *beta feature* which is off by default and only enabled when the
`DynamicResourceAllocation` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
and the `resource.k8s.io/v1alpha3` {{< glossary_tooltip text="API group" term_id="api-group" >}}
and the `resource.k8s.io/v1beta1` {{< glossary_tooltip text="API group" term_id="api-group" >}}
are enabled. For details on that, see the `--feature-gates` and `--runtime-config`
[kube-apiserver parameters](/docs/reference/command-line-tools-reference/kube-apiserver/).
kube-scheduler, kube-controller-manager and kubelet also need the feature gate.
-->
动态资源分配是一个 **Alpha 特性**,只有在启用 `DynamicResourceAllocation`
动态资源分配是一个 **Beta 特性**,默认关闭,只有在启用 `DynamicResourceAllocation`
[特性门控](/zh-cn/docs/reference/command-line-tools-reference/feature-gates/)
`resource.k8s.io/v1alpha3`
`resource.k8s.io/v1beta1`
{{< glossary_tooltip text="API 组" term_id="api-group" >}} 时才启用。
有关详细信息,参阅 `--feature-gates``--runtime-config`
[kube-apiserver 参数](/zh-cn/docs/reference/command-line-tools-reference/kube-apiserver/)。
kube-scheduler、kube-controller-manager 和 kubelet 也需要设置该特性门控。
<!--
When a resource driver uses a control plane controller, then the
`DRAControlPlaneController` feature gate has to be enabled in addition to
When a resource driver reports the status of the devices, then the
`DRAResourceClaimDeviceStatus` feature gate has to be enabled in addition to
`DynamicResourceAllocation`.
-->
当资源驱动程序使用控制平面控制器时,除了需要启用 `DynamicResourceAllocation` 外,
还必须启用 `DRAControlPlaneController` 特性门控。
当资源驱动程序报告设备状态时,除了需要启用 `DynamicResourceAllocation` 外,
还必须启用 `DRAResourceClaimDeviceStatus` 特性门控。
<!--
A quick check whether a Kubernetes cluster supports the feature is to list
@ -491,15 +549,6 @@ If not supported, this error is printed instead:
error: the server doesn't have a resource type "deviceclasses"
```
<!--
A control plane controller is supported when it is possible to create a
ResourceClaim where the `spec.controller` field is set. When the
`DRAControlPlaneController` feature is disabled, that field automatically
gets cleared when storing the ResourceClaim.
-->
当可以创建设置了 `spec.controller` 字段的 ResourceClaim 时,控制平面控制器是受支持的。
`DRAControlPlaneController` 特性被禁用时,存储 ResourceClaim 时该字段会自动被清除。
<!--
The default configuration of kube-scheduler enables the "DynamicResources"
plugin if and only if the feature gate is enabled and when using
@ -516,14 +565,38 @@ be installed. Please refer to the driver's documentation for details.
除了在集群中启用该功能外,还必须安装资源驱动程序。
欲了解详细信息,请参阅驱动程序的文档。
<!--
### Enabling admin access
[Admin access](#admin-access) is an *alpha feature* and only enabled when the
`DRAAdminAccess` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
is enabled in the kube-apiserver and kube-scheduler.
-->
### 启用管理性质访问 {#enabling-admin-access}
[管理性质访问](#admin-access) 是一个 **Alpha 级别特性**,仅在 kube-apiserver 和 kube-scheduler
中启用了 `DRAAdminAccess` [特性门控](/zh-cn/docs/reference/command-line-tools-reference/feature-gates/)时才生效。
<!--
### Enabling Device Status
[ResourceClaim Device Status](#resourceclaim-device-status) is an *alpha feature*
and only enabled when the `DRAResourceClaimDeviceStatus`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
is enabled in the kube-apiserver.
-->
### 启用设备状态 {#enabling-device-status}
[ResourceClaim 设备状态](#resourceclaim-device-status) 是一个 **Alpha 级别特性**
仅在 kube-apiserver 中启用了 `DRAResourceClaimDeviceStatus`
[特性门控](/zh-cn/docs/reference/command-line-tools-reference/feature-gates/)时才生效。
## {{% heading "whatsnext" %}}
<!--
- For more information on the design, see the
[Dynamic Resource Allocation with Structured Parameters](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4381-dra-structured-parameters)
and the
[Dynamic Resource Allocation with Control Plane Controller](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/3063-dynamic-resource-allocation/README.md) KEPs.
KEP.
-->
- 了解更多该设计的信息,
参阅[使用结构化参数的动态资源分配 KEP](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4381-dra-structured-parameters)
和[使用控制平面控制器的动态资源分配 KEP](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/3063-dynamic-resource-allocation/README.md)。
参阅[使用结构化参数的动态资源分配 KEP](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/4381-dra-structured-parameters)。