145 lines
6.7 KiB
Markdown
145 lines
6.7 KiB
Markdown
---
|
|
approvers:
|
|
title: Device Plugins
|
|
description: Use the Kubernetes device plugin framework to implement plugins for GPUs, NICs, FPGAs, InfiniBand, and similar resources that require vendor-specific setup.
|
|
---
|
|
|
|
{% include feature-state-alpha.md %}
|
|
|
|
{% capture overview %}
|
|
Starting in version 1.8, Kubernetes provides a
|
|
[device plugin framework](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/resource-management/device-plugin.md)
|
|
for vendors to advertise their resources to the kubelet without changing Kubernetes core code.
|
|
Instead of writing custom Kubernetes code, vendors can implement a device plugin that can
|
|
be deployed manually or as a DaemonSet. The targeted devices include GPUs,
|
|
High-performance NICs, FPGAs, InfiniBand, and other similar computing resources
|
|
that may require vendor specific initialization and setup.
|
|
{% endcapture %}
|
|
|
|
{% capture body %}
|
|
|
|
## Device plugin registration
|
|
|
|
The device plugins feature is gated by the `DevicePlugins` feature gate and is disabled by default.
|
|
When the device plugins feature is enabled, the kubelet exports a `Registration` gRPC service:
|
|
|
|
```gRPC
|
|
service Registration {
|
|
rpc Register(RegisterRequest) returns (Empty) {}
|
|
}
|
|
```
|
|
A device plugin can register itself with the kubelet through this gRPC service.
|
|
During the registration, the device plugin needs to send:
|
|
|
|
* The name of its Unix socket.
|
|
* The Device Plugin API version against which it was built.
|
|
* The `ResourceName` it wants to advertise. Here `ResourceName` needs to follow the
|
|
[extended resource naming scheme](https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#extended-resources)
|
|
as `vendor-domain/resource`.
|
|
For example, an Nvidia GPU is advertised as `nvidia.com/gpu`.
|
|
|
|
Following a successful registration, the device plugin sends the kubelet the
|
|
list of devices it manages, and the kubelet is then in charge of advertising those
|
|
resources to the API server as part of the kubelet node status update.
|
|
For example, after a device plugin registers `vendor-domain/foo` with the kubelet
|
|
and reports two healthy devices on a node, the node status is updated
|
|
to advertise 2 `vendor-domain/foo`.
|
|
|
|
Then, users can request devices in a
|
|
[Container](/docs/api-reference/{{page.version}}/#container-v1-core)
|
|
specification as they request other types of resources, with the following limitations:
|
|
* Extended resources are only supported as integer resources and cannot be overcommitted.
|
|
* Devices cannot be shared among Containers.
|
|
|
|
Suppose a Kubernetes cluster is running a device plugin that advertises resource `vendor-domain/resource`
|
|
on certain nodes, here is an example user pod requesting this resource:
|
|
|
|
```yaml
|
|
apiVersion: v1
|
|
kind: Pod
|
|
metadata:
|
|
name: demo-pod
|
|
spec:
|
|
containers:
|
|
-
|
|
name: demo-container-1
|
|
image: gcr.io/google_containers/pause:2.0
|
|
resources:
|
|
limits:
|
|
vendor-domain/resource: 2 # requesting 2 vendor-domain/resource
|
|
```
|
|
|
|
## Device plugin implementation
|
|
|
|
The general workflow of a device plugin includes the following steps:
|
|
|
|
* Initialization. During this phase, the device plugin performs vendor specific
|
|
initialization and setup to make sure the devices are in a ready state.
|
|
|
|
* The plugin starts a gRPC service, with a Unix socket under host path
|
|
`/var/lib/kubelet/device-plugins/`, that implements the following interfaces:
|
|
|
|
```gRPC
|
|
service DevicePlugin {
|
|
// ListAndWatch returns a stream of List of Devices
|
|
// Whenever a Device state change or a Device disappears, ListAndWatch
|
|
// returns the new list
|
|
rpc ListAndWatch(Empty) returns (stream ListAndWatchResponse) {}
|
|
|
|
// Allocate is called during container creation so that the Device
|
|
// Plugin can run device specific operations and instruct Kubelet
|
|
// of the steps to make the Device available in the container
|
|
rpc Allocate(AllocateRequest) returns (AllocateResponse) {}
|
|
}
|
|
```
|
|
|
|
* The plugin registers itself with the kubelet through the Unix socket at host
|
|
path `/var/lib/kubelet/device-plugins/kubelet.sock`.
|
|
|
|
* After successfully registering itself, the device plugin runs in serving mode, during which it keeps
|
|
monitoring device health and reports back to the kubelet upon any device state changes.
|
|
It is also responsible for serving `Allocate` gRPC requests. During `Allocate`, the device plugin may
|
|
do device-specific preparation; for example, GPU cleanup or QRNG initialization.
|
|
If the operations succeed, the device plugin returns an `AllocateResponse` that contains container
|
|
runtime configurations for accessing the allocated devices. The kubelet passes this information
|
|
to the container runtime.
|
|
|
|
A device plugin is expected to detect kubelet restarts and re-register itself with the new
|
|
kubelet instance. In the current implementation, a new kubelet instance deletes all the existing Unix sockets
|
|
under `/var/lib/kubelet/device-plugins` when it starts. A device plugin can monitor the deletion
|
|
of its Unix socket and re-register itself upon such an event.
|
|
|
|
## Device plugin deployment
|
|
|
|
A device plugin can be deployed manually or as a DaemonSet. Being deployed as a DaemonSet has
|
|
the benefit that Kubernetes can restart the device plugin if it fails.
|
|
Otherwise, an extra mechanism is needed to recover from device plugin failures.
|
|
The canonical directory `/var/lib/kubelet/device-plugins` requires privileged access,
|
|
so a device plugin must run in a privileged security context.
|
|
If a device plugin is running as a DaemonSet, `/var/lib/kubelet/device-plugins`
|
|
must be mounted as a
|
|
[Volume](/docs/api-reference/{{page.version}}/#volume-v1-core)
|
|
in the plugin's
|
|
[PodSpec](/docs/api-reference/{{page.version}}/#podspec-v1-core).
|
|
|
|
Kubernetes device plugin support is still in alpha. As development continues, its API version can
|
|
change in incompatible ways. We recommend that device plugin developers do the following:
|
|
* Watch for changes in future releases.
|
|
* Support multiple versions of the device plugin API for backward/forward compatibility.
|
|
|
|
If you enable the DevicePlugins feature and run device plugins on nodes that need to be upgraded to
|
|
a Kubernetes release with a newer device plugin API version, upgrade your device plugins
|
|
to support both versions before upgrading these nodes to
|
|
ensure the continuous functioning of the device allocations during the upgrade.
|
|
|
|
## Examples
|
|
|
|
For examples of device plugin implementations, see:
|
|
* The official [NVIDIA GPU device plugin](https://github.com/NVIDIA/k8s-device-plugin)
|
|
* it requires using [nvidia-docker 2.0](https://github.com/NVIDIA/nvidia-docker) which allows you to run GPU enabled docker containers
|
|
* The [NVIDIA GPU device plugin for COS base OS](https://github.com/GoogleCloudPlatform/container-engine-accelerators/tree/master/cmd/nvidia_gpu).
|
|
|
|
{% endcapture %}
|
|
|
|
{% include templates/concept.md %}
|