doc: capture device-plugin stricter workflow ordering explicitly
Based on kubelet device manager refactoring done in 1.25 release, there is stricter ordering requirements where the device plugin MUST start a gRPC service before registering itself to kubelet. In case this ordering is not followed, the plugin registration will fail. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>pull/36940/head
parent
d05d5df9ad
commit
88ecb0501c
|
@ -87,60 +87,65 @@ spec:
|
||||||
|
|
||||||
The general workflow of a device plugin includes the following steps:
|
The general workflow of a device plugin includes the following steps:
|
||||||
|
|
||||||
* Initialization. During this phase, the device plugin performs vendor specific
|
1. Initialization. During this phase, the device plugin performs vendor-specific
|
||||||
initialization and setup to make sure the devices are in a ready state.
|
initialization and setup to make sure the devices are in a ready state.
|
||||||
|
|
||||||
* The plugin starts a gRPC service, with a Unix socket under host path
|
1. The plugin starts a gRPC service, with a Unix socket under the host path
|
||||||
`/var/lib/kubelet/device-plugins/`, that implements the following interfaces:
|
`/var/lib/kubelet/device-plugins/`, that implements the following interfaces:
|
||||||
|
|
||||||
```gRPC
|
```gRPC
|
||||||
service DevicePlugin {
|
service DevicePlugin {
|
||||||
// GetDevicePluginOptions returns options to be communicated with Device Manager.
|
// GetDevicePluginOptions returns options to be communicated with Device Manager.
|
||||||
rpc GetDevicePluginOptions(Empty) returns (DevicePluginOptions) {}
|
rpc GetDevicePluginOptions(Empty) returns (DevicePluginOptions) {}
|
||||||
|
|
||||||
// ListAndWatch returns a stream of List of Devices
|
// ListAndWatch returns a stream of List of Devices
|
||||||
// Whenever a Device state change or a Device disappears, ListAndWatch
|
// Whenever a Device state change or a Device disappears, ListAndWatch
|
||||||
// returns the new list
|
// returns the new list
|
||||||
rpc ListAndWatch(Empty) returns (stream ListAndWatchResponse) {}
|
rpc ListAndWatch(Empty) returns (stream ListAndWatchResponse) {}
|
||||||
|
|
||||||
// Allocate is called during container creation so that the Device
|
// Allocate is called during container creation so that the Device
|
||||||
// Plugin can run device specific operations and instruct Kubelet
|
// Plugin can run device specific operations and instruct Kubelet
|
||||||
// of the steps to make the Device available in the container
|
// of the steps to make the Device available in the container
|
||||||
rpc Allocate(AllocateRequest) returns (AllocateResponse) {}
|
rpc Allocate(AllocateRequest) returns (AllocateResponse) {}
|
||||||
|
|
||||||
// GetPreferredAllocation returns a preferred set of devices to allocate
|
// GetPreferredAllocation returns a preferred set of devices to allocate
|
||||||
// from a list of available ones. The resulting preferred allocation is not
|
// from a list of available ones. The resulting preferred allocation is not
|
||||||
// guaranteed to be the allocation ultimately performed by the
|
// guaranteed to be the allocation ultimately performed by the
|
||||||
// devicemanager. It is only designed to help the devicemanager make a more
|
// devicemanager. It is only designed to help the devicemanager make a more
|
||||||
// informed allocation decision when possible.
|
// informed allocation decision when possible.
|
||||||
rpc GetPreferredAllocation(PreferredAllocationRequest) returns (PreferredAllocationResponse) {}
|
rpc GetPreferredAllocation(PreferredAllocationRequest) returns (PreferredAllocationResponse) {}
|
||||||
|
|
||||||
// PreStartContainer is called, if indicated by Device Plugin during registeration phase,
|
// PreStartContainer is called, if indicated by Device Plugin during registeration phase,
|
||||||
// before each container start. Device plugin can run device specific operations
|
// before each container start. Device plugin can run device specific operations
|
||||||
// such as resetting the device before making devices available to the container.
|
// such as resetting the device before making devices available to the container.
|
||||||
rpc PreStartContainer(PreStartContainerRequest) returns (PreStartContainerResponse) {}
|
rpc PreStartContainer(PreStartContainerRequest) returns (PreStartContainerResponse) {}
|
||||||
}
|
}
|
||||||
```
|
```
|
||||||
|
|
||||||
{{< note >}}
|
{{< note >}}
|
||||||
Plugins are not required to provide useful implementations for
|
Plugins are not required to provide useful implementations for
|
||||||
`GetPreferredAllocation()` or `PreStartContainer()`. Flags indicating which
|
`GetPreferredAllocation()` or `PreStartContainer()`. Flags indicating
|
||||||
(if any) of these calls are available should be set in the `DevicePluginOptions`
|
the availability of these calls, if any, should be set in the `DevicePluginOptions`
|
||||||
message sent back by a call to `GetDevicePluginOptions()`. The `kubelet` will
|
message sent back by a call to `GetDevicePluginOptions()`. The `kubelet` will
|
||||||
always call `GetDevicePluginOptions()` to see which optional functions are
|
always call `GetDevicePluginOptions()` to see which optional functions are
|
||||||
available, before calling any of them directly.
|
available, before calling any of them directly.
|
||||||
{{< /note >}}
|
{{< /note >}}
|
||||||
|
|
||||||
* The plugin registers itself with the kubelet through the Unix socket at host
|
1. The plugin registers itself with the kubelet through the Unix socket at host
|
||||||
path `/var/lib/kubelet/device-plugins/kubelet.sock`.
|
path `/var/lib/kubelet/device-plugins/kubelet.sock`.
|
||||||
|
|
||||||
* After successfully registering itself, the device plugin runs in serving mode, during which it keeps
|
{{< note >}}
|
||||||
monitoring device health and reports back to the kubelet upon any device state changes.
|
The ordering of the workflow is important. A plugin MUST start serving gRPC
|
||||||
It is also responsible for serving `Allocate` gRPC requests. During `Allocate`, the device plugin may
|
service before registering itself with kubelet for successful registration.
|
||||||
do device-specific preparation; for example, GPU cleanup or QRNG initialization.
|
{{< /note >}}
|
||||||
If the operations succeed, the device plugin returns an `AllocateResponse` that contains container
|
|
||||||
runtime configurations for accessing the allocated devices. The kubelet passes this information
|
1. After successfully registering itself, the device plugin runs in serving mode, during which it keeps
|
||||||
to the container runtime.
|
monitoring device health and reports back to the kubelet upon any device state changes.
|
||||||
|
It is also responsible for serving `Allocate` gRPC requests. During `Allocate`, the device plugin may
|
||||||
|
do device-specific preparation; for example, GPU cleanup or QRNG initialization.
|
||||||
|
If the operations succeed, the device plugin returns an `AllocateResponse` that contains container
|
||||||
|
runtime configurations for accessing the allocated devices. The kubelet passes this information
|
||||||
|
to the container runtime.
|
||||||
|
|
||||||
### Handling kubelet restarts
|
### Handling kubelet restarts
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue