doc: capture device-plugin stricter workflow ordering explicitly
Based on kubelet device manager refactoring done in 1.25 release, there is stricter ordering requirements where the device plugin MUST start a gRPC service before registering itself to kubelet. In case this ordering is not followed, the plugin registration will fail. Signed-off-by: Swati Sehgal <swsehgal@redhat.com>pull/36940/head
parent
d05d5df9ad
commit
88ecb0501c
|
@ -87,60 +87,65 @@ spec:
|
|||
|
||||
The general workflow of a device plugin includes the following steps:
|
||||
|
||||
* Initialization. During this phase, the device plugin performs vendor specific
|
||||
1. Initialization. During this phase, the device plugin performs vendor-specific
|
||||
initialization and setup to make sure the devices are in a ready state.
|
||||
|
||||
* The plugin starts a gRPC service, with a Unix socket under host path
|
||||
1. The plugin starts a gRPC service, with a Unix socket under the host path
|
||||
`/var/lib/kubelet/device-plugins/`, that implements the following interfaces:
|
||||
|
||||
```gRPC
|
||||
service DevicePlugin {
|
||||
// GetDevicePluginOptions returns options to be communicated with Device Manager.
|
||||
rpc GetDevicePluginOptions(Empty) returns (DevicePluginOptions) {}
|
||||
```gRPC
|
||||
service DevicePlugin {
|
||||
// GetDevicePluginOptions returns options to be communicated with Device Manager.
|
||||
rpc GetDevicePluginOptions(Empty) returns (DevicePluginOptions) {}
|
||||
|
||||
// ListAndWatch returns a stream of List of Devices
|
||||
// Whenever a Device state change or a Device disappears, ListAndWatch
|
||||
// returns the new list
|
||||
rpc ListAndWatch(Empty) returns (stream ListAndWatchResponse) {}
|
||||
// ListAndWatch returns a stream of List of Devices
|
||||
// Whenever a Device state change or a Device disappears, ListAndWatch
|
||||
// returns the new list
|
||||
rpc ListAndWatch(Empty) returns (stream ListAndWatchResponse) {}
|
||||
|
||||
// Allocate is called during container creation so that the Device
|
||||
// Plugin can run device specific operations and instruct Kubelet
|
||||
// of the steps to make the Device available in the container
|
||||
rpc Allocate(AllocateRequest) returns (AllocateResponse) {}
|
||||
// Allocate is called during container creation so that the Device
|
||||
// Plugin can run device specific operations and instruct Kubelet
|
||||
// of the steps to make the Device available in the container
|
||||
rpc Allocate(AllocateRequest) returns (AllocateResponse) {}
|
||||
|
||||
// GetPreferredAllocation returns a preferred set of devices to allocate
|
||||
// from a list of available ones. The resulting preferred allocation is not
|
||||
// guaranteed to be the allocation ultimately performed by the
|
||||
// devicemanager. It is only designed to help the devicemanager make a more
|
||||
// informed allocation decision when possible.
|
||||
rpc GetPreferredAllocation(PreferredAllocationRequest) returns (PreferredAllocationResponse) {}
|
||||
// GetPreferredAllocation returns a preferred set of devices to allocate
|
||||
// from a list of available ones. The resulting preferred allocation is not
|
||||
// guaranteed to be the allocation ultimately performed by the
|
||||
// devicemanager. It is only designed to help the devicemanager make a more
|
||||
// informed allocation decision when possible.
|
||||
rpc GetPreferredAllocation(PreferredAllocationRequest) returns (PreferredAllocationResponse) {}
|
||||
|
||||
// PreStartContainer is called, if indicated by Device Plugin during registeration phase,
|
||||
// before each container start. Device plugin can run device specific operations
|
||||
// such as resetting the device before making devices available to the container.
|
||||
rpc PreStartContainer(PreStartContainerRequest) returns (PreStartContainerResponse) {}
|
||||
}
|
||||
```
|
||||
// PreStartContainer is called, if indicated by Device Plugin during registeration phase,
|
||||
// before each container start. Device plugin can run device specific operations
|
||||
// such as resetting the device before making devices available to the container.
|
||||
rpc PreStartContainer(PreStartContainerRequest) returns (PreStartContainerResponse) {}
|
||||
}
|
||||
```
|
||||
|
||||
{{< note >}}
|
||||
Plugins are not required to provide useful implementations for
|
||||
`GetPreferredAllocation()` or `PreStartContainer()`. Flags indicating which
|
||||
(if any) of these calls are available should be set in the `DevicePluginOptions`
|
||||
message sent back by a call to `GetDevicePluginOptions()`. The `kubelet` will
|
||||
always call `GetDevicePluginOptions()` to see which optional functions are
|
||||
available, before calling any of them directly.
|
||||
{{< /note >}}
|
||||
{{< note >}}
|
||||
Plugins are not required to provide useful implementations for
|
||||
`GetPreferredAllocation()` or `PreStartContainer()`. Flags indicating
|
||||
the availability of these calls, if any, should be set in the `DevicePluginOptions`
|
||||
message sent back by a call to `GetDevicePluginOptions()`. The `kubelet` will
|
||||
always call `GetDevicePluginOptions()` to see which optional functions are
|
||||
available, before calling any of them directly.
|
||||
{{< /note >}}
|
||||
|
||||
* The plugin registers itself with the kubelet through the Unix socket at host
|
||||
1. The plugin registers itself with the kubelet through the Unix socket at host
|
||||
path `/var/lib/kubelet/device-plugins/kubelet.sock`.
|
||||
|
||||
* After successfully registering itself, the device plugin runs in serving mode, during which it keeps
|
||||
monitoring device health and reports back to the kubelet upon any device state changes.
|
||||
It is also responsible for serving `Allocate` gRPC requests. During `Allocate`, the device plugin may
|
||||
do device-specific preparation; for example, GPU cleanup or QRNG initialization.
|
||||
If the operations succeed, the device plugin returns an `AllocateResponse` that contains container
|
||||
runtime configurations for accessing the allocated devices. The kubelet passes this information
|
||||
to the container runtime.
|
||||
{{< note >}}
|
||||
The ordering of the workflow is important. A plugin MUST start serving gRPC
|
||||
service before registering itself with kubelet for successful registration.
|
||||
{{< /note >}}
|
||||
|
||||
1. After successfully registering itself, the device plugin runs in serving mode, during which it keeps
|
||||
monitoring device health and reports back to the kubelet upon any device state changes.
|
||||
It is also responsible for serving `Allocate` gRPC requests. During `Allocate`, the device plugin may
|
||||
do device-specific preparation; for example, GPU cleanup or QRNG initialization.
|
||||
If the operations succeed, the device plugin returns an `AllocateResponse` that contains container
|
||||
runtime configurations for accessing the allocated devices. The kubelet passes this information
|
||||
to the container runtime.
|
||||
|
||||
### Handling kubelet restarts
|
||||
|
||||
|
|
Loading…
Reference in New Issue