diff --git a/content/en/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md b/content/en/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md index 8b1747b857..8f39284a96 100644 --- a/content/en/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md +++ b/content/en/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md @@ -193,9 +193,69 @@ for these devices: // node resources consumed by pods and containers on the node service PodResourcesLister { rpc List(ListPodResourcesRequest) returns (ListPodResourcesResponse) {} + rpc GetAllocatableResources(AllocatableResourcesRequest) returns (AllocatableResourcesResponse) {} } ``` +The `List` endpoint provides information on resources of running pods, with details such as the +id of exclusively allocated CPUs, device id as it was reported by device plugins and id of +the NUMA node where these devices are allocated. + +```gRPC +// ListPodResourcesResponse is the response returned by List function +message ListPodResourcesResponse { + repeated PodResources pod_resources = 1; +} + +// PodResources contains information about the node resources assigned to a pod +message PodResources { + string name = 1; + string namespace = 2; + repeated ContainerResources containers = 3; +} + +// ContainerResources contains information about the resources assigned to a container +message ContainerResources { + string name = 1; + repeated ContainerDevices devices = 2; + repeated int64 cpu_ids = 3; +} + +// Topology describes hardware topology of the resource +message TopologyInfo { + repeated NUMANode nodes = 1; +} + +// NUMA representation of NUMA node +message NUMANode { + int64 ID = 1; +} + +// ContainerDevices contains information about the devices assigned to a container +message ContainerDevices { + string resource_name = 1; + repeated string device_ids = 2; + TopologyInfo topology = 3; +} +``` + +GetAllocatableResources provides information on resources initially available on the worker node. +It provides more information than kubelet exports to APIServer. + +```gRPC +// AllocatableResourcesResponses contains informations about all the devices known by the kubelet +message AllocatableResourcesResponse { + repeated ContainerDevices devices = 1; + repeated int64 cpu_ids = 2; +} + +``` + +`ContainerDevices` do expose the topology information declaring to which NUMA cells the device is affine. +The NUMA cells are identified using a opaque integer ID, which value is consistent to what device +plugins report [when they register themselves to the kubelet](https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#device-plugin-integration-with-the-topology-manager). + + The gRPC service is served over a unix socket at `/var/lib/kubelet/pod-resources/kubelet.sock`. Monitoring agents for device plugin resources can be deployed as a daemon, or as a DaemonSet. The canonical directory `/var/lib/kubelet/pod-resources` requires privileged access, so monitoring @@ -204,7 +264,7 @@ DaemonSet, `/var/lib/kubelet/pod-resources` must be mounted as a {{< glossary_tooltip term_id="volume" >}} in the device monitoring agent's [PodSpec](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#podspec-v1-core). -Support for the "PodResources service" requires `KubeletPodResources` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) to be enabled. +Support for the `PodResourcesLister service` requires `KubeletPodResources` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) to be enabled. It is enabled by default starting with Kubernetes 1.15 and is v1 since Kubernetes 1.20. ## Device Plugin integration with the Topology Manager @@ -256,5 +316,3 @@ Here are some examples of device plugin implementations: * Learn about [advertising extended resources](/docs/tasks/administer-cluster/extended-resource-node/) on a node * Read about using [hardware acceleration for TLS ingress](https://kubernetes.io/blog/2019/04/24/hardware-accelerated-ssl/tls-termination-in-ingress-controllers-using-kubernetes-device-plugins-and-runtimeclass/) with Kubernetes * Learn about the [Topology Manager](/docs/tasks/administer-cluster/topology-manager/) - - diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index a6411f7c83..8caf95ef1c 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -137,6 +137,7 @@ different Kubernetes components. | `LocalStorageCapacityIsolation` | `true` | Beta | 1.10 | | | `LocalStorageCapacityIsolationFSQuotaMonitoring` | `false` | Alpha | 1.15 | | | `LogarithmicScaleDown` | `false` | Alpha | 1.21 | | +| `KubeletPodResourcesGetAllocatable` | `false` | Alpha | 1.21 | | | `MixedProtocolLBService` | `false` | Alpha | 1.20 | | | `NamespaceDefaultLabelName` | `true` | Beta | 1.21 | | | `NetworkPolicyEndPort` | `false` | Alpha | 1.21 | | @@ -675,6 +676,9 @@ Each feature gate is designed for enabling/disabling a specific feature: - `KubeletPodResources`: Enable the kubelet's pod resources gRPC endpoint. See [Support Device Monitoring](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/606-compute-device-assignment/README.md) for more details. +- `KubeletPodResourcesGetAllocatable`: Enable the kubelet's pod resources `GetAllocatableResources` functionality. + This API augments the [resource allocation reporting](https://kubernetes.io/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources) + with informations about the allocatable resources, enabling clients to properly track the free compute resources on a node. - `LegacyNodeRoleBehavior`: When disabled, legacy behavior in service load balancers and node disruption will ignore the `node-role.kubernetes.io/master` label in favor of the feature-specific labels provided by `NodeDisruptionExclusion` and `ServiceNodeExclusion`.