docs: Scheduler resource tracking metrics
In 1.20 an optional metrics endpoint is added to the scheduler that allows administrators to see the requested pod resources and the imposed pod limits as metrics to better understand the capacity and usage of the system under the pod model. Describe how these metrics can be seen under cluster-administration.pull/24910/head
parent
c0eaf6940a
commit
bd1a75b6c8
|
@ -129,6 +129,28 @@ cloudprovider_gce_api_request_duration_seconds { request = "detach_disk"}
|
|||
cloudprovider_gce_api_request_duration_seconds { request = "list_disk"}
|
||||
```
|
||||
|
||||
|
||||
### kube-scheduler metrics
|
||||
|
||||
{{< feature-state for_k8s_version="v1.20" state="alpha" >}}
|
||||
|
||||
The scheduler exposes optional metrics that reports the requested resources and the desired limits of all running pods. These metrics can be used to build capacity planning dashboards, assess current or historical scheduling limits, quickly identify workloads that cannot schedule due to lack of resources, and compare actual usage to the pod's request.
|
||||
|
||||
The kube-scheduler identifies the resource [requests and limits](/docs/concepts/configuration/manage-resources-containers/) configured for each Pod; when either a request or limit is non-zero, the kube-scheduler reports a metrics timeseries. The time series is labelled by:
|
||||
- namespace
|
||||
- pod name
|
||||
- the node where the pod is scheduled or an empty string if not yet scheduled
|
||||
- priority
|
||||
- the assigned scheduler for that pod
|
||||
- the name of the resource (for example, `cpu`)
|
||||
- the unit of the resource if known (for example, `cores`)
|
||||
|
||||
Once a pod reaches completion (has a `restartPolicy` of `Never` or `OnFailure` and is in the `Succeeded` or `Failed` pod phase, or has been deleted and all containers have a terminated state) the series is no longer reported since the scheduler is now free to schedule other pods to run. The two metrics are called `kube_pod_resource_request` and `kube_pod_resource_limit`.
|
||||
|
||||
The metrics are exposed at the HTTP endpoint `/metrics/resources` and require the same authorization as the `/metrics`
|
||||
endpoint on the scheduler. You must use the `--show-hidden-metrics-for-version=1.20` flag to expose these alpha stability metrics.
|
||||
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
* Read about the [Prometheus text format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#text-based-format) for metrics
|
||||
|
|
Loading…
Reference in New Issue