From 3ae0ef3444ae8b2119f58b06060d406fe3446367 Mon Sep 17 00:00:00 2001 From: Marcin Wielgus Date: Thu, 28 Apr 2016 20:30:48 +0200 Subject: [PATCH] Custom metrics in HPA doc --- .../horizontal-pod-autoscaling/index.md | 53 +++++++++++++++++-- 1 file changed, 50 insertions(+), 3 deletions(-) diff --git a/docs/user-guide/horizontal-pod-autoscaling/index.md b/docs/user-guide/horizontal-pod-autoscaling/index.md index c13be2d63d..e07a236516 100644 --- a/docs/user-guide/horizontal-pod-autoscaling/index.md +++ b/docs/user-guide/horizontal-pod-autoscaling/index.md @@ -3,18 +3,17 @@ This document describes the current state of Horizontal Pod Autoscaling in Kubernetes. - ## What is Horizontal Pod Autoscaling? With Horizontal Pod Autoscaling, Kubernetes automatically scales the number of pods -in a replication controller, deployment or replica set based on observed CPU utilization. +in a replication controller, deployment or replica set based on observed CPU utilization +(or, with alpha support, on some other, application-provided metrics). The Horizontal Pod Autoscaler is implemented as a Kubernetes API resource and a controller. The resource determines the behavior of the controller. The controller periodically adjusts the number of replicas in a replication controller or deployment to match the observed average CPU utilization to the target specified by user. - ## How does the Horizontal Pod Autoscaler work? ![Horizontal Pod Autoscaler diagram](/images/docs/horizontal-pod-autoscaler.svg) @@ -76,6 +75,54 @@ i.e. you cannot bind a Horizontal Pod Autoscaler to a replication controller and The reason this doesn't work is that when rolling update creates a new replication controller, the Horizontal Pod Autoscaler will not be bound to the new replication controller. +## Support for custom metrics + +Kubernetes 1.2 adds alpha support for scaling based on application-specific metrics like QPS (queries per second) or average request latency. + +### Prerequisites + +The cluster has to be started with `ENABLE_CUSTOM_METRICS` environment variable set to `true`. + +### Pod configuration + +The pods to be scaled must have cAdvisor-specific custom (aka application) metrics endpoint configured. The configuration format is described [here](https://github.com/google/cadvisor/blob/master/docs/application_metrics.md). Kubernetes expects the configuration to + be placed in `definition.json` mounted via a [config map](/docs/user-guide/horizontal-pod-autoscaling/configmap/) in `/etc/custom-metrics`. A sample config map may look like this: + +```yaml +apiVersion: v1 +kind: ConfigMap +metadata: + name: cm-config +data: + definition.json: "{\"endpoint\" : \"http://localhost:8080/metrics\"}" +``` + +**Warning** +Due to the way cAdvisor currently works `localhost` refers to the node itself, not to the running pod. Thus the appropriate container in the pod must ask for a node port. Example: + +```yaml + ports: + - hostPort: 8080 + containerPort: 8080 +``` + +### Specifying target + +HPA for custom metrics is configured via an annotation. The value in the annotation is interpreted as a target metric value averaged over +all running pods. Example: + +```yaml + annotations: + alpha/target.custom-metrics.podautoscaler.kubernetes.io: '{"items":[{"name":"qps", "value": "10"}]}' +``` + +In this case if there are 4 pods running and each of them reports qps metric to be equal to 15 HPA will start 2 additional pods so there will be 6 pods in total. If there are multiple metrics passed in the annotation or CPU is configured as well then HPA will use the biggest +number of replicas that comes from the calculations. + +At this moment even if target CPU utilization is not specified a default of 80% will be used. +To calculate number of desired replicas based only on custom metrics CPU utilization +target should be set to a very large value (e.g. 100000%). Then CPU-related logic +will want only 1 replica, leaving the decision about higher replica count to cusom metrics (and min/max limits). ## Further reading