Custom metrics in HPA doc

pull/135/head
Marcin Wielgus 2016-04-28 20:30:48 +02:00
parent 327aae6b65
commit 3ae0ef3444
1 changed files with 50 additions and 3 deletions

View File

@ -3,18 +3,17 @@
This document describes the current state of Horizontal Pod Autoscaling in Kubernetes. This document describes the current state of Horizontal Pod Autoscaling in Kubernetes.
## What is Horizontal Pod Autoscaling? ## What is Horizontal Pod Autoscaling?
With Horizontal Pod Autoscaling, Kubernetes automatically scales the number of pods With Horizontal Pod Autoscaling, Kubernetes automatically scales the number of pods
in a replication controller, deployment or replica set based on observed CPU utilization. in a replication controller, deployment or replica set based on observed CPU utilization
(or, with alpha support, on some other, application-provided metrics).
The Horizontal Pod Autoscaler is implemented as a Kubernetes API resource and a controller. The Horizontal Pod Autoscaler is implemented as a Kubernetes API resource and a controller.
The resource determines the behavior of the controller. The resource determines the behavior of the controller.
The controller periodically adjusts the number of replicas in a replication controller or deployment The controller periodically adjusts the number of replicas in a replication controller or deployment
to match the observed average CPU utilization to the target specified by user. to match the observed average CPU utilization to the target specified by user.
## How does the Horizontal Pod Autoscaler work? ## How does the Horizontal Pod Autoscaler work?
![Horizontal Pod Autoscaler diagram](/images/docs/horizontal-pod-autoscaler.svg) ![Horizontal Pod Autoscaler diagram](/images/docs/horizontal-pod-autoscaler.svg)
@ -76,6 +75,54 @@ i.e. you cannot bind a Horizontal Pod Autoscaler to a replication controller and
The reason this doesn't work is that when rolling update creates a new replication controller, The reason this doesn't work is that when rolling update creates a new replication controller,
the Horizontal Pod Autoscaler will not be bound to the new replication controller. the Horizontal Pod Autoscaler will not be bound to the new replication controller.
## Support for custom metrics
Kubernetes 1.2 adds alpha support for scaling based on application-specific metrics like QPS (queries per second) or average request latency.
### Prerequisites
The cluster has to be started with `ENABLE_CUSTOM_METRICS` environment variable set to `true`.
### Pod configuration
The pods to be scaled must have cAdvisor-specific custom (aka application) metrics endpoint configured. The configuration format is described [here](https://github.com/google/cadvisor/blob/master/docs/application_metrics.md). Kubernetes expects the configuration to
be placed in `definition.json` mounted via a [config map](/docs/user-guide/horizontal-pod-autoscaling/configmap/) in `/etc/custom-metrics`. A sample config map may look like this:
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: cm-config
data:
definition.json: "{\"endpoint\" : \"http://localhost:8080/metrics\"}"
```
**Warning**
Due to the way cAdvisor currently works `localhost` refers to the node itself, not to the running pod. Thus the appropriate container in the pod must ask for a node port. Example:
```yaml
ports:
- hostPort: 8080
containerPort: 8080
```
### Specifying target
HPA for custom metrics is configured via an annotation. The value in the annotation is interpreted as a target metric value averaged over
all running pods. Example:
```yaml
annotations:
alpha/target.custom-metrics.podautoscaler.kubernetes.io: '{"items":[{"name":"qps", "value": "10"}]}'
```
In this case if there are 4 pods running and each of them reports qps metric to be equal to 15 HPA will start 2 additional pods so there will be 6 pods in total. If there are multiple metrics passed in the annotation or CPU is configured as well then HPA will use the biggest
number of replicas that comes from the calculations.
At this moment even if target CPU utilization is not specified a default of 80% will be used.
To calculate number of desired replicas based only on custom metrics CPU utilization
target should be set to a very large value (e.g. 100000%). Then CPU-related logic
will want only 1 replica, leaving the decision about higher replica count to cusom metrics (and min/max limits).
## Further reading ## Further reading