website/docs/user-guide/horizontal-pod-autoscaling/index.md

---
assignees:
- fgrzadkowski
- jszczepkowski
title: Horizontal Pod Autoscaling
---

This document describes the current state of Horizontal Pod Autoscaling in Kubernetes.

## What is Horizontal Pod Autoscaling?

With Horizontal Pod Autoscaling, Kubernetes automatically scales the number of pods
in a replication controller, deployment or replica set based on observed CPU utilization
(or, with alpha support, on some other, application-provided metrics).

The Horizontal Pod Autoscaler is implemented as a Kubernetes API resource and a controller.
The resource determines the behavior of the controller.
The controller periodically adjusts the number of replicas in a replication controller or deployment
to match the observed average CPU utilization to the target specified by user.

## How does the Horizontal Pod Autoscaler work?

![Horizontal Pod Autoscaler diagram](/images/docs/horizontal-pod-autoscaler.svg)

The autoscaler is implemented as a control loop.
It periodically queries CPU utilization for the pods it targets.
(The period of the autoscaler is controlled by `--horizontal-pod-autoscaler-sync-period` flag of controller manager.
The default value is 30 seconds).
Then, it compares the arithmetic mean of the pod's CPU utilization with the target and adjust the number of replicas if needed.

CPU utilization is the recent CPU usage of a pod divided by the sum of CPU requested by the pod's containers.
Please note that if some of the pod's containers do not have CPU request set,
CPU utilization for the pod will not be defined and the autoscaler will not take any action.
Further details of the autoscaling algorithm are given [here](https://github.com/kubernetes/kubernetes/blob/{{page.githubbranch}}/docs/design/horizontal-pod-autoscaler.md#autoscaling-algorithm).

The autoscaler uses heapster to collect CPU utilization.
Therefore, it is required to deploy heapster monitoring in your cluster for autoscaling to work.

The autoscaler accesses corresponding replication controller, deployment or replica set by scale sub-resource.
Scale is an interface which allows to dynamically set the number of replicas and to learn the current state of them.
More details on scale sub-resource can be found [here](https://github.com/kubernetes/kubernetes/blob/{{page.githubbranch}}/docs/design/horizontal-pod-autoscaler.md#scale-subresource).


## API Object

Horizontal Pod Autoscaler is a top-level resource in the Kubernetes REST API.
In Kubernetes 1.2 HPA was graduated from beta to stable (more details about [api versioning](/docs/api/#api-versioning)) with compatibility between versions.
The stable version is available in the `autoscaling/v1` api group whereas the beta vesion is available in the `extensions/v1beta1` api group as before.
The transition plan is to deprecate beta version of HPA in Kubernetes 1.3, and get it rid off completely in Kubernetes 1.4.

**Warning!** Please have in mind that all Kubernetes components still use HPA in `extensions/v1beta1` in Kubernetes 1.2.

More details about the API object can be found at
[HorizontalPodAutoscaler Object](https://github.com/kubernetes/kubernetes/blob/{{page.githubbranch}}/docs/design/horizontal-pod-autoscaler.md#horizontalpodautoscaler-object).

## Support for Horizontal Pod Autoscaler in kubectl

Horizontal Pod Autoscaler, like every API resource, is supported in a standard way by `kubectl`.
We can create a new autoscaler using `kubectl create` command.
We can list autoscalers by `kubectl get hpa` and get detailed description by `kubectl describe hpa`.
Finally, we can delete an autoscaler using `kubectl delete hpa`.

In addition, there is a special `kubectl autoscale` command for easy creation of a Horizontal Pod Autoscaler.
For instance, executing `kubectl autoscale rc foo --min=2 --max=5 --cpu-percent=80`
will create an autoscaler for replication controller *foo*, with target CPU utilization set to `80%`
and the number of replicas between 2 and 5.
The detailed documentation of `kubectl autoscale` can be found [here](/docs/user-guide/kubectl/kubectl_autoscale).


## Autoscaling during rolling update

Currently in Kubernetes, it is possible to perform a [rolling update](/docs/tasks/run-application/rolling-update-replication-controller/) by managing replication controllers directly,
or by using the deployment object, which manages the underlying replication controllers for you.
Horizontal Pod Autoscaler only supports the latter approach: the Horizontal Pod Autoscaler is bound to the deployment object,
it sets the size for the deployment object, and the deployment is responsible for setting sizes of underlying replication controllers.

Horizontal Pod Autoscaler does not work with rolling update using direct manipulation of replication controllers,
i.e. you cannot bind a Horizontal Pod Autoscaler to a replication controller and do rolling update (e.g. using `kubectl rolling-update`).
The reason this doesn't work is that when rolling update creates a new replication controller,
the Horizontal Pod Autoscaler will not be bound to the new replication controller.

## Support for custom metrics

Kubernetes 1.2 adds alpha support for scaling based on application-specific metrics like QPS (queries per second) or average request latency.

### Prerequisites

The cluster has to be started with `ENABLE_CUSTOM_METRICS` environment variable set to `true`.

### Pod configuration

The pods to be scaled must have cAdvisor-specific custom (aka application) metrics endpoint configured. The configuration format is described [here](https://github.com/google/cadvisor/blob/master/docs/application_metrics.md). Kubernetes expects the configuration to 
  be placed in `definition.json` mounted via a [configMap](/docs/user-guide/configmap/) in `/etc/custom-metrics`. A sample config map may look like this:

```yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: cm-config
data:
  definition.json: "{\"endpoint\" : \"http://localhost:8080/metrics\"}"
``` 

**Warning**
Due to the way cAdvisor currently works `localhost` refers to the node itself, not to the running pod. Thus the appropriate container in the pod must ask for a node port. Example:

```yaml
    ports:
    - hostPort: 8080
      containerPort: 8080
```

### Specifying target

HPA for custom metrics is configured via an annotation. The value in the annotation is interpreted as a target metric value averaged over
all running pods. Example: 

```yaml
    annotations:
      alpha/target.custom-metrics.podautoscaler.kubernetes.io: '{"items":[{"name":"qps", "value": "10"}]}'
```

In this case, if there are four pods running and each pod reports a QPS metric of 15 or higher, horizontal pod autoscaling will start two additional pods (for a total of six pods running).

If you specify multiple metrics in your annotation or if you set a target CPU utilization, horizontal pod autoscaling will scale to according to the metric that requires the highest number of replicas.

If you do not specify a target for CPU utilization, Kubernetes defaults to an 80% utilization threshold for horizontal pod autoscaling.

If you want to ensure that horizontal pod autoscaling calculates the number of required replicas based only on custom metrics, you should set the CPU utilization target to a very large value (such as 100000%). As this level of CPU utilization isn't possible, horizontal pod autoscaling will calculate based only on the custom metrics (and min/max limits).


## Further reading

* Design documentation: [Horizontal Pod Autoscaling](https://github.com/kubernetes/kubernetes/blob/{{page.githubbranch}}/docs/design/horizontal-pod-autoscaler.md).
* kubectl autoscale command: [kubectl autoscale](/docs/user-guide/kubectl/kubectl_autoscale).
* Usage example of [Horizontal Pod Autoscaler](/docs/user-guide/horizontal-pod-autoscaling/walkthrough/).
Initial checkin of v1.1 -- does not build 2016-02-11 00:55:31 +00:00			`---`
Adding OWNERS for docs. 2016-07-29 17:36:25 +00:00			`assignees:`
			`- fgrzadkowski`
			`- jszczepkowski`
add explicit titles to docs 2016-12-15 20:16:54 +00:00			`title: Horizontal Pod Autoscaling`
Initial checkin of v1.1 -- does not build 2016-02-11 00:55:31 +00:00			`---`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			`This document describes the current state of Horizontal Pod Autoscaling in Kubernetes.`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			`## What is Horizontal Pod Autoscaling?`

			`With Horizontal Pod Autoscaling, Kubernetes automatically scales the number of pods`
Custom metrics in HPA doc 2016-04-28 18:30:48 +00:00			`in a replication controller, deployment or replica set based on observed CPU utilization`
			`(or, with alpha support, on some other, application-provided metrics).`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			`The Horizontal Pod Autoscaler is implemented as a Kubernetes API resource and a controller.`
			`The resource determines the behavior of the controller.`
Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			`The controller periodically adjusts the number of replicas in a replication controller or deployment`
			`to match the observed average CPU utilization to the target specified by user.`
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			`## How does the Horizontal Pod Autoscaler work?`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			`![Horizontal Pod Autoscaler diagram](/images/docs/horizontal-pod-autoscaler.svg)`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			`The autoscaler is implemented as a control loop.`
			`It periodically queries CPU utilization for the pods it targets.`
			(The period of the autoscaler is controlled by `--horizontal-pod-autoscaler-sync-period` flag of controller manager.
			`The default value is 30 seconds).`
Update index.md 2017-01-17 10:21:05 +00:00			`Then, it compares the arithmetic mean of the pod's CPU utilization with the target and adjust the number of replicas if needed.`
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00
Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			`CPU utilization is the recent CPU usage of a pod divided by the sum of CPU requested by the pod's containers.`
			`Please note that if some of the pod's containers do not have CPU request set,`
			`CPU utilization for the pod will not be defined and the autoscaler will not take any action.`
			`Further details of the autoscaling algorithm are given [here](https://github.com/kubernetes/kubernetes/blob/{{page.githubbranch}}/docs/design/horizontal-pod-autoscaler.md#autoscaling-algorithm).`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			`The autoscaler uses heapster to collect CPU utilization.`
Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			`Therefore, it is required to deploy heapster monitoring in your cluster for autoscaling to work.`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			`The autoscaler accesses corresponding replication controller, deployment or replica set by scale sub-resource.`
Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			`Scale is an interface which allows to dynamically set the number of replicas and to learn the current state of them.`
			`More details on scale sub-resource can be found [here](https://github.com/kubernetes/kubernetes/blob/{{page.githubbranch}}/docs/design/horizontal-pod-autoscaler.md#scale-subresource).`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00

Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			`## API Object`
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			`Horizontal Pod Autoscaler is a top-level resource in the Kubernetes REST API.`
Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			`In Kubernetes 1.2 HPA was graduated from beta to stable (more details about [api versioning](/docs/api/#api-versioning)) with compatibility between versions.`
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			The stable version is available in the `autoscaling/v1` api group whereas the beta vesion is available in the `extensions/v1beta1` api group as before.
			`The transition plan is to deprecate beta version of HPA in Kubernetes 1.3, and get it rid off completely in Kubernetes 1.4.`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			Warning! Please have in mind that all Kubernetes components still use HPA in `extensions/v1beta1` in Kubernetes 1.2.
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00
Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			`More details about the API object can be found at`
			`[HorizontalPodAutoscaler Object](https://github.com/kubernetes/kubernetes/blob/{{page.githubbranch}}/docs/design/horizontal-pod-autoscaler.md#horizontalpodautoscaler-object).`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			`## Support for Horizontal Pod Autoscaler in kubectl`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			Horizontal Pod Autoscaler, like every API resource, is supported in a standard way by `kubectl`.
Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			We can create a new autoscaler using `kubectl create` command.
			We can list autoscalers by `kubectl get hpa` and get detailed description by `kubectl describe hpa`.
			Finally, we can delete an autoscaler using `kubectl delete hpa`.
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			In addition, there is a special `kubectl autoscale` command for easy creation of a Horizontal Pod Autoscaler.
Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			For instance, executing `kubectl autoscale rc foo --min=2 --max=5 --cpu-percent=80`
			will create an autoscaler for replication controller foo, with target CPU utilization set to `80%`
			`and the number of replicas between 2 and 5.`
			The detailed documentation of `kubectl autoscale` can be found [here](/docs/user-guide/kubectl/kubectl_autoscale).
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00

Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			`## Autoscaling during rolling update`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
Move Guide topic: Rolling Updates 2017-03-15 23:53:37 +00:00			`Currently in Kubernetes, it is possible to perform a [rolling update](/docs/tasks/run-application/rolling-update-replication-controller/) by managing replication controllers directly,`
Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			`or by using the deployment object, which manages the underlying replication controllers for you.`
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			`Horizontal Pod Autoscaler only supports the latter approach: the Horizontal Pod Autoscaler is bound to the deployment object,`
Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			`it sets the size for the deployment object, and the deployment is responsible for setting sizes of underlying replication controllers.`
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			`Horizontal Pod Autoscaler does not work with rolling update using direct manipulation of replication controllers,`
			i.e. you cannot bind a Horizontal Pod Autoscaler to a replication controller and do rolling update (e.g. using `kubectl rolling-update`).
Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			`The reason this doesn't work is that when rolling update creates a new replication controller,`
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			`the Horizontal Pod Autoscaler will not be bound to the new replication controller.`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
Custom metrics in HPA doc 2016-04-28 18:30:48 +00:00			`## Support for custom metrics`

			`Kubernetes 1.2 adds alpha support for scaling based on application-specific metrics like QPS (queries per second) or average request latency.`

			`### Prerequisites`

			The cluster has to be started with `ENABLE_CUSTOM_METRICS` environment variable set to `true`.

			`### Pod configuration`

			`The pods to be scaled must have cAdvisor-specific custom (aka application) metrics endpoint configured. The configuration format is described [here](https://github.com/google/cadvisor/blob/master/docs/application_metrics.md). Kubernetes expects the configuration to`
Update index.md 2017-01-17 10:21:05 +00:00			be placed in `definition.json` mounted via a [configMap](/docs/user-guide/configmap/) in `/etc/custom-metrics`. A sample config map may look like this:
Custom metrics in HPA doc 2016-04-28 18:30:48 +00:00
			```yaml
			`apiVersion: v1`
			`kind: ConfigMap`
			`metadata:`
			`name: cm-config`
			`data:`
			`definition.json: "{\"endpoint\" : \"http://localhost:8080/metrics\"}"`
			```

			`Warning`
			Due to the way cAdvisor currently works `localhost` refers to the node itself, not to the running pod. Thus the appropriate container in the pod must ask for a node port. Example:

			```yaml
			`ports:`
			`- hostPort: 8080`
			`containerPort: 8080`
			```

			`### Specifying target`

			`HPA for custom metrics is configured via an annotation. The value in the annotation is interpreted as a target metric value averaged over`
			`all running pods. Example:`

			```yaml
			`annotations:`
			`alpha/target.custom-metrics.podautoscaler.kubernetes.io: '{"items":[{"name":"qps", "value": "10"}]}'`
			```

Fix typo 2016-09-06 13:44:42 +00:00			`In this case, if there are four pods running and each pod reports a QPS metric of 15 or higher, horizontal pod autoscaling will start two additional pods (for a total of six pods running).`
Update sentences based on reviewed comments 2016-09-01 13:51:08 +00:00
			`If you specify multiple metrics in your annotation or if you set a target CPU utilization, horizontal pod autoscaling will scale to according to the metric that requires the highest number of replicas.`

			`If you do not specify a target for CPU utilization, Kubernetes defaults to an 80% utilization threshold for horizontal pod autoscaling.`

			`If you want to ensure that horizontal pod autoscaling calculates the number of required replicas based only on custom metrics, you should set the CPU utilization target to a very large value (such as 100000%). As this level of CPU utilization isn't possible, horizontal pod autoscaling will calculate based only on the custom metrics (and min/max limits).`
Custom metrics in HPA doc 2016-04-28 18:30:48 +00:00
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			`## Further reading`
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00
Surfaces previously hidden topics that had URL conflicts, moves pages to index.md as often as possible 2016-03-29 21:58:48 +00:00			`* Design documentation: [Horizontal Pod Autoscaling](https://github.com/kubernetes/kubernetes/blob/{{page.githubbranch}}/docs/design/horizontal-pod-autoscaler.md).`
HPA: cleanup some nits, based on a readthrough of the docs 2016-06-12 18:56:04 +00:00			`* kubectl autoscale command: [kubectl autoscale](/docs/user-guide/kubectl/kubectl_autoscale).`
Fix self-referencing link in Horizontal Pod Autoscaling doc The link was self-referencing. Linking the walkthrough that has actual example. 2016-06-09 00:04:10 +00:00			`* Usage example of [Horizontal Pod Autoscaler](/docs/user-guide/horizontal-pod-autoscaling/walkthrough/).`