website/docs/user-guide/horizontal-pod-autoscaling/index.md

---
---

Horizontal pod autoscaling allows to automatically scale the number of pods
in a replication controller, deployment or replica set based on observed CPU utilization.
In the future also other metrics will be supported.

In this document we explain how this feature works by walking you through an example of enabling horizontal pod autoscaling for the php-apache server.

## Prerequisites

This example requires a running Kubernetes cluster and kubectl in the version at least 1.2.
[Heapster](https://github.com/kubernetes/heapster) monitoring needs to be deployed in the cluster
as horizontal pod autoscaler uses it to collect metrics
(if you followed [getting started on GCE guide](/docs/getting-started-guides/gce),
heapster monitoring will be turned-on by default).

## Step One: Run & expose php-apache server

To demonstrate horizontal pod autoscaler we will use a custom docker image based on php-apache server.
The image can be found [here](/docs/user-guide/horizontal-pod-autoscaling/image).
It defines [index.php](/docs/user-guide/horizontal-pod-autoscaling/image/index.php) page which performs some CPU intensive computations.

First, we will start a deployment running the image and expose it as a service:

```shell
$ kubectl run php-apache --image=gcr.io/google_containers/hpa-example --requests=cpu=200m --expose --port=80
service "php-apache" created
deployment "php-apache" created
```

## Step Two: Create horizontal pod autoscaler

Now that the server is running, we will create the autoscaler using
[kubectl autoscale](https://github.com/kubernetes/kubernetes/blob/{{page.githubbranch}}/docs/user-guide/kubectl/kubectl_autoscale.md).
The following command will create a horizontal pod autoscaler that maintains between 1 and 10 replicas of the Pods
controlled by the php-apache deployment we created in the first step of these instructions.
Roughly speaking, the horizontal autoscaler will increase and decrease the number of replicas
(via the deployment) to maintain an average CPU utilization across all Pods of 50%
(since each pod requests 200 milli-cores by [kubectl run](#kubectl-run), this means average CPU usage of 100 milli-cores).
See [here](https://github.com/kubernetes/kubernetes/blob/{{page.githubbranch}}/docs/design/horizontal-pod-autoscaler.md#autoscaling-algorithm) for more details on the algorithm.

```shell
$ kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10
deployment "php-apache" autoscaled
```

We may check the current status of autoscaler by running:

```shell
$ kubectl get hpa
NAME         REFERENCE                     TARGET    CURRENT   MINPODS   MAXPODS   AGE
php-apache   Deployment/php-apache/scale   50%       0%        1         10        18s

```

Please note that the current CPU consumption is 0% as we are not sending any requests to the server
(the ``CURRENT`` column shows the average across all the pods controlled by the corresponding deployment).

## Step Three: Increase load

Now, we will see how the autoscaler reacts on the increased load on the server.
We will start a container with `busybox` image and an infinite loop of queries to our server inside (please run it in a different terminal):

```shell
$ kubectl run -i --tty load-generator --image=busybox /bin/sh

Hit enter for command prompt

$ while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done
```

We may examine, how CPU load was increased by executing (it usually takes 1 minute):

```shell
$ kubectl get hpa
NAME         REFERENCE                     TARGET    CURRENT   MINPODS   MAXPODS   AGE
php-apache   Deployment/php-apache/scale   50%       305%      1         10        3m

```

In the case presented here, it bumped CPU consumption to 305% of the request.
As a result, the deployment was resized to 7 replicas:

```shell
$ kubectl get deployment php-apache
NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
php-apache   7         7         7            7           19m
```

**Warning!** Sometimes it may take few steps to stabilize the number of replicas.
Since the amount of load is not controlled in any way it may happen that the final number of replicas will
differ from this example. 

## Step Four: Stop load

We will finish our example by stopping the user load.

In the terminal where we created container with `busybox` image we will terminate
infinite ``while`` loop by sending `SIGINT` signal,
which can be done using `<Ctrl> + C` combination.

Then we will verify the result state:

```shell
$ kubectl get hpa
NAME         REFERENCE                     TARGET    CURRENT   MINPODS   MAXPODS   AGE
php-apache   Deployment/php-apache/scale   50%       0%        1         10        11m

$ kubectl get deployment php-apache
NAME         DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
php-apache   1         1         1            1           27m
```

As we see, in the presented case CPU utilization dropped to 0, and the number of replicas dropped to 1.

**Warning!** Sometimes dropping number of replicas may take few steps.

## Appendix: Other possible scenarios

### Creating the autoscaler from a .yaml file

Instead of using `kubectl autoscale` command we can use the [hpa-php-apache.yaml](/docs/user-guide/horizontal-pod-autoscaling/hpa-php-apache.yaml) file, which looks like this:

```yaml
apiVersion: extensions/v1beta1
kind: HorizontalPodAutoscaler
metadata:
  name: php-apache
  namespace: default
spec:
  scaleRef:
    kind: Deployment
    name: php-apache
    subresource: scale
  minReplicas: 1
  maxReplicas: 10
  cpuUtilization:
    targetPercentage: 50
```

We will create the autoscaler by executing the following command:

```shell
$ kubectl create -f docs/user-guide/horizontal-pod-autoscaling/hpa-php-apache.yaml
horizontalpodautoscaler "php-apache" created
```
Initial checkin of v1.1 -- does not build 2016-02-11 00:55:31 +00:00			`---`
			`---`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`Horizontal pod autoscaling allows to automatically scale the number of pods`
			`in a replication controller, deployment or replica set based on observed CPU utilization.`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00			`In the future also other metrics will be supported.`

Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`In this document we explain how this feature works by walking you through an example of enabling horizontal pod autoscaling for the php-apache server.`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
			`## Prerequisites`

Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`This example requires a running Kubernetes cluster and kubectl in the version at least 1.2.`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00			`[Heapster](https://github.com/kubernetes/heapster) monitoring needs to be deployed in the cluster`
			`as horizontal pod autoscaler uses it to collect metrics`
Moving docs from /vX to /docs; we will use branches for old versions. 2016-02-29 23:17:22 +00:00			`(if you followed [getting started on GCE guide](/docs/getting-started-guides/gce),`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00			`heapster monitoring will be turned-on by default).`

			`## Step One: Run & expose php-apache server`

			`To demonstrate horizontal pod autoscaler we will use a custom docker image based on php-apache server.`
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`The image can be found [here](/docs/user-guide/horizontal-pod-autoscaling/image).`
Moving docs from /vX to /docs; we will use branches for old versions. 2016-02-29 23:17:22 +00:00			`It defines [index.php](/docs/user-guide/horizontal-pod-autoscaling/image/index.php) page which performs some CPU intensive computations.`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`First, we will start a deployment running the image and expose it as a service:`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
			```shell
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`$ kubectl run php-apache --image=gcr.io/google_containers/hpa-example --requests=cpu=200m --expose --port=80`
			`service "php-apache" created`
			`deployment "php-apache" created`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00			```

			`## Step Two: Create horizontal pod autoscaler`

Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`Now that the server is running, we will create the autoscaler using`
			`[kubectl autoscale](https://github.com/kubernetes/kubernetes/blob/{{page.githubbranch}}/docs/user-guide/kubectl/kubectl_autoscale.md).`
			`The following command will create a horizontal pod autoscaler that maintains between 1 and 10 replicas of the Pods`
			`controlled by the php-apache deployment we created in the first step of these instructions.`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00			`Roughly speaking, the horizontal autoscaler will increase and decrease the number of replicas`
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`(via the deployment) to maintain an average CPU utilization across all Pods of 50%`
			`(since each pod requests 200 milli-cores by [kubectl run](#kubectl-run), this means average CPU usage of 100 milli-cores).`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00			`See [here](https://github.com/kubernetes/kubernetes/blob/{{page.githubbranch}}/docs/design/horizontal-pod-autoscaler.md#autoscaling-algorithm) for more details on the algorithm.`

			```shell
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`$ kubectl autoscale deployment php-apache --cpu-percent=50 --min=1 --max=10`
			`deployment "php-apache" autoscaled`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00			```

			`We may check the current status of autoscaler by running:`

			```shell
			`$ kubectl get hpa`
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`NAME REFERENCE TARGET CURRENT MINPODS MAXPODS AGE`
			`php-apache Deployment/php-apache/scale 50% 0% 1 10 18s`

Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00			```

			`Please note that the current CPU consumption is 0% as we are not sending any requests to the server`
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			(the ``CURRENT`` column shows the average across all the pods controlled by the corresponding deployment).
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
			`## Step Three: Increase load`

Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`Now, we will see how the autoscaler reacts on the increased load on the server.`
			We will start a container with `busybox` image and an infinite loop of queries to our server inside (please run it in a different terminal):
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
			```shell
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`$ kubectl run -i --tty load-generator --image=busybox /bin/sh`

			`Hit enter for command prompt`

			`$ while true; do wget -q -O- http://php-apache.default.svc.cluster.local; done`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00			```

Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`We may examine, how CPU load was increased by executing (it usually takes 1 minute):`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
			```shell
			`$ kubectl get hpa`
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`NAME REFERENCE TARGET CURRENT MINPODS MAXPODS AGE`
			`php-apache Deployment/php-apache/scale 50% 305% 1 10 3m`

Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00			```

			`In the case presented here, it bumped CPU consumption to 305% of the request.`
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`As a result, the deployment was resized to 7 replicas:`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
			```shell
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`$ kubectl get deployment php-apache`
			`NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE`
			`php-apache 7 7 7 7 19m`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00			```

Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`Warning! Sometimes it may take few steps to stabilize the number of replicas.`
			`Since the amount of load is not controlled in any way it may happen that the final number of replicas will`
			`differ from this example.`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`## Step Four: Stop load`

			`We will finish our example by stopping the user load.`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			In the terminal where we created container with `busybox` image we will terminate
			infinite ``while`` loop by sending `SIGINT` signal,
			which can be done using `<Ctrl> + C` combination.

			`Then we will verify the result state:`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
			```shell
			`$ kubectl get hpa`
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`NAME REFERENCE TARGET CURRENT MINPODS MAXPODS AGE`
			`php-apache Deployment/php-apache/scale 50% 0% 1 10 11m`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`$ kubectl get deployment php-apache`
			`NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE`
			`php-apache 1 1 1 1 27m`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00			```

Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`As we see, in the presented case CPU utilization dropped to 0, and the number of replicas dropped to 1.`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`Warning! Sometimes dropping number of replicas may take few steps.`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`## Appendix: Other possible scenarios`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`### Creating the autoscaler from a .yaml file`

			Instead of using `kubectl autoscale` command we can use the [hpa-php-apache.yaml](/docs/user-guide/horizontal-pod-autoscaling/hpa-php-apache.yaml) file, which looks like this:
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00
Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			```yaml
			`apiVersion: extensions/v1beta1`
			`kind: HorizontalPodAutoscaler`
			`metadata:`
			`name: php-apache`
			`namespace: default`
			`spec:`
			`scaleRef:`
			`kind: Deployment`
			`name: php-apache`
			`subresource: scale`
			`minReplicas: 1`
			`maxReplicas: 10`
			`cpuUtilization:`
			`targetPercentage: 50`
Giving up the dream of masterdocs under _includes 2016-02-26 11:54:48 +00:00			```

Updated HPA user guide and example 2016-03-15 14:28:17 +00:00			`We will create the autoscaler by executing the following command:`

			```shell
			`$ kubectl create -f docs/user-guide/horizontal-pod-autoscaling/hpa-php-apache.yaml`
			`horizontalpodautoscaler "php-apache" created`
			```