Merge pull request #50000 from jm-franc/configurable-tolerance-blog
Add HPA 'configurable tolerance' blog post (KEP-4951).pull/50488/head
commit
2247c6c90d
|
@ -0,0 +1,96 @@
|
|||
---
|
||||
layout: blog
|
||||
title: "Kubernetes v1.33: HorizontalPodAutoscaler Configurable Tolerance"
|
||||
slug: kubernetes-1-33-hpa-configurable-tolerance
|
||||
# after the v1.33 release, set a future publication date and remove the draft marker
|
||||
# the release comms team can confirm which date has been assigned
|
||||
#
|
||||
# PRs to remove the draft marker should be opened BEFORE release day
|
||||
draft: true
|
||||
math: true # for formulae
|
||||
date: XXXX-XX-XX
|
||||
author: "Jean-Marc François (Google)"
|
||||
---
|
||||
|
||||
This post describes _configurable tolerance for horizontal Pod autoscaling_,
|
||||
a new alpha feature first available in Kubernetes 1.33.
|
||||
|
||||
## What is it?
|
||||
|
||||
[Horizontal Pod Autoscaling](/docs/tasks/run-application/horizontal-pod-autoscale/)
|
||||
is a well-known Kubernetes feature that allows your workload to
|
||||
automatically resize by adding or removing replicas based on resource
|
||||
utilization.
|
||||
|
||||
Let's say you have a web application running in a Kubernetes cluster with 50
|
||||
replicas. You configure the Horizontal Pod Autoscaler (HPA) to scale based on
|
||||
CPU utilization, with a target of 75% utilization. Now, imagine that the current
|
||||
CPU utilization across all replicas is 90%, which is higher than the desired
|
||||
75%. The HPA will calculate the required number of replicas using the formula:
|
||||
```math
|
||||
desiredReplicas = ceil\left\lceil currentReplicas \times \frac{currentMetricValue}{desiredMetricValue} \right\rceil
|
||||
```
|
||||
|
||||
In this example:
|
||||
```math
|
||||
50 \times (90/75) = 60
|
||||
```
|
||||
|
||||
So, the HPA will increase the number of replicas from 50 to 60 to reduce the
|
||||
load on each pod. Similarly, if the CPU utilization were to drop below 75%, the
|
||||
HPA would scale down the number of replicas accordingly. The Kubernetes
|
||||
documentation provides a
|
||||
[detailed description of the scaling algorithm](https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details).
|
||||
|
||||
In order to avoid replicas being created or deleted whenever a small metric
|
||||
fluctuation occurs, Kubernetes applies a form of hysteresis: it only changes the
|
||||
number of replicas when the current and desired metric values differ by more
|
||||
than 10%. In the example above, since the ratio between the current and desired
|
||||
metric values is \\(90/75\\), or 20% above target, exceeding the 10% tolerance,
|
||||
the scale-up action will proceed.
|
||||
|
||||
This default tolerance of 10% is cluster-wide; in older Kubernetes releases, it
|
||||
could not be fine-tuned. It's a suitable value for most usage, but too coarse
|
||||
for large deployments, where a 10% tolerance represents tens of pods. As a
|
||||
result, the community has long
|
||||
[asked](https://github.com/kubernetes/kubernetes/issues/116984) to be able to
|
||||
tune this value.
|
||||
|
||||
In Kubernetes v1.33, this is now possible.
|
||||
|
||||
## How do I use it?
|
||||
|
||||
After enabling the `HPAConfigurableTolerance`
|
||||
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) in
|
||||
your Kubernetes v1.33 cluster, you can add your desired tolerance for your
|
||||
HorizontalPodAutoscaler object.
|
||||
|
||||
Tolerances appear under the `spec.behavior.scaleDown` and
|
||||
`spec.behavior.scaleUp` fields and can thus be different for scale up and scale
|
||||
down. A typical usage would be to specify a small tolerance on scale up (to
|
||||
react quickly to spikes), but higher on scale down (to avoid adding and removing
|
||||
replicas too quickly in response to small metric fluctuations).
|
||||
|
||||
For example, an HPA with a tolerance of 5% on scale-down, and no tolerance on
|
||||
scale-up, would look like the following:
|
||||
|
||||
```yaml
|
||||
apiVersion: autoscaling/v2
|
||||
kind: HorizontalPodAutoscaler
|
||||
metadata:
|
||||
name: my-app
|
||||
spec:
|
||||
...
|
||||
behavior:
|
||||
scaleDown:
|
||||
tolerance: 0.05
|
||||
scaleUp:
|
||||
tolerance: 0
|
||||
```
|
||||
|
||||
## I want all the details!
|
||||
|
||||
Get all the technical details by reading
|
||||
[KEP-4951](https://github.com/kubernetes/enhancements/tree/master/keps/sig-autoscaling/4951-configurable-hpa-tolerance)
|
||||
and follow [issue 4951](https://github.com/kubernetes/enhancements/issues/4951)
|
||||
to be notified of the feature graduation.
|
Loading…
Reference in New Issue