website/docs/admin/disruptions.md

96 lines
4.3 KiB
Markdown
Raw Normal View History

2016-09-16 17:21:21 +00:00
---
assignees:
- mml
---
This guide is for anyone wishing to specify safety constraints on pods or anyone
wishing to write software (typically automation software) that respects those
constraints.
* TOC
{:toc}
## Rationale
Various cluster management operations may voluntarily evict pods. "Voluntary"
means an eviction can be safely delayed for a reasonable period of time. The
principal examples today are draining a node for maintenance or upgrade
(`kubectl drain`), and cluster autoscaling down. In the future the
[rescheduler](https://github.com/kubernetes/kubernetes/blob/master/docs/proposals/rescheduling.md)
may also perform voluntary evictions. By contrast, something like evicting pods
because a node has become unreachable or reports `NotReady`, is not "voluntary."
2016-09-16 17:21:21 +00:00
For voluntary evictions, it can be useful for applications to be able to limit
the number of pods that are down simultaneously. For example, a quorum-based application would
2016-09-16 17:21:21 +00:00
like to ensure that the number of replicas running is never brought below the
number needed for a quorum, even temporarily. Or a web front end might want to
ensure that the number of replicas serving load never falls below a certain
percentage of the total, even briefly. `PodDisruptionBudget` is an API object
that specifies the minimum number or percentage of replicas of a collection that
must be up at a time. Components that wish to evict a pod subject to disruption
budget use the `/eviction` subresource; unlike a regular pod deletion, this
operation may be rejected by the API server if the eviction would cause a
disruption budget to be violated.
## Specifying a PodDisruptionBudget
A `PodDisruptionBudget` has two components: a label selector `selector` to specify the set of
pods to which it applies, and `minAvailable` which is a description of the number of pods from that
set that must still be available after the eviction, i.e. even in the absence
of the evicted pod. `minAvailable` can be either an absolute number or a percentage.
So for example, 100% means no voluntary evictions from the set are permitted. In
2016-09-16 17:21:21 +00:00
typical usage, a single budget would be used for a collection of pods managed by
a controller—for example, the pods in a single ReplicaSet.
Note that a disruption budget does not truly guarantee that the specified
number/percentage of pods will always be up. For example, a node that hosts a
pod from the collection may fail when the collection is at the minimum size
specified in the budget, thus bringing the number of available pods from the
collection below the specified size. The budget can only protect against
voluntary evictions, not all causes of unavailability.
## Requesting an eviction
If you are writing infrastructure software that wants to produce these voluntary
evictions, you will need to use the eviction API. The eviction subresource of a
pod can be thought of as a kind of policy-controlled DELETE operation on the pod
itself. To attempt an eviction (perhaps more REST-precisely, to attempt to
*create* an eviction), you POST an attempted operation. Here's an example:
```json
{
"apiVersion": "policy/v1alpha1",
"kind": "Eviction",
"name": "quux",
"namespace": "default"
}
```
and here is how you would attempt this with `curl`
```bash
curl -v -X POST -H 'Content-type: application/json'
http://127.0.0.1:8080/api/v1/namespaces/default/pods/quux/eviction -d
eviction.json
```
The API can respond in one of three ways.
1. If the eviction is granted, then the pod is deleted just as if you had sent
a `DELETE` request to the pod's URL and you get back `200 OK`.
2. If the current state of affairs wouldn't allow an eviction by the rules set
forth in the budget, you get back `429 Too Many Requests`. This is
typically used for generic rate limiting of *any* requests, but here we mean
that this request isn't allowed *right now* but it may be allowed later.
2016-09-16 17:21:21 +00:00
Currently, callers do not get any `Retry-After` advice, but they may in
future versions.
3. If there is some kind of misconfiguration, like multiple budgets pointing at
the same pod, you will get `500 Internal Server Error`.
For a given eviction request, there are two cases.
1. There is no budget that matches this pod. In this case, the server always
returns `200 OK`.
2. There is at least one budget. In this case, any of the three above responses may
2016-09-16 17:21:21 +00:00
apply.