2016-11-24 23:22:19 +00:00
---
2018-02-18 19:29:37 +00:00
reviewers:
2016-11-24 23:22:19 +00:00
- davidopp
2017-06-26 20:54:25 +00:00
- mml
- foxish
- kow3ns
2020-10-07 18:16:47 +00:00
title: Safely Drain a Node
2020-05-30 19:10:23 +00:00
content_type: task
2020-09-08 20:32:45 +00:00
min-kubernetes-server-version: 1.5
2016-11-24 23:22:19 +00:00
---
2016-12-15 20:16:54 +00:00
2020-05-30 19:10:23 +00:00
<!-- overview -->
2020-09-08 20:32:45 +00:00
This page shows how to safely drain a {{< glossary_tooltip text = "node" term_id = "node" > }},
2020-10-07 18:16:47 +00:00
optionally respecting the PodDisruptionBudget you have defined.
2020-05-30 19:10:23 +00:00
## {{% heading "prerequisites" %}}
2020-09-08 20:32:45 +00:00
{{% version-check %}}
This task also assumes that you have met the following prerequisites:
2017-06-26 20:54:25 +00:00
1. You do not require your applications to be highly available during the
node drain, or
2020-09-08 20:32:45 +00:00
1. You have read about the [PodDisruptionBudget ](/docs/concepts/workloads/pods/disruptions/ ) concept,
and have [configured PodDisruptionBudgets ](/docs/tasks/run-application/configure-pdb/ ) for
2017-06-26 20:54:25 +00:00
applications that need them.
2016-11-24 23:22:19 +00:00
2020-05-30 19:10:23 +00:00
<!-- steps -->
2016-11-24 23:22:19 +00:00
2020-10-07 18:16:47 +00:00
## (Optional) Configure a disruption budget {#configure-poddisruptionbudget}
To endure that your workloads remain available during maintenance, you can
configure a [PodDisruptionBudget ](/docs/concepts/workloads/pods/disruptions/ ).
If availability is important for any applications that run or could run on the node(s)
that you are draining, [configure a PodDisruptionBudgets ](/docs/tasks/run-application/configure-pdb/ )
first and the continue following this guide.
2017-01-18 18:18:37 +00:00
## Use `kubectl drain` to remove a node from service
2016-11-24 23:22:19 +00:00
You can use `kubectl drain` to safely evict all of your pods from a
node before you perform maintenance on the node (e.g. kernel upgrade,
hardware maintenance, etc.). Safe evictions allow the pod's containers
2020-07-27 03:18:16 +00:00
to [gracefully terminate ](/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination )
2020-09-08 20:32:45 +00:00
and will respect the PodDisruptionBudgets you have specified.
2016-11-24 23:22:19 +00:00
2018-11-06 19:33:04 +00:00
{{< note > }}
2020-09-08 20:32:45 +00:00
By default `kubectl drain` ignores certain system pods on the node
2016-11-24 23:22:19 +00:00
that cannot be killed; see
2018-04-27 22:02:19 +00:00
the [kubectl drain ](/docs/reference/generated/kubectl/kubectl-commands/#drain )
2016-11-24 23:22:19 +00:00
documentation for more details.
2018-11-06 19:33:04 +00:00
{{< / note > }}
2016-11-24 23:22:19 +00:00
When `kubectl drain` returns successfully, that indicates that all of
the pods (except the ones excluded as described in the previous paragraph)
2019-05-08 18:40:34 +00:00
have been safely evicted (respecting the desired graceful termination period,
and respecting the PodDisruptionBudget you have defined). It is then safe to
bring down the node by powering down its physical machine or, if running on a
cloud platform, deleting its virtual machine.
2016-11-24 23:22:19 +00:00
First, identify the name of the node you wish to drain. You can list all of the nodes in your cluster with
2016-12-24 17:13:40 +00:00
2016-11-24 23:22:19 +00:00
```shell
kubectl get nodes
```
Next, tell Kubernetes to drain the node:
2016-12-24 17:13:40 +00:00
2016-11-24 23:22:19 +00:00
```shell
kubectl drain < node name >
```
Once it returns (without giving an error), you can power down the node
(or equivalently, if on a cloud platform, delete the virtual machine backing the node).
If you leave the node in the cluster during the maintenance operation, you need to run
2016-12-24 17:13:40 +00:00
2016-11-24 23:22:19 +00:00
```shell
kubectl uncordon < node name >
```
afterwards to tell Kubernetes that it can resume scheduling new pods onto the node.
2017-01-18 18:18:37 +00:00
## Draining multiple nodes in parallel
2016-11-24 23:22:19 +00:00
The `kubectl drain` command should only be issued to a single node at a
time. However, you can run multiple `kubectl drain` commands for
2018-05-23 05:45:32 +00:00
different nodes in parallel, in different terminals or in the
2016-11-24 23:22:19 +00:00
background. Multiple drain commands running concurrently will still
2020-09-08 20:32:45 +00:00
respect the PodDisruptionBudget you specify.
2016-11-24 23:22:19 +00:00
For example, if you have a StatefulSet with three replicas and have
2020-09-08 20:32:45 +00:00
set a PodDisruptionBudget for that set specifying `minAvailable: 2` ,
`kubectl drain` only evicts a pod from the StatefulSet if all three
replicas pods are ready; if then you issue multiple drain commands in
parallel, Kubernetes respects the PodDisruptionBudget and ensure
that only 1 (calculated as `replicas - minAvailable` ) Pod is unavailable
at any given time. Any drains that would cause the number of ready
replicas to fall below the specified budget are blocked.
2016-11-24 23:22:19 +00:00
2020-09-08 20:32:45 +00:00
## The Eviction API {#eviction-api}
2017-06-26 20:54:25 +00:00
2018-04-27 22:02:19 +00:00
If you prefer not to use [kubectl drain ](/docs/reference/generated/kubectl/kubectl-commands/#drain ) (such as
2017-08-01 09:16:30 +00:00
to avoid calling to an external command, or to get finer control over the pod
2017-06-26 20:54:25 +00:00
eviction process), you can also programmatically cause evictions using the eviction API.
2020-09-08 20:32:45 +00:00
You should first be familiar with using [Kubernetes language clients ](/docs/tasks/administer-cluster/access-cluster-api/#programmatic-access-to-the-api ) to access the API.
2017-06-26 20:54:25 +00:00
The eviction subresource of a
2020-09-08 20:32:45 +00:00
Pod can be thought of as a kind of policy-controlled DELETE operation on the Pod
itself. To attempt an eviction (more precisely: to attempt to
*create* an Eviction), you POST an attempted operation. Here's an example:
2017-06-26 20:54:25 +00:00
```json
{
"apiVersion": "policy/v1beta1",
"kind": "Eviction",
"metadata": {
"name": "quux",
"namespace": "default"
}
}
```
You can attempt an eviction using `curl` :
```bash
2020-09-08 20:32:45 +00:00
curl -v -H 'Content-type: application/json' https://your-cluster-api-endpoint.example/api/v1/namespaces/default/pods/quux/eviction -d @eviction .json
2017-06-26 20:54:25 +00:00
```
2017-06-28 04:42:07 +00:00
The API can respond in one of three ways:
2017-06-26 20:54:25 +00:00
2021-02-11 20:51:47 +00:00
- If the eviction is granted, then the Pod is deleted as if you sent
a `DELETE` request to the Pod's URL and received back `200 OK` .
2017-06-26 20:54:25 +00:00
- If the current state of affairs wouldn't allow an eviction by the rules set
2017-12-09 02:32:03 +00:00
forth in the budget, you get back `429 Too Many Requests` . This is
2017-06-26 20:54:25 +00:00
typically used for generic rate limiting of *any* requests, but here we mean
that this request isn't allowed *right now* but it may be allowed later.
2020-09-08 20:32:45 +00:00
- If there is some kind of misconfiguration; for example multiple PodDisruptionBudgets
that refer the same Pod, you get a `500 Internal Server Error` response.
2017-06-26 20:54:25 +00:00
2017-12-09 02:32:03 +00:00
For a given eviction request, there are two cases:
2017-06-26 20:54:25 +00:00
2017-12-09 02:32:03 +00:00
- There is no budget that matches this pod. In this case, the server always
2017-06-26 20:54:25 +00:00
returns `200 OK` .
2017-12-09 02:32:03 +00:00
- There is at least one budget. In this case, any of the three above responses may
2017-06-26 20:54:25 +00:00
apply.
2020-09-08 20:32:45 +00:00
## Stuck evictions
In some cases, an application may reach a broken state, one where unless you intervene the
eviction API will never return anything other than 429 or 500.
For example: this can happen if ReplicaSet is creating Pods for your application but
the replacement Pods do not become `Ready` . You can also see similar symptoms if the
last Pod evicted has a very long termination grace period.
2017-06-26 20:54:25 +00:00
In this case, there are two potential solutions:
2020-09-08 20:32:45 +00:00
- Abort or pause the automated operation. Investigate the reason for the stuck application,
and restart the automation.
- After a suitably long wait, `DELETE` the Pod from your cluster's control plane, instead
of using the eviction API.
2017-06-26 20:54:25 +00:00
Kubernetes does not specify what the behavior should be in this case; it is up to the
application owners and cluster owners to establish an agreement on behavior in these cases.
2016-11-24 23:22:19 +00:00
2020-05-30 19:10:23 +00:00
## {{% heading "whatsnext" %}}
2017-06-26 20:54:25 +00:00
2017-07-25 15:17:50 +00:00
* Follow steps to protect your application by [configuring a Pod Disruption Budget ](/docs/tasks/run-application/configure-pdb/ ).
2018-05-05 16:00:51 +00:00