Document behavior of endpoints with the feature EndpointSliceTerminatingCondition (#36791)

* new behavior of endpoints with the feature gate EndpointSliceTerminatingCondition

* Update content/en/docs/concepts/workloads/pods/pod-lifecycle.md

Co-authored-by: Tim Bannister <tim@scalefactory.com>

* Update content/en/docs/concepts/workloads/pods/pod-lifecycle.md

Co-authored-by: Tim Bannister <tim@scalefactory.com>

* Update content/en/docs/concepts/workloads/pods/pod-lifecycle.md

Co-authored-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>

* fixing feature gate versions

---------

Co-authored-by: Tim Bannister <tim@scalefactory.com>
Co-authored-by: Antonio Ojea <antonio.ojea.garcia@gmail.com>
pull/39463/head
Sergey Kanzhelev 2023-03-15 08:32:17 -07:00 committed by GitHub
parent 3971a57a19
commit ae17e46f73
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 276 additions and 5 deletions

View File

@ -296,7 +296,7 @@ Each probe must define exactly one of these four mechanisms:
The target should implement
[gRPC health checks](https://grpc.io/grpc/core/md_doc_health-checking.html).
The diagnostic is considered successful if the `status`
of the response is `SERVING`.
of the response is `SERVING`.
gRPC probes are an alpha feature and are only available if you
enable the `GRPCContainerProbe`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/).
@ -465,14 +465,32 @@ An example flow:
The containers in the Pod receive the TERM signal at different times and in an arbitrary
order. If the order of shutdowns matters, consider using a `preStop` hook to synchronize.
{{< /note >}}
1. At the same time as the kubelet is starting graceful shutdown, the control plane removes that
shutting-down Pod from EndpointSlice (and Endpoints) objects where these represent
1. At the same time as the kubelet is starting graceful shutdown of the Pod, the control plane evaluates whether to remove that shutting-down Pod from EndpointSlice (and Endpoints) objects, where those objects represent
a {{< glossary_tooltip term_id="service" text="Service" >}} with a configured
{{< glossary_tooltip text="selector" term_id="selector" >}}.
{{< glossary_tooltip text="ReplicaSets" term_id="replica-set" >}} and other workload resources
no longer treat the shutting-down Pod as a valid, in-service replica. Pods that shut down slowly
cannot continue to serve traffic as load balancers (like the service proxy) remove the Pod from
the list of endpoints as soon as the termination grace period _begins_.
should not continue to serve regular traffic and should start terminating and finish processing open connections.
Some applications need to go beyond finishing open connections and need more graceful termination -
for example: session draining and completion. Any endpoints that represent the terminating pods
are not immediately removed from EndpointSlices,
and a status indicating [terminating state](/docs/concepts/services-networking/endpoint-slices/#conditions)
is exposed from the EndpointSlice API (and the legacy Endpoints API). Terminating
endpoints always have their `ready` status
as `false` (for backward compatibility with versions before 1.26),
so load balancers will not use it for regular traffic.
If traffic draining on terminating pod is needed, the actual readiness can be checked as a condition `serving`.
You can find more details on how to implement connections draining
in the tutorial [Pods And Endpoints Termination Flow](/docs/tutorials/services/pods-and-endpoint-termination-flow/)
{{<note>}}
If you don't have the `EndpointSliceTerminatingCondition` feature gate enabled
in your cluster (the gate is on by default from Kubernetes 1.22, and locked to default in 1.26), then the Kubernetes control
plane removes a Pod from any relevant EndpointSlices as soon as the Pod's
termination grace period _begins_. The behavior above is described when the
feature gate `EndpointSliceTerminatingCondition` is enabled.
{{</note>}}
1. When the grace period expires, the kubelet triggers forcible shutdown. The container runtime sends
`SIGKILL` to any processes still running in any container in the Pod.
The kubelet also cleans up a hidden `pause` container if that container runtime uses one.

View File

@ -0,0 +1,221 @@
---
title: Explore Termination Behavior for Pods And Their Endpoints
content_type: tutorial
weight: 60
---
<!-- overview -->
Once you connected your Application with Service following steps
like those outlined in [Connecting Applications with Services](/docs/tutorials/services/connect-applications-service/),
you have a continuously running, replicated application, that is exposed on a network.
This tutorial helps you look at the termination flow for Pods and to explore ways to implement
graceful connection draining.
<!-- body -->
## Termination process for Pods and their endpoints
There are often cases when you need to terminate a Pod - be it for upgrade or scale down.
In order to improve application availability, it may be important to implement
a proper active connections draining. This tutorial explains the flow of
Pod termination in connection with the corresponding endpoint state and removal.
This tutorial explains the flow of Pod termination in connection with the
corresponding endpoint state and removal by using
a simple nginx web server to demonstrate the concept.
<!-- body -->
## Example flow with endpoint termination
The following is the example of the flow described in the
[Termination of Pods](/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination)
document.
Let's say you have a Deployment containing of a single `nginx` replica
(just for demonstration purposes) and a Service:
{{< codenew file="service/pod-with-graceful-termination.yaml" >}}
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
terminationGracePeriodSeconds: 120 # extra long grace period
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
lifecycle:
preStop:
exec:
# Real life termination may take any time up to terminationGracePeriodSeconds.
# In this example - just hang around for at least the duration of terminationGracePeriodSeconds,
# at 120 seconds container will be forcibly terminated.
# Note, all this time nginx will keep processing requests.
command: [
"/bin/sh", "-c", "sleep 180"
]
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx
ports:
- protocol: TCP
port: 80
targetPort: 80
```
Once the Pod and Service are running, you can get the name of any associated EndpointSlices:
```shell
kubectl get endpointslice
```
The output is similar to this:
```none
NAME ADDRESSTYPE PORTS ENDPOINTS AGE
nginx-service-6tjbr IPv4 80 10.12.1.199,10.12.1.201 22m
```
You can see its status, and validate that there is one endpoint registered:
```shell
kubectl get endpointslices -o json -l kubernetes.io/service-name=nginx-service
```
The output is similar to this:
```none
{
"addressType": "IPv4",
"apiVersion": "discovery.k8s.io/v1",
"endpoints": [
{
"addresses": [
"10.12.1.201"
],
"conditions": {
"ready": true,
"serving": true,
"terminating": false
```
Now let's terminate the Pod and validate that the Pod is being terminated
respecting the graceful termination period configuration:
```shell
kubectl delete pod nginx-deployment-7768647bf9-b4b9s
```
All pods:
```shell
kubectl get pods
```
The output is similar to this:
```none
NAME READY STATUS RESTARTS AGE
nginx-deployment-7768647bf9-b4b9s 1/1 Terminating 0 4m1s
nginx-deployment-7768647bf9-rkxlw 1/1 Running 0 8s
```
You can see that the new pod got scheduled.
While the new endpoint is being created for the new Pod, the old endpoint is
still around in the terminating state:
```shell
kubectl get endpointslice -o json nginx-service-6tjbr
```
The output is similar to this:
```none
{
"addressType": "IPv4",
"apiVersion": "discovery.k8s.io/v1",
"endpoints": [
{
"addresses": [
"10.12.1.201"
],
"conditions": {
"ready": false,
"serving": true,
"terminating": true
},
"nodeName": "gke-main-default-pool-dca1511c-d17b",
"targetRef": {
"kind": "Pod",
"name": "nginx-deployment-7768647bf9-b4b9s",
"namespace": "default",
"uid": "66fa831c-7eb2-407f-bd2c-f96dfe841478"
},
"zone": "us-central1-c"
},
{
"addresses": [
"10.12.1.202"
],
"conditions": {
"ready": true,
"serving": true,
"terminating": false
},
"nodeName": "gke-main-default-pool-dca1511c-d17b",
"targetRef": {
"kind": "Pod",
"name": "nginx-deployment-7768647bf9-rkxlw",
"namespace": "default",
"uid": "722b1cbe-dcd7-4ed4-8928-4a4d0e2bbe35"
},
"zone": "us-central1-c"
```
This allows applications to communicate their state during termination
and clients (such as load balancers) to implement a connections draining functionality.
These clients may detect terminating endpoints and implement a special logic for them.
In Kubernetes, endpoints that are terminating always have their `ready` status set as as `false`.
This needs to happen for backward
compatibility, so existing load balancers will not use it for regular traffic.
If traffic draining on terminating pod is needed, the actual readiness can be
checked as a condition `serving`.
When Pod is deleted, the old endpoint will also be deleted.
## {{% heading "whatsnext" %}}
* Learn how to [Connect Applications with Services](/docs/tutorials/services/connect-applications-service/)
* Learn more about [Using a Service to Access an Application in a Cluster](/docs/tasks/access-application-cluster/service-access-application-cluster/)
* Learn more about [Connecting a Front End to a Back End Using a Service](/docs/tasks/access-application-cluster/connecting-frontend-backend/)
* Learn more about [Creating an External Load Balancer](/docs/tasks/access-application-cluster/create-external-load-balancer/)

View File

@ -0,0 +1,32 @@
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
terminationGracePeriodSeconds: 120 # extra long grace period
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
lifecycle:
preStop:
exec:
# Real life termination may take any time up to terminationGracePeriodSeconds.
# In this example - just hang around for at least the duration of terminationGracePeriodSeconds,
# at 120 seconds container will be forcibly terminated.
# Note, all this time nginx will keep processing requests.
command: [
"/bin/sh", "-c", "sleep 180"
]