From e408383ed6e4f0039b8cfe0bccba0e7a089c2aec Mon Sep 17 00:00:00 2001 From: Andrew Sy Kim Date: Tue, 15 Nov 2022 10:56:38 -0500 Subject: [PATCH] KEP-1669: update docs for ProxyTerminatingEndpoints Signed-off-by: Andrew Sy Kim --- .../concepts/services-networking/service.md | 38 +++++++++++-------- .../feature-gates.md | 5 ++- 2 files changed, 25 insertions(+), 18 deletions(-) diff --git a/content/en/docs/concepts/services-networking/service.md b/content/en/docs/concepts/services-networking/service.md index ae723282bd..8e47d534ed 100644 --- a/content/en/docs/concepts/services-networking/service.md +++ b/content/en/docs/concepts/services-networking/service.md @@ -516,22 +516,6 @@ Valid values are `Cluster` and `Local`. Set the field to `Cluster` to route exte and `Local` to only route to ready node-local endpoints. If the traffic policy is `Local` and there are no node-local endpoints, the kube-proxy does not forward any traffic for the relevant Service. -{{< note >}} -{{< feature-state for_k8s_version="v1.22" state="alpha" >}} -If you enable the `ProxyTerminatingEndpoints` -[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) -for the kube-proxy, the kube-proxy checks if the node -has local endpoints and whether or not all the local endpoints are marked as terminating. -If there are local endpoints and **all** of those are terminating, then the kube-proxy ignores -any external traffic policy of `Local`. Instead, whilst the node-local endpoints remain as all -terminating, the kube-proxy forwards traffic for that Service to healthy endpoints elsewhere, -as if the external traffic policy were set to `Cluster`. -This forwarding behavior for terminating endpoints exists to allow external load balancers to -gracefully drain connections that are backed by `NodePort` Services, even when the health check -node port starts to fail. Otherwise, traffic can be lost between the time a node is still in the node pool of a load -balancer and traffic is being dropped during the termination period of a pod. -{{< /note >}} - ### Internal traffic policy {{< feature-state for_k8s_version="v1.22" state="beta" >}} @@ -541,6 +525,28 @@ Valid values are `Cluster` and `Local`. Set the field to `Cluster` to route inte and `Local` to only route to ready node-local endpoints. If the traffic policy is `Local` and there are no node-local endpoints, traffic is dropped by kube-proxy. +### Traffic to Terminating Endpoints + +{{< feature-state for_k8s_version="v1.26" state="beta" >}} + +If the `ProxyTerminatingEndpoints` +[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) +is enabled in kube-proxy and the traffic policy is `Local`, that node's +kube-proxy uses a more complicated algorithm to select endpoints for a Service. +With the feature enabled, kube-proxy checks if the node +has local endpoints and whether or not all the local endpoints are marked as terminating. +If there are local endpoints and **all** of them are terminating, then kube-proxy +will forward traffic to those terminating endpoints. Otherwise, kube-proxy will always +prefer forwarding traffic to endpoints that are not terminating. + +This forwarding behavior for terminating endpoints exist to allow `NodePort` and `LoadBalancer` Services to +gracefully drain connections when using `externalTrafficPolicy=Local`. As a deployment goes through +a rolling update, nodes backing a loadbalancer may transition from N to 0 replicas of that deployment. +In some cases, external load balancers can send traffic to a node with 0 replicas in between health check probes. +Routing traffic to terminating endpoints ensures that Node's that are scaling down Pods can gracefully receive +and drain traffic to those terminating Pods. By the time the Pod completes termination, the external load balancer +should have seen the node's health check failing and fully removed the node from the backend pool. + ## Discovering services Kubernetes supports 2 primary modes of finding a Service - environment diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index c1814f1143..232d212d3d 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -161,7 +161,8 @@ For a reference to old feature gates that are removed, please refer to | `ProbeTerminationGracePeriod` | `false` | Beta | 1.22 | 1.24 | | `ProbeTerminationGracePeriod` | `true` | Beta | 1.25 | | | `ProcMountType` | `false` | Alpha | 1.12 | | -| `ProxyTerminatingEndpoints` | `false` | Alpha | 1.22 | | +| `ProxyTerminatingEndpoints` | `false` | Alpha | 1.22 | 1.25 | +| `ProxyTerminatingEndpoints` | `true` | Beta | 1.26 | | | `QOSReserved` | `false` | Alpha | 1.11 | | | `ReadWriteOncePod` | `false` | Alpha | 1.22 | | | `RecoverVolumeExpansionFailure` | `false` | Alpha | 1.23 | | @@ -646,7 +647,7 @@ Each feature gate is designed for enabling/disabling a specific feature: filesystem walk for better performance and accuracy. - `LogarithmicScaleDown`: Enable semi-random selection of pods to evict on controller scaledown based on logarithmic bucketing of pod timestamps. -- `MatchLabelKeysInPodTopologySpread`: Enable the `matchLabelKeys` field for +- `MatchLabelKeysInPodTopologySpread`: Enable the `matchLabelKeys` field for [Pod topology spread constraints](/docs/concepts/scheduling-eviction/topology-spread-constraints/). - `MaxUnavailableStatefulSet`: Enables setting the `maxUnavailable` field for the [rolling update strategy](/docs/concepts/workloads/controllers/statefulset/#rolling-updates)