From ef6668539c37efab4338ff05728fcdd3d97c2bc1 Mon Sep 17 00:00:00 2001 From: Sergey Kanzhelev Date: Tue, 30 Nov 2021 19:37:31 +0000 Subject: [PATCH] gRPC probes --- .../_posts/2018-10-01-health-checking-grpc.md | 2 + .../feature-gates.md | 10 ++-- ...igure-liveness-readiness-startup-probes.md | 50 ++++++++++++++++++- .../en/examples/pods/probe/grpc-liveness.yaml | 15 ++++++ 4 files changed, 71 insertions(+), 6 deletions(-) create mode 100644 content/en/examples/pods/probe/grpc-liveness.yaml diff --git a/content/en/blog/_posts/2018-10-01-health-checking-grpc.md b/content/en/blog/_posts/2018-10-01-health-checking-grpc.md index 21eb668dc2..e6e584b274 100644 --- a/content/en/blog/_posts/2018-10-01-health-checking-grpc.md +++ b/content/en/blog/_posts/2018-10-01-health-checking-grpc.md @@ -4,6 +4,8 @@ title: 'Health checking gRPC servers on Kubernetes' date: 2018-10-01 --- +_Built-in gRPC probes were introduced in Kubernetes 1.23. To learn more, see [Configure Liveness, Readiness and Startup Probes](/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-grpc-liveness-probe)._ + **Author**: [Ahmet Alp Balkan](https://twitter.com/ahmetb) (Google) [gRPC](https://grpc.io) is on its way to becoming the lingua franca for diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 73bf143a43..63088e2cc2 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -122,6 +122,7 @@ different Kubernetes components. | `ExperimentalHostUserNamespaceDefaulting` | `false` | Beta | 1.5 | | | `GracefulNodeShutdown` | `false` | Alpha | 1.20 | 1.20 | | `GracefulNodeShutdown` | `true` | Beta | 1.21 | | +| `GRPCContainerProbe` | `false` | Alpha | 1.23 | | | `HPAContainerMetrics` | `false` | Alpha | 1.20 | | | `HPAScaleToZero` | `false` | Alpha | 1.16 | | | `IndexedJob` | `false` | Alpha | 1.21 | 1.21 | @@ -573,10 +574,10 @@ Each feature gate is designed for enabling/disabling a specific feature: extended tokens by starting `kube-apiserver` with flag `--service-account-extend-token-expiration=false`. Check [Bound Service Account Tokens](https://github.com/kubernetes/enhancements/blob/master/keps/sig-auth/1205-bound-service-account-tokens/README.md) for more details. -- `ControllerManagerLeaderMigration`: Enables Leader Migration for - [kube-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#initial-leader-migration-configuration) and - [cloud-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#deploy-cloud-controller-manager) which allows a cluster operator to live migrate - controllers from the kube-controller-manager into an external controller-manager +- `ControllerManagerLeaderMigration`: Enables Leader Migration for + [kube-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#initial-leader-migration-configuration) and + [cloud-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#deploy-cloud-controller-manager) which allows a cluster operator to live migrate + controllers from the kube-controller-manager into an external controller-manager (e.g. the cloud-controller-manager) in an HA cluster without downtime. - `CPUManager`: Enable container level CPU affinity support, see [CPU Management Policies](/docs/tasks/administer-cluster/cpu-management-policies/). @@ -782,6 +783,7 @@ Each feature gate is designed for enabling/disabling a specific feature: and gracefully terminate pods running on the node. See [Graceful Node Shutdown](/docs/concepts/architecture/nodes/#graceful-node-shutdown) for more details. +- `GRPCContainerProbe`: Enables gPRC probe method for {Liveness,Readiness,Startup}Probe. See [Configure Liveness, Readiness and Startup Probes](/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-grpc-liveness-probe). - `HPAContainerMetrics`: Enable the `HorizontalPodAutoscaler` to scale based on metrics from individual containers in target pods. - `HPAScaleToZero`: Enables setting `minReplicas` to 0 for `HorizontalPodAutoscaler` diff --git a/content/en/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes.md b/content/en/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes.md index d9ab2056da..2ef2b1368c 100644 --- a/content/en/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes.md +++ b/content/en/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes.md @@ -220,11 +220,57 @@ After 15 seconds, view Pod events to verify that liveness probes: kubectl describe pod goproxy ``` +## Define a gRPC liveness probe + +{{< feature-state for_k8s_version="v1.23" state="alpha" >}} + +If your application implements [gRPC Health Checking Protocol](https://github.com/grpc/grpc/blob/master/doc/health-checking.md), +kubelet can be configured to use it for application liveness checks. + +{{< codenew file="pods/probe/grpc-liveness.yaml">}} + +To use a gRPC probe, `port` must be configured. If the health endpoint is configured +on a non-default service, `service` must be configured. + +{{< note >}} +Unlike HTTP and TCP probes, named ports cannot be used and custom host cannot be configured. +{{< /note >}} + +Configuration problems (e.g. incorrect port and service, unimplemented health checking protocol) +are considered a probe failure, similar to HTTP and TCP probes. + +Before Kubernetes 1.23, gRPC health probes were often implemented using [grpc-health-probe](https://github.com/grpc-ecosystem/grpc-health-probe/), +as described in the blog post [Health checking gRPC servers on Kubernetes](/blog/2018/10/01/health-checking-grpc-servers-on-kubernetes/). +The built-in gRPC probes behavior is similar to one implemented by grpc-health-probe. +When migrating from grpc-health-probe to built-in probes, remember the following differences: + +- Built-in probes will run against pod IP, unlike grpc-health-probe that often runs against `127.0.0.1`. + Be sure to configure your gRPC endpoint to listen for pod IP address. +- Built-in probes do not currently support any authentication parameters (like `-tls`). +- There are no error codes in built-in probes. All errors are considered as probe failures. +- If `ExecProbeTimeout` feature gate is set to `false`, grpc-health-probe will NOT + respect `timeoutSeconds` setting (which defaults to 1s), + while built-in probe will fail on timeout. + +To try the gRPC liveness check, create a Pod using the command below. +In the example below, etcd pod is configured to use gRPC liveness probe. + + +```shell +kubectl apply -f https://k8s.io/examples/pods/probe/content/en/examples/pods/probe/grpc-liveness.yaml +``` + +After 15 seconds, view Pod events to verify that the liveness probes has not failed: + +```shell +kubectl describe pod etcd-with-grpc +``` + ## Use a named port You can use a named [ContainerPort](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#containerport-v1-core) -for HTTP or TCP liveness checks: +for HTTP and TCP probes. Note, gRPC probe does not support named port. ```yaml ports: @@ -349,7 +395,7 @@ This defect was corrected in Kubernetes v1.20. You may have been relying on the even without realizing it, as the default timeout is 1 second. As a cluster administrator, you can disable the [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) `ExecProbeTimeout` (set it to `false`) on each kubelet to restore the behavior from older versions, then remove that override -once all the exec probes in the cluster have a `timeoutSeconds` value set. +once all the exec probes in the cluster have a `timeoutSeconds` value set. If you have pods that are impacted from the default 1 second timeout, you should update their probe timeout so that you're ready for the eventual removal of that feature gate. diff --git a/content/en/examples/pods/probe/grpc-liveness.yaml b/content/en/examples/pods/probe/grpc-liveness.yaml new file mode 100644 index 0000000000..84d716df28 --- /dev/null +++ b/content/en/examples/pods/probe/grpc-liveness.yaml @@ -0,0 +1,15 @@ +apiVersion: v1 +kind: Pod +metadata: + name: etcd-with-grpc +spec: + containers: + - name: etcd + image: k8s.gcr.io/etcd:3.5.1-0 + command: [ "/usr/local/bin/etcd", "--data-dir", "/var/lib/etcd", "--listen-client-urls", "http://0.0.0.0:2379", "--advertise-client-urls", "http://127.0.0.1:2379", "--log-level", "debug"] + ports: + - containerPort: 2379 + livenessProbe: + gRPC: + port: 2379 + initialDelaySeconds: 10