gRPC probes

pull/29740/head
Sergey Kanzhelev 2021-11-30 19:37:31 +00:00
parent 48612bee86
commit ef6668539c
4 changed files with 71 additions and 6 deletions

View File

@ -4,6 +4,8 @@ title: 'Health checking gRPC servers on Kubernetes'
date: 2018-10-01 date: 2018-10-01
--- ---
_Built-in gRPC probes were introduced in Kubernetes 1.23. To learn more, see [Configure Liveness, Readiness and Startup Probes](/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-grpc-liveness-probe)._
**Author**: [Ahmet Alp Balkan](https://twitter.com/ahmetb) (Google) **Author**: [Ahmet Alp Balkan](https://twitter.com/ahmetb) (Google)
[gRPC](https://grpc.io) is on its way to becoming the lingua franca for [gRPC](https://grpc.io) is on its way to becoming the lingua franca for

View File

@ -122,6 +122,7 @@ different Kubernetes components.
| `ExperimentalHostUserNamespaceDefaulting` | `false` | Beta | 1.5 | | | `ExperimentalHostUserNamespaceDefaulting` | `false` | Beta | 1.5 | |
| `GracefulNodeShutdown` | `false` | Alpha | 1.20 | 1.20 | | `GracefulNodeShutdown` | `false` | Alpha | 1.20 | 1.20 |
| `GracefulNodeShutdown` | `true` | Beta | 1.21 | | | `GracefulNodeShutdown` | `true` | Beta | 1.21 | |
| `GRPCContainerProbe` | `false` | Alpha | 1.23 | |
| `HPAContainerMetrics` | `false` | Alpha | 1.20 | | | `HPAContainerMetrics` | `false` | Alpha | 1.20 | |
| `HPAScaleToZero` | `false` | Alpha | 1.16 | | | `HPAScaleToZero` | `false` | Alpha | 1.16 | |
| `IndexedJob` | `false` | Alpha | 1.21 | 1.21 | | `IndexedJob` | `false` | Alpha | 1.21 | 1.21 |
@ -573,10 +574,10 @@ Each feature gate is designed for enabling/disabling a specific feature:
extended tokens by starting `kube-apiserver` with flag `--service-account-extend-token-expiration=false`. extended tokens by starting `kube-apiserver` with flag `--service-account-extend-token-expiration=false`.
Check [Bound Service Account Tokens](https://github.com/kubernetes/enhancements/blob/master/keps/sig-auth/1205-bound-service-account-tokens/README.md) Check [Bound Service Account Tokens](https://github.com/kubernetes/enhancements/blob/master/keps/sig-auth/1205-bound-service-account-tokens/README.md)
for more details. for more details.
- `ControllerManagerLeaderMigration`: Enables Leader Migration for - `ControllerManagerLeaderMigration`: Enables Leader Migration for
[kube-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#initial-leader-migration-configuration) and [kube-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#initial-leader-migration-configuration) and
[cloud-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#deploy-cloud-controller-manager) which allows a cluster operator to live migrate [cloud-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#deploy-cloud-controller-manager) which allows a cluster operator to live migrate
controllers from the kube-controller-manager into an external controller-manager controllers from the kube-controller-manager into an external controller-manager
(e.g. the cloud-controller-manager) in an HA cluster without downtime. (e.g. the cloud-controller-manager) in an HA cluster without downtime.
- `CPUManager`: Enable container level CPU affinity support, see - `CPUManager`: Enable container level CPU affinity support, see
[CPU Management Policies](/docs/tasks/administer-cluster/cpu-management-policies/). [CPU Management Policies](/docs/tasks/administer-cluster/cpu-management-policies/).
@ -782,6 +783,7 @@ Each feature gate is designed for enabling/disabling a specific feature:
and gracefully terminate pods running on the node. See and gracefully terminate pods running on the node. See
[Graceful Node Shutdown](/docs/concepts/architecture/nodes/#graceful-node-shutdown) [Graceful Node Shutdown](/docs/concepts/architecture/nodes/#graceful-node-shutdown)
for more details. for more details.
- `GRPCContainerProbe`: Enables gPRC probe method for {Liveness,Readiness,Startup}Probe. See [Configure Liveness, Readiness and Startup Probes](/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-grpc-liveness-probe).
- `HPAContainerMetrics`: Enable the `HorizontalPodAutoscaler` to scale based on - `HPAContainerMetrics`: Enable the `HorizontalPodAutoscaler` to scale based on
metrics from individual containers in target pods. metrics from individual containers in target pods.
- `HPAScaleToZero`: Enables setting `minReplicas` to 0 for `HorizontalPodAutoscaler` - `HPAScaleToZero`: Enables setting `minReplicas` to 0 for `HorizontalPodAutoscaler`

View File

@ -220,11 +220,57 @@ After 15 seconds, view Pod events to verify that liveness probes:
kubectl describe pod goproxy kubectl describe pod goproxy
``` ```
## Define a gRPC liveness probe
{{< feature-state for_k8s_version="v1.23" state="alpha" >}}
If your application implements [gRPC Health Checking Protocol](https://github.com/grpc/grpc/blob/master/doc/health-checking.md),
kubelet can be configured to use it for application liveness checks.
{{< codenew file="pods/probe/grpc-liveness.yaml">}}
To use a gRPC probe, `port` must be configured. If the health endpoint is configured
on a non-default service, `service` must be configured.
{{< note >}}
Unlike HTTP and TCP probes, named ports cannot be used and custom host cannot be configured.
{{< /note >}}
Configuration problems (e.g. incorrect port and service, unimplemented health checking protocol)
are considered a probe failure, similar to HTTP and TCP probes.
Before Kubernetes 1.23, gRPC health probes were often implemented using [grpc-health-probe](https://github.com/grpc-ecosystem/grpc-health-probe/),
as described in the blog post [Health checking gRPC servers on Kubernetes](/blog/2018/10/01/health-checking-grpc-servers-on-kubernetes/).
The built-in gRPC probes behavior is similar to one implemented by grpc-health-probe.
When migrating from grpc-health-probe to built-in probes, remember the following differences:
- Built-in probes will run against pod IP, unlike grpc-health-probe that often runs against `127.0.0.1`.
Be sure to configure your gRPC endpoint to listen for pod IP address.
- Built-in probes do not currently support any authentication parameters (like `-tls`).
- There are no error codes in built-in probes. All errors are considered as probe failures.
- If `ExecProbeTimeout` feature gate is set to `false`, grpc-health-probe will NOT
respect `timeoutSeconds` setting (which defaults to 1s),
while built-in probe will fail on timeout.
To try the gRPC liveness check, create a Pod using the command below.
In the example below, etcd pod is configured to use gRPC liveness probe.
```shell
kubectl apply -f https://k8s.io/examples/pods/probe/content/en/examples/pods/probe/grpc-liveness.yaml
```
After 15 seconds, view Pod events to verify that the liveness probes has not failed:
```shell
kubectl describe pod etcd-with-grpc
```
## Use a named port ## Use a named port
You can use a named You can use a named
[ContainerPort](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#containerport-v1-core) [ContainerPort](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#containerport-v1-core)
for HTTP or TCP liveness checks: for HTTP and TCP probes. Note, gRPC probe does not support named port.
```yaml ```yaml
ports: ports:
@ -349,7 +395,7 @@ This defect was corrected in Kubernetes v1.20. You may have been relying on the
even without realizing it, as the default timeout is 1 second. even without realizing it, as the default timeout is 1 second.
As a cluster administrator, you can disable the [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) `ExecProbeTimeout` (set it to `false`) As a cluster administrator, you can disable the [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) `ExecProbeTimeout` (set it to `false`)
on each kubelet to restore the behavior from older versions, then remove that override on each kubelet to restore the behavior from older versions, then remove that override
once all the exec probes in the cluster have a `timeoutSeconds` value set. once all the exec probes in the cluster have a `timeoutSeconds` value set.
If you have pods that are impacted from the default 1 second timeout, If you have pods that are impacted from the default 1 second timeout,
you should update their probe timeout so that you're ready for the you should update their probe timeout so that you're ready for the
eventual removal of that feature gate. eventual removal of that feature gate.

View File

@ -0,0 +1,15 @@
apiVersion: v1
kind: Pod
metadata:
name: etcd-with-grpc
spec:
containers:
- name: etcd
image: k8s.gcr.io/etcd:3.5.1-0
command: [ "/usr/local/bin/etcd", "--data-dir", "/var/lib/etcd", "--listen-client-urls", "http://0.0.0.0:2379", "--advertise-client-urls", "http://127.0.0.1:2379", "--log-level", "debug"]
ports:
- containerPort: 2379
livenessProbe:
gRPC:
port: 2379
initialDelaySeconds: 10