gRPC probes

pull/29740/head
Sergey Kanzhelev 2021-11-30 19:37:31 +00:00
parent 48612bee86
commit ef6668539c
4 changed files with 71 additions and 6 deletions

View File

@ -4,6 +4,8 @@ title: 'Health checking gRPC servers on Kubernetes'
date: 2018-10-01
---
_Built-in gRPC probes were introduced in Kubernetes 1.23. To learn more, see [Configure Liveness, Readiness and Startup Probes](/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-grpc-liveness-probe)._
**Author**: [Ahmet Alp Balkan](https://twitter.com/ahmetb) (Google)
[gRPC](https://grpc.io) is on its way to becoming the lingua franca for

View File

@ -122,6 +122,7 @@ different Kubernetes components.
| `ExperimentalHostUserNamespaceDefaulting` | `false` | Beta | 1.5 | |
| `GracefulNodeShutdown` | `false` | Alpha | 1.20 | 1.20 |
| `GracefulNodeShutdown` | `true` | Beta | 1.21 | |
| `GRPCContainerProbe` | `false` | Alpha | 1.23 | |
| `HPAContainerMetrics` | `false` | Alpha | 1.20 | |
| `HPAScaleToZero` | `false` | Alpha | 1.16 | |
| `IndexedJob` | `false` | Alpha | 1.21 | 1.21 |
@ -573,10 +574,10 @@ Each feature gate is designed for enabling/disabling a specific feature:
extended tokens by starting `kube-apiserver` with flag `--service-account-extend-token-expiration=false`.
Check [Bound Service Account Tokens](https://github.com/kubernetes/enhancements/blob/master/keps/sig-auth/1205-bound-service-account-tokens/README.md)
for more details.
- `ControllerManagerLeaderMigration`: Enables Leader Migration for
[kube-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#initial-leader-migration-configuration) and
[cloud-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#deploy-cloud-controller-manager) which allows a cluster operator to live migrate
controllers from the kube-controller-manager into an external controller-manager
- `ControllerManagerLeaderMigration`: Enables Leader Migration for
[kube-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#initial-leader-migration-configuration) and
[cloud-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#deploy-cloud-controller-manager) which allows a cluster operator to live migrate
controllers from the kube-controller-manager into an external controller-manager
(e.g. the cloud-controller-manager) in an HA cluster without downtime.
- `CPUManager`: Enable container level CPU affinity support, see
[CPU Management Policies](/docs/tasks/administer-cluster/cpu-management-policies/).
@ -782,6 +783,7 @@ Each feature gate is designed for enabling/disabling a specific feature:
and gracefully terminate pods running on the node. See
[Graceful Node Shutdown](/docs/concepts/architecture/nodes/#graceful-node-shutdown)
for more details.
- `GRPCContainerProbe`: Enables gPRC probe method for {Liveness,Readiness,Startup}Probe. See [Configure Liveness, Readiness and Startup Probes](/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-grpc-liveness-probe).
- `HPAContainerMetrics`: Enable the `HorizontalPodAutoscaler` to scale based on
metrics from individual containers in target pods.
- `HPAScaleToZero`: Enables setting `minReplicas` to 0 for `HorizontalPodAutoscaler`

View File

@ -220,11 +220,57 @@ After 15 seconds, view Pod events to verify that liveness probes:
kubectl describe pod goproxy
```
## Define a gRPC liveness probe
{{< feature-state for_k8s_version="v1.23" state="alpha" >}}
If your application implements [gRPC Health Checking Protocol](https://github.com/grpc/grpc/blob/master/doc/health-checking.md),
kubelet can be configured to use it for application liveness checks.
{{< codenew file="pods/probe/grpc-liveness.yaml">}}
To use a gRPC probe, `port` must be configured. If the health endpoint is configured
on a non-default service, `service` must be configured.
{{< note >}}
Unlike HTTP and TCP probes, named ports cannot be used and custom host cannot be configured.
{{< /note >}}
Configuration problems (e.g. incorrect port and service, unimplemented health checking protocol)
are considered a probe failure, similar to HTTP and TCP probes.
Before Kubernetes 1.23, gRPC health probes were often implemented using [grpc-health-probe](https://github.com/grpc-ecosystem/grpc-health-probe/),
as described in the blog post [Health checking gRPC servers on Kubernetes](/blog/2018/10/01/health-checking-grpc-servers-on-kubernetes/).
The built-in gRPC probes behavior is similar to one implemented by grpc-health-probe.
When migrating from grpc-health-probe to built-in probes, remember the following differences:
- Built-in probes will run against pod IP, unlike grpc-health-probe that often runs against `127.0.0.1`.
Be sure to configure your gRPC endpoint to listen for pod IP address.
- Built-in probes do not currently support any authentication parameters (like `-tls`).
- There are no error codes in built-in probes. All errors are considered as probe failures.
- If `ExecProbeTimeout` feature gate is set to `false`, grpc-health-probe will NOT
respect `timeoutSeconds` setting (which defaults to 1s),
while built-in probe will fail on timeout.
To try the gRPC liveness check, create a Pod using the command below.
In the example below, etcd pod is configured to use gRPC liveness probe.
```shell
kubectl apply -f https://k8s.io/examples/pods/probe/content/en/examples/pods/probe/grpc-liveness.yaml
```
After 15 seconds, view Pod events to verify that the liveness probes has not failed:
```shell
kubectl describe pod etcd-with-grpc
```
## Use a named port
You can use a named
[ContainerPort](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#containerport-v1-core)
for HTTP or TCP liveness checks:
for HTTP and TCP probes. Note, gRPC probe does not support named port.
```yaml
ports:
@ -349,7 +395,7 @@ This defect was corrected in Kubernetes v1.20. You may have been relying on the
even without realizing it, as the default timeout is 1 second.
As a cluster administrator, you can disable the [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) `ExecProbeTimeout` (set it to `false`)
on each kubelet to restore the behavior from older versions, then remove that override
once all the exec probes in the cluster have a `timeoutSeconds` value set.
once all the exec probes in the cluster have a `timeoutSeconds` value set.
If you have pods that are impacted from the default 1 second timeout,
you should update their probe timeout so that you're ready for the
eventual removal of that feature gate.

View File

@ -0,0 +1,15 @@
apiVersion: v1
kind: Pod
metadata:
name: etcd-with-grpc
spec:
containers:
- name: etcd
image: k8s.gcr.io/etcd:3.5.1-0
command: [ "/usr/local/bin/etcd", "--data-dir", "/var/lib/etcd", "--listen-client-urls", "http://0.0.0.0:2379", "--advertise-client-urls", "http://127.0.0.1:2379", "--log-level", "debug"]
ports:
- containerPort: 2379
livenessProbe:
gRPC:
port: 2379
initialDelaySeconds: 10