Merge pull request #40873 from windsonsea/podfai

[zh] sync pod-failure-policy.md
pull/40913/head
Kubernetes Prow Robot 2023-04-28 00:52:16 -07:00 committed by GitHub
commit 39141bfb82
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 200 additions and 0 deletions

View File

@ -49,6 +49,13 @@ You should already be familiar with the basic use of [Job](/docs/concepts/worklo
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
<!--
Ensure that the [feature gates](/docs/reference/command-line-tools-reference/feature-gates/)
`PodDisruptionConditions` and `JobPodFailurePolicy` are both enabled in your cluster.
-->
确保[特性门控](/zh-cn/docs/reference/command-line-tools-reference/feature-gates/)
`PodDisruptionConditions``JobPodFailurePolicy` 在你的集群中均已启用。
<!--
## Using Pod failure policy to avoid unnecessary Pod retries
@ -218,6 +225,180 @@ The cluster automatically cleans up the Pods.
-->
集群自动清理 Pod。
<!--
## Using Pod failure policy to avoid unnecessary Pod retries based on custom Pod Conditions
With the following example, you can learn how to use Pod failure policy to
avoid unnecessary Pod restarts based on custom Pod Conditions.
-->
## 基于自定义 Pod 状况使用 Pod 失效策略避免不必要的 Pod 重试 {#avoid-pod-retries-based-on-custom-conditions}
根据以下示例,你可以学习如何基于自定义 Pod 状况使用 Pod 失效策略避免不必要的 Pod 重启。
{{< note >}}
<!--
The example below works since version 1.27 as it relies on transitioning of
deleted pods, in the `Pending` phase, to a terminal phase
(see: [Pod Phase](/docs/concepts/workloads/pods/pod-lifecycle/#pod-phase)).
-->
以下示例自 v1.27 起开始生效,因为它依赖于将已删除的 Pod 从 `Pending` 阶段过渡到终止阶段
(参阅 [Pod 阶段](/zh-cn/docs/concepts/workloads/pods/pod-lifecycle/#pod-phase))。
{{< /note >}}
<!--
1. First, create a Job based on the config:
-->
1. 首先基于配置创建一个 Job
{{< codenew file="/controllers/job-pod-failure-policy-config-issue.yaml" >}}
<!--
by running:
-->
执行以下命令:
```sh
kubectl create -f job-pod-failure-policy-config-issue.yaml
```
<!--
Note that, the image is misconfigured, as it does not exist.
-->
请注意,镜像配置不正确,因为该镜像不存在。
<!--
2. Inspect the status of the job's Pods by running:
-->
2. 通过执行以下命令检查任务 Pod 的状态:
```sh
kubectl get pods -l job-name=job-pod-failure-policy-config-issue -o yaml
```
<!--
You will see output similar to this:
-->
你将看到类似以下输出:
```yaml
containerStatuses:
- image: non-existing-repo/non-existing-image:example
...
state:
waiting:
message: Back-off pulling image "non-existing-repo/non-existing-image:example"
reason: ImagePullBackOff
...
phase: Pending
```
<!--
Note that the pod remains in the `Pending` phase as it fails to pull the
misconfigured image. This, in principle, could be a transient issue and the
image could get pulled. However, in this case, the image does not exist so
we indicate this fact by a custom condition.
-->
请注意Pod 依然处于 `Pending` 阶段,因为它无法拉取错误配置的镜像。
原则上讲这可能是一个暂时问题,镜像还是会被拉取。然而这种情况下,
镜像不存在,因为我们通过一个自定义状况表明了这个事实。
<!--
3. Add the custom condition. First prepare the patch by running:
-->
3. 添加自定义状况。执行以下命令先准备补丁:
```sh
cat <<EOF > patch.yaml
status:
conditions:
- type: ConfigIssue
status: "True"
reason: "NonExistingImage"
lastTransitionTime: "$(date -u +"%Y-%m-%dT%H:%M:%SZ")"
EOF
```
<!--
Second, select one of the pods created by the job by running:
-->
其次,执行以下命令选择通过任务创建的其中一个 Pod
```
podName=$(kubectl get pods -l job-name=job-pod-failure-policy-config-issue -o jsonpath='{.items[0].metadata.name}')
```
<!--
Then, apply the patch on one of the pods by running the following command:
-->
随后执行以下命令将补丁应用到其中一个 Pod 上:
```sh
kubectl patch pod $podName --subresource=status --patch-file=patch.yaml
```
<!--
If applied successfully, you will get a notification like this:
-->
如果被成功应用,你将看到类似以下的一条通知:
```sh
pod/job-pod-failure-policy-config-issue-k6pvp patched
```
<!--
4. Delete the pod to transition it to `Failed` phase, by running the command:
-->
4. 执行以下命令删除此 Pod 将其过渡到 `Failed` 阶段:
```sh
kubectl delete pods/$podName
```
<!--
5. Inspect the status of the Job by running:
-->
5. 执行以下命令查验 Job 的状态:
```sh
kubectl get jobs -l job-name=job-pod-failure-policy-config-issue -o yaml
```
<!--
In the Job status, see a job `Failed` condition with the field `reason`
equal `PodFailurePolicy`. Additionally, the `message` field contains a
more detailed information about the Job termination, such as:
`Pod default/job-pod-failure-policy-config-issue-k6pvp has condition ConfigIssue matching FailJob rule at index 0`.
-->
在 Job 状态中,看到任务 `Failed` 状况的 `reason` 字段等于 `PodFailurePolicy`
此外,`message` 字段包含了与 Job 终止相关的更多详细信息,例如:
`Pod default/job-pod-failure-policy-config-issue-k6pvp has condition ConfigIssue matching FailJob rule at index 0`
{{< note >}}
<!--
In a production environment, the steps 3 and 4 should be automated by a
user-provided controller.
-->
在生产环境中,第 3 和 4 步应由用户提供的控制器进行自动化处理。
{{< /note >}}
<!--
### Cleaning up
Delete the Job you created:
-->
### 清理
删除你创建的 Job
```sh
kubectl delete jobs/job-pod-failure-policy-config-issue
```
<!--
The cluster automatically cleans up the Pods.
-->
集群自动清理 Pod。
<!--
## Alternatives

View File

@ -0,0 +1,19 @@
apiVersion: batch/v1
kind: Job
metadata:
name: job-pod-failure-policy-config-issue
spec:
completions: 8
parallelism: 2
template:
spec:
restartPolicy: Never
containers:
- name: main
image: "non-existing-repo/non-existing-image:example"
backoffLimit: 6
podFailurePolicy:
rules:
- action: FailJob
onPodConditions:
- type: ConfigIssue