[zh] Tidy up and fix links in tasks section (2/10)
parent
664464806c
commit
1281821884
|
@ -495,7 +495,7 @@ For more information, please see [kubectl scale](/docs/reference/generated/kubec
|
|||
|
||||
Sometimes it's necessary to make narrow, non-disruptive updates to resources you've created.
|
||||
-->
|
||||
## 就地更新资源
|
||||
## 就地更新资源 {#in-place-updates-of-resources}
|
||||
|
||||
有时,有必要对您所创建的资源进行小范围、无干扰地更新。
|
||||
|
||||
|
|
|
@ -170,7 +170,7 @@ On AWS, master node sizes are currently set at cluster startup time and do not c
|
|||
<!--
|
||||
### Addon Resources
|
||||
-->
|
||||
### 插件资源
|
||||
### 插件资源 {#addon-resources}
|
||||
|
||||
<!--
|
||||
To prevent memory leaks or other resource issues in [cluster addons](https://releases.k8s.io/{{< param "githubbranch" >}}/cluster/addons) from consuming all the resources available on a node, Kubernetes sets resource limits on addon containers to limit the CPU and Memory resources they can consume (See PR [#10653](http://pr.k8s.io/10653/files) and [#10778](http://pr.k8s.io/10778/files)).
|
||||
|
|
|
@ -319,7 +319,7 @@ their own IPs. In many cases, the node IPs, pod IPs, and some service IPs on a
|
|||
routable, so they will not be reachable from a machine outside the cluster,
|
||||
such as your desktop machine.
|
||||
-->
|
||||
## 访问集群中正在运行的服务
|
||||
## 访问集群中正在运行的服务 {#accessing-services-running-on-the-cluster}
|
||||
|
||||
上一节介绍了如何连接 Kubernetes API 服务。本节介绍如何连接到 Kubernetes 集群上运行的其他服务。
|
||||
在 Kubernetes 中,[节点](/docs/admin/node),[pods](/docs/user-guide/pods) 和 [服务](/docs/user-guide/services) 都有自己的 IP。
|
||||
|
|
|
@ -19,14 +19,10 @@ application Container runs.
|
|||
-->
|
||||
本文介绍在应用容器运行前,怎样利用 Init 容器初始化 Pod。
|
||||
|
||||
|
||||
## {{% heading "prerequisites" %}}
|
||||
|
||||
|
||||
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
|
||||
|
||||
|
||||
|
||||
<!-- steps -->
|
||||
|
||||
<!--
|
||||
|
@ -38,8 +34,7 @@ container starts.
|
|||
|
||||
Here is the configuration file for the Pod:
|
||||
-->
|
||||
|
||||
## 创建一个包含 Init 容器的 Pod
|
||||
## 创建一个包含 Init 容器的 Pod {#creating-a-pod-that-has-an-init-container}
|
||||
|
||||
本例中您将创建一个包含一个应用容器和一个 Init 容器的 Pod。Init 容器在应用容器启动前运行完成。
|
||||
|
||||
|
|
|
@ -1,18 +1,9 @@
|
|||
---
|
||||
reviewers:
|
||||
- bprashanth
|
||||
- enisoc
|
||||
- erictune
|
||||
- foxish
|
||||
- janetkuo
|
||||
- kow3ns
|
||||
- smarterclayton
|
||||
title: 调试 Init 容器
|
||||
content_type: task
|
||||
---
|
||||
|
||||
<!--
|
||||
---
|
||||
reviewers:
|
||||
- bprashanth
|
||||
- enisoc
|
||||
|
@ -23,7 +14,6 @@ reviewers:
|
|||
- smarterclayton
|
||||
title: Debug Init Containers
|
||||
content_type: task
|
||||
---
|
||||
-->
|
||||
|
||||
<!-- overview -->
|
||||
|
@ -34,15 +24,12 @@ Init Containers. The example command lines below refer to the Pod as
|
|||
`<pod-name>` and the Init Containers as `<init-container-1>` and
|
||||
`<init-container-2>`.
|
||||
-->
|
||||
|
||||
此页显示如何核查与 init 容器执行相关的问题。
|
||||
下面的示例命令行将 Pod 称为 `<pod-name>`,而 init 容器称为 `<init-container-1>` 和 `<init-container-2>`。
|
||||
|
||||
|
||||
此页显示如何核查与 Init 容器执行相关的问题。
|
||||
下面的示例命令行将 Pod 称为 `<pod-name>`,而 Init 容器称为 `<init-container-1>` 和
|
||||
`<init-container-2>`。
|
||||
|
||||
## {{% heading "prerequisites" %}}
|
||||
|
||||
|
||||
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
|
||||
|
||||
<!--
|
||||
|
@ -51,10 +38,8 @@ Init Containers. The example command lines below refer to the Pod as
|
|||
* You should have [Configured an Init Container](/docs/tasks/configure-pod-container/configure-pod-initialization/#creating-a-pod-that-has-an-init-container/).
|
||||
-->
|
||||
|
||||
* 您应该熟悉 [Init 容器](/docs/concepts/abstractions/init-containers/)的基础知识。
|
||||
* 您应该已经[配置好一个 Init 容器](/docs/tasks/configure-pod-container/configure-pod-initialization/#creating-a-pod-that-has-an-init-container/)。
|
||||
|
||||
|
||||
* 你应该熟悉 [Init 容器](/zh/docs/concepts/workloads/pods/init-containers/)的基础知识。
|
||||
* 你应该已经[配置好一个 Init 容器](/zh/docs/tasks/configure-pod-container/configure-pod-initialization/#creating-a-pod-that-has-an-init-container/)。
|
||||
|
||||
<!-- steps -->
|
||||
|
||||
|
@ -88,18 +73,14 @@ NAME READY STATUS RESTARTS AGE
|
|||
See [Understanding Pod status](#understanding-pod-status) for more examples of
|
||||
status values and their meanings.
|
||||
-->
|
||||
|
||||
更多状态值及其含义请参考[了解 Pod 的状态](#understanding-pod-status)。
|
||||
更多状态值及其含义请参考[理解 Pod 的状态](#understanding-pod-status)。
|
||||
|
||||
<!--
|
||||
## Getting details about Init Containers
|
||||
-->
|
||||
|
||||
## 获取 Init 容器详情
|
||||
|
||||
<!--
|
||||
View more detailed information about Init Container execution:
|
||||
-->
|
||||
## 获取 Init 容器详情 {#getting-details-about-init-containers}
|
||||
|
||||
查看 Init 容器运行的更多详情:
|
||||
|
||||
|
@ -110,8 +91,7 @@ kubectl describe pod <pod-name>
|
|||
<!--
|
||||
For example, a Pod with two Init Containers might show the following:
|
||||
-->
|
||||
|
||||
例如,对于包含两个 Init 容器的 Pod 应该显示如下信息:
|
||||
例如,对于包含两个 Init 容器的 Pod 可能显示如下信息:
|
||||
|
||||
```
|
||||
Init Containers:
|
||||
|
@ -145,8 +125,7 @@ Init Containers:
|
|||
You can also access the Init Container statuses programmatically by reading the
|
||||
`status.initContainerStatuses` field on the Pod Spec:
|
||||
-->
|
||||
|
||||
您还可以通过读取 Pod Spec 上的 `status.initContainerStatuses` 字段以编程方式了解 Init 容器的状态:
|
||||
你还可以通过编程方式读取 Pod Spec 上的 `status.initContainerStatuses` 字段,了解 Init 容器的状态:
|
||||
|
||||
```shell
|
||||
kubectl get pod nginx --template '{{.status.initContainerStatuses}}'
|
||||
|
@ -155,21 +134,17 @@ kubectl get pod nginx --template '{{.status.initContainerStatuses}}'
|
|||
<!--
|
||||
This command will return the same information as above in raw JSON.
|
||||
-->
|
||||
|
||||
此命令将返回与原始 JSON 中相同的信息.
|
||||
|
||||
<!--
|
||||
## Accessing logs from Init Containers
|
||||
-->
|
||||
|
||||
## 通过 Init 容器访问日志
|
||||
|
||||
<!--
|
||||
Pass the Init Container name along with the Pod name
|
||||
to access its logs.
|
||||
-->
|
||||
## 通过 Init 容器访问日志 {#accessing-logs-from-init-containers}
|
||||
|
||||
一起传递 Init 容器名称与 Pod 名称来访问它的日志。
|
||||
与 Pod 名称一起传递 Init 容器名称,以访问容器的日志。
|
||||
|
||||
```shell
|
||||
kubectl logs <pod-name> -c <init-container-2>
|
||||
|
@ -180,25 +155,19 @@ Init Containers that run a shell script print
|
|||
commands as they're executed. For example, you can do this in Bash by running
|
||||
`set -x` at the beginning of the script.
|
||||
-->
|
||||
|
||||
运行 shell 脚本打印命令的init容器,执行 shell 脚本。
|
||||
例如,您可以在 Bash 中通过在脚本的开头运行 `set -x` 来实现。
|
||||
|
||||
|
||||
运行 Shell 脚本的 Init 容器在执行 Shell 脚本时输出命令本身。
|
||||
例如,你可以在 Bash 中通过在脚本的开头运行 `set -x` 来实现。
|
||||
|
||||
<!-- discussion -->
|
||||
|
||||
<!--
|
||||
## Understanding Pod status
|
||||
-->
|
||||
|
||||
## 了解 Pod 的状态
|
||||
|
||||
<!--
|
||||
A Pod status beginning with `Init:` summarizes the status of Init Container
|
||||
execution. The table below describes some example status values that you might
|
||||
see while debugging Init Containers.
|
||||
-->
|
||||
## 理解 Pod 的状态 {#understanding-pod-status}
|
||||
|
||||
以 `Init:` 开头的 Pod 状态汇总了 Init 容器执行的状态。
|
||||
下表介绍调试 Init 容器时可能看到的一些状态值示例。
|
||||
|
@ -213,13 +182,11 @@ Status | Meaning
|
|||
`PodInitializing` or `Running` | The Pod has already finished executing Init Containers.
|
||||
-->
|
||||
|
||||
状态 | 含义
|
||||
状态 | 含义
|
||||
------ | -------
|
||||
`Init:N/M` | Pod 包含 `M` 个 Init 容器,其中 `N` 个已经运行完成。
|
||||
`Init:Error` | Init 容器已执行失败。
|
||||
`Init:CrashLoopBackOff` | Init 容器反复执行失败。
|
||||
`Init:CrashLoopBackOff` | Init 容器执行总是失败。
|
||||
`Pending` | Pod 还没有开始执行 Init 容器。
|
||||
`PodInitializing` or `Running` | Pod 已经完成执行 Init 容器。
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -4,10 +4,8 @@ content_type: task
|
|||
---
|
||||
|
||||
<!--
|
||||
---
|
||||
title: Determine the Reason for Pod Failure
|
||||
content_type: task
|
||||
---
|
||||
-->
|
||||
|
||||
<!-- overview -->
|
||||
|
@ -16,33 +14,24 @@ content_type: task
|
|||
This page shows how to write and read a Container
|
||||
termination message.
|
||||
-->
|
||||
|
||||
本文介绍如何编写和读取容器的终止消息。
|
||||
|
||||
<!--
|
||||
Termination messages provide a way for containers to write information about
|
||||
fatal events to a location where it can be easily retrieved and surfaced by
|
||||
tools like dashboards and monitoring software. In most cases, information that
|
||||
you put in a termination message should also be written to the general
|
||||
Termination messages provide a way for containers to write information about
|
||||
fatal events to a location where it can be easily retrieved and surfaced by
|
||||
tools like dashboards and monitoring software. In most cases, information that
|
||||
you put in a termination message should also be written to the general
|
||||
[Kubernetes logs](/docs/concepts/cluster-administration/logging/).
|
||||
-->
|
||||
|
||||
终止消息为容器提供了一种方法,可以将有关致命事件的信息写入某个位置,在该位置可以通过仪表板和监控软件等工具轻松检索和显示致命事件。
|
||||
在大多数情况下,您放入终止消息中的信息也应该写入[常规 Kubernetes 日志](/docs/concepts/cluster-administration/logging/)。
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
终止消息为容器提供了一种方法,可以将有关致命事件的信息写入某个位置,
|
||||
在该位置可以通过仪表板和监控软件等工具轻松检索和显示致命事件。
|
||||
在大多数情况下,您放入终止消息中的信息也应该写入
|
||||
[常规 Kubernetes 日志](/zh/docs/concepts/cluster-administration/logging/)。
|
||||
|
||||
## {{% heading "prerequisites" %}}
|
||||
|
||||
|
||||
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
|
||||
|
||||
|
||||
|
||||
|
||||
<!-- steps -->
|
||||
|
||||
<!--
|
||||
|
@ -52,7 +41,6 @@ In this exercise, you create a Pod that runs one container.
|
|||
The configuration file specifies a command that runs when
|
||||
the container starts.
|
||||
-->
|
||||
|
||||
## 读写终止消息
|
||||
|
||||
在本练习中,您将创建运行一个容器的 Pod。
|
||||
|
@ -62,59 +50,69 @@ the container starts.
|
|||
|
||||
1. <!--Create a Pod based on the YAML configuration file:-->基于 YAML 配置文件创建 Pod:
|
||||
|
||||
kubectl create -f https://k8s.io/examples/debug/termination.yaml
|
||||
```shell
|
||||
kubectl create -f https://k8s.io/examples/debug/termination.yaml
|
||||
```
|
||||
|
||||
<!--In the YAML file, in the `cmd` and `args` fields, you can see that the
|
||||
container sleeps for 10 seconds and then writes "Sleep expired" to
|
||||
the `/dev/termination-log` file. After the container writes
|
||||
the "Sleep expired" message, it terminates.-->
|
||||
YAML 文件中,在 `cmd` 和 `args` 字段,你可以看到容器休眠 10 秒然后将 "Sleep expired" 写入 `/dev/termination-log` 文件。
|
||||
容器写完 "Sleep expired" 消息后,它就终止了。
|
||||
<!--In the YAML file, in the `cmd` and `args` fields, you can see that the
|
||||
container sleeps for 10 seconds and then writes "Sleep expired" to
|
||||
the `/dev/termination-log` file. After the container writes
|
||||
the "Sleep expired" message, it terminates.-->
|
||||
YAML 文件中,在 `cmd` 和 `args` 字段,你可以看到容器休眠 10 秒然后将 "Sleep expired"
|
||||
写入 `/dev/termination-log` 文件。
|
||||
容器写完 "Sleep expired" 消息后就终止了。
|
||||
|
||||
1. <!--Display information about the Pod:-->显示 Pod 的信息:
|
||||
|
||||
kubectl get pod termination-demo
|
||||
```shell
|
||||
kubectl get pod termination-demo
|
||||
```
|
||||
|
||||
<!--Repeat the preceding command until the Pod is no longer running.-->
|
||||
重复前面的命令直到 Pod 不再运行。
|
||||
<!--Repeat the preceding command until the Pod is no longer running.-->
|
||||
重复前面的命令直到 Pod 不再运行。
|
||||
|
||||
1. <!--Display detailed information about the Pod:-->显示 Pod 的详细信息:
|
||||
1. <!--Display detailed information about the Pod:-->
|
||||
显示 Pod 的详细信息:
|
||||
|
||||
kubectl get pod --output=yaml
|
||||
```shell
|
||||
kubectl get pod --output=yaml
|
||||
```
|
||||
|
||||
<!--The output includes the "Sleep expired" message:-->输出结果包含 "Sleep expired" 消息:
|
||||
<!--The output includes the "Sleep expired" message:-->输出结果包含 "Sleep expired" 消息:
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
...
|
||||
lastState:
|
||||
terminated:
|
||||
containerID: ...
|
||||
exitCode: 0
|
||||
finishedAt: ...
|
||||
message: |
|
||||
Sleep expired
|
||||
...
|
||||
```
|
||||
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
...
|
||||
lastState:
|
||||
terminated:
|
||||
containerID: ...
|
||||
exitCode: 0
|
||||
finishedAt: ...
|
||||
message: |
|
||||
Sleep expired
|
||||
...
|
||||
1. <!--Use a Go template to filter the output so that it includes only the termination message:-->
|
||||
使用 Go 模板过滤输出结果,使其只含有终止消息:
|
||||
|
||||
1. <!--Use a Go template to filter the output so that it includes only the termination message:-->使用 Go 模板过滤输出结果,使其只含有终止消息:
|
||||
|
||||
kubectl get pod termination-demo -o go-template="{{range .status.containerStatuses}}{{.lastState.terminated.message}}{{end}}"
|
||||
```shell
|
||||
kubectl get pod termination-demo -o go-template="{{range .status.containerStatuses}}{{.lastState.terminated.message}}{{end}}"
|
||||
```
|
||||
|
||||
<!--
|
||||
## Customizing the termination message
|
||||
-->
|
||||
|
||||
## 定制终止消息
|
||||
|
||||
<!--
|
||||
Kubernetes retrieves termination messages from the termination message file
|
||||
specified in the `terminationMessagePath` field of a Container, which as a default
|
||||
value of `/dev/termination-log`. By customizing this field, you can tell Kubernetes
|
||||
to use a different file. Kubernetes use the contents from the specified file to
|
||||
populate the Container's status message on both success and failure.
|
||||
-->
|
||||
## 定制终止消息
|
||||
|
||||
Kubernetes 从容器的 `terminationMessagePath` 字段中指定的终止消息文件中检索终止消息,默认值为 `/dev/termination-log`。
|
||||
Kubernetes 从容器的 `terminationMessagePath` 字段中指定的终止消息文件中检索终止消息,
|
||||
默认值为 `/dev/termination-log`。
|
||||
通过定制这个字段,您可以告诉 Kubernetes 使用不同的文件。
|
||||
Kubernetes 使用指定文件中的内容在成功和失败时填充容器的状态消息。
|
||||
|
||||
|
@ -122,7 +120,6 @@ Kubernetes 使用指定文件中的内容在成功和失败时填充容器的状
|
|||
In the following example, the container writes termination messages to
|
||||
`/tmp/my-log` for Kubernetes to retrieve:
|
||||
-->
|
||||
|
||||
在下例中,容器将终止消息写入 `/tmp/my-log` 给 Kubernetes 来接收:
|
||||
|
||||
```yaml
|
||||
|
@ -152,11 +149,8 @@ is empty and the container exited with an error. The log output is limited to
|
|||
通过将 `terminationMessagePolicy` 设置为 "`FallbackToLogsOnError`",你就可以告诉 Kubernetes,在容器因错误退出时,如果终止消息文件为空,则使用容器日志输出的最后一块作为终止消息。
|
||||
日志输出限制为 2048 字节或 80 行,以较小者为准。
|
||||
|
||||
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
|
||||
<!--
|
||||
* See the `terminationMessagePath` field in
|
||||
[Container](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#container-v1-core).
|
||||
|
@ -164,11 +158,8 @@ is empty and the container exited with an error. The log output is limited to
|
|||
* Learn about [Go templates](https://golang.org/pkg/text/template/).
|
||||
-->
|
||||
|
||||
* 参考[容器](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#container-v1-core)的 `terminationMessagePath` 字段。
|
||||
* 了解[接收日志](/docs/concepts/cluster-administration/logging/)。
|
||||
* 参考 [Container](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#container-v1-core)
|
||||
资源的 `terminationMessagePath` 字段。
|
||||
* 了解[接收日志](/zh/docs/concepts/cluster-administration/logging/)。
|
||||
* 了解 [Go 模版](https://golang.org/pkg/text/template/)。
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -1,19 +1,14 @@
|
|||
---
|
||||
reviewers:
|
||||
- piosz
|
||||
- x13n
|
||||
content_type: concept
|
||||
title: StackDriver 中的事件
|
||||
---
|
||||
|
||||
<!--
|
||||
---
|
||||
reviewers:
|
||||
- piosz
|
||||
- x13n
|
||||
content_type: concept
|
||||
title: Events in Stackdriver
|
||||
---
|
||||
-->
|
||||
|
||||
<!-- overview -->
|
||||
|
@ -27,8 +22,10 @@ for debugging your application in the [Application Introspection and Debugging
|
|||
section.
|
||||
-->
|
||||
|
||||
Kubernetes 事件是一种对象,它为用户提供了洞察集群内发生的事情的能力,例如调度程序做出了什么决定,或者为什么某些 Pod 被逐出节点。
|
||||
您可以在[应用程序自检和调试](/docs/tasks/debug-application-cluster/debug-application-introspection/)中阅读有关使用事件调试应用程序的更多信息。
|
||||
Kubernetes 事件是一种对象,它为用户提供了洞察集群内发生的事情的能力,
|
||||
例如调度程序做出了什么决定,或者为什么某些 Pod 被逐出节点。
|
||||
你可以在[应用程序自检和调试](/zh/docs/tasks/debug-application-cluster/debug-application-introspection/)
|
||||
中阅读有关使用事件调试应用程序的更多信息。
|
||||
|
||||
<!--
|
||||
Since events are API objects, they are stored in the apiserver on master. To
|
||||
|
@ -37,8 +34,7 @@ removed one hour after the last occurrence. To provide longer history
|
|||
and aggregation capabilities, a third party solution should be installed
|
||||
to capture events.
|
||||
-->
|
||||
|
||||
因为事件是 API 对象,所以它们存储在主节点上的 apiserver 中。
|
||||
因为事件是 API 对象,所以它们存储在主控节点上的 API 服务器中。
|
||||
为了避免主节点磁盘空间被填满,将强制执行保留策略:事件在最后一次发生的一小时后将会被删除。
|
||||
为了提供更长的历史记录和聚合能力,应该安装第三方解决方案来捕获事件。
|
||||
|
||||
|
@ -46,10 +42,8 @@ to capture events.
|
|||
This article describes a solution that exports Kubernetes events to
|
||||
Stackdriver Logging, where they can be processed and analyzed.
|
||||
-->
|
||||
|
||||
本文描述了一个将 Kubernetes 事件导出为 Stackdriver Logging 的解决方案,在这里可以对它们进行处理和分析。
|
||||
|
||||
{{< note >}}
|
||||
<!--
|
||||
It is not guaranteed that all events happening in a cluster will be
|
||||
exported to Stackdriver. One possible scenario when events will not be
|
||||
|
@ -58,24 +52,21 @@ upgrade). In most cases it's fine to use events for purposes like setting up
|
|||
[metrics][sdLogMetrics] and [alerts][sdAlerts], but you should be aware
|
||||
of the potential inaccuracy.
|
||||
-->
|
||||
{{< note >}}
|
||||
不能保证集群中发生的所有事件都将导出到 Stackdriver。
|
||||
事件不能导出的一种可能情况是事件导出器没有运行(例如,在重新启动或升级期间)。
|
||||
在大多数情况下,可以将事件用于设置 [metrics][sdLogMetrics] 和 [alerts][sdAlerts] 等目的,但您应该注意潜在的不准确性。
|
||||
在大多数情况下,可以将事件用于设置
|
||||
[metrics](https://cloud.google.com/logging/docs/view/logs_based_metrics) 和
|
||||
[alerts](https://cloud.google.com/logging/docs/view/logs_based_metrics#creating_an_alerting_policy)
|
||||
等目的,但你应该注意其潜在的不准确性。
|
||||
{{< /note >}}
|
||||
|
||||
[sdLogMetrics]: https://cloud.google.com/logging/docs/view/logs_based_metrics
|
||||
[sdAlerts]: https://cloud.google.com/logging/docs/view/logs_based_metrics#creating_an_alerting_policy
|
||||
|
||||
|
||||
|
||||
|
||||
<!-- body -->
|
||||
|
||||
<!--
|
||||
## Deployment
|
||||
-->
|
||||
|
||||
## 部署
|
||||
## 部署 {#deployment}
|
||||
|
||||
### Google Kubernetes Engine
|
||||
|
||||
|
@ -91,21 +82,18 @@ average, approximately 100Mb RAM and 100m CPU is needed.
|
|||
-->
|
||||
|
||||
在 Google Kubernetes Engine 中,如果启用了云日志,那么事件导出器默认部署在主节点运行版本为 1.7 及更高版本的集群中。
|
||||
为了防止干扰您的工作负载,事件导出器没有设置资源,并且处于尽力而为的 QoS 类型中,这意味着它将在资源匮乏的情况下第一个被杀死。
|
||||
为了防止干扰你的工作负载,事件导出器没有设置资源,并且处于尽力而为的 QoS 类型中,这意味着它将在资源匮乏的情况下第一个被杀死。
|
||||
如果要导出事件,请确保有足够的资源给事件导出器 Pod 使用。
|
||||
这可能会因为工作负载的不同而有所不同,但平均而言,需要大约 100MB 的内存和 100m 的 CPU。
|
||||
|
||||
<!--
|
||||
### Deploying to the Existing Cluster
|
||||
-->
|
||||
|
||||
### 部署到现有集群
|
||||
|
||||
<!--
|
||||
Deploy event exporter to your cluster using the following command:
|
||||
-->
|
||||
### 部署到现有集群
|
||||
|
||||
使用下面的命令将事件导出器部署到您的集群:
|
||||
使用下面的命令将事件导出器部署到你的集群:
|
||||
|
||||
```shell
|
||||
kubectl create -f https://k8s.io/examples/debug/event-exporter.yaml
|
||||
|
@ -123,27 +111,27 @@ requests. As mentioned earlier, 100Mb RAM and 100m CPU should be enough.
|
|||
由于事件导出器访问 Kubernetes API,因此它需要权限才能访问。
|
||||
以下的部署配置为使用 RBAC 授权。
|
||||
它设置服务帐户和集群角色绑定,以允许事件导出器读取事件。
|
||||
为了确保事件导出器 Pod 不会从节点中退出,您可以另外设置资源请求。
|
||||
为了确保事件导出器 Pod 不会从节点中退出,你可以另外设置资源请求。
|
||||
如前所述,100MB 内存和 100m CPU 应该就足够了。
|
||||
|
||||
{{< codenew file="debug/event-exporter.yaml" >}}
|
||||
|
||||
<!--
|
||||
## User Guide
|
||||
-->
|
||||
|
||||
## 用户指南
|
||||
|
||||
<!--
|
||||
Events are exported to the `GKE Cluster` resource in Stackdriver Logging.
|
||||
You can find them by selecting an appropriate option from a drop-down menu
|
||||
of available resources:
|
||||
-->
|
||||
## 用户指南 {#user-guide}
|
||||
|
||||
事件在 Stackdriver Logging 中被导出到 `GKE Cluster` 资源。
|
||||
您可以通过从可用资源的下拉菜单中选择适当的选项来找到它们:
|
||||
你可以通过从可用资源的下拉菜单中选择适当的选项来找到它们:
|
||||
|
||||
<!--
|
||||
<img src="/images/docs/stackdriver-event-exporter-resource.png" alt="Events location in the Stackdriver Logging interface" width="500">
|
||||
-->
|
||||
<img src="/images/docs/stackdriver-event-exporter-resource.png" alt="Stackdriver 日志接口中事件的位置" width="500">
|
||||
|
||||
<!--
|
||||
You can filter based on the event object fields using Stackdriver Logging
|
||||
|
@ -151,8 +139,9 @@ You can filter based on the event object fields using Stackdriver Logging
|
|||
For example, the following query will show events from the scheduler
|
||||
about pods from deployment `nginx-deployment`:
|
||||
-->
|
||||
|
||||
您可以使用 Stackdriver Logging 的[过滤机制](https://cloud.google.com/logging/docs/view/advanced_filters)基于事件对象字段进行过滤。
|
||||
你可以使用 Stackdriver Logging 的
|
||||
[过滤机制](https://cloud.google.com/logging/docs/view/advanced_filters)
|
||||
基于事件对象字段进行过滤。
|
||||
例如,下面的查询将显示调度程序中有关 Deployment `nginx-deployment` 中的 Pod 的事件:
|
||||
|
||||
```
|
||||
|
@ -162,6 +151,6 @@ jsonPayload.source.component="default-scheduler"
|
|||
jsonPayload.involvedObject.name:"nginx-deployment"
|
||||
```
|
||||
|
||||
{{< figure src="/images/docs/stackdriver-event-exporter-filter.png" alt="Filtered events in the Stackdriver Logging interface" width="500" >}}
|
||||
{{< figure src="/images/docs/stackdriver-event-exporter-filter.png" alt="在 Stackdriver 接口中过滤的事件" width="500" >}}
|
||||
|
||||
|
||||
|
|
|
@ -4,10 +4,8 @@ content_type: task
|
|||
---
|
||||
|
||||
<!--
|
||||
---
|
||||
title: Developing and debugging services locally
|
||||
content_type: task
|
||||
---
|
||||
-->
|
||||
|
||||
<!-- overview -->
|
||||
|
@ -15,28 +13,25 @@ content_type: task
|
|||
<!--
|
||||
Kubernetes applications usually consist of multiple, separate services, each running in its own container. Developing and debugging these services on a remote Kubernetes cluster can be cumbersome, requiring you to [get a shell on a running container](/docs/tasks/debug-application-cluster/get-shell-running-container/) and running your tools inside the remote shell.
|
||||
-->
|
||||
|
||||
Kubernetes 应用程序通常由多个独立的服务组成,每个服务都在自己的容器中运行。
|
||||
在远端的 Kubernetes 集群上开发和调试这些服务可能很麻烦,需要[在运行的容器上打开 shell](/docs/tasks/debug-application-cluster/get-shell-running-container/),然后在远端 shell 中运行您所需的工具。
|
||||
在远端的 Kubernetes 集群上开发和调试这些服务可能很麻烦,需要
|
||||
[在运行的容器上打开 Shell](/zh/docs/tasks/debug-application-cluster/get-shell-running-container/),
|
||||
然后在远端 Shell 中运行你所需的工具。
|
||||
|
||||
<!--
|
||||
`telepresence` is a tool to ease the process of developing and debugging services locally, while proxying the service to a remote Kubernetes cluster. Using `telepresence` allows you to use custom tools, such as a debugger and IDE, for a local service and provides the service full access to ConfigMap, secrets, and the services running on the remote cluster.
|
||||
-->
|
||||
|
||||
`telepresence` 是一种工具,用于在本地轻松开发和调试服务,同时将服务代理到远程 Kubernetes 集群。
|
||||
使用 `telepresence` 可以为本地服务使用自定义工具(如调试器和 IDE),并提供对 Configmap、Secrets 和远程集群上运行的服务的完全访问。
|
||||
使用 `telepresence` 可以为本地服务使用自定义工具(如调试器和 IDE),
|
||||
并提供对 Configmap、Secret 和远程集群上运行的服务的完全访问。
|
||||
|
||||
<!--
|
||||
This document describes using `telepresence` to develop and debug services running on a remote cluster locally.
|
||||
--
|
||||
|
||||
-->
|
||||
本文档描述如何在本地使用 `telepresence` 开发和调试远程集群上运行的服务。
|
||||
|
||||
|
||||
|
||||
## {{% heading "prerequisites" %}}
|
||||
|
||||
|
||||
<!--
|
||||
* Kubernetes cluster is installed
|
||||
* `kubectl` is configured to communicate with the cluster
|
||||
|
@ -47,8 +42,6 @@ This document describes using `telepresence` to develop and debug services runni
|
|||
* 配置好 `kubectl` 与集群交互
|
||||
* [Telepresence](https://www.telepresence.io/reference/install) 安装完毕
|
||||
|
||||
|
||||
|
||||
<!-- steps -->
|
||||
|
||||
<!--
|
||||
|
@ -56,40 +49,39 @@ This document describes using `telepresence` to develop and debug services runni
|
|||
|
||||
Open a terminal and run `telepresence` with no arguments to get a `telepresence` shell. This shell runs locally, giving you full access to your local filesystem.
|
||||
-->
|
||||
|
||||
打开终端,不带参数运行 `telepresence`,以打开 `telepresence` shell。这个 shell 在本地运行,使您可以完全访问本地文件系统。
|
||||
打开终端,不带参数运行 `telepresence`,以打开 `telepresence` Shell。
|
||||
这个 Shell 在本地运行,使你可以完全访问本地文件系统。
|
||||
|
||||
<!--
|
||||
The `telepresence` shell can be used in a variety of ways. For example, write a shell script on your laptop, and run it directly from the shell in real time. You can do this on a remote shell as well, but you might not be able to use your preferred code editor, and the script is deleted when the container is terminated.
|
||||
|
||||
Enter `exit` to quit and close the shell.
|
||||
-->
|
||||
|
||||
`telepresence` shell 的使用方式多种多样。
|
||||
例如,在你的笔记本电脑上写一个 shell 脚本,然后直接在 shell 中实时运行它。
|
||||
您也可以在远端 shell 上执行此操作,但这样可能无法使用首选的代码编辑器,并且在容器终止时脚本将被删除。
|
||||
`telepresence` Shell 的使用方式多种多样。
|
||||
例如,在你的笔记本电脑上写一个 Shell 脚本,然后直接在 Shell 中实时运行它。
|
||||
你也可以在远端 Shell 上执行此操作,但这样可能无法使用首选的代码编辑器,并且在容器终止时脚本将被删除。
|
||||
|
||||
<!--
|
||||
## Developing or debugging an existing service
|
||||
|
||||
When developing an application on Kubernetes, you typically program or debug a single service. The service might require access to other services for testing and debugging. One option is to use the continuous deployment pipeline, but even the fastest deployment pipeline introduces a delay in the program or debug cycle.
|
||||
-->
|
||||
|
||||
## 开发和调试现有的服务
|
||||
|
||||
在 Kubernetes 上开发应用程序时,通常对单个服务进行编程或调试。
|
||||
服务可能需要访问其他服务以进行测试和调试。
|
||||
一种选择是使用连续部署管道,但即使最快的部署管道也会在程序或调试周期中引入延迟。
|
||||
一种选择是使用连续部署流水线,但即使最快的部署流水线也会在程序或调试周期中引入延迟。
|
||||
|
||||
<!--
|
||||
Use the `--swap-deployment` option to swap an existing deployment with the Telepresence proxy. Swapping allows you to run a service locally and connect to the remote Kubernetes cluster. The services in the remote cluster can now access the locally running instance.
|
||||
|
||||
To run telepresence with `--swap-deployment`, enter:
|
||||
-->
|
||||
使用 `--swap-deployment` 选项将现有部署与 Telepresence 代理交换。
|
||||
交换允许你在本地运行服务并能够连接到远端的 Kubernetes 集群。
|
||||
远端集群中的服务现在就可以访问本地运行的实例。
|
||||
|
||||
使用 `--swap-deployment` 选项将现有部署与 Telepresence 代理交换。交换允许您在本地运行服务并能够连接到远端的 Kubernetes 集群。远端的集群中的服务现在就可以访问本地运行的实例。
|
||||
|
||||
到运行 telepresence 并带有 `--swap-deployment` 选项,请输入:
|
||||
要运行 telepresence 并带有 `--swap-deployment` 选项,请输入:
|
||||
|
||||
`telepresence --swap-deployment $DEPLOYMENT_NAME`
|
||||
|
||||
|
@ -98,31 +90,25 @@ where $DEPLOYMENT_NAME is the name of your existing deployment.
|
|||
|
||||
Running this command spawns a shell. In the shell, start your service. You can then make edits to the source code locally, save, and see the changes take effect immediately. You can also run your service in a debugger, or any other local development tool.
|
||||
-->
|
||||
这里的 `$DEPLOYMENT_NAME` 是你现有的部署名称。
|
||||
|
||||
这里的 $DEPLOYMENT_NAME 是您现有的部署名称。
|
||||
|
||||
运行此命令将生成 shell。在 shell 中,启动您的服务。
|
||||
然后,您就可以在本地对源代码进行编辑、保存并能看到更改立即生效。您还可以在调试器或任何其他本地开发工具中运行服务。
|
||||
|
||||
|
||||
运行此命令将生成 Shell。在该 Shell 中,启动你的服务。
|
||||
然后,你就可以在本地对源代码进行编辑、保存并能看到更改立即生效。
|
||||
你还可以在调试器或任何其他本地开发工具中运行服务。
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
|
||||
<!--
|
||||
If you're interested in a hands-on tutorial, check out [this tutorial](https://cloud.google.com/community/tutorials/developing-services-with-k8s) that walks through locally developing the Guestbook application on Google Kubernetes Engine.
|
||||
-->
|
||||
|
||||
如果您对实践教程感兴趣,请查看[本教程](https://cloud.google.com/community/tutorials/developing-services-with-k8s),其中介绍了在 Google Kubernetes Engine 上本地开发 Guestbook 应用程序。
|
||||
如果你对实践教程感兴趣,请查看[本教程](https://cloud.google.com/community/tutorials/developing-services-with-k8s),其中介绍了在 Google Kubernetes Engine 上本地开发 Guestbook 应用程序。
|
||||
|
||||
<!--
|
||||
Telepresence has [numerous proxying options](https://www.telepresence.io/reference/methods), depending on your situation.
|
||||
|
||||
For further reading, visit the [Telepresence website](https://www.telepresence.io).
|
||||
-->
|
||||
|
||||
Telepresence 有[多种代理选项](https://www.telepresence.io/reference/methods),以满足您的各种情况。
|
||||
Telepresence 有[多种代理选项](https://www.telepresence.io/reference/methods),以满足你的各种情况。
|
||||
|
||||
要了解更多信息,请访问 [Telepresence 网站](https://www.telepresence.io)。
|
||||
|
||||
|
||||
|
|
|
@ -1,19 +1,14 @@
|
|||
---
|
||||
reviewers:
|
||||
- piosz
|
||||
- x13n
|
||||
content_type: concept
|
||||
title: 使用 ElasticSearch 和 Kibana 进行日志管理
|
||||
---
|
||||
|
||||
<!--
|
||||
---
|
||||
reviewers:
|
||||
- piosz
|
||||
- x13n
|
||||
content_type: concept
|
||||
title: Logging Using Elasticsearch and Kibana
|
||||
---
|
||||
-->
|
||||
|
||||
<!-- overview -->
|
||||
|
@ -23,8 +18,10 @@ On the Google Compute Engine (GCE) platform, the default logging support targets
|
|||
[Stackdriver Logging](https://cloud.google.com/logging/), which is described in detail
|
||||
in the [Logging With Stackdriver Logging](/docs/user-guide/logging/stackdriver).
|
||||
-->
|
||||
|
||||
在 Google Compute Engine (GCE) 平台上,默认的日志管理支持目标是 [Stackdriver Logging](https://cloud.google.com/logging/),在 [使用 Stackdriver Logging 管理日志](/docs/user-guide/logging/stackdriver)中详细描述了这一点。
|
||||
在 Google Compute Engine (GCE) 平台上,默认的日志管理支持目标是
|
||||
[Stackdriver Logging](https://cloud.google.com/logging/),
|
||||
在[使用 Stackdriver Logging 管理日志](/zh/docs/tasks/debug-application-cluster/logging-stackdriver/)
|
||||
中详细描述了这一点。
|
||||
|
||||
<!--
|
||||
This article describes how to set up a cluster to ingest logs into
|
||||
|
@ -32,18 +29,19 @@ This article describes how to set up a cluster to ingest logs into
|
|||
them using [Kibana](https://www.elastic.co/products/kibana), as an alternative to
|
||||
Stackdriver Logging when running on GCE.
|
||||
-->
|
||||
本文介绍了如何设置一个集群,将日志导入
|
||||
[Elasticsearch](https://www.elastic.co/products/elasticsearch),并使用
|
||||
[Kibana](https://www.elastic.co/products/kibana) 查看日志,作为在 GCE 上
|
||||
运行应用时使用 Stackdriver Logging 管理日志的替代方案。
|
||||
|
||||
本文介绍了如何设置一个集群,将日志导入[Elasticsearch](https://www.elastic.co/products/elasticsearch),并使用 [Kibana](https://www.elastic.co/products/kibana) 查看日志,作为在 GCE 上运行应用时使用 Stackdriver Logging 管理日志的替代方案。
|
||||
|
||||
{{< note >}}
|
||||
<!--
|
||||
You cannot automatically deploy Elasticsearch and Kibana in the Kubernetes cluster hosted on Google Kubernetes Engine. You have to deploy them manually.
|
||||
-->
|
||||
您不能在 Google Kubernetes Engine 平台运行的 Kubernetes 集群上自动的部署 Elasticsearch 和 Kibana。您必须手动部署它们。
|
||||
{{< note >}}
|
||||
你不能在 Google Kubernetes Engine 平台运行的 Kubernetes 集群上自动部署
|
||||
Elasticsearch 和 Kibana。你必须手动部署它们。
|
||||
{{< /note >}}
|
||||
|
||||
|
||||
|
||||
<!-- body -->
|
||||
|
||||
<!--
|
||||
|
@ -51,8 +49,7 @@ To use Elasticsearch and Kibana for cluster logging, you should set the
|
|||
following environment variable as shown below when creating your cluster with
|
||||
kube-up.sh:
|
||||
-->
|
||||
|
||||
要使用 Elasticsearch 和 Kibana 处理集群日志,您应该在使用 kube-up.sh 脚本创建集群时设置下面所示的环境变量:
|
||||
要使用 Elasticsearch 和 Kibana 处理集群日志,你应该在使用 kube-up.sh 脚本创建集群时设置下面所示的环境变量:
|
||||
|
||||
```shell
|
||||
KUBE_LOGGING_DESTINATION=elasticsearch
|
||||
|
@ -61,18 +58,20 @@ KUBE_LOGGING_DESTINATION=elasticsearch
|
|||
<!--
|
||||
You should also ensure that `KUBE_ENABLE_NODE_LOGGING=true` (which is the default for the GCE platform).
|
||||
-->
|
||||
|
||||
您还应该确保设置了 `KUBE_ENABLE_NODE_LOGGING=true` (这是 GCE 平台的默认设置)。
|
||||
你还应该确保设置了 `KUBE_ENABLE_NODE_LOGGING=true` (这是 GCE 平台的默认设置)。
|
||||
|
||||
<!--
|
||||
Now, when you create a cluster, a message will indicate that the Fluentd log
|
||||
collection daemons that run on each node will target Elasticsearch:
|
||||
-->
|
||||
|
||||
现在,当您创建集群时,将有一条消息将指示每个节点上运行的 Fluentd 日志收集守护进程以 ElasticSearch 为日志输出目标:
|
||||
现在,当你创建集群时,将有一条消息将指示每个节点上运行的 fluentd 日志收集守护进程
|
||||
以 ElasticSearch 为日志输出目标:
|
||||
|
||||
```shell
|
||||
$ cluster/kube-up.sh
|
||||
cluster/kube-up.sh
|
||||
```
|
||||
|
||||
```
|
||||
...
|
||||
Project: kubernetes-satnam
|
||||
Zone: us-central1-b
|
||||
|
@ -96,11 +95,14 @@ The per-node Fluentd pods, the Elasticsearch pods, and the Kibana pods should
|
|||
all be running in the kube-system namespace soon after the cluster comes to
|
||||
life.
|
||||
-->
|
||||
|
||||
每个节点的 Fluentd pod、Elasticsearch pod 和 Kibana pod 都应该在集群启动后不久运行在 kube-system 命名空间中。
|
||||
每个节点的 Fluentd Pod、Elasticsearch Pod 和 Kibana Pod 都应该在集群启动后不久运行在
|
||||
kube-system 命名空间中。
|
||||
|
||||
```shell
|
||||
$ kubectl get pods --namespace=kube-system
|
||||
kubectl get pods --namespace=kube-system
|
||||
```
|
||||
|
||||
```
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
elasticsearch-logging-v1-78nog 1/1 Running 0 2h
|
||||
elasticsearch-logging-v1-nj2nb 1/1 Running 0 2h
|
||||
|
@ -122,10 +124,12 @@ Elasticsearch pods store the logs and expose them via a REST API.
|
|||
The `kibana-logging` pod provides a web UI for reading the logs stored in
|
||||
Elasticsearch, and is part of a service named `kibana-logging`.
|
||||
-->
|
||||
|
||||
`fluentd-elasticsearch` pod 从每个节点收集日志并将其发送到 `elasticsearch-logging` pods,该 pod 是名为 `elasticsearch-logging` 的[服务](/docs/concepts/services-networking/service/)的一部分。
|
||||
`fluentd-elasticsearch` Pod 从每个节点收集日志并将其发送到 `elasticsearch-logging` Pod,
|
||||
该 Pod 是名为 `elasticsearch-logging` 的
|
||||
[服务](/zh/docs/concepts/services-networking/service/)的一部分。
|
||||
这些 ElasticSearch pod 存储日志,并通过 REST API 将其公开。
|
||||
`kibana-logging` pod 提供了一个用于读取 ElasticSearch 中存储的日志的 Web UI,它是名为 `kibana-logging` 的服务的一部分。
|
||||
`kibana-logging` pod 提供了一个用于读取 ElasticSearch 中存储的日志的 Web UI,
|
||||
它是名为 `kibana-logging` 的服务的一部分。
|
||||
|
||||
<!--
|
||||
The Elasticsearch and Kibana services are both in the `kube-system` namespace
|
||||
|
@ -134,13 +138,14 @@ follow the instructions for [Accessing services running in a cluster](/docs/conc
|
|||
-->
|
||||
|
||||
Elasticsearch 和 Kibana 服务都位于 `kube-system` 命名空间中,并且没有通过可公开访问的 IP 地址直接暴露。
|
||||
要访问它们,请参照[访问集群中运行的服务](/docs/concepts/cluster-administration/access-cluster/#accessing-services-running-on-the-cluster)的说明进行操作。
|
||||
要访问它们,请参照
|
||||
[访问集群中运行的服务](/zh/docs/tasks/access-application-cluster/access-cluster/#accessing-services-running-on-the-cluster)
|
||||
的说明进行操作。
|
||||
|
||||
<!--
|
||||
If you try accessing the `elasticsearch-logging` service in your browser, you'll
|
||||
see a status page that looks something like this:
|
||||
-->
|
||||
|
||||
如果你想在浏览器中访问 `elasticsearch-logging` 服务,你将看到类似下面的状态页面:
|
||||
|
||||
![Elasticsearch Status](/images/docs/es-browser.png)
|
||||
|
@ -150,7 +155,6 @@ You can now type Elasticsearch queries directly into the browser, if you'd
|
|||
like. See [Elasticsearch's documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html)
|
||||
for more details on how to do so.
|
||||
-->
|
||||
|
||||
现在你可以直接在浏览器中输入 Elasticsearch 查询,如果你愿意的话。
|
||||
请参考 [Elasticsearch 的文档](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-uri-request.html) 以了解这样做的更多细节。
|
||||
|
||||
|
@ -165,11 +169,12 @@ You can set the refresh interval to 5 seconds to have the logs
|
|||
regularly refreshed.
|
||||
-->
|
||||
|
||||
或者,您可以使用 Kibana 查看集群的日志(再次使用[访问集群中运行的服务的说明](/docs/user-guide/accessing-the-cluster/#accessing-services-running-on-the-cluster))。
|
||||
第一次访问 Kibana URL 时,将显示一个页面,要求您配置所接收日志的视图。
|
||||
或者,你可以使用 Kibana 查看集群的日志(再次使用
|
||||
[访问集群中运行的服务的说明](/zh/docs/tasks/access-application-cluster/access-cluster/#accessing-services-running-on-the-cluster))。
|
||||
第一次访问 Kibana URL 时,将显示一个页面,要求你配置所接收日志的视图。
|
||||
选择时间序列值的选项,然后选择 `@timestamp`。
|
||||
在下面的页面中选择 `Discover` 选项卡,然后您应该能够看到所摄取的日志。
|
||||
您可以将刷新间隔设置为 5 秒,以便定期刷新日志。
|
||||
在下面的页面中选择 `Discover` 选项卡,然后你应该能够看到所摄取的日志。
|
||||
你可以将刷新间隔设置为 5 秒,以便定期刷新日志。
|
||||
|
||||
<!--
|
||||
Here is a typical view of ingested logs from the Kibana viewer:
|
||||
|
@ -179,16 +184,12 @@ Here is a typical view of ingested logs from the Kibana viewer:
|
|||
|
||||
![Kibana logs](/images/docs/kibana-logs.png)
|
||||
|
||||
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
|
||||
<!--
|
||||
Kibana opens up all sorts of powerful options for exploring your logs! For some
|
||||
ideas on how to dig into it, check out [Kibana's documentation](https://www.elastic.co/guide/en/kibana/current/discover.html).
|
||||
-->
|
||||
|
||||
Kibana 为浏览您的日志提供了各种强大的选项!有关如何深入研究它的一些想法,请查看 [Kibana 的文档](https://www.elastic.co/guide/en/kibana/current/discover.html)。
|
||||
|
||||
Kibana 为浏览你的日志提供了各种强大的选项!有关如何深入研究它的一些想法,
|
||||
请查看 [Kibana 的文档](https://www.elastic.co/guide/en/kibana/current/discover.html)。
|
||||
|
||||
|
|
|
@ -3,18 +3,14 @@ content_type: task
|
|||
title: 节点健康监测
|
||||
---
|
||||
<!--
|
||||
---
|
||||
reviewers:
|
||||
- Random-Liu
|
||||
- dchen1107
|
||||
content_type: task
|
||||
title: Monitor Node Health
|
||||
---
|
||||
-->
|
||||
|
||||
<!-- overview -->
|
||||
|
||||
*节点问题探测器* 是一个 [DaemonSet](/docs/concepts/workloads/controllers/daemonset/) 用来监控节点健康。它从各种守护进程收集节点问题,并以[NodeCondition](/docs/concepts/architecture/nodes/#condition) 和 [Event](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#event-v1-core) 的形式报告给 apiserver 。
|
||||
<!--
|
||||
*Node problem detector* is a [DaemonSet](/docs/concepts/workloads/controllers/daemonset/) monitoring the
|
||||
node health. It collects node problems from various daemons and reports them
|
||||
|
@ -22,6 +18,12 @@ to the apiserver as [NodeCondition](/docs/concepts/architecture/nodes/#condition
|
|||
and [Event](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#event-v1-core).
|
||||
-->
|
||||
|
||||
*节点问题探测器* 是一个 [DaemonSet](/zh/docs/concepts/workloads/controllers/daemonset/),
|
||||
用来监控节点健康。它从各种守护进程收集节点问题,并以
|
||||
[NodeCondition](/zh/docs/concepts/architecture/nodes/#condition) 和
|
||||
[Event](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#event-v1-core)
|
||||
的形式报告给 API 服务器。
|
||||
|
||||
<!--
|
||||
It supports some known kernel issue detection now, and will detect more and
|
||||
more node problems over time.
|
||||
|
@ -33,7 +35,8 @@ Currently Kubernetes won't take any action on the node conditions and events
|
|||
generated by node problem detector. In the future, a remedy system could be
|
||||
introduced to deal with node problems.
|
||||
-->
|
||||
目前,Kubernetes 不会对节点问题检测器监测到的节点状态和事件采取任何操作。将来可能会引入一个补救系统来处理这些节点问题。
|
||||
目前,Kubernetes 不会对节点问题检测器监测到的节点状态和事件采取任何操作。
|
||||
将来可能会引入一个补救系统来处理这些节点问题。
|
||||
|
||||
<!--
|
||||
See more information
|
||||
|
@ -41,75 +44,71 @@ See more information
|
|||
-->
|
||||
更多信息请参阅 [这里](https://github.com/kubernetes/node-problem-detector)。
|
||||
|
||||
|
||||
|
||||
## {{% heading "prerequisites" %}}
|
||||
|
||||
|
||||
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
|
||||
|
||||
|
||||
|
||||
<!-- steps -->
|
||||
|
||||
<!--
|
||||
## Limitations
|
||||
-->
|
||||
## 局限性
|
||||
|
||||
<!--
|
||||
* The kernel issue detection of node problem detector only supports file based
|
||||
kernel log now. It doesn't support log tools like journald.
|
||||
-->
|
||||
* 节点问题检测器的内核问题检测现在只支持基于文件类型的内核日志。 它不支持像 journald 这样的命令行日志工具。
|
||||
## 局限性 {#limitations}
|
||||
|
||||
* 节点问题检测器的内核问题检测现在只支持基于文件类型的内核日志。
|
||||
它不支持像 journald 这样的命令行日志工具。
|
||||
|
||||
<!--
|
||||
* The kernel issue detection of node problem detector has assumption on kernel
|
||||
log format, and now it only works on Ubuntu and Debian. However, it is easy to extend
|
||||
it to [support other log format](/docs/tasks/debug-application-cluster/monitor-node-health/#support-other-log-format).
|
||||
log format, and now it only works on Ubuntu and Debian. However, it is easy to extend
|
||||
it to [support other log format](/docs/tasks/debug-application-cluster/monitor-node-health/#support-other-log-format).
|
||||
-->
|
||||
* 节点问题检测器的内核问题检测对内核日志格式有一定要求,现在它只适用于 Ubuntu 和 Debian。但是,将其扩展为 [支持其它日志格式](/docs/tasks/debug-application-cluster/monitor-node-health/#support-other-log-format) 也很容易。
|
||||
* 节点问题检测器的内核问题检测对内核日志格式有一定要求,现在它只适用于 Ubuntu 和 Debian。
|
||||
不过将其扩展为[支持其它日志格式](#support-other-log-format) 也很容易。
|
||||
|
||||
<!--
|
||||
## Enable/Disable in GCE cluster
|
||||
-->
|
||||
## 在 GCE 集群中启用/禁用
|
||||
|
||||
<!--
|
||||
Node problem detector is [running as a cluster addon](/docs/setup/cluster-large/#addon-resources) enabled by default in the
|
||||
gce cluster.
|
||||
-->
|
||||
节点问题检测器在 gce 集群中以[集群插件的形式](/docs/setup/cluster-large/#addon-resources)默认启用。
|
||||
## 在 GCE 集群中启用/禁用
|
||||
|
||||
节点问题检测器在 gce 集群中以
|
||||
[集群插件的形式](/zh/docs/setup/best-practices/cluster-large/#addon-resources)
|
||||
默认启用。
|
||||
|
||||
<!--
|
||||
You can enable/disable it by setting the environment variable
|
||||
`KUBE_ENABLE_NODE_PROBLEM_DETECTOR` before `kube-up.sh`.
|
||||
-->
|
||||
您可以在运行 `kube-up.sh` 之前,以设置环境变量 `KUBE_ENABLE_NODE_PROBLEM_DETECTOR` 的形式启用/禁用它。
|
||||
你可以在运行 `kube-up.sh` 之前,以设置环境变量 `KUBE_ENABLE_NODE_PROBLEM_DETECTOR` 的形式启用/禁用它。
|
||||
|
||||
<!--
|
||||
## Use in Other Environment
|
||||
-->
|
||||
## 在其它环境中使用
|
||||
|
||||
<!--
|
||||
To enable node problem detector in other environment outside of GCE, you can use
|
||||
either `kubectl` or addon pod.
|
||||
-->
|
||||
要在 GCE 之外的其他环境中启用节点问题检测器,您可以使用 `kubectl` 或插件 pod。
|
||||
## 在其它环境中使用 {#use-in-other-environment}
|
||||
|
||||
要在 GCE 之外的其他环境中启用节点问题检测器,你可以使用 `kubectl` 或插件 pod。
|
||||
|
||||
<!--
|
||||
### Kubectl
|
||||
-->
|
||||
### Kubectl
|
||||
|
||||
<!--
|
||||
This is the recommended way to start node problem detector outside of GCE. It
|
||||
provides more flexible management, such as overwriting the default
|
||||
configuration to fit it into your environment or detect
|
||||
customized node problems.
|
||||
-->
|
||||
这是在 GCE 之外启动节点问题检测器的推荐方法。它的管理更加灵活,例如覆盖默认配置以使其适合您的环境或检测自定义节点问题。
|
||||
### Kubectl
|
||||
|
||||
这是在 GCE 之外启动节点问题检测器的推荐方法。
|
||||
它的管理更加灵活,例如覆盖默认配置以使其适合你的环境或检测自定义节点问题。
|
||||
|
||||
<!--
|
||||
* **Step 1:** `node-problem-detector.yaml`:
|
||||
|
@ -118,12 +117,11 @@ customized node problems.
|
|||
|
||||
{{< codenew file="debug/node-problem-detector.yaml" >}}
|
||||
|
||||
|
||||
<!--
|
||||
***Notice that you should make sure the system log directory is right for your
|
||||
OS distro.***
|
||||
-->
|
||||
***请注意保证您的系统日志路径与您的 OS 发行版相对应。***
|
||||
***请注意保证你的系统日志路径与你的 OS 发行版相对应。***
|
||||
|
||||
<!--
|
||||
* **Step 2:** Start node problem detector with `kubectl`:
|
||||
|
@ -136,39 +134,40 @@ OS distro.***
|
|||
|
||||
<!--
|
||||
### Addon Pod
|
||||
-->
|
||||
### 插件 Pod
|
||||
|
||||
<!--
|
||||
This is for those who have their own cluster bootstrap solution, and don't need
|
||||
to overwrite the default configuration. They could leverage the addon pod to
|
||||
further automate the deployment.
|
||||
-->
|
||||
这适用于拥有自己的集群引导程序解决方案的用户,并且不需要覆盖默认配置。 他们可以利用插件 Pod 进一步自动化部署。
|
||||
### 插件 Pod {#addon-pod}
|
||||
|
||||
这适用于拥有自己的集群引导程序解决方案的用户,并且不需要覆盖默认配置。
|
||||
他们可以利用插件 Pod 进一步自动化部署。
|
||||
|
||||
<!--
|
||||
Just create `node-problem-detector.yaml`, and put it under the addon pods directory
|
||||
`/etc/kubernetes/addons/node-problem-detector` on master node.
|
||||
-->
|
||||
只需创建 `node-problem-detector.yaml`,并将其放在主节点上的插件 pod 目录 `/etc/kubernetes/addons/node-problem-detector` 下。
|
||||
|
||||
只需创建 `node-problem-detector.yaml`,并将其放在主节点上的插件 pod 目录
|
||||
`/etc/kubernetes/addons/node-problem-detector` 下。
|
||||
|
||||
<!--
|
||||
## Overwrite the Configuration
|
||||
-->
|
||||
## 覆盖配置文件
|
||||
|
||||
<!--
|
||||
The [default configuration](https://github.com/kubernetes/node-problem-detector/tree/v0.1/config)
|
||||
is embedded when building the docker image of node problem detector.
|
||||
-->
|
||||
构建节点问题检测器的 docker 镜像时,会嵌入[默认配置](https://github.com/kubernetes/node-problem-detector/tree/v0.1/config)。
|
||||
## 覆盖配置文件
|
||||
|
||||
构建节点问题检测器的 docker 镜像时,会嵌入
|
||||
[默认配置](https://github.com/kubernetes/node-problem-detector/tree/v0.1/config)。
|
||||
|
||||
<!--
|
||||
However, you can use [ConfigMap](/docs/tasks/configure-pod-container/configure-pod-configmap/) to overwrite it
|
||||
following the steps:
|
||||
-->
|
||||
不过,您可以像下面这样使用 [ConfigMap](/docs/tasks/configure-pod-container/configure-pod-configmap/) 将其覆盖:
|
||||
不过,你可以像下面这样使用 [ConfigMap](/zh/docs/tasks/configure-pod-container/configure-pod-configmap/)
|
||||
将其覆盖:
|
||||
|
||||
<!--
|
||||
* **Step 1:** Change the config files in `config/`.
|
||||
|
@ -182,7 +181,6 @@ node-problem-detector-config --from-file=config/`.
|
|||
|
||||
{{< codenew file="debug/node-problem-detector-configmap.yaml" >}}
|
||||
|
||||
|
||||
<!--
|
||||
* **Step 4:** Re-create the node problem detector with the new yaml file:
|
||||
-->
|
||||
|
@ -206,13 +204,12 @@ ConfigMap, configuration overwriting is not supported now.
|
|||
|
||||
<!--
|
||||
## Kernel Monitor
|
||||
-->
|
||||
## 内核监视器
|
||||
|
||||
<!--
|
||||
*Kernel Monitor* is a problem daemon in node problem detector. It monitors kernel log
|
||||
and detects known kernel issues following predefined rules.
|
||||
-->
|
||||
## 内核监视器
|
||||
|
||||
*内核监视器* 是节点问题检测器中的问题守护进程。它监视内核日志并按照预定义规则检测已知内核问题。
|
||||
|
||||
<!--
|
||||
|
@ -222,18 +219,17 @@ The rule list is extensible, and you can always extend it by overwriting the
|
|||
configuration.
|
||||
-->
|
||||
内核监视器根据 [`config/kernel-monitor.json`](https://github.com/kubernetes/node-problem-detector/blob/v0.1/config/kernel-monitor.json) 中的一组预定义规则列表匹配内核问题。
|
||||
规则列表是可扩展的,您始终可以通过覆盖配置来扩展它。
|
||||
规则列表是可扩展的,你始终可以通过覆盖配置来扩展它。
|
||||
|
||||
<!--
|
||||
### Add New NodeConditions
|
||||
-->
|
||||
### 添加新的 NodeCondition
|
||||
|
||||
<!--
|
||||
To support new node conditions, you can extend the `conditions` field in
|
||||
`config/kernel-monitor.json` with new condition definition:
|
||||
-->
|
||||
您可以使用新的状态描述来扩展 `config/kernel-monitor.json` 中的 `conditions` 字段以支持新的节点状态。
|
||||
### 添加新的 NodeCondition
|
||||
|
||||
你可以使用新的状态描述来扩展 `config/kernel-monitor.json` 中的 `conditions` 字段以支持新的节点状态。
|
||||
|
||||
```json
|
||||
{
|
||||
|
@ -245,14 +241,13 @@ To support new node conditions, you can extend the `conditions` field in
|
|||
|
||||
<!--
|
||||
### Detect New Problems
|
||||
-->
|
||||
### 检测新的问题
|
||||
|
||||
<!--
|
||||
To detect new problems, you can extend the `rules` field in `config/kernel-monitor.json`
|
||||
with new rule definition:
|
||||
-->
|
||||
您可以使用新的规则描述来扩展 `config/kernel-monitor.json` 中的 `rules` 字段以检测新问题。
|
||||
### 检测新的问题
|
||||
|
||||
你可以使用新的规则描述来扩展 `config/kernel-monitor.json` 中的 `rules` 字段以检测新问题。
|
||||
|
||||
```json
|
||||
{
|
||||
|
@ -265,42 +260,40 @@ with new rule definition:
|
|||
|
||||
<!--
|
||||
### Change Log Path
|
||||
-->
|
||||
### 更改日志路径
|
||||
|
||||
<!--
|
||||
Kernel log in different OS distros may locate in different path. The `log`
|
||||
field in `config/kernel-monitor.json` is the log path inside the container.
|
||||
You can always configure it to match your OS distro.
|
||||
-->
|
||||
不同操作系统发行版的内核日志的可能不同。 `config/kernel-monitor.json` 中的 `log` 字段是容器内的日志路径。您始终可以修改配置使其与您的 OS 发行版匹配。
|
||||
### 更改日志路径
|
||||
|
||||
不同操作系统发行版的内核日志的可能不同。 `config/kernel-monitor.json` 中的 `log` 字段是容器内的日志路径。你始终可以修改配置使其与你的 OS 发行版匹配。
|
||||
|
||||
<!--
|
||||
### Support Other Log Format
|
||||
-->
|
||||
### 支持其它日志格式
|
||||
|
||||
<!--
|
||||
Kernel monitor uses [`Translator`](https://github.com/kubernetes/node-problem-detector/blob/v0.1/pkg/kernelmonitor/translator/translator.go)
|
||||
plugin to translate kernel log the internal data structure. It is easy to
|
||||
implement a new translator for a new log format.
|
||||
-->
|
||||
内核监视器使用 [`Translator`] 插件将内核日志转换为内部数据结构。我们可以很容易为新的日志格式实现新的翻译器。
|
||||
### 支持其它日志格式 {#support-other-log-format}
|
||||
|
||||
内核监视器使用 [`Translator`] 插件将内核日志转换为内部数据结构。
|
||||
我们可以很容易为新的日志格式实现新的翻译器。
|
||||
|
||||
<!-- discussion -->
|
||||
|
||||
<!--
|
||||
## Caveats
|
||||
-->
|
||||
## 注意事项
|
||||
|
||||
<!--
|
||||
It is recommended to run the node problem detector in your cluster to monitor
|
||||
the node health. However, you should be aware that this will introduce extra
|
||||
resource overhead on each node. Usually this is fine, because:
|
||||
-->
|
||||
我们建议在集群中运行节点问题检测器来监视节点运行状况。但是,您应该知道这将在每个节点上引入额外的资源开销。一般情况下没有影响,因为:
|
||||
## 注意事项 {#caveats}
|
||||
|
||||
我们建议在集群中运行节点问题检测器来监视节点运行状况。
|
||||
但是,你应该知道这将在每个节点上引入额外的资源开销。一般情况下没有影响,因为:
|
||||
|
||||
<!--
|
||||
* The kernel log is generated relatively slowly.
|
||||
|
|
|
@ -1,47 +1,41 @@
|
|||
---
|
||||
reviewers:
|
||||
- fgrzadkowski
|
||||
- piosz
|
||||
title: 资源指标管道
|
||||
content_type: concept
|
||||
---
|
||||
<!--
|
||||
---
|
||||
reviewers:
|
||||
- fgrzadkowski
|
||||
- piosz
|
||||
title: Resource metrics pipeline
|
||||
content_type: concept
|
||||
---
|
||||
-->
|
||||
|
||||
<!-- overview -->
|
||||
|
||||
<!--
|
||||
Starting from Kubernetes 1.8, resource usage metrics, such as container CPU and memory usage,
|
||||
Resource usage metrics, such as container CPU and memory usage,
|
||||
are available in Kubernetes through the Metrics API. These metrics can be either accessed directly
|
||||
by user, for example by using `kubectl top` command, or used by a controller in the cluster, e.g.
|
||||
Horizontal Pod Autoscaler, to make decisions.
|
||||
-->
|
||||
从 Kubernetes 1.8开始,资源使用指标,例如容器 CPU 和内存使用率,可通过 Metrics API 在 Kubernetes 中获得。这些指标可以直接被用户访问,比如使用`kubectl top`命令行,或者这些指标由集群中的控制器使用,例如,Horizontal Pod Autoscaler,使用这些指标来做决策。
|
||||
|
||||
|
||||
|
||||
资源使用指标,例如容器 CPU 和内存使用率,可通过 Metrics API 在 Kubernetes 中获得。
|
||||
这些指标可以直接被用户访问,比如使用 `kubectl top` 命令行,或者这些指标由集群中的控制器使用,
|
||||
例如,Horizontal Pod Autoscaler,使用这些指标来做决策。
|
||||
|
||||
<!-- body -->
|
||||
|
||||
<!--
|
||||
## The Metrics API
|
||||
-->
|
||||
## Metrics API
|
||||
|
||||
<!--
|
||||
Through the Metrics API you can get the amount of resource currently used
|
||||
by a given node or a given pod. This API doesn't store the metric values,
|
||||
so it's not possible for example to get the amount of resources used by a
|
||||
given node 10 minutes ago.
|
||||
-->
|
||||
通过 Metrics API,您可以获得指定节点或 pod 当前使用的资源量。此 API 不存储指标值,因此想要获取某个指定节点10分钟前的资源使用量是不可能的。
|
||||
## Metrics API {#the-metrics-api}
|
||||
|
||||
通过 Metrics API,你可以获得指定节点或 Pod 当前使用的资源量。
|
||||
此 API 不存储指标值,因此想要获取某个指定节点 10 分钟前的资源使用量是不可能的。
|
||||
|
||||
<!--
|
||||
The API is no different from any other API:
|
||||
|
@ -52,15 +46,16 @@ The API is no different from any other API:
|
|||
- it is discoverable through the same endpoint as the other Kubernetes APIs under `/apis/metrics.k8s.io/` path
|
||||
- it offers the same security, scalability and reliability guarantees
|
||||
-->
|
||||
- 此 API 和其它 Kubernetes API 一起位于同一端点(endpoint)之下,是可发现的,路径为`/apis/metrics.k8s.io/`
|
||||
- 此 API 和其它 Kubernetes API 一起位于同一端点(endpoint)之下,是可发现的,
|
||||
路径为 `/apis/metrics.k8s.io/`
|
||||
- 它提供相同的安全性、可扩展性和可靠性保证
|
||||
|
||||
<!--
|
||||
The API is defined in [k8s.io/metrics](https://github.com/kubernetes/metrics/blob/master/pkg/apis/metrics/v1beta1/types.go)
|
||||
repository. You can find more information about the API there.
|
||||
-->
|
||||
Metrics API 在[k8s.io/metrics](https://github.com/kubernetes/metrics/blob/master/pkg/apis/metrics/v1beta1/types.go)
|
||||
仓库中定义。您可以在那里找到有关 Metrics API 的更多信息。
|
||||
Metrics API 在 [k8s.io/metrics](https://github.com/kubernetes/metrics/blob/master/pkg/apis/metrics/v1beta1/types.go)
|
||||
仓库中定义。你可以在那里找到有关 Metrics API 的更多信息。
|
||||
|
||||
<!--
|
||||
The API requires metrics server to be deployed in the cluster. Otherwise it will be not available.
|
||||
|
@ -70,36 +65,66 @@ Metrics API 需要在集群中部署 Metrics Server。否则它将不可用。
|
|||
{{< /note >}}
|
||||
|
||||
<!--
|
||||
## Metrics Server
|
||||
## Measuring Resource Usage
|
||||
|
||||
### CPU
|
||||
|
||||
CPU is reported as the average usage, in [CPU cores](/docs/concepts/configuration/manage-compute-resources-container/#meaning-of-cpu), over a period of time. This value is derived by taking a rate over a cumulative CPU counter provided by the kernel (in both Linux and Windows kernels). The kubelet chooses the window for the rate calculation.
|
||||
-->
|
||||
## Metrics Server
|
||||
## 度量资源用量 {#measuring-resource-usage}
|
||||
|
||||
### CPU
|
||||
|
||||
CPU 用量按其一段时间内的平均值统计,单位为
|
||||
[CPU 核](/zh/docs/concepts/configuration/manage-resources-containers/#meaning-of-cpu)。
|
||||
此度量值通过在内核(包括 Linux 和 Windows)提供的累积 CPU 计数器乘以一个系数得到。
|
||||
`kubelet` 组件负责选择计算系数所使用的窗口大小。
|
||||
|
||||
### 内存 {#memory}
|
||||
|
||||
内存用量按工作集(Working Set)的大小字节数统计,其数值为收集度量值的那一刻的内存用量。
|
||||
如果一切都很理想化,“工作集” 是任务在使用的内存总量,该内存是不可以在内存压力较大
|
||||
的情况下被释放的。
|
||||
不过,具体的工作集计算方式取决于宿主 OS,有很大不同,且通常都大量使用启发式
|
||||
规则来给出一个估计值。
|
||||
其中包含所有匿名内存使用(没有后台文件提供存储者),因为 Kubernetes 不支持交换分区。
|
||||
度量值通常包含一些高速缓存(有后台文件提供存储)内存,因为宿主操作系统并不是总能
|
||||
回收这些页面。
|
||||
|
||||
<!--
|
||||
## Metrics Server
|
||||
|
||||
[Metrics Server](https://github.com/kubernetes-incubator/metrics-server) is a cluster-wide aggregator of resource usage data.
|
||||
Starting from Kubernetes 1.8 it's deployed by default in clusters created by `kube-up.sh` script
|
||||
as a Deployment object. If you use a different Kubernetes setup mechanism you can deploy it using the provided
|
||||
[deployment yamls](https://github.com/kubernetes-incubator/metrics-server/tree/master/deploy).
|
||||
It's supported in Kubernetes 1.7+ (see details below).
|
||||
-->
|
||||
[Metrics Server](https://github.com/kubernetes-incubator/metrics-server)是集群范围资源使用数据的聚合器。
|
||||
从 Kubernetes 1.8开始,它作为 Deployment 对象,被默认部署在由`kube-up.sh`脚本创建的集群中。
|
||||
如果您使用不同的 Kubernetes 安装方法,则可以使用提供的[deployment yamls](https://github.com/kubernetes-incubator/metrics-server/tree/master/deploy)来部署。它在 Kubernetes 1.7+中得到支持(详见下文)。
|
||||
## Metrics 服务器 {#metrics-server}
|
||||
|
||||
[Metrics 服务器](https://github.com/kubernetes-incubator/metrics-server)是集群范围资源使用数据的聚合器。
|
||||
在由 `kube-up.sh` 脚本创建的集群中默认会以 Deployment 的形式被部署。
|
||||
如果你使用其他 Kubernetes 安装方法,则可以使用提供的
|
||||
[deployment yamls](https://github.com/kubernetes-incubator/metrics-server/tree/master/deploy)
|
||||
来部署。
|
||||
|
||||
<!--
|
||||
Metric server collects metrics from the Summary API, exposed by [Kubelet](/docs/admin/kubelet/) on each node.
|
||||
-->
|
||||
Metric server 从每个节点上的 [Kubelet](/docs/admin/kubelet/) 公开的 Summary API 中采集指标信息。
|
||||
Metric server 从每个节点上的 [Kubelet](/zh/docs/reference/command-line-tools-reference/kubelet/)
|
||||
公开的 Summary API 中采集指标信息。
|
||||
|
||||
<!--
|
||||
Metrics Server registered in the main API server through
|
||||
[Kubernetes aggregator](/docs/concepts/api-extension/apiserver-aggregation/),
|
||||
which was introduced in Kubernetes 1.7.
|
||||
-->
|
||||
通过在主 API server 中注册的 Metrics Server [Kubernetes 聚合器](/docs/concepts/api-extension/apiserver-aggregation/) 来采集指标信息, 这是在 Kubernetes 1.7 中引入的。
|
||||
Metrics 服务器通过
|
||||
[Kubernetes 聚合器](/zh/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/)
|
||||
注册到主 API 服务器。
|
||||
|
||||
<!--
|
||||
Learn more about the metrics server in [the design doc](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/metrics-server.md).
|
||||
-->
|
||||
在[设计文档](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/metrics-server.md)中可以了解到有关 Metrics Server 的更多信息。
|
||||
|
||||
在[设计文档](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/instrumentation/metrics-server.md)中可以了解到有关 Metrics 服务器的更多信息。
|
||||
|
||||
|
|
|
@ -1,16 +1,12 @@
|
|||
---
|
||||
reviewers:
|
||||
- mikedanese
|
||||
content_type: concept
|
||||
title: 资源监控工具
|
||||
---
|
||||
<!--
|
||||
---
|
||||
reviewers:
|
||||
- mikedanese
|
||||
content_type: concept
|
||||
title: Tools for Monitoring Resources
|
||||
---
|
||||
-->
|
||||
|
||||
<!-- overview -->
|
||||
|
@ -25,12 +21,12 @@ information about an application's resource usage at each of these levels.
|
|||
This information allows you to evaluate your application's performance and
|
||||
where bottlenecks can be removed to improve overall performance.
|
||||
-->
|
||||
要扩展应用程序并提供可靠的服务,您需要了解应用程序在部署时的行为。
|
||||
您可以通过检测容器检查 Kubernetes 集群中的应用程序性能,[pods](/docs/user-guide/pods), [服务](/docs/user-guide/services)和整个集群的特征。
|
||||
要扩展应用程序并提供可靠的服务,你需要了解应用程序在部署时的行为。
|
||||
你可以通过检测容器检查 Kubernetes 集群中的应用程序性能,
|
||||
[Pods](/zh/docs/concepts/workloads/pods), [服务](/zh/docs/concepts/services-networking/service/)
|
||||
和整个集群的特征。
|
||||
Kubernetes 在每个级别上提供有关应用程序资源使用情况的详细信息。
|
||||
此信息使您可以评估应用程序的性能,以及在何处可以消除瓶颈以提高整体性能。
|
||||
|
||||
|
||||
此信息使你可以评估应用程序的性能,以及在何处可以消除瓶颈以提高整体性能。
|
||||
|
||||
<!-- body -->
|
||||
|
||||
|
@ -38,22 +34,26 @@ Kubernetes 在每个级别上提供有关应用程序资源使用情况的详细
|
|||
In Kubernetes, application monitoring does not depend on a single monitoring solution. On new clusters, you can use [resource metrics](#resource-metrics-pipeline) or [full metrics](#full-metrics-pipeline) pipelines to collect monitoring statistics.
|
||||
-->
|
||||
在 Kubernetes 中,应用程序监控不依赖单个监控解决方案。
|
||||
在新集群上,您可以使用[资源度量](#资源度量管道)或[完整度量](#完整度量管道)管道来收集监视统计信息。
|
||||
在新集群上,你可以使用[资源度量](#resource-metrics-pipeline)或
|
||||
[完整度量](#full-metrics-pipeline)管道来收集监视统计信息。
|
||||
|
||||
<!--
|
||||
## Resource metrics pipeline
|
||||
-->
|
||||
## 资源度量管道
|
||||
|
||||
<!--
|
||||
The resource metrics pipeline provides a limited set of metrics related to
|
||||
cluster components such as the [Horizontal Pod Autoscaler](/docs/tasks/run-application/horizontal-pod-autoscale) controller, as well as the `kubectl top` utility.
|
||||
These metrics are collected by the lightweight, short-term, in-memory
|
||||
[metrics-server](https://github.com/kubernetes-incubator/metrics-server) and
|
||||
are exposed via the `metrics.k8s.io` API.
|
||||
-->
|
||||
资源指标管道提供了一组与集群组件,例如[Horizontal Pod Autoscaler]控制器(/docs/tasks/run-application/horizontal-pod-autoscale),以及 `kubectl top` 实用程序相关的有限度量。
|
||||
这些指标是由轻量级的、短期内存[度量服务器](https://github.com/kubernetes-incubator/metrics-server)收集的,通过 `metrics.k8s.io` 公开。
|
||||
## 资源度量管道 {#resource-metrics-pipeline}
|
||||
|
||||
资源指标管道提供了一组与集群组件,例如
|
||||
[Horizontal Pod Autoscaler](/zh/docs/tasks/run-application/horizontal-pod-autoscale/)控制器,
|
||||
以及 `kubectl top` 实用程序相关的有限度量。
|
||||
这些指标是由轻量级的、短期、内存存储的
|
||||
[度量服务器](https://github.com/kubernetes-incubator/metrics-server)收集的,
|
||||
通过 `metrics.k8s.io` 公开。
|
||||
|
||||
<!--
|
||||
metrics-server discovers all nodes on the cluster and
|
||||
|
@ -69,7 +69,8 @@ resource usage statistics through the metrics-server Resource Metrics API.
|
|||
This API is served at `/metrics/resource/v1beta1` on the kubelet's authenticated and
|
||||
read-only ports.
|
||||
-->
|
||||
度量服务器发现集群中的所有节点,并且查询每个节点的[kubelet](/docs/reference/command-line-tools-reference/kubelet)以获取 CPU 和内存使用情况。
|
||||
度量服务器发现集群中的所有节点,并且查询每个节点的
|
||||
[kubelet](/zh/docs/reference/command-line-tools-reference/kubelet)以获取 CPU 和内存使用情况。
|
||||
Kubelet 充当 Kubernetes 主节点与节点之间的桥梁,管理机器上运行的 Pod 和容器。
|
||||
kubelet 将每个 pod 转换为其组成的容器,并在容器运行时通过容器运行时界面获取各个容器使用情况统计信息。
|
||||
kubelet 从集成的 cAdvisor 获取此信息,以进行旧式 Docker 集成。
|
||||
|
@ -78,10 +79,7 @@ kubelet 从集成的 cAdvisor 获取此信息,以进行旧式 Docker 集成。
|
|||
|
||||
<!--
|
||||
## Full metrics pipeline
|
||||
-->
|
||||
## 完整度量管道
|
||||
|
||||
<!--
|
||||
A full metrics pipeline gives you access to richer metrics. Kubernetes can
|
||||
respond to these metrics by automatically scaling or adapting the cluster
|
||||
based on its current state, using mechanisms such as the Horizontal Pod
|
||||
|
@ -89,17 +87,18 @@ Autoscaler. The monitoring pipeline fetches metrics from the kubelet and
|
|||
then exposes them to Kubernetes via an adapter by implementing either the
|
||||
`custom.metrics.k8s.io` or `external.metrics.k8s.io` API.
|
||||
-->
|
||||
一个完整度量管道可以让您访问更丰富的度量。
|
||||
## 完整度量管道 {#full-metrics-pipeline}
|
||||
|
||||
一个完整度量管道可以让你访问更丰富的度量。
|
||||
Kubernetes 还可以根据集群的当前状态,使用 Pod 水平自动扩缩器等机制,通过自动调用扩展或调整集群来响应这些度量。
|
||||
监控管道从 kubelet 获取度量,然后通过适配器将它们公开给 Kubernetes,方法是实现 `custom.metrics.k8s.io` 或 `external.metrics.k8s.io` API。
|
||||
监控管道从 kubelet 获取度量值,然后通过适配器将它们公开给 Kubernetes,
|
||||
方法是实现 `custom.metrics.k8s.io` 或 `external.metrics.k8s.io` API。
|
||||
|
||||
<!--
|
||||
[Prometheus](https://prometheus.io), a CNCF project, can natively monitor Kubernetes, nodes, and Prometheus itself.
|
||||
Full metrics pipeline projects that are not part of the CNCF are outside the scope of Kubernetes documentation.
|
||||
-->
|
||||
[Prometheus](https://prometheus.io),一个 CNCF 项目,可以原生监控 Kubernetes、节点和
|
||||
[Prometheus](https://prometheus.io) 是一个 CNCF 项目,可以原生监控 Kubernetes、节点和
|
||||
Prometheus 本身。
|
||||
完整度量管道项目不属于 CNCF 的一部分,不在 Kubernetes 文档的范围之内。
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -1,66 +1,76 @@
|
|||
---
|
||||
title: 删除 StatefulSet
|
||||
content_type: task
|
||||
weight: 60
|
||||
---
|
||||
|
||||
<!--
|
||||
reviewers:
|
||||
- bprashanth
|
||||
- erictune
|
||||
- foxish
|
||||
- janetkuo
|
||||
- smarterclayton
|
||||
title: 删除 StatefulSet
|
||||
title: Delete a StatefulSet
|
||||
content_type: task
|
||||
weight: 60
|
||||
---
|
||||
-->
|
||||
|
||||
<!-- overview -->
|
||||
|
||||
<!--
|
||||
This task shows you how to delete a StatefulSet.
|
||||
--->
|
||||
本文介绍如何删除 StatefulSet。
|
||||
|
||||
|
||||
本任务展示如何删除 StatefulSet。
|
||||
|
||||
## {{% heading "prerequisites" %}}
|
||||
|
||||
|
||||
<!--
|
||||
* This task assumes you have an application running on your cluster represented by a StatefulSet.
|
||||
--->
|
||||
* 本文假设在您的集群上已经运行了由 StatefulSet 创建的应用。
|
||||
|
||||
|
||||
* 本任务假设在你的集群上已经运行了由 StatefulSet 创建的应用。
|
||||
|
||||
<!-- steps -->
|
||||
|
||||
## 删除 StatefulSet
|
||||
## 删除 StatefulSet {#deleting-a-statefulset}
|
||||
|
||||
<!--
|
||||
You can delete a StatefulSet in the same way you delete other resources in Kubernetes: use the `kubectl delete` command, and specify the StatefulSet either by file or by name.
|
||||
--->
|
||||
您可以像删除 Kubernetes 中的其他资源一样删除 StatefulSet:使用 `kubectl delete` 命令,并按文件或者名字指定 StatefulSet。
|
||||
你可以像删除 Kubernetes 中的其他资源一样删除 StatefulSet:使用 `kubectl delete` 命令,并按文件或者名字指定 StatefulSet。
|
||||
|
||||
```shell
|
||||
kubectl delete -f <file.yaml>
|
||||
```
|
||||
|
||||
<!--
|
||||
```shell
|
||||
kubectl delete statefulsets <statefulset-name>
|
||||
```
|
||||
-->
|
||||
```shell
|
||||
kubectl delete statefulsets <statefulset 名称>
|
||||
```
|
||||
|
||||
<!--
|
||||
You may need to delete the associated headless service separately after the StatefulSet itself is deleted.
|
||||
--->
|
||||
删除 StatefulSet 之后,您可能需要单独删除关联的无头服务。
|
||||
|
||||
```shell
|
||||
kubectl delete service <service-name>
|
||||
```
|
||||
-->
|
||||
删除 StatefulSet 之后,你可能需要单独删除关联的无头服务。
|
||||
|
||||
```shell
|
||||
kubectl delete service <服务名称>
|
||||
```
|
||||
|
||||
<!--
|
||||
Deleting a StatefulSet through kubectl will scale it down to 0, thereby deleting all pods that are a part of it.
|
||||
If you want to delete just the StatefulSet and not the pods, use `--cascade=false`.
|
||||
--->
|
||||
通过 kubectl 删除 StatefulSet 会将其缩容为0,因此删除属于它的所有pods。
|
||||
如果您只想删除 StatefulSet 而不删除 pods,使用 `--cascade=false`。
|
||||
通过 `kubectl` 删除 StatefulSet 会将其缩容为 0,因此删除属于它的所有 Pod。
|
||||
如果你只想删除 StatefulSet 而不删除 Pod,使用 `--cascade=false`。
|
||||
|
||||
```shell
|
||||
kubectl delete -f <file.yaml> --cascade=false
|
||||
|
@ -69,67 +79,72 @@ kubectl delete -f <file.yaml> --cascade=false
|
|||
<!--
|
||||
By passing `--cascade=false` to `kubectl delete`, the Pods managed by the StatefulSet are left behind even after the StatefulSet object itself is deleted. If the pods have a label `app=myapp`, you can then delete them as follows:
|
||||
--->
|
||||
通过将 `--cascade=false` 传递给 `kubectl delete`,在删除 StatefulSet 对象之后,StatefulSet 管理的 pods 会被保留下来。如果 pods 有一个标签 `app=myapp`,则可以按照如下方式删除它们:
|
||||
通过将 `--cascade=false` 传递给 `kubectl delete`,在删除 StatefulSet 对象之后,
|
||||
StatefulSet 管理的 Pod 会被保留下来。如果 Pod 具有标签 `app=myapp`,则可以按照
|
||||
如下方式删除它们:
|
||||
|
||||
```shell
|
||||
kubectl delete pods -l app=myapp
|
||||
```
|
||||
|
||||
|
||||
<!--
|
||||
### Persistent Volumes
|
||||
|
||||
<!--
|
||||
Deleting the Pods in a StatefulSet will not delete the associated volumes. This is to ensure that you have the chance to copy data off the volume before deleting it. Deleting the PVC after the pods have left the [terminating state](/docs/concepts/workloads/pods/pod/#termination-of-pods) might trigger deletion of the backing Persistent Volumes depending on the storage class and reclaim policy. You should never assume ability to access a volume after claim deletion.
|
||||
--->
|
||||
删除 StatefulSet 管理的 pods 并不会删除关联的卷。这是为了确保您有机会在删除卷之前从卷中复制数据。在pods离开[终止状态](/docs/concepts/workloads/pods/pod/#termination-of-pods)后删除 PVC 可能会触发删除支持的 Persistent Volumes,具体取决于存储类和回收策略。声明删除后,您永远不应该假设能够访问卷。
|
||||
-->
|
||||
### 持久卷 {#persistent-volumes}
|
||||
|
||||
删除 StatefulSet 管理的 Pod 并不会删除关联的卷。这是为了确保你有机会在删除卷之前从卷中复制数据。
|
||||
在 Pod 离开[终止状态](/zh/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination)
|
||||
后删除 PVC 可能会触发删除背后的 PV 持久卷,具体取决于存储类和回收策略。
|
||||
永远不要假定在 PVC 删除后仍然能够访问卷。
|
||||
|
||||
<!--
|
||||
**Note: Use caution when deleting a PVC, as it may lead to data loss.**
|
||||
--->
|
||||
**注意:删除 PVC 时要谨慎,因为这可能会导致数据丢失。**
|
||||
Use caution when deleting a PVC, as it may lead to data loss.
|
||||
-->
|
||||
{{< note >}}
|
||||
删除 PVC 时要谨慎,因为这可能会导致数据丢失。
|
||||
{{< /note >}}
|
||||
|
||||
<!--
|
||||
### Complete deletion of a StatefulSet
|
||||
--->
|
||||
### 完全删除 StatefulSet
|
||||
|
||||
<!--
|
||||
To simply delete everything in a StatefulSet, including the associated pods, you can run a series of commands similar to the following:
|
||||
--->
|
||||
要简单地删除 StatefulSet 中的所有内容,包括关联的 pods,您可能需要运行一系列类似于以下内容的命令:
|
||||
-->
|
||||
### 完全删除 StatefulSet {#complete-deletion-of-a-statefulset}
|
||||
|
||||
要简单地删除 StatefulSet 中的所有内容,包括关联的 pods,你可能需要运行一系列类似于以下内容的命令:
|
||||
|
||||
```shell
|
||||
grace=$(kubectl get pods <stateful-set-pod> --template '{{.spec.terminationGracePeriodSeconds}}')
|
||||
kubectl delete statefulset -l app=myapp
|
||||
sleep $grace
|
||||
kubectl delete pvc -l app=myapp
|
||||
|
||||
```
|
||||
|
||||
<!--
|
||||
In the example above, the Pods have the label `app=myapp`; substitute your own label as appropriate.
|
||||
--->
|
||||
在上面的例子中,pods 的标签为 `app=myapp`;适当地替换您自己的标签。
|
||||
-->
|
||||
在上面的例子中,Pod 的标签为 `app=myapp`;适当地替换你自己的标签。
|
||||
|
||||
<!--
|
||||
### Force deletion of StatefulSet pods
|
||||
--->
|
||||
### 强制删除 StatefulSet 类型的 pods
|
||||
|
||||
<!--
|
||||
If you find that some pods in your StatefulSet are stuck in the 'Terminating' or 'Unknown' states for an extended period of time, you may need to manually intervene to forcefully delete the pods from the apiserver. This is a potentially dangerous task. Refer to [Deleting StatefulSet Pods](/docs/tasks/manage-stateful-set/delete-pods/) for details.
|
||||
--->
|
||||
如果您发现 StatefulSet 中的某些 pods 长时间处于 'Terminating' 或者 'Unknown' 状态,则可能需要手动干预以强制从 apiserver 中删除 pods。这是一项潜在的危险任务。详细信息请阅读[删除 StatefulSet 类型的 Pods](/docs/tasks/manage-stateful-set/delete-pods/)。
|
||||
|
||||
-->
|
||||
### 强制删除 StatefulSet 的 Pod
|
||||
|
||||
如果你发现 StatefulSet 的某些 Pod 长时间处于 'Terminating' 或者 'Unknown' 状态,
|
||||
则可能需要手动干预以强制从 API 服务器中删除这些 Pod。
|
||||
这是一项有点危险的任务。详细信息请阅读
|
||||
[删除 StatefulSet 类型的 Pods](/zh/docs/tasks/run-application/delete-stateful-set/)。
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
|
||||
<!--
|
||||
Learn more about [force deleting StatefulSet Pods](/docs/tasks/run-application/force-delete-stateful-set-pod/).
|
||||
--->
|
||||
了解更多有关[强制删除 StatefulSet 类型的 Pods](/docs/tasks/run-application/force-delete-stateful-set-pod/)。
|
||||
|
||||
|
||||
进一步了解[强制删除 StatefulSet 的 Pods](/zh/docs/tasks/run-application/force-delete-stateful-set-pod/)。
|
||||
|
||||
|
||||
|
|
|
@ -1,92 +1,167 @@
|
|||
---
|
||||
approvers:
|
||||
- bprashanth
|
||||
- enisoc
|
||||
- erictune
|
||||
- foxish
|
||||
- janetkuo
|
||||
- kow3ns
|
||||
- smarterclayton
|
||||
title: 弹缩StatefulSet
|
||||
title: 扩缩 StatefulSet
|
||||
content_type: task
|
||||
---
|
||||
|
||||
<!-- overview -->
|
||||
本文介绍如何弹缩StatefulSet.
|
||||
<!--
|
||||
This task shows how to scale a StatefulSet. Scaling a StatefulSet refers to increasing or decreasing the number of replicas.
|
||||
-->
|
||||
本文介绍如何扩缩StatefulSet。StatefulSet 的扩缩指的是增加或者减少副本个数。
|
||||
|
||||
|
||||
## {{% heading "prerequisites" %}}
|
||||
|
||||
<!--
|
||||
* StatefulSets are only available in Kubernetes version 1.5 or later.
|
||||
To check your version of Kubernetes, run `kubectl version`.
|
||||
|
||||
* StatefulSets仅适用于Kubernetes1.5及以上版本.
|
||||
* **不是所有Stateful应用都适合弹缩.** 在弹缩前您的应用前. 您必须充分了解您的应用, 不适当的弹缩StatefulSet或许会造成应用自身功能的不稳定.
|
||||
* 仅当您确定该Stateful应用的集群是完全健康才可执行弹缩操作.
|
||||
* Not all stateful applications scale nicely. If you are unsure about whether to scale your StatefulSets, see [StatefulSet concepts](/docs/concepts/workloads/controllers/statefulset/) or [StatefulSet tutorial](/docs/tutorials/stateful-application/basic-stateful-set/) for further information.
|
||||
|
||||
* You should perform scaling only when you are confident that your stateful application
|
||||
cluster is completely healthy.
|
||||
-->
|
||||
* StatefulSets 仅适用于 Kubernetes 1.5 及以上版本。
|
||||
* 不是所有 Stateful 应用都能很好地执行扩缩操作。
|
||||
如果你不是很确定是否要扩缩你的 StatefulSet,可先参阅
|
||||
[StatefulSet 概念](/zh/docs/concepts/workloads/controllers/statefulset/)
|
||||
或者 [StatefulSet 教程](/zh/docs/tutorials/stateful-application/basic-stateful-set/)。
|
||||
|
||||
* 仅当你确定你的有状态应用的集群是完全健康的,才可执行扩缩操作.
|
||||
|
||||
<!-- steps -->
|
||||
|
||||
## 使用 `kubectl` 弹缩StatefulSets
|
||||
<!--
|
||||
## Scaling StatefulSets
|
||||
|
||||
弹缩请确认 `kubectl` 已经升级到Kubernetes1.5及以上版本. 如果不确定, 执行 `kubectl version` 命令并检查使用的 `Client Version`.
|
||||
### Use kubectl to scale StatefulSets
|
||||
|
||||
### `kubectl 弹缩`
|
||||
|
||||
首先, 找到您想要弹缩的StatefulSet. 记住, 您需先清楚是否能弹缩该应用.
|
||||
First, find the StatefulSet you want to scale.
|
||||
|
||||
```shell
|
||||
kubectl get statefulsets <stateful-set-name>
|
||||
```
|
||||
-->
|
||||
## 扩缩 StatefulSet {#scaling-statefulset}
|
||||
|
||||
改变StatefulSet副本数量:
|
||||
## 使用 `kubectl` 扩缩 StatefulSet
|
||||
|
||||
首先,找到你要扩缩的 StatefulSet。
|
||||
|
||||
```shell
|
||||
kubectl get statefulsets <statefulset 名称>
|
||||
```
|
||||
|
||||
<!--
|
||||
Change the number of replicas of your StatefulSet:
|
||||
|
||||
```shell
|
||||
kubectl scale statefulsets <stateful-set-name> --replicas=<new-replicas>
|
||||
```
|
||||
-->
|
||||
更改 StatefulSet 中副本个数:
|
||||
|
||||
### 可使用其他命令: `kubectl apply` / `kubectl edit` / `kubectl patch`
|
||||
```shell
|
||||
kubectl scale statefulsets <statefulset 名称> --replicas=<新的副本数>
|
||||
```
|
||||
|
||||
另外, 您可以 [in-place updates](/docs/concepts/cluster-administration/manage-deployment/#in-place-updates-of-resources) StatefulSets.
|
||||
<!--
|
||||
### Make in-place updates on your StatefulSets
|
||||
|
||||
如果您的StatefulSet开始由 `kubectl apply` 或 `kubectl create --save-config` 创建,更新StatefulSet manifests中的 `.spec.replicas`, 然后执行命令 `kubectl apply`:
|
||||
Alternatively, you can do [in-place updates](/docs/concepts/cluster-administration/manage-deployment/#in-place-updates-of-resources) on your StatefulSets.
|
||||
|
||||
If your StatefulSet was initially created with `kubectl apply`,
|
||||
update `.spec.replicas` of the StatefulSet manifests, and then do a `kubectl apply`:
|
||||
-->
|
||||
### 对 StatefulSet 执行就地更新
|
||||
|
||||
另外, 你可以[就地更新](/zh/docs/concepts/cluster-administration/manage-deployment/#in-place-updates-of-resources) StatefulSet。
|
||||
|
||||
如果你的 StatefulSet 最初通过 `kubectl apply` 或 `kubectl create --save-config` 创建,
|
||||
你可以更新 StatefulSet 清单中的 `.spec.replicas`, 然后执行命令 `kubectl apply`:
|
||||
|
||||
<!--
|
||||
```shell
|
||||
kubectl apply -f <stateful-set-file-updated>
|
||||
```
|
||||
|
||||
除此之外, 可以通过命令 `kubectl edit` 编辑该字段:
|
||||
Otherwise, edit that field with `kubectl edit`:
|
||||
|
||||
```shell
|
||||
kubectl edit statefulsets <stateful-set-name>
|
||||
```
|
||||
|
||||
或使用 `kubectl patch`:
|
||||
Or use `kubectl patch`:
|
||||
|
||||
```shell
|
||||
kubectl patch statefulsets <stateful-set-name> -p '{"spec":{"replicas":<new-replicas>}}'
|
||||
```
|
||||
-->
|
||||
```shell
|
||||
kubectl apply -f <更新后的 statefulset 文件>
|
||||
```
|
||||
|
||||
## 排查故障
|
||||
否则,可以使用 `kubectl edit` 编辑副本字段:
|
||||
|
||||
### 缩容工作不正常
|
||||
```shell
|
||||
kubectl edit statefulsets <statefulset 名称>
|
||||
```
|
||||
|
||||
当Stateful管理下的任何一个Pod不健康时您不能缩容该StatefulSet. 仅当Stateful下的所有Pods都处于运行和ready状态后才可缩容.
|
||||
或者使用 `kubectl patch`:
|
||||
|
||||
当一个StatefulSet的size > 1, 如果有一个Pod不健康, 没有办法让Kubernetes知道是否是由于永久性故障还是瞬态(升级/维护/节点重启)导致. 如果该Pod不健康是由于永久性
|
||||
故障导致, 则在不纠正该故障的情况下进行缩容可能会导致一种状态, 即StatefulSet下的Pod数量低于应正常运行的副本数. 这也许会导致StatefulSet不可用.
|
||||
```shell
|
||||
kubectl patch statefulsets <statefulset 名称> -p '{"spec":{"replicas":<new-replicas>}}'
|
||||
```
|
||||
|
||||
如果由于瞬态故障而导致Pod不健康,并且Pod可能再次可用,那么瞬态错误可能会干扰您对
|
||||
StatefulSet的扩容/缩容操作. 一些分布式数据库在节点加入和同时离开时存在问题. 在
|
||||
这些情况下,最好是在应用级别进行弹缩操作, 并且只有在您确保Stateful应用的集群是完全健康时才执行弹缩.
|
||||
<!--
|
||||
## Troubleshooting
|
||||
|
||||
### Scaling down does not work right
|
||||
-->
|
||||
## 故障排查 {#troubleshooting}
|
||||
|
||||
### 缩容操作无法正常工作
|
||||
|
||||
<!--
|
||||
You cannot scale down a StatefulSet when any of the stateful Pods it manages is unhealthy. Scaling down only takes place
|
||||
after those stateful Pods become running and ready.
|
||||
|
||||
If spec.replicas > 1, Kubernetes cannot determine the reason for an unhealthy Pod. It might be the result of a permanent fault or of a transient fault. A transient fault can be caused by a restart required by upgrading or maintenance.
|
||||
-->
|
||||
当 Stateful 所管理的任何 Pod 不健康时,你不能对该 StatefulSet 执行缩容操作。
|
||||
仅当 StatefulSet 的所有 Pod 都处于运行状态和 Ready 状况后才可缩容.
|
||||
|
||||
如果 `spec.replicas` 大于 1,Kubernetes 无法判定 Pod 不健康的原因。
|
||||
Pod 不健康可能是由于永久性故障造成也可能是瞬态故障。
|
||||
瞬态故障可能是节点升级或维护而引起的节点重启造成的。
|
||||
|
||||
<!--
|
||||
If the Pod is unhealthy due to a permanent fault, scaling
|
||||
without correcting the fault may lead to a state where the StatefulSet membership
|
||||
drops below a certain minimum number of replicas that are needed to function
|
||||
correctly. This may cause your StatefulSet to become unavailable.
|
||||
-->
|
||||
如果该 Pod 不健康是由于永久性故障导致, 则在不纠正该故障的情况下进行缩容可能会导致
|
||||
StatefulSet 进入一种状态,其成员 Pod 数量低于应正常运行的副本数。
|
||||
这种状态也许会导致 StatefulSet 不可用。
|
||||
|
||||
<!--
|
||||
If the Pod is unhealthy due to a transient fault and the Pod might become available again,
|
||||
the transient error may interfere with your scale-up or scale-down operation. Some distributed
|
||||
databases have issues when nodes join and leave at the same time. It is better
|
||||
to reason about scaling operations at the application level in these cases, and
|
||||
perform scaling only when you are sure that your stateful application cluster is
|
||||
completely healthy.
|
||||
-->
|
||||
如果由于瞬态故障而导致 Pod 不健康并且 Pod 可能再次变为可用,那么瞬态错误可能会干扰
|
||||
你对 StatefulSet 的扩容/缩容操作。 一些分布式数据库在同时有节点加入和离开时
|
||||
会遇到问题。在这些情况下,最好是在应用级别进行分析扩缩操作的状态, 并且只有在确保
|
||||
Stateful 应用的集群是完全健康时才执行扩缩操作。
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
|
||||
了解更多 [deleting a StatefulSet](/docs/tasks/manage-stateful-set/deleting-a-statefulset/).
|
||||
|
||||
|
||||
|
||||
<!--
|
||||
* Learn more about [deleting a StatefulSet](/docs/tasks/run-application/delete-stateful-set/).
|
||||
-->
|
||||
* 进一步了解[删除 StatefulSet](/zh/docs/tasks/run-application/delete-stateful-set/)
|
||||
|
||||
|
|
Loading…
Reference in New Issue