diff --git a/content/en/blog/_posts/2020-08-03-kubernetes-1-18-release-interview.md b/content/en/blog/_posts/2020-08-03-kubernetes-1-18-release-interview.md index a8e4e71736..f2057895c5 100644 --- a/content/en/blog/_posts/2020-08-03-kubernetes-1-18-release-interview.md +++ b/content/en/blog/_posts/2020-08-03-kubernetes-1-18-release-interview.md @@ -62,7 +62,7 @@ It was actually just making— again, startup, small company, small team, so rea **ADAM GLICK: What time frame was this?** -JORGE ALARCÓN: Three, four years ago, so definitely not 1.13. That's the best guesstimate that I can give at this point. But I wasn't able to find any good examples, any tutorials. The only book that I was able to get my hands on was the one written by Joe Beda, Kelsey Hightower, and I forget the other author. But what is it? "[Kubernetes— Up and Running](](http://shop.oreilly.com/product/0636920223788.do))"? +JORGE ALARCÓN: Three, four years ago, so definitely not 1.13. That's the best guesstimate that I can give at this point. But I wasn't able to find any good examples, any tutorials. The only book that I was able to get my hands on was the one written by Joe Beda, Kelsey Hightower, and I forget the other author. But what is it? "[Kubernetes— Up and Running](http://shop.oreilly.com/product/0636920223788.do)"? And in general, right now I use it as reference— it's really good. But as a beginner, I still was lost. They give all these amazing examples, they provide the applications, but I had no idea why someone might need a Pod, why someone might need a Deployment. So my last resort was to try and find someone who actually knew Kubernetes. diff --git a/content/en/blog/_posts/2024-03-01-sig-cloud-provider-spotlight.md b/content/en/blog/_posts/2024-03-01-sig-cloud-provider-spotlight.md new file mode 100644 index 0000000000..a0e5d829de --- /dev/null +++ b/content/en/blog/_posts/2024-03-01-sig-cloud-provider-spotlight.md @@ -0,0 +1,148 @@ +--- +layout: blog +title: "Spotlight on SIG Cloud Provider" +slug: sig-cloud-provider-spotlight-2024 +date: 2024-03-01 +canonicalUrl: https://www.k8s.dev/blog/2024/03/01/sig-cloud-provider-spotlight-2024/ +--- + +**Author**: Arujjwal Negi + +One of the most popular ways developers use Kubernetes-related services is via cloud providers, but +have you ever wondered how cloud providers can do that? How does this whole process of integration +of Kubernetes to various cloud providers happen? To answer that, let's put the spotlight on [SIG +Cloud Provider](https://github.com/kubernetes/community/blob/master/sig-cloud-provider/README.md). + +SIG Cloud Provider works to create seamless integrations between Kubernetes and various cloud +providers. Their mission? Keeping the Kubernetes ecosystem fair and open for all. By setting clear +standards and requirements, they ensure every cloud provider plays nicely with Kubernetes. It is +their responsibility to configure cluster components to enable cloud provider integrations. + +In this blog of the SIG Spotlight series, [Arujjwal Negi](https://twitter.com/arujjval) interviews +[Michael McCune](https://github.com/elmiko) (Red Hat), also known as _elmiko_, co-chair of SIG Cloud +Provider, to give us an insight into the workings of this group. + +## Introduction + +**Arujjwal**: Let's start by getting to know you. Can you give us a small intro about yourself and +how you got into Kubernetes? + +**Michael**: Hi, I’m Michael McCune, most people around the community call me by my handle, +_elmiko_. I’ve been a software developer for a long time now (Windows 3.1 was popular when I +started!), and I’ve been involved with open-source software for most of my career. I first got +involved with Kubernetes as a developer of machine learning and data science applications; the team +I was on at the time was creating tutorials and examples to demonstrate the use of technologies like +Apache Spark on Kubernetes. That said, I’ve been interested in distributed systems for many years +and when an opportunity arose to join a team working directly on Kubernetes, I jumped at it! + +## Functioning and working + +**Arujjwal**: Can you give us an insight into what SIG Cloud Provider does and how it functions? + +**Michael**: SIG Cloud Provider was formed to help ensure that Kubernetes provides a neutral +integration point for all infrastructure providers. Our largest task to date has been the extraction +and migration of in-tree cloud controllers to out-of-tree components. The SIG meets regularly to +discuss progress and upcoming tasks and also to answer questions and bugs that +arise. Additionally, we act as a coordination point for cloud provider subprojects such as the cloud +provider framework, specific cloud controller implementations, and the [Konnectivity proxy +project](https://kubernetes.io/docs/tasks/extend-kubernetes/setup-konnectivity/). + + +**Arujjwal:** After going through the project +[README](https://github.com/kubernetes/community/blob/master/sig-cloud-provider/README.md), I +learned that SIG Cloud Provider works with the integration of Kubernetes with cloud providers. How +does this whole process go? + +**Michael:** One of the most common ways to run Kubernetes is by deploying it to a cloud environment +(AWS, Azure, GCP, etc). Frequently, the cloud infrastructures have features that enhance the +performance of Kubernetes, for example, by providing elastic load balancing for Service objects. To +ensure that cloud-specific services can be consistently consumed by Kubernetes, the Kubernetes +community has created cloud controllers to address these integration points. Cloud providers can +create their own controllers either by using the framework maintained by the SIG or by following +the API guides defined in the Kubernetes code and documentation. One thing I would like to point out +is that SIG Cloud Provider does not deal with the lifecycle of nodes in a Kubernetes cluster; +for those types of topics, SIG Cluster Lifecycle and the Cluster API project are more appropriate +venues. + +## Important subprojects + +**Arujjwal:** There are a lot of subprojects within this SIG. Can you highlight some of the most +important ones and what job they do? + +**Michael:** I think the two most important subprojects today are the [cloud provider +framework](https://github.com/kubernetes/community/blob/master/sig-cloud-provider/README.md#kubernetes-cloud-provider) +and the [extraction/migration +project](https://github.com/kubernetes/community/blob/master/sig-cloud-provider/README.md#cloud-provider-extraction-migration). The +cloud provider framework is a common library to help infrastructure integrators build a cloud +controller for their infrastructure. This project is most frequently the starting point for new +people coming to the SIG. The extraction and migration project is the other big subproject and a +large part of why the framework exists. A little history might help explain further: for a long +time, Kubernetes needed some integration with the underlying infrastructure, not +necessarily to add features but to be aware of cloud events like instance termination. The cloud +provider integrations were built into the Kubernetes code tree, and thus the term "in-tree" was +created (check out this [article on the topic](https://kaslin.rocks/out-of-tree/) for more +info). The activity of maintaining provider-specific code in the main Kubernetes source tree was +considered undesirable by the community. The community’s decision inspired the creation of the +extraction and migration project to remove the "in-tree" cloud controllers in favor of +"out-of-tree" components. + + +**Arujjwal:** What makes [the cloud provider framework] a good place to start? Does it have consistent good beginner work? What +kind? + +**Michael:** I feel that the cloud provider framework is a good place to start as it encodes the +community’s preferred practices for cloud controller managers and, as such, will give a newcomer a +strong understanding of how and what the managers do. Unfortunately, there is not a consistent +stream of beginner work on this component; this is due in part to the mature nature of the framework +and that of the individual providers as well. For folks who are interested in getting more involved, +having some [Go language](https://go.dev/) knowledge is good and also having an understanding of +how at least one cloud API (e.g., AWS, Azure, GCP) works is also beneficial. In my personal opinion, +being a newcomer to SIG Cloud Provider can be challenging as most of the code around this project +deals directly with specific cloud provider interactions. My best advice to people wanting to do +more work on cloud providers is to grow your familiarity with one or two cloud APIs, then look +for open issues on the controller managers for those clouds, and always communicate with the other +contributors as much as possible. + +## Accomplishments + +**Arujjwal:** Can you share about an accomplishment(s) of the SIG that you are proud of? + +**Michael:** Since I joined the SIG, more than a year ago, we have made great progress in advancing +the extraction and migration subproject. We have moved from an alpha status on the defining +[KEP](https://github.com/kubernetes/enhancements/blob/master/keps/README.md) to a beta status and +are inching ever closer to removing the old provider code from the Kubernetes source tree. I've been +really proud to see the active engagement from our community members and to see the progress we have +made towards extraction. I have a feeling that, within the next few releases, we will see the final +removal of the in-tree cloud controllers and the completion of the subproject. + +## Advice for new contributors + +**Arujjwal:** Is there any suggestion or advice for new contributors on how they can start at SIG +Cloud Provider? + +**Michael:** This is a tricky question in my opinion. SIG Cloud Provider is focused on the code +pieces that integrate between Kubernetes and an underlying infrastructure. It is very common, but +not necessary, for members of the SIG to be representing a cloud provider in an official capacity. I +recommend that anyone interested in this part of Kubernetes should come to an SIG meeting to see how +we operate and also to study the cloud provider framework project. We have some interesting ideas +for future work, such as a common testing framework, that will cut across all cloud providers and +will be a great opportunity for anyone looking to expand their Kubernetes involvement. + +**Arujjwal:** Are there any specific skills you're looking for that we should highlight? To give you +an example from our own [SIG ContribEx] +(https://github.com/kubernetes/community/blob/master/sig-contributor-experience/README.md): +if you're an expert in [Hugo](https://gohugo.io/), we can always use some help with k8s.dev! + +**Michael:** The SIG is currently working through the final phases of our extraction and migration +process, but we are looking toward the future and starting to plan what will come next. One of the +big topics that the SIG has discussed is testing. Currently, we do not have a generic common set of +tests that can be exercised by each cloud provider to confirm the behaviour of their controller +manager. If you are an expert in Ginkgo and the Kubetest framework, we could probably use your help +in designing and implementing the new tests. + +--- + +This is where the conversation ends. I hope this gave you some insights about SIG Cloud Provider's +aim and working. This is just the tip of the iceberg. To know more and get involved with SIG Cloud +Provider, try attending their meetings +[here](https://github.com/kubernetes/community/blob/master/sig-cloud-provider/README.md#meetings). diff --git a/content/en/docs/tasks/access-application-cluster/service-access-application-cluster.md b/content/en/docs/tasks/access-application-cluster/service-access-application-cluster.md index 92f715b818..9effb1e2f7 100644 --- a/content/en/docs/tasks/access-application-cluster/service-access-application-cluster.md +++ b/content/en/docs/tasks/access-application-cluster/service-access-application-cluster.md @@ -87,7 +87,7 @@ Here is the configuration file for the application Deployment: Events: ``` - Make a note of the NodePort value for the service. For example, + Make a note of the NodePort value for the Service. For example, in the preceding output, the NodePort value is 31496. 1. List the pods that are running the Hello World application: diff --git a/content/en/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes.md b/content/en/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes.md index 4d0b744d7d..9615568d92 100644 --- a/content/en/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes.md +++ b/content/en/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes.md @@ -294,7 +294,6 @@ For example: ports: - name: liveness-port containerPort: 8080 - hostPort: 8080 livenessProbe: httpGet: @@ -318,7 +317,6 @@ So, the previous example would become: ports: - name: liveness-port containerPort: 8080 - hostPort: 8080 livenessProbe: httpGet: @@ -542,7 +540,6 @@ spec: ports: - name: liveness-port containerPort: 8080 - hostPort: 8080 livenessProbe: httpGet: diff --git a/content/es/docs/tasks/run-application/force-delete-stateful-set-pod.md b/content/es/docs/tasks/run-application/force-delete-stateful-set-pod.md new file mode 100644 index 0000000000..2864766f5c --- /dev/null +++ b/content/es/docs/tasks/run-application/force-delete-stateful-set-pod.md @@ -0,0 +1,109 @@ +--- +reviewers: +- ramrodo +title: Eliminación Forzosa de Pods de StatefulSet +content_type: task +weight: 70 +--- + + +Esta página muestra cómo eliminar Pods que son parte de un +{{< glossary_tooltip text="StatefulSet" term_id="StatefulSet" >}}, +y explica las consideraciones a tener en cuenta al hacerlo. + +## {{% heading "prerequisites" %}} + +- Esta es una tarea bastante avanzada y tiene el potencial de violar algunas de las propiedades + inherentes de StatefulSet. +- Antes de proceder, familiarízate con las consideraciones enumeradas a continuación. + + + +## Consideraciones de StatefulSet + +En la operación normal de un StatefulSet, **nunca** hay necesidad de eliminar forzosamente un Pod de StatefulSet. +El [controlador de StatefulSet](/es/docs/concepts/workloads/controllers/statefulset/) es responsable de +crear, escalar y eliminar miembros del StatefulSet. Intenta asegurar que el número especificado +de Pods, desde el ordinal 0 hasta N-1, estén vivos y listos. StatefulSet asegura que, en cualquier momento, +exista como máximo un Pod con una identidad dada, corriendo en un clúster. Esto se refiere a la semántica de +*como máximo uno* proporcionada por un StatefulSet. + +La eliminación manual forzada debe realizarse con precaución, ya que tiene el potencial de violar la +semántica de como máximo uno, inherente a StatefulSet. Los StatefulSets pueden usarse para ejecutar aplicaciones distribuidas y +agrupadas que necesitan una identidad de red estable y almacenamiento estable. +Estas aplicaciones a menudo tienen configuraciones que dependen de un conjunto de un número fijo de +miembros con identidades fijas. Tener múltiples miembros con la misma identidad puede ser desastroso +y puede llevar a pérdida de datos (por ejemplo, escenario de cerebro dividido en sistemas basados en quórum). + +## Eliminar Pods + +Puedes realizar una eliminación de Pod paulatina con el siguiente comando: + +```shell +kubectl delete pods +``` + +Para que lo anterior conduzca a una terminación paulatina, el Pod no debe especificar un +`pod.Spec.TerminationGracePeriodSeconds` de 0. La práctica de establecer un +`pod.Spec.TerminationGracePeriodSeconds` de 0 segundos es insegura y se desaconseja rotundamente +para los Pods de StatefulSet. La eliminación paulatina es segura y garantizará que el Pod +se apague de [manera paulatina](/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination), antes de que kubelet elimine el nombre en el apiserver. + +Un Pod no se elimina automáticamente cuando un nodo no es accesible. +Los Pods que se ejecutan en un Nodo inaccesible entran en el estado 'Terminating' o 'Unknown' después de un +[tiempo de espera](es/docs/concepts/architecture/nodes/#Estados). +Los Pods también pueden entrar en estos estados cuando el usuario intenta la eliminación paulatina de un Pod +en un nodo inaccesible. +Las únicas formas en que un Pod en tal estado puede ser eliminado del apiserver son las siguientes: + +- El objeto Node es eliminado (ya sea por ti, o por el [Controlador de Nodo](/es/docs/concepts/architecture/nodes/#controlador-de-nodos)).). +- Kubelet, en el nodo no responsivo, comienza a responder, mata el Pod y elimina la entrada del apiserver. +- Eliminación forzada del Pod por el usuario. +- +La mejor práctica recomendada es usar el primer o segundo enfoque. Si un nodo está confirmado +como muerto (por ejemplo, desconectado permanentemente de la red, apagado, etc.), entonces elimina +el objeto Node. Si el nodo es afectado de una partición de red, entonces trata de resolver esto +o espera a que se resuelva. Cuando la partición se solucione, kubelet completará la eliminación +del Pod y liberará su nombre en el apiserver. + +Normalmente, el sistema completa la eliminación una vez que el Pod ya no se está ejecutando en un nodo, o +el nodo es eliminado por un administrador. Puedes anular esto forzando la eliminación del Pod. + +### Eliminación Forzosa + +Las eliminaciones forzosas **no** esperan confirmación de kubelet de que el Pod ha sido terminado. +Independientemente de si una eliminación forzosa tiene éxito en matar un Pod, inmediatamente +liberará el nombre del apiserver. Esto permitiría que el controlador de StatefulSet cree un Pod de reemplazo +con esa misma identidad; esto puede llevar a la duplicación de un Pod que aún está en ejecución, +y si dicho Pod todavía puede comunicarse con los otros miembros del StatefulSet, +violará la semántica de como máximo uno que StatefulSet está diseñado para garantizar. + +Cuando eliminas forzosamente un Pod de StatefulSet, estás afirmando que el Pod en cuestión nunca +volverá a hacer contacto con otros Pods en el StatefulSet y su nombre puede ser liberado de forma segura para que +se cree un reemplazo. + + +Si quieres eliminar un Pod de forma forzosa usando la versión de kubectl >= 1.5, haz lo siguiente: + +```shell +kubectl delete pods --grace-period=0 --force +``` + +Si estás usando cualquier versión de kubectl <= 1.4, deberías omitir la opción `--force` y usar: + +```shell +kubectl delete pods --grace-period=0 +``` + +Si incluso después de estos comandos el pod está atascado en el estado `Unknown`, usa el siguiente comando para +eliminar el Pod del clúster: + +```shell +kubectl patch pod -p '{"metadata":{"finalizers":null}}' +``` + +Siempre realiza la eliminación forzosa de Pods de StatefulSet con cuidado y con pleno conocimiento de los riesgos involucrados. + +## {{% heading "whatsnext" %}} + +Aprende más sobre [depurar un StatefulSet](/docs/tasks/debug/debug-application/debug-statefulset/). diff --git a/content/ja/releases/notes.md b/content/ja/releases/notes.md new file mode 100644 index 0000000000..e3e5d2d9ba --- /dev/null +++ b/content/ja/releases/notes.md @@ -0,0 +1,15 @@ +--- +linktitle: リリースノート +title: ノート +type: docs +description: > + Kubernetesのリリースノート +sitemap: + priority: 0.5 +--- + +リリースノートは、使用しているKubernetesのバージョンに合った[Changelog](https://github.com/kubernetes/kubernetes/tree/master/CHANGELOG)を読むことで確認できます。 +{{< skew currentVersionAddMinor 0 >}}のchangelogを見るには[GitHub](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-{{< skew currentVersionAddMinor 0 >}}.md)を参照してください。 + +またリリースノートは、[relnotes.k8s.io](https://relnotes.k8s.io)上で検索してフィルタリングすることもできます。 +{{< skew currentVersionAddMinor 0 >}}のフィルタリングされたリリースノートを見るには[relnotes.k8s.io](https://relnotes.k8s.io/?releaseVersions={{< skew currentVersionAddMinor 0 >}}.0)を参照してください。 diff --git a/content/ru/docs/tutorials/kubernetes-basics/scale/scale-intro.html b/content/ru/docs/tutorials/kubernetes-basics/scale/scale-intro.html index 4399e729c2..c1c2c2c540 100644 --- a/content/ru/docs/tutorials/kubernetes-basics/scale/scale-intro.html +++ b/content/ru/docs/tutorials/kubernetes-basics/scale/scale-intro.html @@ -41,7 +41,7 @@ description: |-
-

Количество экземпляров можно указать прямо при создании деплоймента, используя параметр --replicas команды kubectl create deployment

+

Количество экземпляров можно указать прямо при создании деплоймента, используя параметр --replicas команды kubectl create deployment

diff --git a/content/zh-cn/docs/setup/best-practices/multiple-zones.md b/content/zh-cn/docs/setup/best-practices/multiple-zones.md index 796a2b0796..9636b38a93 100644 --- a/content/zh-cn/docs/setup/best-practices/multiple-zones.md +++ b/content/zh-cn/docs/setup/best-practices/multiple-zones.md @@ -1,6 +1,6 @@ --- title: 运行于多可用区环境 -weight: 10 +weight: 20 content_type: concept --- @@ -18,7 +18,7 @@ content_type: concept -本页描述如何跨多个区(Zone)中运行集群。 +本页描述如何跨多个区(Zone)运行集群。 @@ -35,11 +35,11 @@ APIs and services. Typical cloud architectures aim to minimize the chance that a failure in one zone also impairs services in another zone. --> -## 背景 +## 背景 {#background} Kubernetes 从设计上允许同一个 Kubernetes 集群跨多个失效区来运行, -通常这些区位于某个称作 _区域(region)_ 逻辑分组中。 -主要的云提供商都将区域定义为一组失效区的集合(也称作 _可用区(Availability Zones)_), +通常这些区位于某个称作 **区域(Region)** 逻辑分组中。 +主要的云提供商都将区域定义为一组失效区的集合(也称作 **可用区(Availability Zones**)), 能够提供一组一致的功能特性:每个区域内,各个可用区提供相同的 API 和服务。 典型的云体系结构都会尝试降低某个区中的失效影响到其他区中服务的概率。 @@ -66,10 +66,10 @@ If you are running a cloud controller manager then you should also replicate this across all the failure zones you selected. --> 当你部署集群控制面时,应将控制面组件的副本跨多个失效区来部署。 -如果可用性是一个很重要的指标,应该选择至少三个失效区,并将每个 -控制面组件(API 服务器、调度器、etcd、控制器管理器)复制多个副本, -跨至少三个失效区来部署。如果你在运行云控制器管理器,则也应该将 -该组件跨所选的三个失效区来部署。 +如果可用性是一个很重要的指标,应该选择至少三个失效区, +并将每个控制面组件(API 服务器、调度器、etcd、控制器管理器)复制多个副本, +跨至少三个失效区来部署。如果你在运行云控制器管理器, +则也应该将该组件跨所选的三个失效区来部署。 {{< note >}} ## 节点行为 {#node-behavior} -Kubernetes 自动为负载资源(如{{< glossary_tooltip text="Deployment" term_id="deployment" >}} -或 {{< glossary_tooltip text="StatefulSet" term_id="statefulset" >}})) -跨集群中不同节点来部署其 Pods。 +Kubernetes 自动为负载资源(如 {{< glossary_tooltip text="Deployment" term_id="deployment" >}} +或 {{< glossary_tooltip text="StatefulSet" term_id="statefulset" >}}) +跨集群中不同节点来部署其 Pod。 这种分布逻辑有助于降低失效带来的影响。 -节点启动时,每个节点上的 kubelet 会向 Kubernetes API 中代表该 kubelet 的 Node 对象 -添加 {{< glossary_tooltip text="标签" term_id="label" >}}。 +节点启动时,每个节点上的 kubelet 会向 Kubernetes API 中代表该 kubelet 的 Node +对象添加{{< glossary_tooltip text="标签" term_id="label" >}}。 这些标签可能包含[区信息](/zh-cn/docs/reference/labels-annotations-taints/#topologykubernetesiozone)。 如果你的集群跨了多个可用区或者地理区域,你可以使用节点标签,结合 [Pod 拓扑分布约束](/zh-cn/docs/concepts/scheduling-eviction/topology-spread-constraints/) -来控制如何在你的集群中多个失效域之间分布 Pods。这里的失效域可以是 -地理区域、可用区甚至是特定节点。 -这些提示信息使得{{< glossary_tooltip text="调度器" term_id="kube-scheduler" >}} -能够更好地分布 Pods,以实现更好的可用性,降低因为某种失效给整个工作负载 -带来的风险。 +来控制如何在你的集群中多个失效域之间分布 Pod。这里的失效域可以是地理区域、可用区甚至是特定节点。 +这些提示信息使得{{< glossary_tooltip text="调度器" term_id="kube-scheduler" >}}能够更好地调度 +Pod,以实现更好的可用性,降低因为某种失效给整个工作负载带来的风险。 -例如,你可以设置一种约束,确保某个 StatefulSet 中的三个副本都运行在 -不同的可用区中,只要其他条件允许。你可以通过声明的方式来定义这种约束, +例如,你可以设置一种约束,确保某个 StatefulSet 中的 3 个副本都运行在不同的可用区中, +只要其他条件允许。你可以通过声明的方式来定义这种约束, 而不需要显式指定每个工作负载使用哪些可用区。 -### 跨多个区分布节点 {#distributing-nodes-across-zones} +### 跨多个区分布节点 {#distributing-nodes-across-zones} -Kubernetes 的核心逻辑并不会帮你创建节点,你需要自行完成此操作,或者使用 -类似 [Cluster API](https://cluster-api.sigs.k8s.io/) 这类工具来替你管理节点。 +Kubernetes 的核心逻辑并不会帮你创建节点,你需要自行完成此操作,或者使用类似 +[Cluster API](https://cluster-api.sigs.k8s.io/) 这类工具来替你管理节点。 -使用类似 Cluster API 这类工具,你可以跨多个失效域来定义一组用做你的集群 -工作节点的机器,以及当整个区的服务出现中断时如何自动治愈集群的策略。 +使用类似 Cluster API 这类工具,你可以跨多个失效域来定义一组用做你的集群工作节点的机器, +以及当整个区的服务出现中断时如何自动治愈集群的策略。 -## 为 Pods 手动指定区 +## 为 Pod 手动指定区 {#manual-zone-assignment-for-pods} -你可以应用[节点选择算符约束](/zh-cn/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector) -到你所创建的 Pods 上,或者为 Deployment、StatefulSet 或 Job 这类工作负载资源 -中的 Pod 模板设置此类约束。 +你可以应用[节点选择算符约束](/zh-cn/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector)到你所创建的 +Pod 上,或者为 Deployment、StatefulSet 或 Job 这类工作负载资源中的 Pod 模板设置此类约束。 -## 跨区的存储访问 +## 跨区的存储访问 {#storage-access-for-zones} -当创建持久卷时,Kubernetes 会自动向那些链接到特定区的 PersistentVolume 添加区标签。。 +当创建持久卷时,Kubernetes 会自动向那些链接到特定区的 PersistentVolume 添加区标签。 {{< glossary_tooltip text="调度器" term_id="kube-scheduler" >}}通过其 -`NoVolumeZoneConflict` 断言确保申领给定 PersistentVolume 的 Pods 只会 -被调度到该卷所在的可用区。 +`NoVolumeZoneConflict` 断言确保申领给定 PersistentVolume 的 Pod +只会被调度到该卷所在的可用区。 + + +请注意,添加区标签的方法可能取决于你的云提供商和存储制备器。 +请参阅具体的环境文档,确保配置正确。 请注意,添加区标签的方法可能会根据你所使用的云提供商和存储制备器而有所不同。 为确保配置正确,请始终参阅你的环境的特定文档。 @@ -212,10 +217,11 @@ storage in that class may use. To learn about configuring a StorageClass that is aware of failure domains or zones, see [Allowed topologies](/docs/concepts/storage/storage-classes/#allowed-topologies). --> -你可以为 PersistentVolumeClaim 指定{{< glossary_tooltip text="StorageClass" term_id="storage-class" >}} +你可以为 PersistentVolumeClaim 指定 +{{< glossary_tooltip text="StorageClass" term_id="storage-class" >}} 以设置该类中的存储可以使用的失效域(区)。 -要了解如何配置能够感知失效域或区的 StorageClass,请参阅 -[可用的拓扑逻辑](/zh-cn/docs/concepts/storage/storage-classes/#allowed-topologies)。 +要了解如何配置能够感知失效域或区的 StorageClass, +请参阅[可用的拓扑逻辑](/zh-cn/docs/concepts/storage/storage-classes/#allowed-topologies)。 -## 网络 {#networking} +## 网络 {#networking} Kubernetes 自身不提供与可用区相关的联网配置。 你可以使用[网络插件](/zh-cn/docs/concepts/extend-kubernetes/compute-storage-net/network-plugins/) 来配置集群的联网,该网络解决方案可能拥有一些与可用区相关的元素。 -例如,如果你的云提供商支持 `type=LoadBalancer` 的 Service,则负载均衡器 -可能仅会将请求流量发送到运行在负责处理给定连接的负载均衡器组件所在的区。 +例如,如果你的云提供商支持 `type=LoadBalancer` 的 Service, +则负载均衡器可能仅会将请求流量发送到运行在负责处理给定连接的负载均衡器组件所在的区。 请查阅云提供商的文档了解详细信息。 -对于自定义的或本地集群部署,也可以考虑这些因素 -{{< glossary_tooltip text="Service" term_id="service" >}} +对于自定义的或本地集群部署,也可以考虑这些因素。 +{{< glossary_tooltip text="Service" term_id="service" >}} 和 {{< glossary_tooltip text="Ingress" term_id="ingress" >}} 的行为, 包括处理不同失效区的方法,在很大程度上取决于你的集群是如何搭建的。 @@ -266,11 +272,11 @@ something to consider. --> ## 失效恢复 {#fault-recovery} -在搭建集群时,你可能需要考虑当某区域中的所有失效区都同时掉线时,是否以及如何 -恢复服务。例如,你是否要求在某个区中至少有一个节点能够运行 Pod? +在搭建集群时,你可能需要考虑当某区域中的所有失效区都同时掉线时,是否以及如何恢复服务。 +例如,你是否要求在某个区中至少有一个节点能够运行 Pod? 请确保任何对集群很关键的修复工作都不要指望集群中至少有一个健康节点。 例如:当所有节点都不健康时,你可能需要运行某个修复性的 Job, -该 Job 要设置特定的{{< glossary_tooltip text="容忍度" term_id="toleration" >}} +该 Job 要设置特定的{{< glossary_tooltip text="容忍度" term_id="toleration" >}}, 以便修复操作能够至少将一个节点恢复为可用状态。 Kubernetes 对这类问题没有现成的解决方案;不过这也是要考虑的因素之一。 @@ -281,6 +287,5 @@ Kubernetes 对这类问题没有现成的解决方案;不过这也是要考虑 To learn how the scheduler places Pods in a cluster, honoring the configured constraints, visit [Scheduling and Eviction](/docs/concepts/scheduling-eviction/). --> -要了解调度器如何在集群中放置 Pods 并遵从所配置的约束,可参阅 -[调度与驱逐](/zh-cn/docs/concepts/scheduling-eviction/)。 - +要了解调度器如何在集群中放置 Pod 并遵从所配置的约束, +可参阅[调度与驱逐](/zh-cn/docs/concepts/scheduling-eviction/)。 diff --git a/content/zh-cn/docs/tasks/administer-cluster/configure-upgrade-etcd.md b/content/zh-cn/docs/tasks/administer-cluster/configure-upgrade-etcd.md index 7f3d68589c..af1f786c84 100644 --- a/content/zh-cn/docs/tasks/administer-cluster/configure-upgrade-etcd.md +++ b/content/zh-cn/docs/tasks/administer-cluster/configure-upgrade-etcd.md @@ -19,17 +19,15 @@ weight: 270 ## {{% heading "prerequisites" %}} -{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}} - 你需要有一个 Kubernetes 集群,并且必须配置 kubectl 命令行工具以与你的集群通信。 -建议在至少有两个不充当控制平面的节点上运行此任务。如果你还没有集群, +建议参照本指南在至少有两个不充当控制平面的节点上运行此任务。如果你还没有集群, 你可以使用 [minikube](https://minikube.sigs.k8s.io/docs/tutorials/multi_node/) 创建一个。 @@ -42,7 +40,14 @@ nodes . If you do not already have a cluster, you can create one by using * etcd is a leader-based distributed system. Ensure that the leader periodically send heartbeats on time to all followers to keep the cluster stable. +--> +## 先决条件 {#prerequisites} +* 运行的 etcd 集群个数成员为奇数。 + +* etcd 是一个基于领导者(Leader-Based)的分布式系统。确保主节点定期向所有从节点发送心跳,以保持集群稳定。 + + -## 先决条件 {#prerequisites} - -* 运行的 etcd 集群个数成员为奇数。 - -* etcd 是一个 leader-based 分布式系统。确保主节点定期向所有从节点发送心跳,以保持集群稳定。 - * 确保不发生资源不足。 集群的性能和稳定性对网络和磁盘 I/O 非常敏感。任何资源匮乏都会导致心跳超时, 从而导致集群的不稳定。不稳定的情况表明没有选出任何主节点。 在这种情况下,集群不能对其当前状态进行任何更改,这意味着不能调度新的 Pod。 + * 保持 etcd 集群的稳定对 Kubernetes 集群的稳定性至关重要。 因此,请在专用机器或隔离环境上运行 etcd 集群, 以满足[所需资源需求](https://etcd.io/docs/current/op-guide/hardware/)。 @@ -100,7 +100,7 @@ This section covers starting a single-node and multi-node etcd cluster. -配置安全通信后,限制只有 Kubernetes API 服务器可以访问 etcd 集群。使用 TLS 身份验证来完成此任务。 +配置安全通信后,使用 TLS 身份验证来限制只有 Kubernetes API 服务器可以访问 etcd 集群。 -Kubernetes 目前不支持 etcd 身份验证。 -想要了解更多信息,请参阅相关的问题[支持 etcd v2 的基本认证](https://github.com/kubernetes/kubernetes/issues/23398)。 +Kubernetes 没有为 etcd 提供身份验证的计划。 {{< /note >}} 3. 停止故障节点上的 etcd 服务器。除了 Kubernetes API 服务器之外的其他客户端可能会造成流向 etcd 的流量, 可以停止所有流量以防止写入数据目录。 4. 移除失败的成员: @@ -400,7 +397,7 @@ replace it with `member4=http://10.0.0.4`. ``` 5. 增加新成员: @@ -418,7 +415,7 @@ replace it with `member4=http://10.0.0.4`. ``` 6. 在 IP 为 `10.0.0.4` 的机器上启动新增加的成员: @@ -430,7 +427,7 @@ replace it with `member4=http://10.0.0.4`. ``` ## 备份 etcd 集群 {#backing-up-an-etcd-cluster} -所有 Kubernetes 对象都存储在 etcd 上。 +所有 Kubernetes 对象都存储在 etcd 中。 定期备份 etcd 集群数据对于在灾难场景(例如丢失所有控制平面节点)下恢复 Kubernetes 集群非常重要。 快照文件包含所有 Kubernetes 状态和关键信息。为了保证敏感的 Kubernetes 数据的安全,可以对快照文件进行加密。 @@ -482,22 +479,22 @@ snapshot and volume snapshot. ### 内置快照 {#built-in-snapshot} -etcd 支持内置快照。快照可以从使用 `etcdctl snapshot save` 命令的活动成员中获取, +etcd 支持内置快照。快照可以从使用 `etcdctl snapshot save` 命令的活动成员中创建, 也可以通过从 etcd [数据目录](https://etcd.io/docs/current/op-guide/configuration/#--data-dir) -复制 `member/snap/db` 文件,该 etcd 数据目录目前没有被 etcd 进程使用。获取快照不会影响成员的性能。 +复制 `member/snap/db` 文件,该 etcd 数据目录目前没有被 etcd 进程使用。创建快照不会影响成员的性能。 -下面是一个示例,用于获取 `$ENDPOINT` 所提供的键空间的快照到文件 `snapshot.db`: +下面是一个示例,用于创建 `$ENDPOINT` 所提供的键空间的快照到文件 `snapshot.db`: ```shell ETCDCTL_API=3 etcdctl --endpoints $ENDPOINT snapshot save snapshot.db @@ -527,11 +524,11 @@ ETCDCTL_API=3 etcdctl --write-out=table snapshot status snapshot.db 如果 etcd 运行在支持备份的存储卷(如 Amazon Elastic Block -存储)上,则可以通过获取存储卷的快照来备份 etcd 数据。 +存储)上,则可以通过创建存储卷的快照来备份 etcd 数据。 我们还可以使用 etcdctl 提供的各种选项来制作快照。例如: @@ -548,10 +545,10 @@ ETCDCTL_API=3 etcdctl -h ``` -列出 etcdctl 可用的各种选项。例如,你可以通过指定端点、证书等来制作快照,如下所示: +列出 etcdctl 可用的各种选项。例如,你可以通过指定端点、证书和密钥来制作快照,如下所示: ```shell ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \ @@ -573,7 +570,7 @@ where `trusted-ca-file`, `cert-file` and `key-file` can be obtained from the des Scaling out etcd clusters increases availability by trading off performance. Scaling does not increase cluster performance nor capability. A general rule is not to scale out or in etcd clusters. Do not configure any auto scaling -groups for etcd clusters. It is highly recommended to always run a static +groups for etcd clusters. It is strongly recommended to always run a static five-member etcd cluster for production Kubernetes clusters at any officially supported scale. --> @@ -599,7 +596,7 @@ for information on how to add members into an existing cluster. etcd 支持从 [major.minor](http://semver.org/) 或其他不同 patch 版本的 etcd 进程中获取的快照进行恢复。 @@ -637,7 +634,7 @@ etcdctl --data-dir snapshot restore snapshot.db ``` 如果 `` 与之前的文件夹相同,请先删除此文件夹并停止 etcd 进程,再恢复集群。 否则,需要在恢复后更改 etcd 配置并重新启动 etcd 进程才能使用新的数据目录。 @@ -650,7 +647,7 @@ For more information and examples on restoring a cluster from a snapshot file, s [etcd 灾难恢复文档](https://etcd.io/docs/current/op-guide/recovery/#restoring-a-cluster)。 碎片整理是一种昂贵的操作,因此应尽可能少地执行此操作。 -另一方面,也有必要确保任何 etcd 成员都不会用尽存储配额。 +另一方面,也有必要确保任何 etcd 成员都不会超过存储配额。 Kubernetes 项目建议在执行碎片整理时, 使用诸如 [etcd-defrag](https://github.com/ahrtr/etcd-defrag) 之类的工具。 {{< /note >}} diff --git a/layouts/shortcodes/thirdparty-content.html b/layouts/shortcodes/thirdparty-content.html index a252e93c3d..09b45e72d8 100644 --- a/layouts/shortcodes/thirdparty-content.html +++ b/layouts/shortcodes/thirdparty-content.html @@ -2,7 +2,7 @@ {{- $vendor_message := .Get "vendor" | default "false" -}}