From 6d847b0a311d7e458b2562f1ec53944676e9879d Mon Sep 17 00:00:00 2001 From: xing-yang Date: Sun, 12 Feb 2023 22:25:22 +0000 Subject: [PATCH 01/22] Blog for Volume Group Snapshot --- .../2023-05-08-volume-group-snapshot-alpha.md | 268 ++++++++++++++++++ 1 file changed, 268 insertions(+) create mode 100644 content/en/blog/_posts/2023-05-08-volume-group-snapshot-alpha.md diff --git a/content/en/blog/_posts/2023-05-08-volume-group-snapshot-alpha.md b/content/en/blog/_posts/2023-05-08-volume-group-snapshot-alpha.md new file mode 100644 index 00000000000..14c396449d3 --- /dev/null +++ b/content/en/blog/_posts/2023-05-08-volume-group-snapshot-alpha.md @@ -0,0 +1,268 @@ +--- +layout: blog +title: "Introducing Volume Group Snapshot" +date: 2023-05-08T10:00:00-08:00 +slug: kubernetes-1-27-volume-group-snapshot-alpha +--- + +**Author:** Xing Yang (VMware) + +Volume group snapshot is introduced as an Alpha feature in Kubernetes v1.27. +This feature introduces a Kubernetes API that allows users to take a crash consistent +snapshot for multiple volumes together. It uses a label selector to group multiple +PersistentVolumeClaims for snapshotting. +This new feature is only supported for CSI volume drivers. + +## What is Volume Group Snapshot + +Some storage systems provide the ability to create a crash consistent snapshot of +multiple volumes. A group snapshot represents “copies” from multiple volumes that +are taken at the same point-in-time. A group snapshot can be used either to rehydrate +new volumes (pre-populated with the snapshot data) or to restore existing volumes to +a previous state (represented by the snapshots). + +## Why add Volume Group Snapshots to Kubernetes? + +The Kubernetes volume plugin system already provides a powerful abstraction that +automates the provisioning, attaching, mounting, resizing, and snapshotting of block +and file storage. + +Underpinning all these features is the Kubernetes goal of workload portability: +Kubernetes aims to create an abstraction layer between distributed applications and +underlying clusters so that applications can be agnostic to the specifics of the +cluster they run on and application deployment requires no “cluster specific” knowledge. + +There is already a [VolumeSnapshot API](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/177-volume-snapshot) +that provides the ability to take a snapshot of a persistent volume to protect against +data loss or data corruption. However, there are other snapshotting functionalities +not covered by the VolumeSnapshot API. + +Some storage systems support consistent group snapshots that allow a snapshot to be +taken from multiple volumes at the same point-in-time to achieve write order consistency. +This can be useful for applications that contain multiple volumes. For example, +an application may have data stored in one volume and logs stored in another volume. +If snapshots for the data volume and the logs volume are taken at different times, +the application will not be consistent and will not function properly if it is restored +from those snapshots when a disaster strikes. + +It is true that we can quiesce the application first, take an individual snapshot from +each volume that is part of the application one after the other, and then unquiesce the +application after all the individual snapshots are taken. This way we will get application +consistent snapshots. +However, application quiesce is time consuming. Sometimes it may not be possible to +quiesce an application. Taking individual snapshots one after another may also take +longer time compared to taking a consistent group snapshot. Some users may not want +to do application quiesce very frequently for these reasons. For example, a user may +want to run weekly backups with application quiesce and nightly backups without +application quiesce but with consistent group support which provides crash consistency +across all volumes in the group. + +## Kubernetes Volume Group Snapshots API + +Kubernetes Volume Group Snapshots introduce [three new API objects](https://github.com/kubernetes-csi/external-snapshotter/blob/master/client/apis/volumegroupsnapshot/v1alpha1/types.go) for managing snapshots: + +`VolumeGroupSnapshot` +: Created by a Kubernetes user (or perhaps by your own automation) to request +creation of a volume group snapshot for multiple volumes. +It contains information about the volume group snapshot operation such as the +timestamp when the volume group snapshot was taken and whether it is ready to use. +The creation and deletion of this object represents a desire to create or delete a +cluster resource (a group snapshot). + +`VolumeGroupSnapshotContent` +: Created by the snapshot controller for a dynamically created VolumeGroupSnapshot. +It contains information about the volume group snapshot including the volume group +snapshot ID. +This object represents a provisioned resource on the cluster (a group snapshot). +The VolumeGroupSnapshotContent object binds to the VolumeGroupSnapshot for which it +was created with a one-to-one mapping. + +`VolumeGroupSnapshotClass` +: Created by cluster administrators to describe how volume group snapshots should be +created. including the driver information, the deletion policy, etc. + +The Volume Group Snapshot objects are defined as CustomResourceDefinitions (CRDs). +These CRDs must be installed in a Kubernetes cluster for a CSI Driver to support +volume group snapshots. + +## How do I use Kubernetes Volume Group Snapshots + +Volume Group Snapshot feature is implemented in the +[external-snapshotter](https://github.com/kubernetes-csi/external-snapshotter) repository. Implementing volume +group snapshots meant adding or changing several components: + +* Kubernetes Volume Group Snapshot CRDs +* Volume group snapshot controller logic is added to the common snapshot controller. +* Volume group snapshot validation webhook logic is added to the common snapshot validation webhook. +* Logic to make CSI calls is added to CSI Snapshotter sidecar controller. + +The volume snapshot controller, CRDs, and validation webhook are deployed once per +cluster, while the sidecar is bundled with each CSI driver. + +Therefore, it makes sense to deploy the volume snapshot controller, CRDs, and validation +webhook as a cluster addon. It is strongly recommended that Kubernetes distributors +bundle and deploy the volume snapshot controller, CRDs, and validation webhook as part +of their Kubernetes cluster management process (independent of any CSI Driver). + +### Creating a new group snapshot with Kubernetes + +Once a VolumeGroupSnapshotClass object is defined and you have volumes you want to +snapshot together, you may create a new group snapshot by creating a VolumeGroupSnapshot +object. + +The source of the group snapshot specifies whether the underlying group snapshot +should be dynamically created or if a pre-existing VolumeGroupSnapshotContent +should be used. One of the following members in the source must be set. + +* Selector - Selector is a label query over persistent volume claims that are to be grouped together for snapshotting. This labelSelector will be used to match the label added to a PVC. +* VolumeGroupSnapshotContentName - specifies the name of a pre-existing VolumeGroupSnapshotContent object representing an existing volume group snapshot. + +For dynamic provisioning, a selector must be set so that the snapshot controller can +find PVCs with the matching labels to be snapshotted together. + +```yaml +apiVersion: groupsnapshot.storage.k8s.io/v1alpha1 +kind: VolumeGroupSnapshot +metadata: + name: new-group-snapshot-demo + namespace: demo-namespace +spec: + volumeGroupSnapshotClassName: csi-groupSnapclass + source: + selector: + group: myGroup +``` + +In the VolumeGroupSnapshot spec, a user can specify the VolumeGroupSnapshotClass which +has the information about which CSI driver should be used for creating the group snapshot. + +### Importing an existing group snapshot with Kubernetes + +You can always import an existing group snapshot to Kubernetes by manually creating +a VolumeGroupSnapshotContent object to represent the existing group snapshot. +Because VolumeGroupSnapshotContent is a non-namespace API object, only a system admin +may have the permission to create it. Once a VolumeGroupSnapshotContent object is +created, the user can create a VolumeGroupSnapshot object pointing to the +VolumeGroupSnapshotContent object. + +```yaml +apiVersion: groupsnapshot.storage.k8s.io/v1alpha1 +kind: VolumeGroupSnapshotContent +metadata: + name: pre-existing-group-snap-content1 +spec: + driver: com.example.csi-driver + deletionPolicy: Delete + source: + volumeGroupSnapshotHandle: group-snap-id + volumeGroupSnapshotRef: + kind: VolumeGroupSnapshot + name: pre-existing-group-snap1 + namespace: demo-namespace +``` + +A VolumeGroupSnapshot object should be created to allow a user to use the group snapshot: + +```yaml +apiVersion: groupsnapshot.storage.k8s.io/v1alpha1 +kind: VolumeGroupSnapshot +metadata: + name: pre-existing-group-snap1 + namespace: demo-namespace +spec: + snapshotContentName: pre-existing-group-snap-content1 +``` + +Once these objects are created, the snapshot controller will bind them together, +and set the field `status.ready` to `"True"` to indicate the group snapshot is ready +to use. + +### How to use group snapshot for restore in Kubernetes + +At restore time, the user can request a new PersistentVolumeClaim to be created from +a VolumeSnapshot object that is part of a VolumeGroupSnapshot. This will trigger +provisioning of a new volume that is pre-populated with data from the specified +snapshot. The user should repeat this until all volumes are created from all the +snapshots that are part of a group snapshot. + +## As a storage vendor, how do I add support for group snapshots to my CSI driver? + +To implement the volume group snapshot feature, a CSI driver MUST: + +* Implement a new group controller service. +* Implement group controller RPCs: `CreateVolumeGroupSnapshot`, `DeleteVolumeGroupSnapshot`, and `GetVolumeGroupSnapshot`. +* Add group controller capability `CREATE_DELETE_GET_VOLUME_GROUP_SNAPSHOT`. + +See the [CSI spec](https://github.com/container-storage-interface/spec/blob/master/spec.md) +and the [Kubernetes-CSI Driver Developer Guide](https://kubernetes-csi.github.io/docs/) +for more details. + +Although Kubernetes poses as little prescriptive on the packaging and deployment of +a CSI Volume Driver as possible, it provides a suggested mechanism to deploy a +containerized CSI driver to simplify the process. + +As part of this recommended deployment process, the Kubernetes team provides a number of +sidecar (helper) containers, including the +[external-snapshotter sidecar container](https://kubernetes-csi.github.io/docs/external-snapshotter.html) +which has been updated to support volume group snapshot. + +The external-snapshotter watches the Kubernetes API server for the +`VolumeGroupSnapshotContent` object and triggers `CreateVolumeGroupSnapshot` and +`DeleteVolumeGroupSnapshot` operations against a CSI endpoint. + +## What are the limitations? + +The alpha implementation of volume group snapshots for Kubernetes has the following +limitations: + +* Does not support reverting an existing PVC to an earlier state represented by a snapshot that is part of a group snapshot (only supports provisioning a new volume from a snapshot). +* No application consistency guarantees beyond any guarantees provided by the storage system (e.g. crash consistency). + +## What’s next? + +Depending on feedback and adoption, the Kubernetes team plans to push the CSI +Group Snapshot implementation to Beta in either 1.28 or 1.29. +Some of the features we are interested in supporting include volume replication, +replication group, volume placement, application quiescing, changed block tracking, and more. + +## How can I learn more? + +The design spec for the volume group snapshot feature is [here](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/3476-volume-group-snapshot). + +The code repository for volume group snapshot APIs and controller is [here](https://github.com/kubernetes-csi/external-snapshotter). + +Check out additional documentation on the group snapshot feature [here](https://kubernetes-csi.github.io/docs/). + +## How do I get involved? + +This project, like all of Kubernetes, is the result of hard work by many contributors +from diverse backgrounds working together. On behalf of SIG Storage, I would like to +offer a huge thank you to the contributors who stepped up these last few quarters +to help the project reach alpha: + +* Alex Meade ([ameade](https://github.com/ameade)) +* Ben Swartzlander ([bswartz](https://github.com/bswartz)) +* Humble Devassy Chirammal ([humblec](https://github.com/humblec)) +* James Defelice ([jdef](https://github.com/jdef)) +* Jan Šafránek ([jsafrane](https://github.com/jsafrane)) +* Jing Xu ([jingxu97](https://github.com/jingxu97)) +* Michelle Au ([msau42](https://github.com/msau42)) +* Niels de Vos ([nixpanic](https://github.com/nixpanic)) +* Rakshith R ([Rakshith-R](https://github.com/Rakshith-R)) +* Raunak Shah ([RaunakShah](https://github.com/RaunakShah)) +* Saad Ali ([saad-ali](https://github.com/saad-ali)) +* Thomas Watson ([rbo54](https://github.com/rbo54)) +* Xing Yang ([xing-yang](https://github.com/xing-yang)) +* Yati Padia ([yati1998](https://github.com/yati1998)) + +We also want to thank everyone else who has contributed to the project, including others +who helped review the [KEP](https://github.com/kubernetes/enhancements/pull/1551) +and the [CSI spec PR](https://github.com/container-storage-interface/spec/pull/519). + +For those interested in getting involved with the design and development of CSI or +any part of the Kubernetes Storage system, join the +[Kubernetes Storage Special Interest Group](https://github.com/kubernetes/community/tree/master/sig-storage) (SIG). +We always welcome new contributors. + +We also hold regular [Data Protection Working Group meetings](https://docs.google.com/document/d/15tLCV3csvjHbKb16DVk-mfUmFry_Rlwo-2uG6KNGsfw/edit#). +New attendees are welcome to join our discussions. From 8555b53308c9e88b9b1dab7aa7619459809ff241 Mon Sep 17 00:00:00 2001 From: Peter Schuurman Date: Tue, 29 Nov 2022 17:58:51 -0800 Subject: [PATCH 02/22] Add blog post for StatefulSet Migration using StatefulSetStartOrdinal --- .../2022-12-16-statefulset-migration.md | 180 ++++++++++++++++++ 1 file changed, 180 insertions(+) create mode 100644 content/en/blog/_posts/2022-12-16-statefulset-migration.md diff --git a/content/en/blog/_posts/2022-12-16-statefulset-migration.md b/content/en/blog/_posts/2022-12-16-statefulset-migration.md new file mode 100644 index 00000000000..c191306b3e7 --- /dev/null +++ b/content/en/blog/_posts/2022-12-16-statefulset-migration.md @@ -0,0 +1,180 @@ +--- +layout: blog +title: "Kubernetes 1.26: StatefulSet Migration" +date: 2022-12-16 +slug: statefulset-migration +--- + +**Author**: Peter Schuurman (Google) + +Kubernetes v1.26 introduces a new, alpha-level feature for +[StatefulSets](/docs/concepts/workloads/controllers/statefulset/) that controls +the ordinal numbering of Pod replicas. Ordinals can start from arbitrary +non-negative numbers. This blog post will discuss how this feature can be +used. + +## Background + +StatefulSets ordinals provide sequential identities for pod replicas. When using +[OrderedReady Pod Management](/docs/tutorials/stateful-application/basic-stateful-set/#orderedready-pod-management), +Pods are created from ordinal index `0` up to `N-1`. + +With Kubernetes today, orchestrating a StatefulSet migration across clusters is +challenging. Backup and restore solutions exist, but these require the +application to be scaled down to zero replicas prior to migration. In today's +fully connected world, planned downtime and unavailability may not allow you to +meet your business goals. You could use +[Cascading Delete](/docs/tutorials/stateful-application/basic-stateful-set/#cascading-delete) +or +[On Delete](/docs/tutorials/stateful-application/basic-stateful-set/#on-delete) +to migrate individual pods, however this is error prone and tedious to manage. +You lose the self-healing benefit of the StatefulSet controller when your Pods +fail or are evicted. + +This feature enables a StatefulSet to be responsible for a range of ordinals +within a logical range of `[0, N)`. With it, you can scale down a range +(`[0, k)`) in a source cluster, and scale up the complementary range (`[k, N)`) +in a destination cluster, while maintaining application availability. This +enables you to retain *at most one* semantics and +[Rolling Update](/docs/tutorials/stateful-application/basic-stateful-set/#rolling-update) +behavior when orchestrating a migration across clusters. + +### Why would I want to use this feature? + +Say you're running your StatefulSet in one cluster, and need to migrate it out +to a different cluster. There are many reasons why you would need to do this: + * **Scalability**: Your StatefulSet has scaled too large for your cluster, and + has started to disrupt the quality of service for other workloads in your + cluster. + * **Isolation**: You're running a StatefulSet in a cluster that is accessed + by multiple users, and namespace isolation isn't sufficient. + * **Cluster Configuration**: You want to move your StatefulSet to a different + cluster to use some environment that is not available on your current + cluster. + * **Control Plane Upgrades**: You want to move your StatefulSet to a cluster + running an upgraded control plane, and can't handle the risk or downtime of + in-place control plane upgrades. + +### How do I use it? + +Enable the `StatefulSetStartOrdinal` feature gate on a cluster, and create a +StatefulSet with a customized `.spec.ordinals.start`. + +### Try it for yourself + +In this demo, you'll use the `StatefulSetStartOrdinal` feature to migrate a +StatefulSet from one cluster to another. For this demo, the +[redis-cluster](https://github.com/bitnami/charts/tree/main/bitnami/redis-cluster) +Bitnami Helm chart is used to install Redis. + +Tools Required: + * [yq](https://github.com/mikefarah/yq) + * [helm](https://helm.sh/docs/helm/helm_install/) + +Pre-requisites: Two clusters named `source` and `destination`. + * `StatefulSetStartOrdinal` feature gate is enabled on both clusters + * [MultiClusterServices](https://github.com/kubernetes/enhancements/tree/master/keps/sig-multicluster/1645-multi-cluster-services-api) +support is enabled + * The same default `StorageClass` is installed on both clusters. This + `StorageClass` should provision underlying storage that is accessible from + both clusters + +1. Create a demo namespace on both clusters. + +``` +kubectl create ns kep-3335 +``` + +2. Deploy a `ServiceExport` on both clusters. + +``` +kind: ServiceExport +apiVersion: multicluster.x-k8s.io/v1alpha1 +metadata: + namespace: kep-3335 + name: redis-redis-cluster-headless +``` + +3. Deploy a Redis cluster on `source`. + +``` +helm repo add bitnami https://charts.bitnami.com/bitnami +helm install redis --namespace kep-3335 \ + bitnami/redis-cluster \ + --set persistence.size=1Gi +``` + +4. On `source`, check the replication status. + +``` +kubectl exec -it redis-redis-cluster-0 -- /bin/bash -c \ + "redis-cli -c -h redis-redis-cluster -a $(kubectl get secret redis-redis-cluster -o jsonpath="{.data.redis-password}" | base64 -d) CLUSTER NODES;" +``` + +``` +2ce30362c188aabc06f3eee5d92892d95b1da5c3 10.104.0.14:6379@16379 myself,master - 0 1669764411000 3 connected 10923-16383 +7743661f60b6b17b5c71d083260419588b4f2451 10.104.0.16:6379@16379 slave 2ce30362c188aabc06f3eee5d92892d95b1da5c3 0 1669764410000 3 connected +961f35e37c4eea507cfe12f96e3bfd694b9c21d4 10.104.0.18:6379@16379 slave a8765caed08f3e185cef22bd09edf409dc2bcc61 0 1669764411000 1 connected +7136e37d8864db983f334b85d2b094be47c830e5 10.104.0.15:6379@16379 slave 2cff613d763b22c180cd40668da8e452edef3fc8 0 1669764412595 2 connected +a8765caed08f3e185cef22bd09edf409dc2bcc61 10.104.0.19:6379@16379 master - 0 1669764411592 1 connected 0-5460 +2cff613d763b22c180cd40668da8e452edef3fc8 10.104.0.17:6379@16379 master - 0 1669764410000 2 connected 5461-10922 +``` + +5. On `destination`, deploy Redis with zero replicas. + +``` +helm install redis --namespace kep-3335 \ + bitnami/redis-cluster \ + --set persistence.size=1Gi \ + --set cluster.nodes=0 \ + --set redis.extraEnvVars\[0\].name=REDIS_NODES,redis.extraEnvVars\[0\].value="redis-redis-cluster-headless.kep-3335.svc.cluster.local" \ + --set existingSecret=redis-redis-cluster +``` + +6. Scale down replica `redis-redis-cluster-5` in the source cluster. + +``` +kubectl patch sts redis-redis-cluster -p '{"spec": {"replicas": 5}}' +``` + +7. Migrate dependencies from `source` to `destination`. + +Source Cluster + +``` +kubectl get pvc redis-data-redis-redis-cluster-5 -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion, .metadata.annotations, .metadata.finalizers, .status)' > /tmp/pvc-redis-data-redis-redis-cluster-5.yaml +kubectl get pv $(yq '.spec.volumeName' /tmp/pvc-redis-data-redis-redis-cluster-5.yaml) -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion, .metadata.annotations, .metadata.finalizers, .spec.claimRef, .status)' > /tmp/pv-redis-data-redis-redis-cluster-5.yaml +kubectl get secret redis-redis-cluster -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion)' > /tmp/secret-redis-redis-cluster.yaml +``` + +Destination Cluster + +``` +kubectl create -f /tmp/pv-redis-data-redis-redis-cluster-5.yaml +kubectl create -f /tmp/pvc-redis-data-redis-redis-cluster-5.yaml +kubectl create -f /tmp/secret-redis-redis-cluster.yaml +``` + +8. Scale up replica `redis-redis-cluster-5` in the destination cluster. + +``` +kubectl patch sts redis-redis-cluster -p '{"spec": {"ordinals": {"start": 5}, "replicas": 1}}' +``` + +9. On the source cluster, check the replication status. + +``` +kubectl exec -it redis-redis-cluster-0 -- /bin/bash -c \ + "redis-cli -c -h redis-redis-cluster -a $(kubectl get secret redis-redis-cluster -o jsonpath="{.data.redis-password}" | base64 -d) CLUSTER NODES;" +``` + +You should see that the new replica's address has joined the Redis cluster. + +``` +2cff613d763b22c180cd40668da8e452edef3fc8 10.104.0.17:6379@16379 myself,master - 0 1669766684000 2 connected 5461-10922 +7136e37d8864db983f334b85d2b094be47c830e5 10.108.0.22:6379@16379 slave 2cff613d763b22c180cd40668da8e452edef3fc8 0 1669766685609 2 connected +2ce30362c188aabc06f3eee5d92892d95b1da5c3 10.104.0.14:6379@16379 master - 0 1669766684000 3 connected 10923-16383 +961f35e37c4eea507cfe12f96e3bfd694b9c21d4 10.104.0.18:6379@16379 slave a8765caed08f3e185cef22bd09edf409dc2bcc61 0 1669766683600 1 connected +a8765caed08f3e185cef22bd09edf409dc2bcc61 10.104.0.19:6379@16379 master - 0 1669766685000 1 connected 0-5460 +7743661f60b6b17b5c71d083260419588b4f2451 10.104.0.16:6379@16379 slave 2ce30362c188aabc06f3eee5d92892d95b1da5c3 0 1669766686613 3 connected +``` From bd45ab5474c7a9170742b0cc2fc9573d0c3b34c8 Mon Sep 17 00:00:00 2001 From: Peter Schuurman Date: Tue, 29 Nov 2022 18:33:24 -0800 Subject: [PATCH 03/22] Update blog post headings and add What's Next section --- .../2022-12-16-statefulset-migration.md | 32 +++++++++++++++++-- 1 file changed, 29 insertions(+), 3 deletions(-) diff --git a/content/en/blog/_posts/2022-12-16-statefulset-migration.md b/content/en/blog/_posts/2022-12-16-statefulset-migration.md index c191306b3e7..be6f591adc1 100644 --- a/content/en/blog/_posts/2022-12-16-statefulset-migration.md +++ b/content/en/blog/_posts/2022-12-16-statefulset-migration.md @@ -39,7 +39,7 @@ enables you to retain *at most one* semantics and [Rolling Update](/docs/tutorials/stateful-application/basic-stateful-set/#rolling-update) behavior when orchestrating a migration across clusters. -### Why would I want to use this feature? +## Why would I want to use this feature? Say you're running your StatefulSet in one cluster, and need to migrate it out to a different cluster. There are many reasons why you would need to do this: @@ -55,12 +55,12 @@ to a different cluster. There are many reasons why you would need to do this: running an upgraded control plane, and can't handle the risk or downtime of in-place control plane upgrades. -### How do I use it? +## How do I use it? Enable the `StatefulSetStartOrdinal` feature gate on a cluster, and create a StatefulSet with a customized `.spec.ordinals.start`. -### Try it for yourself +## Try it for yourself In this demo, you'll use the `StatefulSetStartOrdinal` feature to migrate a StatefulSet from one cluster to another. For this demo, the @@ -139,6 +139,17 @@ kubectl patch sts redis-redis-cluster -p '{"spec": {"replicas": 5}}' 7. Migrate dependencies from `source` to `destination`. +The following commands copy resources from `source` to `destionation`. Details +that are not relevant in `destination` cluster are removed (eg: `uid`, +`resourceVersion`, `status`). + +Note: For the PV/PVC, this procedure only works if the underlying storage system + that your `StorageClass` uses can support copying existing PVs into a + new cluster. Storage that is associated with a specific node or topology + may not be supported. Additionally, some storage systems may store + addtional metadata about volumes outside of a PV object, and may require + a more specialized sequence to import a volume. + Source Cluster ``` @@ -178,3 +189,18 @@ You should see that the new replica's address has joined the Redis cluster. a8765caed08f3e185cef22bd09edf409dc2bcc61 10.104.0.19:6379@16379 master - 0 1669766685000 1 connected 0-5460 7743661f60b6b17b5c71d083260419588b4f2451 10.104.0.16:6379@16379 slave 2ce30362c188aabc06f3eee5d92892d95b1da5c3 0 1669766686613 3 connected ``` + +## What's Next? + +This feature provides a building block for a StatefulSet to be split up across +clusters, but does not prescribe the mechanism as to how the StatefulSet should +be migrated. Migration requires coordination of StatefulSet replicas, along with +orchestration of the storage and network layer. This is dependent on the storage +and connectivity requirements of the application installed by the StatefulSet. +Additionally, many StatefulSets are controlled by Operators, which adds another +layer of complexity to migration. + +If you're interested in building blocks to make these processes easier, get +involved with +[SIG Multicluster](https://github.com/kubernetes/community/blob/master/sig-multicluster) +to contribute! From 0043f1967f69c0dc5afec5fa9981907a8394a922 Mon Sep 17 00:00:00 2001 From: Peter Schuurman Date: Thu, 1 Dec 2022 08:52:40 -0800 Subject: [PATCH 04/22] Add a note about copying PV/PVC from source to destination cluster --- .../2022-12-16-statefulset-migration.md | 20 ++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/content/en/blog/_posts/2022-12-16-statefulset-migration.md b/content/en/blog/_posts/2022-12-16-statefulset-migration.md index be6f591adc1..a279b5f8cc8 100644 --- a/content/en/blog/_posts/2022-12-16-statefulset-migration.md +++ b/content/en/blog/_posts/2022-12-16-statefulset-migration.md @@ -143,15 +143,14 @@ The following commands copy resources from `source` to `destionation`. Details that are not relevant in `destination` cluster are removed (eg: `uid`, `resourceVersion`, `status`). -Note: For the PV/PVC, this procedure only works if the underlying storage system - that your `StorageClass` uses can support copying existing PVs into a - new cluster. Storage that is associated with a specific node or topology - may not be supported. Additionally, some storage systems may store - addtional metadata about volumes outside of a PV object, and may require - a more specialized sequence to import a volume. - Source Cluster +Note: If using a `StorageClass` with `reclaimPolicy: Delete` configured, you + should patch the PVs in `source` with `reclaimPolicy: Retain` prior to + deletion to retain the underlying storage used in `destination`. See + [Change the Reclaim Policy of a PersistentVolume](/docs/tasks/administer-cluster/change-pv-reclaim-policy/) + for more details. + ``` kubectl get pvc redis-data-redis-redis-cluster-5 -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion, .metadata.annotations, .metadata.finalizers, .status)' > /tmp/pvc-redis-data-redis-redis-cluster-5.yaml kubectl get pv $(yq '.spec.volumeName' /tmp/pvc-redis-data-redis-redis-cluster-5.yaml) -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion, .metadata.annotations, .metadata.finalizers, .spec.claimRef, .status)' > /tmp/pv-redis-data-redis-redis-cluster-5.yaml @@ -160,6 +159,13 @@ kubectl get secret redis-redis-cluster -o yaml | yq 'del(.metadata.uid, .metadat Destination Cluster +Note: For the PV/PVC, this procedure only works if the underlying storage system + that your PVs use can support being copied into `destination`. Storage + that is associated with a specific node or topology may not be supported. + Additionally, some storage systems may store addtional metadata about + volumes outside of a PV object, and may require a more specialized + sequence to import a volume. + ``` kubectl create -f /tmp/pv-redis-data-redis-redis-cluster-5.yaml kubectl create -f /tmp/pvc-redis-data-redis-redis-cluster-5.yaml From bd610ae7d3a9786a1b7d9b3af4070eb1ff4cae0b Mon Sep 17 00:00:00 2001 From: Peter Schuurman Date: Sat, 17 Dec 2022 04:21:04 -0800 Subject: [PATCH 05/22] Review updates for StatefulSet StartOrdinal blog post --- .../2022-12-16-statefulset-migration.md | 182 +++++++++--------- 1 file changed, 92 insertions(+), 90 deletions(-) diff --git a/content/en/blog/_posts/2022-12-16-statefulset-migration.md b/content/en/blog/_posts/2022-12-16-statefulset-migration.md index a279b5f8cc8..d7b80aef071 100644 --- a/content/en/blog/_posts/2022-12-16-statefulset-migration.md +++ b/content/en/blog/_posts/2022-12-16-statefulset-migration.md @@ -1,7 +1,7 @@ --- layout: blog -title: "Kubernetes 1.26: StatefulSet Migration" -date: 2022-12-16 +title: "Kubernetes 1.26: StatefulSet Start Ordinal Simplifies Migration" +date: 2023-01-03 slug: statefulset-migration --- @@ -16,13 +16,13 @@ used. ## Background StatefulSets ordinals provide sequential identities for pod replicas. When using -[OrderedReady Pod Management](/docs/tutorials/stateful-application/basic-stateful-set/#orderedready-pod-management), +[`OrderedReady` Pod management](/docs/tutorials/stateful-application/basic-stateful-set/#orderedready-pod-management), Pods are created from ordinal index `0` up to `N-1`. With Kubernetes today, orchestrating a StatefulSet migration across clusters is challenging. Backup and restore solutions exist, but these require the application to be scaled down to zero replicas prior to migration. In today's -fully connected world, planned downtime and unavailability may not allow you to +fully connected world, even planned application downtime may not allow you to meet your business goals. You could use [Cascading Delete](/docs/tutorials/stateful-application/basic-stateful-set/#cascading-delete) or @@ -31,8 +31,9 @@ to migrate individual pods, however this is error prone and tedious to manage. You lose the self-healing benefit of the StatefulSet controller when your Pods fail or are evicted. -This feature enables a StatefulSet to be responsible for a range of ordinals -within a logical range of `[0, N)`. With it, you can scale down a range +Kubernetes v1.26 enables a StatefulSet to be responsible for a range of ordinals +within a half-open interval `[0, N)` (the ordinals 0, 1, ... N-1). +With it, you can scale down a range (`[0, k)`) in a source cluster, and scale up the complementary range (`[k, N)`) in a destination cluster, while maintaining application availability. This enables you to retain *at most one* semantics and @@ -63,7 +64,7 @@ StatefulSet with a customized `.spec.ordinals.start`. ## Try it for yourself In this demo, you'll use the `StatefulSetStartOrdinal` feature to migrate a -StatefulSet from one cluster to another. For this demo, the +StatefulSet from one Kubernetes cluster to another. For this demo, the [redis-cluster](https://github.com/bitnami/charts/tree/main/bitnami/redis-cluster) Bitnami Helm chart is used to install Redis. @@ -77,124 +78,125 @@ Pre-requisites: Two clusters named `source` and `destination`. support is enabled * The same default `StorageClass` is installed on both clusters. This `StorageClass` should provision underlying storage that is accessible from - both clusters + both clusters. 1. Create a demo namespace on both clusters. -``` -kubectl create ns kep-3335 -``` + ``` + kubectl create ns kep-3335 + ``` 2. Deploy a `ServiceExport` on both clusters. -``` -kind: ServiceExport -apiVersion: multicluster.x-k8s.io/v1alpha1 -metadata: - namespace: kep-3335 - name: redis-redis-cluster-headless -``` + ``` + kind: ServiceExport + apiVersion: multicluster.x-k8s.io/v1alpha1 + metadata: + namespace: kep-3335 + name: redis-redis-cluster-headless + ``` 3. Deploy a Redis cluster on `source`. -``` -helm repo add bitnami https://charts.bitnami.com/bitnami -helm install redis --namespace kep-3335 \ - bitnami/redis-cluster \ - --set persistence.size=1Gi -``` + ``` + helm repo add bitnami https://charts.bitnami.com/bitnami + helm install redis --namespace kep-3335 \ + bitnami/redis-cluster \ + --set persistence.size=1Gi + ``` 4. On `source`, check the replication status. -``` -kubectl exec -it redis-redis-cluster-0 -- /bin/bash -c \ - "redis-cli -c -h redis-redis-cluster -a $(kubectl get secret redis-redis-cluster -o jsonpath="{.data.redis-password}" | base64 -d) CLUSTER NODES;" -``` + ``` + kubectl exec -it redis-redis-cluster-0 -- /bin/bash -c \ + "redis-cli -c -h redis-redis-cluster -a $(kubectl get secret redis-redis-cluster -o jsonpath="{.data.redis-password}" | base64 -d) CLUSTER NODES;" + ``` -``` -2ce30362c188aabc06f3eee5d92892d95b1da5c3 10.104.0.14:6379@16379 myself,master - 0 1669764411000 3 connected 10923-16383 -7743661f60b6b17b5c71d083260419588b4f2451 10.104.0.16:6379@16379 slave 2ce30362c188aabc06f3eee5d92892d95b1da5c3 0 1669764410000 3 connected -961f35e37c4eea507cfe12f96e3bfd694b9c21d4 10.104.0.18:6379@16379 slave a8765caed08f3e185cef22bd09edf409dc2bcc61 0 1669764411000 1 connected -7136e37d8864db983f334b85d2b094be47c830e5 10.104.0.15:6379@16379 slave 2cff613d763b22c180cd40668da8e452edef3fc8 0 1669764412595 2 connected -a8765caed08f3e185cef22bd09edf409dc2bcc61 10.104.0.19:6379@16379 master - 0 1669764411592 1 connected 0-5460 -2cff613d763b22c180cd40668da8e452edef3fc8 10.104.0.17:6379@16379 master - 0 1669764410000 2 connected 5461-10922 -``` + ``` + 2ce30362c188aabc06f3eee5d92892d95b1da5c3 10.104.0.14:6379@16379 myself,master - 0 1669764411000 3 connected 10923-16383 + 7743661f60b6b17b5c71d083260419588b4f2451 10.104.0.16:6379@16379 slave 2ce30362c188aabc06f3eee5d92892d95b1da5c3 0 1669764410000 3 connected + 961f35e37c4eea507cfe12f96e3bfd694b9c21d4 10.104.0.18:6379@16379 slave a8765caed08f3e185cef22bd09edf409dc2bcc61 0 1669764411000 1 connected + 7136e37d8864db983f334b85d2b094be47c830e5 10.104.0.15:6379@16379 slave 2cff613d763b22c180cd40668da8e452edef3fc8 0 1669764412595 2 connected + a8765caed08f3e185cef22bd09edf409dc2bcc61 10.104.0.19:6379@16379 master - 0 1669764411592 1 connected 0-5460 + 2cff613d763b22c180cd40668da8e452edef3fc8 10.104.0.17:6379@16379 master - 0 1669764410000 2 connected 5461-10922 + ``` 5. On `destination`, deploy Redis with zero replicas. -``` -helm install redis --namespace kep-3335 \ - bitnami/redis-cluster \ - --set persistence.size=1Gi \ - --set cluster.nodes=0 \ - --set redis.extraEnvVars\[0\].name=REDIS_NODES,redis.extraEnvVars\[0\].value="redis-redis-cluster-headless.kep-3335.svc.cluster.local" \ - --set existingSecret=redis-redis-cluster -``` + ``` + helm install redis --namespace kep-3335 \ + bitnami/redis-cluster \ + --set persistence.size=1Gi \ + --set cluster.nodes=0 \ + --set redis.extraEnvVars\[0\].name=REDIS_NODES,redis.extraEnvVars\[0\].value="redis-redis-cluster-headless.kep-3335.svc.cluster.local" \ + --set existingSecret=redis-redis-cluster + ``` 6. Scale down replica `redis-redis-cluster-5` in the source cluster. -``` -kubectl patch sts redis-redis-cluster -p '{"spec": {"replicas": 5}}' -``` + ``` + kubectl patch sts redis-redis-cluster -p '{"spec": {"replicas": 5}}' + ``` 7. Migrate dependencies from `source` to `destination`. -The following commands copy resources from `source` to `destionation`. Details -that are not relevant in `destination` cluster are removed (eg: `uid`, -`resourceVersion`, `status`). + The following commands copy resources from `source` to `destionation`. Details + that are not relevant in `destination` cluster are removed (eg: `uid`, + `resourceVersion`, `status`). -Source Cluster + #### Source Cluster -Note: If using a `StorageClass` with `reclaimPolicy: Delete` configured, you - should patch the PVs in `source` with `reclaimPolicy: Retain` prior to - deletion to retain the underlying storage used in `destination`. See - [Change the Reclaim Policy of a PersistentVolume](/docs/tasks/administer-cluster/change-pv-reclaim-policy/) - for more details. + Note: If using a `StorageClass` with `reclaimPolicy: Delete` configured, you + should patch the PVs in `source` with `reclaimPolicy: Retain` prior to + deletion to retain the underlying storage used in `destination`. See + [Change the Reclaim Policy of a PersistentVolume](/docs/tasks/administer-cluster/change-pv-reclaim-policy/) + for more details. -``` -kubectl get pvc redis-data-redis-redis-cluster-5 -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion, .metadata.annotations, .metadata.finalizers, .status)' > /tmp/pvc-redis-data-redis-redis-cluster-5.yaml -kubectl get pv $(yq '.spec.volumeName' /tmp/pvc-redis-data-redis-redis-cluster-5.yaml) -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion, .metadata.annotations, .metadata.finalizers, .spec.claimRef, .status)' > /tmp/pv-redis-data-redis-redis-cluster-5.yaml -kubectl get secret redis-redis-cluster -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion)' > /tmp/secret-redis-redis-cluster.yaml -``` + ``` + kubectl get pvc redis-data-redis-redis-cluster-5 -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion, .metadata.annotations, .metadata.finalizers, .status)' > /tmp/pvc-redis-data-redis-redis-cluster-5.yaml + kubectl get pv $(yq '.spec.volumeName' /tmp/pvc-redis-data-redis-redis-cluster-5.yaml) -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion, .metadata.annotations, .metadata.finalizers, .spec.claimRef, .status)' > /tmp/pv-redis-data-redis-redis-cluster-5.yaml + kubectl get secret redis-redis-cluster -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion)' > /tmp/secret-redis-redis-cluster.yaml + ``` -Destination Cluster + #### Destination Cluster -Note: For the PV/PVC, this procedure only works if the underlying storage system - that your PVs use can support being copied into `destination`. Storage - that is associated with a specific node or topology may not be supported. - Additionally, some storage systems may store addtional metadata about - volumes outside of a PV object, and may require a more specialized - sequence to import a volume. + Note: For the PV/PVC, this procedure only works if the underlying storage system + that your PVs use can support being copied into `destination`. Storage + that is associated with a specific node or topology may not be supported. + Additionally, some storage systems may store addtional metadata about + volumes outside of a PV object, and may require a more specialized + sequence to import a volume. -``` -kubectl create -f /tmp/pv-redis-data-redis-redis-cluster-5.yaml -kubectl create -f /tmp/pvc-redis-data-redis-redis-cluster-5.yaml -kubectl create -f /tmp/secret-redis-redis-cluster.yaml -``` + ``` + kubectl create -f /tmp/pv-redis-data-redis-redis-cluster-5.yaml + kubectl create -f /tmp/pvc-redis-data-redis-redis-cluster-5.yaml + kubectl create -f /tmp/secret-redis-redis-cluster.yaml + ``` -8. Scale up replica `redis-redis-cluster-5` in the destination cluster. +8. Scale up replica `redis-redis-cluster-5` in the destination cluster, with a + start ordinal of 5: -``` -kubectl patch sts redis-redis-cluster -p '{"spec": {"ordinals": {"start": 5}, "replicas": 1}}' -``` + ``` + kubectl patch sts redis-redis-cluster -p '{"spec": {"ordinals": {"start": 5}, "replicas": 1}}' + ``` 9. On the source cluster, check the replication status. -``` -kubectl exec -it redis-redis-cluster-0 -- /bin/bash -c \ - "redis-cli -c -h redis-redis-cluster -a $(kubectl get secret redis-redis-cluster -o jsonpath="{.data.redis-password}" | base64 -d) CLUSTER NODES;" -``` + ``` + kubectl exec -it redis-redis-cluster-0 -- /bin/bash -c \ + "redis-cli -c -h redis-redis-cluster -a $(kubectl get secret redis-redis-cluster -o jsonpath="{.data.redis-password}" | base64 -d) CLUSTER NODES;" + ``` -You should see that the new replica's address has joined the Redis cluster. + You should see that the new replica's address has joined the Redis cluster. -``` -2cff613d763b22c180cd40668da8e452edef3fc8 10.104.0.17:6379@16379 myself,master - 0 1669766684000 2 connected 5461-10922 -7136e37d8864db983f334b85d2b094be47c830e5 10.108.0.22:6379@16379 slave 2cff613d763b22c180cd40668da8e452edef3fc8 0 1669766685609 2 connected -2ce30362c188aabc06f3eee5d92892d95b1da5c3 10.104.0.14:6379@16379 master - 0 1669766684000 3 connected 10923-16383 -961f35e37c4eea507cfe12f96e3bfd694b9c21d4 10.104.0.18:6379@16379 slave a8765caed08f3e185cef22bd09edf409dc2bcc61 0 1669766683600 1 connected -a8765caed08f3e185cef22bd09edf409dc2bcc61 10.104.0.19:6379@16379 master - 0 1669766685000 1 connected 0-5460 -7743661f60b6b17b5c71d083260419588b4f2451 10.104.0.16:6379@16379 slave 2ce30362c188aabc06f3eee5d92892d95b1da5c3 0 1669766686613 3 connected -``` + ``` + 2cff613d763b22c180cd40668da8e452edef3fc8 10.104.0.17:6379@16379 myself,master - 0 1669766684000 2 connected 5461-10922 + 7136e37d8864db983f334b85d2b094be47c830e5 10.108.0.22:6379@16379 slave 2cff613d763b22c180cd40668da8e452edef3fc8 0 1669766685609 2 connected + 2ce30362c188aabc06f3eee5d92892d95b1da5c3 10.104.0.14:6379@16379 master - 0 1669766684000 3 connected 10923-16383 + 961f35e37c4eea507cfe12f96e3bfd694b9c21d4 10.104.0.18:6379@16379 slave a8765caed08f3e185cef22bd09edf409dc2bcc61 0 1669766683600 1 connected + a8765caed08f3e185cef22bd09edf409dc2bcc61 10.104.0.19:6379@16379 master - 0 1669766685000 1 connected 0-5460 + 7743661f60b6b17b5c71d083260419588b4f2451 10.104.0.16:6379@16379 slave 2ce30362c188aabc06f3eee5d92892d95b1da5c3 0 1669766686613 3 connected + ``` ## What's Next? From 76dae7885750c692255e439afa7d2f631856cd1f Mon Sep 17 00:00:00 2001 From: Peter Schuurman Date: Sat, 17 Dec 2022 04:29:51 -0800 Subject: [PATCH 06/22] Remove MCS references from StatefulSet start ordinal blog post --- .../2022-12-16-statefulset-migration.md | 31 +++++++------------ 1 file changed, 11 insertions(+), 20 deletions(-) diff --git a/content/en/blog/_posts/2022-12-16-statefulset-migration.md b/content/en/blog/_posts/2022-12-16-statefulset-migration.md index d7b80aef071..c9e5226cc80 100644 --- a/content/en/blog/_posts/2022-12-16-statefulset-migration.md +++ b/content/en/blog/_posts/2022-12-16-statefulset-migration.md @@ -72,13 +72,14 @@ Tools Required: * [yq](https://github.com/mikefarah/yq) * [helm](https://helm.sh/docs/helm/helm_install/) -Pre-requisites: Two clusters named `source` and `destination`. +Pre-requisites: Two Kubernetes clusters named `source` and `destination`. * `StatefulSetStartOrdinal` feature gate is enabled on both clusters - * [MultiClusterServices](https://github.com/kubernetes/enhancements/tree/master/keps/sig-multicluster/1645-multi-cluster-services-api) -support is enabled * The same default `StorageClass` is installed on both clusters. This `StorageClass` should provision underlying storage that is accessible from both clusters. + * A flat network topology that allows for pods to be accessible across both + Kubernetes clusters. If creating clusters on a cloud provider, this + configuration may be called private cloud or private network. 1. Create a demo namespace on both clusters. @@ -86,17 +87,7 @@ support is enabled kubectl create ns kep-3335 ``` -2. Deploy a `ServiceExport` on both clusters. - - ``` - kind: ServiceExport - apiVersion: multicluster.x-k8s.io/v1alpha1 - metadata: - namespace: kep-3335 - name: redis-redis-cluster-headless - ``` - -3. Deploy a Redis cluster on `source`. +2. Deploy a Redis cluster on `source`. ``` helm repo add bitnami https://charts.bitnami.com/bitnami @@ -105,7 +96,7 @@ support is enabled --set persistence.size=1Gi ``` -4. On `source`, check the replication status. +3. On `source`, check the replication status. ``` kubectl exec -it redis-redis-cluster-0 -- /bin/bash -c \ @@ -121,7 +112,7 @@ support is enabled 2cff613d763b22c180cd40668da8e452edef3fc8 10.104.0.17:6379@16379 master - 0 1669764410000 2 connected 5461-10922 ``` -5. On `destination`, deploy Redis with zero replicas. +4. On `destination`, deploy Redis with zero replicas. ``` helm install redis --namespace kep-3335 \ @@ -132,13 +123,13 @@ support is enabled --set existingSecret=redis-redis-cluster ``` -6. Scale down replica `redis-redis-cluster-5` in the source cluster. +5. Scale down replica `redis-redis-cluster-5` in the source cluster. ``` kubectl patch sts redis-redis-cluster -p '{"spec": {"replicas": 5}}' ``` -7. Migrate dependencies from `source` to `destination`. +6. Migrate dependencies from `source` to `destination`. The following commands copy resources from `source` to `destionation`. Details that are not relevant in `destination` cluster are removed (eg: `uid`, @@ -173,14 +164,14 @@ support is enabled kubectl create -f /tmp/secret-redis-redis-cluster.yaml ``` -8. Scale up replica `redis-redis-cluster-5` in the destination cluster, with a +7. Scale up replica `redis-redis-cluster-5` in the destination cluster, with a start ordinal of 5: ``` kubectl patch sts redis-redis-cluster -p '{"spec": {"ordinals": {"start": 5}, "replicas": 1}}' ``` -9. On the source cluster, check the replication status. +8. On the source cluster, check the replication status. ``` kubectl exec -it redis-redis-cluster-0 -- /bin/bash -c \ From 8ca5a5d775de6f308ee3ed61a8e759d3915372c1 Mon Sep 17 00:00:00 2001 From: Peter Schuurman Date: Sat, 17 Dec 2022 04:39:16 -0800 Subject: [PATCH 07/22] Minor edits to StatefulSet start ordinal blog post --- content/en/blog/_posts/2022-12-16-statefulset-migration.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/en/blog/_posts/2022-12-16-statefulset-migration.md b/content/en/blog/_posts/2022-12-16-statefulset-migration.md index c9e5226cc80..e6d7f561719 100644 --- a/content/en/blog/_posts/2022-12-16-statefulset-migration.md +++ b/content/en/blog/_posts/2022-12-16-statefulset-migration.md @@ -16,7 +16,7 @@ used. ## Background StatefulSets ordinals provide sequential identities for pod replicas. When using -[`OrderedReady` Pod management](/docs/tutorials/stateful-application/basic-stateful-set/#orderedready-pod-management), +[`OrderedReady` Pod management](/docs/tutorials/stateful-application/basic-stateful-set/#orderedready-pod-management) Pods are created from ordinal index `0` up to `N-1`. With Kubernetes today, orchestrating a StatefulSet migration across clusters is @@ -135,7 +135,7 @@ Pre-requisites: Two Kubernetes clusters named `source` and `destination`. that are not relevant in `destination` cluster are removed (eg: `uid`, `resourceVersion`, `status`). - #### Source Cluster + #### Source cluster Note: If using a `StorageClass` with `reclaimPolicy: Delete` configured, you should patch the PVs in `source` with `reclaimPolicy: Retain` prior to @@ -149,7 +149,7 @@ Pre-requisites: Two Kubernetes clusters named `source` and `destination`. kubectl get secret redis-redis-cluster -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion)' > /tmp/secret-redis-redis-cluster.yaml ``` - #### Destination Cluster + #### Destination cluster Note: For the PV/PVC, this procedure only works if the underlying storage system that your PVs use can support being copied into `destination`. Storage From 13f1c8ab0526b65bc5af4691863c5c94d7249424 Mon Sep 17 00:00:00 2001 From: Peter Schuurman Date: Wed, 11 Jan 2023 21:39:11 -0800 Subject: [PATCH 08/22] Update formatting and wording for StatefulSet Migration Redis demo --- .../2022-12-16-statefulset-migration.md | 87 +++++++++++-------- 1 file changed, 52 insertions(+), 35 deletions(-) diff --git a/content/en/blog/_posts/2022-12-16-statefulset-migration.md b/content/en/blog/_posts/2022-12-16-statefulset-migration.md index e6d7f561719..9b1f1297f27 100644 --- a/content/en/blog/_posts/2022-12-16-statefulset-migration.md +++ b/content/en/blog/_posts/2022-12-16-statefulset-migration.md @@ -2,7 +2,7 @@ layout: blog title: "Kubernetes 1.26: StatefulSet Start Ordinal Simplifies Migration" date: 2023-01-03 -slug: statefulset-migration +slug: statefulset-start-ordinal --- **Author**: Peter Schuurman (Google) @@ -32,11 +32,12 @@ You lose the self-healing benefit of the StatefulSet controller when your Pods fail or are evicted. Kubernetes v1.26 enables a StatefulSet to be responsible for a range of ordinals -within a half-open interval `[0, N)` (the ordinals 0, 1, ... N-1). +within a range {0..N-1} (the ordinals 0, 1, ... up to N-1). With it, you can scale down a range -(`[0, k)`) in a source cluster, and scale up the complementary range (`[k, N)`) +{0..k-1} in a source cluster, and scale up the complementary range {k..N-1} in a destination cluster, while maintaining application availability. This -enables you to retain *at most one* semantics and +enables you to retain *at most one* semantics (meaning there is at most one Pod +with a given identity running in a StatefulSet) and [Rolling Update](/docs/tutorials/stateful-application/basic-stateful-set/#rolling-update) behavior when orchestrating a migration across clusters. @@ -61,42 +62,50 @@ to a different cluster. There are many reasons why you would need to do this: Enable the `StatefulSetStartOrdinal` feature gate on a cluster, and create a StatefulSet with a customized `.spec.ordinals.start`. -## Try it for yourself +## Try it out -In this demo, you'll use the `StatefulSetStartOrdinal` feature to migrate a -StatefulSet from one Kubernetes cluster to another. For this demo, the +In this demo, I'll use the new mechanism to migrate a +StatefulSet from one Kubernetes cluster to another. The [redis-cluster](https://github.com/bitnami/charts/tree/main/bitnami/redis-cluster) -Bitnami Helm chart is used to install Redis. +Bitnami Helm chart will be used to install Redis. Tools Required: * [yq](https://github.com/mikefarah/yq) * [helm](https://helm.sh/docs/helm/helm_install/) -Pre-requisites: Two Kubernetes clusters named `source` and `destination`. - * `StatefulSetStartOrdinal` feature gate is enabled on both clusters - * The same default `StorageClass` is installed on both clusters. This - `StorageClass` should provision underlying storage that is accessible from - both clusters. - * A flat network topology that allows for pods to be accessible across both - Kubernetes clusters. If creating clusters on a cloud provider, this - configuration may be called private cloud or private network. +### Pre-requisites {#demo-pre-requisites} -1. Create a demo namespace on both clusters. +To do this, I need two Kubernetes clusters that can both access common +networking and storage; I've named my clusters `source` and `destination`. +Specifically, I need: + +* The `StatefulSetStartOrdinal` feature gate enabled on both clusters. +* Client configuration for `kubectl` that lets me access both clusters as an + administrator. +* The same `StorageClass` installed on both clusters, and set as the default + StorageClass for both clusters. This `StorageClass` should provision + underlying storage that is accessible from either or both clusters. +* A flat network topology that allows for pods to send and receive packets to + and from Pods in either clusters. If you are creating clusters on a cloud + provider, this configuration may be called private cloud or private network. + +1. Create a demo namespace on both clusters: ``` kubectl create ns kep-3335 ``` -2. Deploy a Redis cluster on `source`. +2. Deploy a Redis cluster with six replicas in the source cluster: ``` helm repo add bitnami https://charts.bitnami.com/bitnami helm install redis --namespace kep-3335 \ bitnami/redis-cluster \ - --set persistence.size=1Gi + --set persistence.size=1Gi \ + --set cluster.nodes=6 ``` -3. On `source`, check the replication status. +3. Check the replication status in the source cluster: ``` kubectl exec -it redis-redis-cluster-0 -- /bin/bash -c \ @@ -112,7 +121,7 @@ Pre-requisites: Two Kubernetes clusters named `source` and `destination`. 2cff613d763b22c180cd40668da8e452edef3fc8 10.104.0.17:6379@16379 master - 0 1669764410000 2 connected 5461-10922 ``` -4. On `destination`, deploy Redis with zero replicas. +4. Deploy a Redis cluster with zero replicas in the destination cluster: ``` helm install redis --namespace kep-3335 \ @@ -123,19 +132,20 @@ Pre-requisites: Two Kubernetes clusters named `source` and `destination`. --set existingSecret=redis-redis-cluster ``` -5. Scale down replica `redis-redis-cluster-5` in the source cluster. +5. Scale down the `redis-redis-cluster` StatefulSet in the source cluster by 1, + to remove the replica `redis-redis-cluster-5`: ``` kubectl patch sts redis-redis-cluster -p '{"spec": {"replicas": 5}}' ``` -6. Migrate dependencies from `source` to `destination`. +6. Migrate dependencies from the source cluster to the destination cluster: The following commands copy resources from `source` to `destionation`. Details that are not relevant in `destination` cluster are removed (eg: `uid`, `resourceVersion`, `status`). - #### Source cluster + **Steps for the source cluster** Note: If using a `StorageClass` with `reclaimPolicy: Delete` configured, you should patch the PVs in `source` with `reclaimPolicy: Retain` prior to @@ -149,7 +159,7 @@ Pre-requisites: Two Kubernetes clusters named `source` and `destination`. kubectl get secret redis-redis-cluster -o yaml | yq 'del(.metadata.uid, .metadata.resourceVersion)' > /tmp/secret-redis-redis-cluster.yaml ``` - #### Destination cluster + **Steps for the destination cluster** Note: For the PV/PVC, this procedure only works if the underlying storage system that your PVs use can support being copied into `destination`. Storage @@ -164,31 +174,37 @@ Pre-requisites: Two Kubernetes clusters named `source` and `destination`. kubectl create -f /tmp/secret-redis-redis-cluster.yaml ``` -7. Scale up replica `redis-redis-cluster-5` in the destination cluster, with a - start ordinal of 5: +7. Scale up the `redis-redis-cluster` StatefulSet in the destination cluster by + 1, with a start ordinal of 5: ``` kubectl patch sts redis-redis-cluster -p '{"spec": {"ordinals": {"start": 5}, "replicas": 1}}' ``` -8. On the source cluster, check the replication status. +8. Check the replication status in the destination cluster: ``` - kubectl exec -it redis-redis-cluster-0 -- /bin/bash -c \ + kubectl exec -it redis-redis-cluster-5 -- /bin/bash -c \ "redis-cli -c -h redis-redis-cluster -a $(kubectl get secret redis-redis-cluster -o jsonpath="{.data.redis-password}" | base64 -d) CLUSTER NODES;" ``` - You should see that the new replica's address has joined the Redis cluster. + I should see that the new replica (labeled `myself`) has joined the Redis + cluster (the IP address belongs to a different CIDR block than the + replicas in the source cluster). ``` - 2cff613d763b22c180cd40668da8e452edef3fc8 10.104.0.17:6379@16379 myself,master - 0 1669766684000 2 connected 5461-10922 - 7136e37d8864db983f334b85d2b094be47c830e5 10.108.0.22:6379@16379 slave 2cff613d763b22c180cd40668da8e452edef3fc8 0 1669766685609 2 connected + 2cff613d763b22c180cd40668da8e452edef3fc8 10.104.0.17:6379@16379 master - 0 1669766684000 2 connected 5461-10922 + 7136e37d8864db983f334b85d2b094be47c830e5 10.108.0.22:6379@16379 myself,slave 2cff613d763b22c180cd40668da8e452edef3fc8 0 1669766685609 2 connected 2ce30362c188aabc06f3eee5d92892d95b1da5c3 10.104.0.14:6379@16379 master - 0 1669766684000 3 connected 10923-16383 961f35e37c4eea507cfe12f96e3bfd694b9c21d4 10.104.0.18:6379@16379 slave a8765caed08f3e185cef22bd09edf409dc2bcc61 0 1669766683600 1 connected a8765caed08f3e185cef22bd09edf409dc2bcc61 10.104.0.19:6379@16379 master - 0 1669766685000 1 connected 0-5460 7743661f60b6b17b5c71d083260419588b4f2451 10.104.0.16:6379@16379 slave 2ce30362c188aabc06f3eee5d92892d95b1da5c3 0 1669766686613 3 connected ``` +9. Repeat steps #5 to #7 for the remainder of the replicas, until the + Redis StatefulSet in the source cluster is scaled to 0, and the Redis + StatefulSet in the destination cluster is healthy with 6 total replicas. + ## What's Next? This feature provides a building block for a StatefulSet to be split up across @@ -196,10 +212,11 @@ clusters, but does not prescribe the mechanism as to how the StatefulSet should be migrated. Migration requires coordination of StatefulSet replicas, along with orchestration of the storage and network layer. This is dependent on the storage and connectivity requirements of the application installed by the StatefulSet. -Additionally, many StatefulSets are controlled by Operators, which adds another +Additionally, many StatefulSets are managed by +[operators](/docs/concepts/extend-kubernetes/operator/), which adds another layer of complexity to migration. -If you're interested in building blocks to make these processes easier, get -involved with +If you're interested in building enhancements to make these processes easier, +get involved with [SIG Multicluster](https://github.com/kubernetes/community/blob/master/sig-multicluster) to contribute! From 22101af31513194986d6406b52a94568114a01f4 Mon Sep 17 00:00:00 2001 From: Peter Schuurman Date: Mon, 13 Mar 2023 12:15:51 -0700 Subject: [PATCH 09/22] Update StatefulSetStartOrdinal blog post for beta v1.27 --- content/en/blog/_posts/2022-12-16-statefulset-migration.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/content/en/blog/_posts/2022-12-16-statefulset-migration.md b/content/en/blog/_posts/2022-12-16-statefulset-migration.md index 9b1f1297f27..83fa30d1cb8 100644 --- a/content/en/blog/_posts/2022-12-16-statefulset-migration.md +++ b/content/en/blog/_posts/2022-12-16-statefulset-migration.md @@ -1,15 +1,16 @@ --- layout: blog title: "Kubernetes 1.26: StatefulSet Start Ordinal Simplifies Migration" -date: 2023-01-03 +date: 2023-04-19 slug: statefulset-start-ordinal --- **Author**: Peter Schuurman (Google) -Kubernetes v1.26 introduces a new, alpha-level feature for +Kubernetes v1.26 introduced a new, alpha-level feature for [StatefulSets](/docs/concepts/workloads/controllers/statefulset/) that controls -the ordinal numbering of Pod replicas. Ordinals can start from arbitrary +the ordinal numbering of Pod replicas. As of Kubernetes v1.27, this feature is +now beta. Ordinals can start from arbitrary non-negative numbers. This blog post will discuss how this feature can be used. From 4f223e0d3b3081edcafa7d74d95f19aa8d8bbc0d Mon Sep 17 00:00:00 2001 From: Peter Schuurman Date: Thu, 13 Apr 2023 20:36:44 -0700 Subject: [PATCH 10/22] Add publish date for StatefulSet Migration blog --- ...efulset-migration.md => 2023-04-28-statefulset-migration.md} | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) rename content/en/blog/_posts/{2022-12-16-statefulset-migration.md => 2023-04-28-statefulset-migration.md} (99%) diff --git a/content/en/blog/_posts/2022-12-16-statefulset-migration.md b/content/en/blog/_posts/2023-04-28-statefulset-migration.md similarity index 99% rename from content/en/blog/_posts/2022-12-16-statefulset-migration.md rename to content/en/blog/_posts/2023-04-28-statefulset-migration.md index 83fa30d1cb8..b01e75a0565 100644 --- a/content/en/blog/_posts/2022-12-16-statefulset-migration.md +++ b/content/en/blog/_posts/2023-04-28-statefulset-migration.md @@ -1,7 +1,7 @@ --- layout: blog title: "Kubernetes 1.26: StatefulSet Start Ordinal Simplifies Migration" -date: 2023-04-19 +date: 2023-04-28 slug: statefulset-start-ordinal --- From 4787efefeecb9de2610b3dbcf7439e126ef9b980 Mon Sep 17 00:00:00 2001 From: Ricardo Katz Date: Sun, 23 Apr 2023 17:16:32 -0300 Subject: [PATCH 11/22] rkatz stepping down from pt approvers --- OWNERS_ALIASES | 2 -- 1 file changed, 2 deletions(-) diff --git a/OWNERS_ALIASES b/OWNERS_ALIASES index 05fe413e085..64d68b7737c 100644 --- a/OWNERS_ALIASES +++ b/OWNERS_ALIASES @@ -167,7 +167,6 @@ aliases: - edsoncelio - femrtnz - jcjesus - - rikatz - stormqueen1990 - yagonobre sig-docs-pt-reviews: # PR reviews for Portugese content @@ -176,7 +175,6 @@ aliases: - femrtnz - jcjesus - mrerlison - - rikatz - stormqueen1990 - yagonobre sig-docs-vi-owners: # Admins for Vietnamese content From c7f1fdf50c85197a997e99421b40ffbff26baece Mon Sep 17 00:00:00 2001 From: Benjamin Wang Date: Mon, 24 Apr 2023 07:00:50 +0800 Subject: [PATCH 12/22] update the minimum recommended etcd versions to 3.4.22+ and 3.5.6+ 3.3 is end of life. There is also a data inconsistency issue in 3.4.21 and 3.5.5, so 3.4.22+ and 3.5.6+ are the minimum recommended versions. Please read https://groups.google.com/g/etcd-dev/c/8S7u6NqW6C4. Signed-off-by: Benjamin Wang --- .../en/docs/tasks/administer-cluster/configure-upgrade-etcd.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/tasks/administer-cluster/configure-upgrade-etcd.md b/content/en/docs/tasks/administer-cluster/configure-upgrade-etcd.md index 542a3a57c29..6df175e93c1 100644 --- a/content/en/docs/tasks/administer-cluster/configure-upgrade-etcd.md +++ b/content/en/docs/tasks/administer-cluster/configure-upgrade-etcd.md @@ -38,7 +38,7 @@ weight: 270 clusters. Therefore, run etcd clusters on dedicated machines or isolated environments for [guaranteed resource requirements](https://etcd.io/docs/current/op-guide/hardware/). -* The minimum recommended version of etcd to run in production is `3.2.10+`. +* The minimum recommended etcd versions to run in production are `3.4.22+` and `3.5.6+`. ## Resource requirements From c76cbc8ffeba31673a454e1c229df161bd76e6fe Mon Sep 17 00:00:00 2001 From: Zhuzhenghao Date: Sun, 23 Apr 2023 17:44:59 +0800 Subject: [PATCH 13/22] [zh] sync 1.2 kube-apiserver --- .../kube-apiserver.md | 184 +++++++++++------- 1 file changed, 111 insertions(+), 73 deletions(-) diff --git a/content/zh-cn/docs/reference/command-line-tools-reference/kube-apiserver.md b/content/zh-cn/docs/reference/command-line-tools-reference/kube-apiserver.md index e3ac50b7ccf..355c93ce8a7 100644 --- a/content/zh-cn/docs/reference/command-line-tools-reference/kube-apiserver.md +++ b/content/zh-cn/docs/reference/command-line-tools-reference/kube-apiserver.md @@ -782,9 +782,9 @@ CIDRs opened in GCE firewall for L7 LB traffic proxy & health checks -如果启用了性能分析,则启用锁争用性能分析。 +如果启用了性能分析,则启用阻塞分析。 @@ -793,17 +793,33 @@ Enable lock contention profiling, if profiling is enabled +

CORS 允许的来源清单,以逗号分隔。 允许的来源可以是支持子域匹配的正则表达式。 如果此列表为空,则不会启用 CORS。 +请确保每个表达式与整个主机名相匹配,方法是用'^'锚定开始或包括'//'前缀,同时用'$'锚定结束或包括':'端口分隔符后缀。 +有效表达式的例子是'//example.com(:|$)'和'^https://example.com(:|$)'。 +

+ +--debug-socket-path string + + +

+ +使用位于给定路径的、未受保护的(无身份认证或鉴权的)UNIX 域套接字执行性能分析。 +

+ + --default-not-ready-toleration-seconds int     默认值:300 @@ -853,13 +869,11 @@ Number of workers spawned for DeleteCollection call. These are used to speed up

-尽管位于默认启用的插件列表中,仍须被禁用的准入插件(NamespaceLifecycle、LimitRanger、ServiceAccount、TaintNodesByCondition、PodSecurity、Priority、DefaultTolerationSeconds、DefaultStorageClass、StorageObjectInUseProtection、PersistentVolumeClaimResize、RuntimeClass、CertificateApproval、CertificateSigning、CertificateSubjectRestriction、DefaultIngressClass、MutatingAdmissionWebhook、ValidatingAdmissionPolicy、ValidatingAdmissionWebhook、ResourceQuota)。 -取值为逗号分隔的准入插件列表:AlwaysAdmit、AlwaysDeny、AlwaysPullImages、CertificateApproval、CertificateSigning、CertificateSubjectRestriction、DefaultIngressClass、DefaultStorageClass、DefaultTolerationSeconds、DenyServiceExternalIPs、EventRateLimit、ExtendedResourceToleration、ImagePolicyWebhook、LimitPodHardAntiAffinityTopology、LimitRanger、MutatingAdmissionWebhook、NamespaceAutoProvision、NamespaceExists、NamespaceLifecycle、NodeRestriction、OwnerReferencesPermissionEnforcement、PersistentVolumeClaimResize、PersistentVolumeLabel、PodNodeSelector、PodSecurity、PodTolerationRestriction、Priority、ResourceQuota、RuntimeClass、SecurityContextDeny、ServiceAccount、StorageObjectInUseProtection、TaintNodesByCondition、ValidatingAdmissionPolicy、ValidatingAdmissionWebhook。 +尽管位于默认启用的插件列表中,仍须被禁用的准入插件(NamespaceLifecycle、LimitRanger、ServiceAccount、TaintNodesByCondition、PodSecurity、Priority、DefaultTolerationSeconds、DefaultStorageClass、StorageObjectInUseProtection、PersistentVolumeClaimResize、RuntimeClass、CertificateApproval、CertificateSigning、ClusterTrustBundleAttest、CertificateSubjectRestriction、DefaultIngressClass、MutatingAdmissionWebhook、ValidatingAdmissionPolicy、ValidatingAdmissionWebhook、ResourceQuota)。 +取值为逗号分隔的准入插件列表:AlwaysAdmit、AlwaysDeny、AlwaysPullImages、CertificateApproval、CertificateSigning、CertificateSubjectRestriction、ClusterTrustBundleAttest、DefaultIngressClass、DefaultStorageClass、DefaultTolerationSeconds、DenyServiceExternalIPs、EventRateLimit、ExtendedResourceToleration、ImagePolicyWebhook、LimitPodHardAntiAffinityTopology、LimitRanger、MutatingAdmissionWebhook、NamespaceAutoProvision、NamespaceExists、NamespaceLifecycle、NodeRestriction、OwnerReferencesPermissionEnforcement、PersistentVolumeClaimResize、PersistentVolumeLabel、PodNodeSelector、PodSecurity、PodTolerationRestriction、Priority、ResourceQuota、RuntimeClass、SecurityContextDeny、ServiceAccount、StorageObjectInUseProtection、TaintNodesByCondition、ValidatingAdmissionPolicy、ValidatingAdmissionWebhook。 该标志中插件的顺序无关紧要。

@@ -900,11 +914,11 @@ File with apiserver egress selector configuration.

-除了默认启用的插件(NamespaceLifecycle、LimitRanger、ServiceAccount、TaintNodesByCondition、PodSecurity、Priority、DefaultTolerationSeconds、DefaultStorageClass、StorageObjectInUseProtection、PersistentVolumeClaimResize、RuntimeClass、CertificateApproval、CertificateSigning、CertificateSubjectRestriction、DefaultIngressClass、MutatingAdmissionWebhook、ValidatingAdmissionPolicy、ValidatingAdmissionWebhook、ResourceQuota)之外要启用的准入插件。 -取值为逗号分隔的准入插件列表:AlwaysAdmit、AlwaysDeny、AlwaysPullImages、CertificateApproval、CertificateSigning、CertificateSubjectRestriction、DefaultIngressClass、DefaultStorageClass、DefaultTolerationSeconds、DenyServiceExternalIPs、EventRateLimit、ExtendedResourceToleration、ImagePolicyWebhook、LimitPodHardAntiAffinityTopology、LimitRanger、MutatingAdmissionWebhook、NamespaceAutoProvision、NamespaceExists、NamespaceLifecycle、NodeRestriction、OwnerReferencesPermissionEnforcement、PersistentVolumeClaimResize、PersistentVolumeLabel、PodNodeSelector、PodSecurity、PodTolerationRestriction、Priority、ResourceQuota、RuntimeClass、SecurityContextDeny、ServiceAccount、StorageObjectInUseProtection、TaintNodesByCondition、ValidatingAdmissionPolicy、ValidatingAdmissionWebhook。该标志中插件的顺序无关紧要。 +除了默认启用的插件(NamespaceLifecycle、LimitRanger、ServiceAccount、TaintNodesByCondition、PodSecurity、Priority、DefaultTolerationSeconds、DefaultStorageClass、StorageObjectInUseProtection、PersistentVolumeClaimResize、RuntimeClass、CertificateApproval、CertificateSigning、ClusterTrustBundleAttest、CertificateSubjectRestriction、DefaultIngressClass、MutatingAdmissionWebhook、ValidatingAdmissionPolicy、ValidatingAdmissionWebhook、ResourceQuota)之外要启用的准入插件。 +取值为逗号分隔的准入插件列表:AlwaysAdmit、AlwaysDeny、AlwaysPullImages、CertificateApproval、CertificateSigning、CertificateSubjectRestriction、ClusterTrustBundleAttest、DefaultIngressClass、DefaultStorageClass、DefaultTolerationSeconds、DenyServiceExternalIPs、EventRateLimit、ExtendedResourceToleration、ImagePolicyWebhook、LimitPodHardAntiAffinityTopology、LimitRanger、MutatingAdmissionWebhook、NamespaceAutoProvision、NamespaceExists、NamespaceLifecycle、NodeRestriction、OwnerReferencesPermissionEnforcement、PersistentVolumeClaimResize、PersistentVolumeLabel、PodNodeSelector、PodSecurity、PodTolerationRestriction、Priority、ResourceQuota、RuntimeClass、SecurityContextDeny、ServiceAccount、StorageObjectInUseProtection、TaintNodesByCondition、ValidatingAdmissionPolicy、ValidatingAdmissionWebhook。该标志中插件的顺序无关紧要。

@@ -1185,16 +1199,16 @@ comma-separated 'key=True|False' pairs - +

-

一组 key=value 对,用来描述测试性/试验性功能的特性门控。可选项有:
APIListChunking=true|false (BETA - 默认值=true)
APIPriorityAndFairness=true|false (BETA - 默认值=true)
APIResponseCompression=true|false (BETA - 默认值=true)
-APISelfSubjectReview=true|false (ALPHA - 默认值=false)
+APISelfSubjectReview=true|false (BETA - 默认值=true)
APIServerIdentity=true|false (BETA - 默认值=true)
-APIServerTracing=true|false (ALPHA - 默认值=false)
-AggregatedDiscoveryEndpoint=true|false (ALPHA - 默认值=false)
+APIServerTracing=true|false (BETA - 默认值=true)
+AdmissionWebhookMatchConditions=true|false (ALPHA - 默认值=false)
+AggregatedDiscoveryEndpoint=true|false (BETA - 默认值=true)
AllAlpha=true|false (ALPHA - 默认值=false)
AllBeta=true|false (BETA - 默认值=false)
AnyVolumeDataSource=true|false (BETA - 默认值=true)
@@ -1314,29 +1334,31 @@ CPUManagerPolicyBetaOptions=true|false (BETA - 默认值=true)
CPUManagerPolicyOptions=true|false (BETA - 默认值=true)
CSIMigrationPortworx=true|false (BETA - 默认值=false)
CSIMigrationRBD=true|false (ALPHA - 默认值=false)
-CSINodeExpandSecret=true|false (ALPHA - 默认值=false)
+CSINodeExpandSecret=true|false (BETA - 默认值=true)
CSIVolumeHealth=true|false (ALPHA - 默认值=false)
-ComponentSLIs=true|false (ALPHA - 默认值=false)
+CloudControllerManagerWebhook=true|false (ALPHA - 默认值=false)
+CloudDualStackNodeIPs=true|false (ALPHA - 默认值=false)
+ClusterTrustBundle=true|false (ALPHA - 默认值=false)
+ComponentSLIs=true|false (BETA - 默认值=true)
ContainerCheckpoint=true|false (ALPHA - 默认值=false)
ContextualLogging=true|false (ALPHA - 默认值=false)
-CronJobTimeZone=true|false (BETA - 默认值=true)
CrossNamespaceVolumeDataSource=true|false (ALPHA - 默认值=false)
CustomCPUCFSQuotaPeriod=true|false (ALPHA - 默认值=false)
CustomResourceValidationExpressions=true|false (BETA - 默认值=true)
DisableCloudProviders=true|false (ALPHA - 默认值=false)
DisableKubeletCloudCredentialProviders=true|false (ALPHA - 默认值=false)
-DownwardAPIHugePages=true|false (BETA - 默认值=true)
DynamicResourceAllocation=true|false (ALPHA - 默认值=false)
-EventedPLEG=true|false (ALPHA - 默认值=false)
+ElasticIndexedJob=true|false (BETA - 默认值=true)
+EventedPLEG=true|false (BETA - 默认值=false)
ExpandedDNSConfig=true|false (BETA - 默认值=true)
ExperimentalHostUserNamespaceDefaulting=true|false (BETA - 默认值=false)
-GRPCContainerProbe=true|false (BETA - 默认值=true)
-GracefulNodeShutdown=true|false (BETA - 默认值=true) +GracefulNodeShutdown=true|false (BETA - 默认值=true)
GracefulNodeShutdownBasedOnPodPriority=true|false (BETA - 默认值=true)
-HPAContainerMetrics=true|false (ALPHA - 默认值=false)
+HPAContainerMetrics=true|false (BETA - 默认值=true)
HPAScaleToZero=true|false (ALPHA - 默认值=false)
HonorPVReclaimPolicy=true|false (ALPHA - 默认值=false)
-IPTablesOwnershipCleanup=true|false (ALPHA - 默认值=false)
+IPTablesOwnershipCleanup=true|false (BETA - 默认值=true)
+InPlacePodVerticalScaling=true|false (ALPHA - 默认值=false)
InTreePluginAWSUnregister=true|false (ALPHA - 默认值=false)
InTreePluginAzureDiskUnregister=true|false (ALPHA - 默认值=false)
InTreePluginAzureFileUnregister=true|false (ALPHA - 默认值=false)
@@ -1345,63 +1367,67 @@ InTreePluginOpenStackUnregister=true|false (ALPHA - 默认值=false)
InTreePluginPortworxUnregister=true|false (ALPHA - 默认值=false)
InTreePluginRBDUnregister=true|false (ALPHA - 默认值=false)
InTreePluginvSphereUnregister=true|false (ALPHA - 默认值=false)
-JobMutableNodeSchedulingDirectives=true|false (BETA - 默认值=true)
JobPodFailurePolicy=true|false (BETA - 默认值=true)
JobReadyPods=true|false (BETA - 默认值=true)
-KMSv2=true|false (ALPHA - 默认值=false)
+KMSv2=true|false (BETA - 默认值=true)
KubeletInUserNamespace=true|false (ALPHA - 默认值=false)
KubeletPodResources=true|false (BETA - 默认值=true)
+KubeletPodResourcesDynamicResources=true|false (ALPHA - 默认值=false)
+KubeletPodResourcesGet=true|false (ALPHA - 默认值=false)
KubeletPodResourcesGetAllocatable=true|false (BETA - 默认值=true)
-KubeletTracing=true|false (ALPHA - 默认值=false)
-LegacyServiceAccountTokenTracking=true|false (ALPHA - 默认值=false)
+KubeletTracing=true|false (BETA - 默认值=true)
+LegacyServiceAccountTokenTracking=true|false (BETA - 默认值=true)
LocalStorageCapacityIsolationFSQuotaMonitoring=true|false (ALPHA - 默认值=false)
LogarithmicScaleDown=true|false (BETA - 默认值=true)
LoggingAlphaOptions=true|false (ALPHA - 默认值=false)
LoggingBetaOptions=true|false (BETA - 默认值=true)
-MatchLabelKeysInPodTopologySpread=true|false (ALPHA - 默认值=false)
+MatchLabelKeysInPodTopologySpread=true|false (BETA - 默认值=true)
MaxUnavailableStatefulSet=true|false (ALPHA - 默认值=false)
MemoryManager=true|false (BETA - 默认值=true)
MemoryQoS=true|false (ALPHA - 默认值=false)
-MinDomainsInPodTopologySpread=true|false (BETA - 默认值=false)
-MinimizeIPTablesRestore=true|false (ALPHA - 默认值=false)
+MinDomainsInPodTopologySpread=true|false (BETA - 默认值=true)
+MinimizeIPTablesRestore=true|false (BETA - 默认值=true)
MultiCIDRRangeAllocator=true|false (ALPHA - 默认值=false)
+MultiCIDRServiceAllocator=true|false (ALPHA - 默认值=false)
NetworkPolicyStatus=true|false (ALPHA - 默认值=false)
+NewVolumeManagerReconstruction=true|false (BETA - 默认值=true)
NodeInclusionPolicyInPodTopologySpread=true|false (BETA - 默认值=true)
+NodeLogQuery=true|false (ALPHA - 默认值=false)
NodeOutOfServiceVolumeDetach=true|false (BETA - 默认值=true)
NodeSwap=true|false (ALPHA - 默认值=false)
OpenAPIEnums=true|false (BETA - 默认值=true)
-OpenAPIV3=true|false (BETA - 默认值=true)
-PDBUnhealthyPodEvictionPolicy=true|false (ALPHA - 默认值=false)
+PDBUnhealthyPodEvictionPolicy=true|false (BETA - 默认值=true)
PodAndContainerStatsFromCRI=true|false (ALPHA - 默认值=false)
PodDeletionCost=true|false (BETA - 默认值=true)
PodDisruptionConditions=true|false (BETA - 默认值=true)
PodHasNetworkCondition=true|false (ALPHA - 默认值=false)
-PodSchedulingReadiness=true|false (ALPHA - 默认值=false)
+PodSchedulingReadiness=true|false (BETA - 默认值=true)
ProbeTerminationGracePeriod=true|false (BETA - 默认值=true)
ProcMountType=true|false (ALPHA - 默认值=false)
ProxyTerminatingEndpoints=true|false (BETA - 默认值=true)
QOSReserved=true|false (ALPHA - 默认值=false)
-ReadWriteOncePod=true|false (ALPHA - 默认值=false)
+ReadWriteOncePod=true|false (BETA - 默认值=true)
RecoverVolumeExpansionFailure=true|false (ALPHA - 默认值=false)
RemainingItemCount=true|false (BETA - 默认值=true)
RetroactiveDefaultStorageClass=true|false (BETA - 默认值=true)
RotateKubeletServerCertificate=true|false (BETA - 默认值=true)
-SELinuxMountReadWriteOncePod=true|false (ALPHA - 默认值=false)
-SeccompDefault=true|false (BETA - 默认值=true)
-ServerSideFieldValidation=true|false (BETA - 默认值=true)
+SELinuxMountReadWriteOncePod=true|false (BETA - 默认值=true)
+SecurityContextDeny=true|false (ALPHA - 默认值=false)
+ServiceNodePortStaticSubrange=true|false (ALPHA - 默认值=false)
SizeMemoryBackedVolumes=true|false (BETA - 默认值=true)
-StatefulSetAutoDeletePVC=true|false (ALPHA - 默认值=false)
-StatefulSetStartOrdinal=true|false (ALPHA - 默认值=false)
+StableLoadBalancerNodeSet=true|false (BETA - 默认值=true)
+StatefulSetAutoDeletePVC=true|false (BETA - 默认值=true)
+StatefulSetStartOrdinal=true|false (BETA - 默认值=true)
StorageVersionAPI=true|false (ALPHA - 默认值=false)
StorageVersionHash=true|false (BETA - 默认值=true)
TopologyAwareHints=true|false (BETA - 默认值=true)
-TopologyManager=true|false (BETA - 默认值=true)
TopologyManagerPolicyAlphaOptions=true|false (ALPHA - 默认值=false)
TopologyManagerPolicyBetaOptions=true|false (BETA - 默认值=false)
TopologyManagerPolicyOptions=true|false (ALPHA - 默认值=false)
UserNamespacesStatelessPodsSupport=true|false (ALPHA - 默认值=false)
ValidatingAdmissionPolicy=true|false (ALPHA - 默认值=false)
VolumeCapacityPriority=true|false (ALPHA - 默认值=false)
+WatchList=true|false (ALPHA - 默认值=false)
WinDSR=true|false (ALPHA - 默认值=false)
WinOverlay=true|false (BETA - 默认值=true)
WindowsHostNetwork=true|false (ALPHA - 默认值=true) @@ -2214,6 +2240,18 @@ in addition 'Connection: close' response header is set in order to tear down the + +--shutdown-watch-termination-grace-period duration + + +

+ +此选项如果被设置了,则表示 API 服务器体面关闭服务器窗口内,等待活跃的监听请求耗尽的最长宽限期。 +

+ + --storage-backend string From b672cea482f2546ccaadb13ab01ff4ae9eeeedd7 Mon Sep 17 00:00:00 2001 From: Zhuzhenghao Date: Thu, 13 Apr 2023 15:27:13 +0800 Subject: [PATCH 14/22] [zh] resync page extensible-admission-controllers --- .../extensible-admission-controllers.md | 424 ++++++++++-------- 1 file changed, 233 insertions(+), 191 deletions(-) diff --git a/content/zh-cn/docs/reference/access-authn-authz/extensible-admission-controllers.md b/content/zh-cn/docs/reference/access-authn-authz/extensible-admission-controllers.md index 7b29c76a347..7beaa90ee91 100644 --- a/content/zh-cn/docs/reference/access-authn-authz/extensible-admission-controllers.md +++ b/content/zh-cn/docs/reference/access-authn-authz/extensible-admission-controllers.md @@ -3,8 +3,14 @@ title: 动态准入控制 content_type: concept weight: 40 --- - + @@ -29,16 +36,16 @@ This page describes how to build, configure, use, and monitor admission webhooks 准入 Webhook 是一种用于接收准入请求并对其进行处理的 HTTP 回调机制。 可以定义两种类型的准入 webhook,即 [验证性质的准入 Webhook](/zh-cn/docs/reference/access-authn-authz/admission-controllers/#validatingadmissionwebhook) 和 [修改性质的准入 Webhook](/zh-cn/docs/reference/access-authn-authz/admission-controllers/#mutatingadmissionwebhook)。 -修改性质的准入 Webhook 会先被调用。它们可以更改发送到 API +修改性质的准入 Webhook 会先被调用。它们可以更改发送到 API 服务器的对象以执行自定义的设置默认值操作。 -{{< note >}} 如果准入 Webhook 需要保证它们所看到的是对象的最终状态以实施某种策略。 则应使用验证性质的准入 Webhook,因为对象被修改性质 Webhook 看到之后仍然可能被修改。 {{< /note >}} -### 尝试准入 Webhook {#experimenting-with-admission-webhooks} +## 尝试准入 Webhook {#experimenting-with-admission-webhooks} 准入 Webhook 本质上是集群控制平面的一部分。你应该非常谨慎地编写和部署它们。 如果你打算编写或者部署生产级准入 webhook,请阅读[用户指南](/zh-cn/docs/reference/access-authn-authz/extensible-admission-controllers/#write-an-admission-webhook-server)以获取相关说明。 @@ -101,19 +109,19 @@ that is validated in a Kubernetes e2e test. The webhook handles the as an `AdmissionReview` object in the same version it received. --> 请参阅 Kubernetes e2e 测试中的 -[admission webhook 服务器](https://github.com/kubernetes/kubernetes/blob/release-1.21/test/images/agnhost/webhook/main.go) +[Admission Webhook 服务器](https://github.com/kubernetes/kubernetes/blob/release-1.21/test/images/agnhost/webhook/main.go) 的实现。webhook 处理由 API 服务器发送的 `AdmissionReview` 请求,并且将其决定 作为 `AdmissionReview` 对象以相同版本发送回去。 -有关发送到 webhook 的数据的详细信息,请参阅 [webhook 请求](#request)。 +有关发送到 Webhook 的数据的详细信息,请参阅 [Webhook 请求](#request)。 -要获取来自 webhook 的预期数据,请参阅 [webhook 响应](#response)。 +要获取来自 Webhook 的预期数据,请参阅 [Webhook 响应](#response)。 示例准入 Webhook 服务器置 `ClientAuth` 字段为 [空](https://github.com/kubernetes/kubernetes/blob/v1.22.0/test/images/agnhost/webhook/config.go#L38-L39), -默认为 `NoClientCert` 。这意味着 webhook 服务器不会验证客户端的身份,认为其是 apiservers。 +默认为 `NoClientCert` 。这意味着 Webhook 服务器不会验证客户端的身份,认为其是 apiservers。 如果你需要双向 TLS 或其他方式来验证客户端,请参阅 如何[对 apiservers 进行身份认证](#authenticate-apiservers)。 @@ -141,18 +149,18 @@ The test also creates a [service](/docs/reference/generated/kubernetes-api/{{< p as the front-end of the webhook server. See [code](https://github.com/kubernetes/kubernetes/blob/v1.22.0/test/e2e/apimachinery/webhook.go#L748). --> -e2e 测试中的 webhook 服务器通过 +e2e 测试中的 Webhook 服务器通过 [deployment API](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#deployment-v1-apps) 部署在 Kubernetes 集群中。该测试还将创建一个 [service](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#service-v1-core) -作为 webhook 服务器的前端。参见 +作为 Webhook 服务器的前端。参见 [相关代码](https://github.com/kubernetes/kubernetes/blob/v1.22.0/test/e2e/apimachinery/webhook.go#L748)。 -你也可以在集群外部署 webhook。这样做需要相应地更新你的 webhook 配置。 +你也可以在集群外部署 Webhook。这样做需要相应地更新你的 Webhook 配置。 -你可以通过 +你可以通过 [ValidatingWebhookConfiguration](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#validatingwebhookconfiguration-v1-admissionregistration-k8s-io) -或者 +或者 [MutatingWebhookConfiguration](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#mutatingwebhookconfiguration-v1-admissionregistration-k8s-io) 动态配置哪些资源要被哪些准入 Webhook 处理。 + -以下是一个 `ValidatingWebhookConfiguration` 示例,mutating webhook 配置与此类似。有关每个配置字段的详细信息,请参阅 [webhook 配置](#webhook-configuration) 部分。 +以下是一个 `ValidatingWebhookConfiguration` 示例,Mutating Webhook 配置与此类似。有关每个配置字段的详细信息,请参阅 [Webhook 配置](#webhook-configuration) 部分。 ```yaml apiVersion: admissionregistration.k8s.io/v1 @@ -184,11 +193,11 @@ metadata: webhooks: - name: "pod-policy.example.com" rules: - - apiGroups: [""] + - apiGroups: [""] apiVersions: ["v1"] - operations: ["CREATE"] - resources: ["pods"] - scope: "Namespaced" + operations: ["CREATE"] + resources: ["pods"] + scope: "Namespaced" clientConfig: service: namespace: "example-namespace" @@ -198,6 +207,7 @@ webhooks: sideEffects: None timeoutSeconds: 5 ``` + {{< note >}} `scope` 字段指定是仅集群范围的资源(Cluster)还是名字空间范围的资源资源(Namespaced)将与此规则匹配。 `*` 表示没有范围限制。 +{{< note >}} -{{< note >}} 当使用 `clientConfig.service` 时,服务器证书必须对 `..svc` 有效。 {{< /note >}} +{{< note >}} -{{< note >}} -Webhook 调用的默认超时是 10 秒,你可以设置 `timeout` 并建议对 webhook 设置较短的超时时间。 -如果 webhook 调用超时,则根据 webhook 的失败策略处理请求。 +Webhook 调用的默认超时是 10 秒,你可以设置 `timeout` 并建议对 Webhook 设置较短的超时时间。 +如果 Webhook 调用超时,则根据 Webhook 的失败策略处理请求。 {{< /note >}} 当一个 API 服务器收到与 `rules` 相匹配的请求时, -该 API 服务器将按照 `clientConfig` 中指定的方式向 webhook 发送一个 `admissionReview` 请求。 +该 API 服务器将按照 `clientConfig` 中指定的方式向 Webhook 发送一个 `admissionReview` 请求。 创建 Webhook 配置后,系统将花费几秒钟使新配置生效。 ### 对 API 服务器进行身份认证 {#authenticate-apiservers} @@ -322,71 +332,74 @@ For more information about `AdmissionConfiguration`, see the [AdmissionConfiguration (v1) reference](/docs/reference/config-api/apiserver-webhookadmission.v1/). See the [webhook configuration](#webhook-configuration) section for details about each config field. -* In the kubeConfig file, provide the credentials: +In the kubeConfig file, provide the credentials: --> 有关 `AdmissionConfiguration` 的更多信息,请参见 [AdmissionConfiguration (v1) reference](/docs/reference/config-api/apiserver-webhookadmission.v1/)。 -有关每个配置字段的详细信息,请参见 [webhook 配置](#webhook-配置)部分。 +有关每个配置字段的详细信息,请参见 [Webhook 配置](#webhook-配置)部分。 -* 在 kubeConfig 文件中,提供证书凭据: +在 kubeConfig 文件中,提供证书凭据: + +```yaml +apiVersion: v1 +kind: Config +users: +# 名称应设置为服务的 DNS 名称或配置了 Webhook 的 URL 的主机名(包括端口)。 +# 如果将非 443 端口用于服务,则在配置 1.16+ API 服务器时,该端口必须包含在名称中。 +# +# 对于配置在默认端口(443)上与服务对话的 Webhook,请指定服务的 DNS 名称: +# - name: webhook1.ns1.svc +# user: ... +# +# 对于配置在非默认端口(例如 8443)上与服务对话的 Webhook,请在 1.16+ 中指定服务的 DNS 名称和端口: +# - name: webhook1.ns1.svc:8443 +# user: ... +# 并可以选择仅使用服务的 DNS 名称来创建第二节,以与 1.15 API 服务器版本兼容: +# - name: webhook1.ns1.svc +# user: ... +# +# 对于配置为使用 URL 的 webhook,请匹配在 webhook 的 URL 中指定的主机(和端口)。 +# 带有 `url: https://www.example.com` 的 webhook: +# - name: www.example.com +# user: ... +# +# 带有 `url: https://www.example.com:443` 的 webhook: +# - name: www.example.com:443 +# user: ... +# +# 带有 `url: https://www.example.com:8443` 的 webhook: +# - name: www.example.com:8443 +# user: ... +# +- name: 'webhook1.ns1.svc' + user: + client-certificate-data: "" + client-key-data: "" +# `name` 支持使用 * 通配符匹配前缀段。 +- name: '*.webhook-company.org' + user: + password: "" + username: "" +# '*' 是默认匹配项。 +- name: '*' + user: + token: "" +``` - ```yaml - apiVersion: v1 - kind: Config - users: - # 名称应设置为服务的 DNS 名称或配置了 Webhook 的 URL 的主机名(包括端口)。 - # 如果将非 443 端口用于服务,则在配置 1.16+ API 服务器时,该端口必须包含在名称中。 - # - # 对于配置在默认端口(443)上与服务对话的 Webhook,请指定服务的 DNS 名称: - # - name: webhook1.ns1.svc - # user: ... - # - # 对于配置在非默认端口(例如 8443)上与服务对话的 Webhook,请在 1.16+ 中指定服务的 DNS 名称和端口: - # - name: webhook1.ns1.svc:8443 - # user: ... - # 并可以选择仅使用服务的 DNS 名称来创建第二节,以与 1.15 API 服务器版本兼容: - # - name: webhook1.ns1.svc - # user: ... - # - # 对于配置为使用 URL 的 webhook,请匹配在 webhook 的 URL 中指定的主机(和端口)。 - # 带有 `url: https://www.example.com` 的 webhook: - # - name: www.example.com - # user: ... - # - # 带有 `url: https://www.example.com:443` 的 webhook: - # - name: www.example.com:443 - # user: ... - # - # 带有 `url: https://www.example.com:8443` 的 webhook: - # - name: www.example.com:8443 - # user: ... - # - - name: 'webhook1.ns1.svc' - user: - client-certificate-data: "" - client-key-data: "" - # `name` 支持使用 * 通配符匹配前缀段。 - - name: '*.webhook-company.org' - user: - password: "" - username: "" - # '*' 是默认匹配项。 - - name: '*' - user: - token: "" - ``` 当然,你需要设置 Webhook 服务器来处理这些身份验证请求。 - + ## Webhook 请求与响应 {#webhook-request-and-response} -创建 webhook 配置时,`admissionReviewVersions` 是必填字段。 +创建 Webhook 配置时,`admissionReviewVersions` 是必填字段。 Webhook 必须支持至少一个当前和以前的 API 服务器都可以解析的 `AdmissionReview` 版本。 当拒绝请求时,Webhook 可以使用 `status` 字段自定义 http 响应码和返回给用户的消息。 @@ -624,7 +639,8 @@ For `patchType: JSONPatch`, the `patch` field contains a base64-encoded array of 对于 `patchType: JSONPatch`,`patch` 字段包含一个以 base64 编码的 JSON patch 操作数组。 @@ -652,18 +668,19 @@ So a webhook response to add that label would be: } ``` - 准入 Webhook 可以选择性地返回在 HTTP `Warning` 头中返回给请求客户端的警告消息,警告代码为 299。 警告可以与允许或拒绝的准入响应一起发送。 - 如果你正在实现返回一条警告的 webhook,则: @@ -674,7 +691,7 @@ If you're implementing a webhook that returns a warning: {{< caution >}} 超过 256 个字符的单条警告消息在返回给客户之前可能会被 API 服务器截断。 如果超过 4096 个字符的警告消息(来自所有来源),则额外的警告消息会被忽略。 @@ -731,37 +748,44 @@ Webhook,则应为每个 Webhook 赋予一个唯一的名称。 Each webhook must specify a list of rules used to determine if a request to the API server should be sent to the webhook. Each rule specifies one or more operations, apiGroups, apiVersions, and resources, and a resource scope: --> -每个 webhook 必须指定用于确定是否应将对 apiserver 的请求发送到 webhook 的规则列表。 +每个 Webhook 必须指定用于确定是否应将对 apiserver 的请求发送到 webhook 的规则列表。 每个规则都指定一个或多个 operations、apiGroups、apiVersions 和 resources 以及资源的 scope: * `operations` 列出一个或多个要匹配的操作。 可以是 `CREATE`、`UPDATE`、`DELETE`、`CONNECT` 或 `*` 以匹配所有内容。 * `apiGroups` 列出了一个或多个要匹配的 API 组。`""` 是核心 API 组。`"*"` 匹配所有 API 组。 * `apiVersions` 列出了一个或多个要匹配的 API 版本。`"*"` 匹配所有 API 版本。 * `resources` 列出了一个或多个要匹配的资源。 - * `"*"` 匹配所有资源,但不包括子资源。 - * `"*/*"` 匹配所有资源,包括子资源。 - * `"pods/*"` 匹配 pod 的所有子资源。 - * `"*/status"` 匹配所有 status 子资源。 + + * `"*"` 匹配所有资源,但不包括子资源。 + * `"*/*"` 匹配所有资源,包括子资源。 + * `"pods/*"` 匹配 pod 的所有子资源。 + * `"*/status"` 匹配所有 status 子资源。 * `scope` 指定要匹配的范围。有效值为 `"Cluster"`、`"Namespaced"` 和 `"*"`。 子资源匹配其父资源的范围。默认值为 `"*"`。 - * `"Cluster"` 表示只有集群作用域的资源才能匹配此规则(API 对象 Namespace 是集群作用域的)。 - * `"Namespaced"` 意味着仅具有名字空间的资源才符合此规则。 - * `"*"` 表示没有作用域限制。 + + * `"Cluster"` 表示只有集群作用域的资源才能匹配此规则(API 对象 Namespace 是集群作用域的)。 + * `"Namespaced"` 意味着仅具有名字空间的资源才符合此规则。 + * `"*"` 表示没有作用域限制。 -仅当选择使用 webhook 时才使用对象选择器,因为最终用户可以通过设置标签来 +仅当选择使用 Webhook 时才使用对象选择器,因为最终用户可以通过设置标签来 跳过准入 Webhook。 此示例显示了一个验证性质的 Webhook,它将匹配到对某名字空间中的任何具名字空间的资源的 `CREATE` 请求,前提是该名字空间具有值为 "prod" 或 "staging" 的 "environment" 标签: @@ -951,7 +976,7 @@ webhooks: matchExpressions: - key: environment operator: In - values: ["prod","staging"] + values: ["prod", "staging"] rules: - operations: ["CREATE"] apiGroups: ["*"] @@ -983,7 +1008,7 @@ For example, if a webhook only specified a rule for some API groups/versions and a request was made to modify the resource via another API group/version (like `extensions/v1beta1`), the request would not be sent to the webhook. --> -例如,如果一个 webhook 仅为某些 API 组/版本指定了规则(例如 +例如,如果一个 Webhook 仅为某些 API 组/版本指定了规则(例如 `apiGroups:["apps"], apiVersions:["v1","v1beta1"]`),而修改资源的请求是通过另一个 API 组/版本(例如 `extensions/v1beta1`)发出的,该请求将不会被发送到 Webhook。 @@ -991,25 +1016,28 @@ API 组/版本(例如 `extensions/v1beta1`)发出的,该请求将不会被 The `matchPolicy` lets a webhook define how its `rules` are used to match incoming requests. Allowed values are `Exact` or `Equivalent`. --> -`matchPolicy` 允许 webhook 定义如何使用其 `rules` 匹配传入的请求。 +`matchPolicy` 允许 Webhook 定义如何使用其 `rules` 匹配传入的请求。 允许的值为 `Exact` 或 `Equivalent`。 * `Exact` 表示仅当请求与指定规则完全匹配时才应拦截该请求。 * `Equivalent` 表示如果某个请求意在修改 `rules` 中列出的资源, 即使该请求是通过其他 API 组或版本发起,也应拦截该请求。 -在上面给出的示例中,仅为 `apps/v1` 注册的 webhook 可以使用 `matchPolicy`: +在上面给出的示例中,仅为 `apps/v1` 注册的 Webhook 可以使用 `matchPolicy`: * `matchPolicy: Exact` 表示不会将 `extensions/v1beta1` 请求发送到 Webhook -* `matchPolicy:Equivalent` 表示将 `extensions/v1beta1` 请求发送到 webhook - (将对象转换为 webhook 指定的版本:`apps/v1`) +* `matchPolicy:Equivalent` 表示将 `extensions/v1beta1` 请求发送到 Webhook + (将对象转换为 Webhook 指定的版本:`apps/v1`) 准入 Webhook 所用的 `matchPolicy` 默认为 `Equivalent`。 @@ -1144,7 +1173,7 @@ webhooks: expression: '!authorizer.group("admissionregistration.k8s.io").resource("validatingwebhookconfigurations").name("my-webhook.example.com").check("breakglass").allowed()' ``` - 匹配条件可以访问以下 CEL 变量: @@ -1217,8 +1246,8 @@ stanza of the webhook configuration. Webhooks can either be called via a URL or a service reference, and can optionally include a custom CA bundle to use to verify the TLS connection. --> -API 服务器确定请求应发送到 webhook 后,它需要知道如何调用 webhook。 -此信息在 webhook 配置的 `clientConfig` 节中指定。 +API 服务器确定请求应发送到 Webhook 后,它需要知道如何调用 webhook。 +此信息在 Webhook 配置的 `clientConfig` 节中指定。 Webhook 可以通过 URL 或服务引用来调用,并且可以选择包含自定义 CA 包,以用于验证 TLS 连接。 @@ -1231,7 +1260,7 @@ Webhook 可以通过 URL 或服务引用来调用,并且可以选择包含自 `url` gives the location of the webhook, in standard URL form (`scheme://host:port/path`). --> -`url` 以标准 URL 形式给出 webhook 的位置(`scheme://host:port/path`)。 +`url` 以标准 URL 形式给出 Webhook 的位置(`scheme://host:port/path`)。 请注意,将 `localhost` 或 `127.0.0.1` 用作 `host` 是有风险的, -除非你非常小心地在所有运行 apiserver 的、可能需要对此 webhook +除非你非常小心地在所有运行 apiserver 的、可能需要对此 Webhook 进行调用的主机上运行。这样的安装方式可能不具有可移植性,即很难在新集群中启用。 使用用户或基本身份验证(例如:"user:password@")是不允许的。 使用片段("#...")和查询参数("?...")也是不允许的。 @@ -1321,14 +1350,16 @@ webhooks: path: /my-path port: 1234 ``` + {{< note >}} 你必须在以上示例中将 `` 替换为一个有效的 VA 证书包, 这是一个用 PEM 编码的 CA 证书包,用于校验 Webhook 的服务器证书。 {{< /note >}} + @@ -1342,7 +1373,7 @@ Webhook 通常仅对发送给他们的 `AdmissionReview` 内容进行操作。 但是,某些 Webhook 在处理 admission 请求时会进行带外更改。 -Webhook 使用 webhook 配置中的 `sideEffects` 字段显示它们是否有副作用: +Webhook 使用 Webhook 配置中的 `sideEffects` 字段显示它们是否有副作用: -* `None`:调用 webhook 没有副作用。 -* `NoneOnDryRun`:调用 webhook 可能会有副作用,但是如果将带有 `dryRun: true` - 属性的请求发送到 webhook,则 webhook 将抑制副作用(该 webhook 可识别 `dryRun`)。 +* `None`:调用 Webhook 没有副作用。 +* `NoneOnDryRun`:调用 Webhook 可能会有副作用,但是如果将带有 `dryRun: true` + 属性的请求发送到 webhook,则 Webhook 将抑制副作用(该 Webhook 可识别 `dryRun`)。 -这是一个 validating webhook 的示例,表明它对 `dryRun: true` 请求没有副作用: +这是一个 validating Webhook 的示例,表明它对 `dryRun: true` 请求没有副作用: ```yaml apiVersion: admissionregistration.k8s.io/v1 @@ -1427,8 +1458,8 @@ webhooks: timeoutSeconds: 2 ``` - 准入 Webhook 所用的超时时间默认为 10 秒。 @@ -1464,9 +1495,9 @@ and mutating webhooks can specify a `reinvocationPolicy` to control whether they 可以将 `reinvocationPolicy` 设置为 `Never` 或 `IfNeeded`。 默认为 `Never`。 * `Never`: 在一次准入测试中,不得多次调用 Webhook。 * `IfNeeded`: 如果在最初的 Webhook 调用之后被其他对象的插件修改了被接纳的对象, @@ -1479,9 +1510,11 @@ The important elements to note are: * 不能保证附加调用的次数恰好是一。 * 如果其他调用导致对该对象的进一步修改,则不能保证再次调用 Webhook。 @@ -1490,7 +1523,8 @@ The important elements to note are: (推荐用于有副作用的 Webhook)。 这是一个修改性质的 Webhook 的示例,该 Webhook 在以后的准入插件修改对象时被重新调用: @@ -1510,7 +1544,7 @@ in an object could already exist in the user-provided object, but it is essentia 修改性质的 Webhook 必须具有[幂等](#idempotence)性,并且能够成功处理 已被接纳并可能被修改的对象的修改性质的 Webhook。 对于所有修改性质的准入 Webhook 都是如此,因为它们可以在对象中进行的 -任何更改可能已经存在于用户提供的对象中,但是对于选择重新调用的 webhook +任何更改可能已经存在于用户提供的对象中,但是对于选择重新调用的 Webhook 来说是必不可少的。 -`failurePolicy` 定义了如何处理准入 webhook 中无法识别的错误和超时错误。允许的值为 `Ignore` 或 `Fail`。 +`failurePolicy` 定义了如何处理准入 Webhook 中无法识别的错误和超时错误。允许的值为 `Ignore` 或 `Fail`。 -* `Ignore` 表示调用 webhook 的错误将被忽略并且允许 API 请求继续。 -* `Fail` 表示调用 webhook 的错误导致准入失败并且 API 请求被拒绝。 +* `Ignore` 表示调用 Webhook 的错误将被忽略并且允许 API 请求继续。 +* `Fail` 表示调用 Webhook 的错误导致准入失败并且 API 请求被拒绝。 这是一个修改性质的 webhook,配置为在调用准入 Webhook 遇到错误时拒绝 API 请求: @@ -1542,8 +1576,8 @@ webhooks: failurePolicy: Fail ``` - 准入 Webhook 所用的默认 `failurePolicy` 是 `Fail`。 @@ -1560,14 +1594,13 @@ monitoring mechanisms help cluster admins to answer questions like: 2. What change did the mutating webhook applied to the object? -3. Which webhooks are frequently rejecting API requests? What's the reason for a - rejection? +3. Which webhooks are frequently rejecting API requests? What's the reason for a rejection? --> API 服务器提供了监视准入 Webhook 行为的方法。这些监视机制可帮助集群管理员回答以下问题: -1. 哪个修改性质的 webhook 改变了 API 请求中的对象? +1. 哪个修改性质的 Webhook 改变了 API 请求中的对象? 2. 修改性质的 Webhook 对对象做了哪些更改? -3. 哪些 webhook 经常拒绝 API 请求?是什么原因拒绝? +3. 哪些 Webhook 经常拒绝 API 请求?是什么原因拒绝? - 在 `Metadata` 或更高审计级别上,将使用 JSON 负载记录带有键名 -`mutation.webhook.admission.k8s.io/round_{round idx}_index_{order idx}` 的注解, -该注解表示针对给定请求调用了 Webhook,以及该 Webhook 是否更改了对象。 + `mutation.webhook.admission.k8s.io/round_{round idx}_index_{order idx}` 的注解, + 该注解表示针对给定请求调用了 Webhook,以及该 Webhook 是否更改了对象。 有时,了解哪些准入 Webhook 经常拒绝 API 请求以及拒绝的原因是很有用的。 @@ -1757,20 +1791,22 @@ metrics are labelled to identify the causes of webhook rejection(s): - `type`: the admission webhook type, can be one of `admit` and `validating`. - `error_type`: identifies if an error occurred during the webhook invocation that caused the rejection. Its value can be one of: - - `calling_webhook_error`: unrecognized errors or timeout errors from the admission webhook happened and the - webhook's [Failure policy](#failure-policy) is set to `Fail`. - - `no_error`: no error occurred. The webhook rejected the request with `allowed: false` in the admission - response. The metrics label `rejection_code` records the `.status.code` set in the admission response. - - `apiserver_internal_error`: an API server internal error happened. + + - `calling_webhook_error`: unrecognized errors or timeout errors from the admission webhook happened and the + webhook's [Failure policy](#failure-policy) is set to `Fail`. + - `no_error`: no error occurred. The webhook rejected the request with `allowed: false` in the admission + response. The metrics label `rejection_code` records the `.status.code` set in the admission response. + - `apiserver_internal_error`: an API server internal error happened. + - `rejection_code`: the HTTP status code set in the admission response when a webhook rejected a request. --> - `name`:拒绝请求 Webhook 的名称。 - `operation`:请求的操作类型可以是 `CREATE`、`UPDATE`、`DELETE` 和 `CONNECT` 其中之一。 -- `type`:Admission webhook 类型,可以是 `admit` 和 `validating` 其中之一。 -- `error_type`:标识在 webhook 调用期间是否发生了错误并且导致了拒绝。其值可以是以下之一: +- `type`:Admission Webhook 类型,可以是 `admit` 和 `validating` 其中之一。 +- `error_type`:标识在 Webhook 调用期间是否发生了错误并且导致了拒绝。其值可以是以下之一: - `calling_webhook_error`:发生了来自准入 Webhook 的无法识别的错误或超时错误, - 并且 webhook 的 [失败策略](#failure-policy) 设置为 `Fail`。 + 并且 Webhook 的 [失败策略](#failure-policy) 设置为 `Fail`。 - `no_error`:未发生错误。Webhook 在准入响应中以 `allowed: false` 值拒绝了请求。 度量标签 `rejection_code` 记录了在准入响应中设置的 `.status.code`。 - `apiserver_internal_error`:apiserver 发生内部错误。 @@ -1815,7 +1851,8 @@ the initial application. 2. For a `CREATE` pod request, if the field `.spec.containers[].resources.limits` of a container is not set, set default resource limits. -3. For a `CREATE` pod request, inject a sidecar container with name `foo-sidecar` if no container with the name `foo-sidecar` already exists. +3. For a `CREATE` pod request, inject a sidecar container with name `foo-sidecar` if no container + with the name `foo-sidecar` already exists. In the cases above, the webhook can be safely reinvoked, or admit an object that already has the fields set. --> @@ -1891,16 +1928,18 @@ versions. See [Matching requests: matchPolicy](#matching-requests-matchpolicy) f ### 可用性 {#availability} -建议准入 webhook 尽快完成执行(时长通常是毫秒级),因为它们会增加 API 请求的延迟。 +建议准入 Webhook 尽快完成执行(时长通常是毫秒级),因为它们会增加 API 请求的延迟。 建议对 Webhook 使用较小的超时值。有关更多详细信息,请参见[超时](#timeouts)。 建议 Admission Webhook 应该采用某种形式的负载均衡机制,以提供高可用性和高性能。 @@ -1912,9 +1951,11 @@ to leverage the load-balancing that service supports. Admission webhooks that need to guarantee they see the final state of the object in order to enforce policy should use a validating admission webhook, since objects can be modified after being seen by mutating webhooks. -For example, a mutating admission webhook is configured to inject a sidecar container with name "foo-sidecar" on every -`CREATE` pod request. If the sidecar *must* be present, a validating admisson webhook should also be configured to intercept `CREATE` pod requests, and validate -that a container with name "foo-sidecar" with the expected configuration exists in the to-be-created object. +For example, a mutating admission webhook is configured to inject a sidecar container with name +"foo-sidecar" on every `CREATE` pod request. If the sidecar *must* be present, a validating +admisson webhook should also be configured to intercept `CREATE` pod requests, and validate that a +container with name "foo-sidecar" with the expected configuration exists in the to-be-created +object. --> ### 确保看到对象的最终状态 {#guaranteeing-the-final-state-of-the-object-is-seen} @@ -1923,7 +1964,7 @@ that a container with name "foo-sidecar" with the expected configuration exists 则应该使用一个验证性质的 webhook, 因为可以通过 mutating Webhook 看到对象后对其进行修改。 -例如,一个修改性质的准入Webhook 被配置为在每个 `CREATE` Pod 请求中 +例如,一个修改性质的准入 Webhook 被配置为在每个 `CREATE` Pod 请求中 注入一个名称为 "foo-sidecar" 的 sidecar 容器。 如果*必须*存在边车容器,则还应配置一个验证性质的准入 Webhook 以拦截 @@ -1942,7 +1983,8 @@ When a node that runs the webhook server pods becomes unhealthy, the webhook deployment will try to reschedule the pods to another node. However the requests will get rejected by the existing webhook server since the `"env"` label is unset, and the migration cannot happen. -It is recommended to exclude the namespace where your webhook is running with a [namespaceSelector](#matching-requests-namespaceselector). +It is recommended to exclude the namespace where your webhook is running with a +[namespaceSelector](#matching-requests-namespaceselector). --> ### 避免自托管的 Webhooks 中出现死锁 {#avoiding-deadlocks-in-self-hosted-webhooks} @@ -1971,7 +2013,7 @@ set to `NoneOnDryRun`. See [Side effects](#side-effects) for more detail. --> ### 副作用 {#side-effects} -建议准入 Webhook 应尽可能避免副作用,这意味着该准入 webhook 仅对发送给他们的 +建议准入 Webhook 应尽可能避免副作用,这意味着该准入 Webhook 仅对发送给他们的 `AdmissionReview` 的内容起作用,并且不要进行额外更改。 如果 Webhook 没有任何副作用,则 `.webhooks[].sideEffects` 字段应设置为 `None`。 From 0111d739b6272e27c6289c768c0f11e0197a9a99 Mon Sep 17 00:00:00 2001 From: xin gu <418294249@qq.com> Date: Tue, 25 Apr 2023 21:48:06 +0800 Subject: [PATCH 15/22] sync status.md sync status.md --- .../reference/kubernetes-api/common-definitions/status.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/zh-cn/docs/reference/kubernetes-api/common-definitions/status.md b/content/zh-cn/docs/reference/kubernetes-api/common-definitions/status.md index 0d3a6de1b6f..3b1a2be574e 100644 --- a/content/zh-cn/docs/reference/kubernetes-api/common-definitions/status.md +++ b/content/zh-cn/docs/reference/kubernetes-api/common-definitions/status.md @@ -158,10 +158,10 @@ guide. You can file document formatting bugs against the 资源的 UID(当有单个可以描述的资源时)。 - 更多信息: http://kubernetes.io/docs/user-guide/identifiers#uids + 更多信息: https://kubernetes.io/zh-cn/docs/concepts/overview/working-with-objects/names#uids - **kind** (string) From f71a86263d07cc87932ee86aef1fbc8ac4e53a7b Mon Sep 17 00:00:00 2001 From: Peter Schuurman Date: Tue, 25 Apr 2023 10:38:02 -0700 Subject: [PATCH 16/22] Update title to reflect k8s 1.27 --- content/en/blog/_posts/2023-04-28-statefulset-migration.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/blog/_posts/2023-04-28-statefulset-migration.md b/content/en/blog/_posts/2023-04-28-statefulset-migration.md index b01e75a0565..6f52bf2eab7 100644 --- a/content/en/blog/_posts/2023-04-28-statefulset-migration.md +++ b/content/en/blog/_posts/2023-04-28-statefulset-migration.md @@ -1,6 +1,6 @@ --- layout: blog -title: "Kubernetes 1.26: StatefulSet Start Ordinal Simplifies Migration" +title: "Kubernetes 1.27: StatefulSet Start Ordinal Simplifies Migration" date: 2023-04-28 slug: statefulset-start-ordinal --- From fe878980eb28071af2c20be75c165a4d1f475f19 Mon Sep 17 00:00:00 2001 From: Kensei Nakada Date: Wed, 26 Apr 2023 07:58:00 +0900 Subject: [PATCH 17/22] Add the deprecation notice of KubeSchedulerConfiguration v1beta3 --- content/en/docs/reference/scheduling/config.md | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/content/en/docs/reference/scheduling/config.md b/content/en/docs/reference/scheduling/config.md index 86391158266..d1ba65585a9 100644 --- a/content/en/docs/reference/scheduling/config.md +++ b/content/en/docs/reference/scheduling/config.md @@ -20,8 +20,7 @@ by implementing one or more of these extension points. You can specify scheduling profiles by running `kube-scheduler --config `, using the -KubeSchedulerConfiguration ([v1beta3](/docs/reference/config-api/kube-scheduler-config.v1beta3/) -or [v1](/docs/reference/config-api/kube-scheduler-config.v1/)) +KubeSchedulerConfiguration [v1](/docs/reference/config-api/kube-scheduler-config.v1/) struct. A minimal configuration looks as follows: @@ -35,9 +34,10 @@ clientConnection: {{< note >}} KubeSchedulerConfiguration [v1beta2](/docs/reference/config-api/kube-scheduler-config.v1beta2/) - is deprecated in v1.25 and will be removed in v1.26. Please migrate KubeSchedulerConfiguration to - [v1beta3](/docs/reference/config-api/kube-scheduler-config.v1beta3/) or [v1](/docs/reference/config-api/kube-scheduler-config.v1/) - before upgrading Kubernetes to v1.25. + is deprecated in v1.25 and will be removed in v1.28. + KubeSchedulerConfiguration [v1beta3](/docs/reference/config-api/kube-scheduler-config.v1beta3/) + is deprecated in v1.26 and will be removed in v1.29. + Please migrate KubeSchedulerConfiguration to [v1](/docs/reference/config-api/kube-scheduler-config.v1/). {{< /note >}} ## Profiles From cd03bdddcf4eabf8d104418213b2acc1666233a7 Mon Sep 17 00:00:00 2001 From: xing-yang Date: Mon, 10 Apr 2023 03:31:30 +0000 Subject: [PATCH 18/22] Address review comments --- .../2023-05-08-volume-group-snapshot-alpha.md | 97 +++++++++++-------- 1 file changed, 56 insertions(+), 41 deletions(-) diff --git a/content/en/blog/_posts/2023-05-08-volume-group-snapshot-alpha.md b/content/en/blog/_posts/2023-05-08-volume-group-snapshot-alpha.md index 14c396449d3..0bc231c3933 100644 --- a/content/en/blog/_posts/2023-05-08-volume-group-snapshot-alpha.md +++ b/content/en/blog/_posts/2023-05-08-volume-group-snapshot-alpha.md @@ -1,19 +1,19 @@ --- layout: blog -title: "Introducing Volume Group Snapshot" -date: 2023-05-08T10:00:00-08:00 +title: "Kubernetes 1.27: Introducing An API For Volume Group Snapshots" +date: 2023-05-08 slug: kubernetes-1-27-volume-group-snapshot-alpha --- **Author:** Xing Yang (VMware) Volume group snapshot is introduced as an Alpha feature in Kubernetes v1.27. -This feature introduces a Kubernetes API that allows users to take a crash consistent -snapshot for multiple volumes together. It uses a label selector to group multiple -PersistentVolumeClaims for snapshotting. -This new feature is only supported for CSI volume drivers. +This feature introduces a Kubernetes API that allows users to take crash consistent +snapshots for multiple volumes together. It uses a label selector to group multiple +`PersistentVolumeClaims` for snapshotting. +This new feature is only supported for [CSI](https://kubernetes-csi.github.io/docs/) volume drivers. -## What is Volume Group Snapshot +## An overview of volume group snapshots Some storage systems provide the ability to create a crash consistent snapshot of multiple volumes. A group snapshot represents “copies” from multiple volumes that @@ -21,7 +21,7 @@ are taken at the same point-in-time. A group snapshot can be used either to rehy new volumes (pre-populated with the snapshot data) or to restore existing volumes to a previous state (represented by the snapshots). -## Why add Volume Group Snapshots to Kubernetes? +## Why add volume group snapshots to Kubernetes? The Kubernetes volume plugin system already provides a powerful abstraction that automates the provisioning, attaching, mounting, resizing, and snapshotting of block @@ -30,9 +30,9 @@ and file storage. Underpinning all these features is the Kubernetes goal of workload portability: Kubernetes aims to create an abstraction layer between distributed applications and underlying clusters so that applications can be agnostic to the specifics of the -cluster they run on and application deployment requires no “cluster specific” knowledge. +cluster they run on and application deployment requires no cluster specific knowledge. -There is already a [VolumeSnapshot API](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/177-volume-snapshot) +There is already a [VolumeSnapshot](/docs/concepts/storage/volume-snapshots/) API that provides the ability to take a snapshot of a persistent volume to protect against data loss or data corruption. However, there are other snapshotting functionalities not covered by the VolumeSnapshot API. @@ -45,25 +45,28 @@ If snapshots for the data volume and the logs volume are taken at different time the application will not be consistent and will not function properly if it is restored from those snapshots when a disaster strikes. -It is true that we can quiesce the application first, take an individual snapshot from +It is true that you can quiesce the application first, take an individual snapshot from each volume that is part of the application one after the other, and then unquiesce the -application after all the individual snapshots are taken. This way we will get application -consistent snapshots. -However, application quiesce is time consuming. Sometimes it may not be possible to -quiesce an application. Taking individual snapshots one after another may also take -longer time compared to taking a consistent group snapshot. Some users may not want -to do application quiesce very frequently for these reasons. For example, a user may -want to run weekly backups with application quiesce and nightly backups without -application quiesce but with consistent group support which provides crash consistency -across all volumes in the group. +application after all the individual snapshots are taken. This way, you would get +application consistent snapshots. + +However, sometimes it may not be possible to quiesce an application or the application +quiesce can be too expensive so you want to do it less frequently. Taking individual +snapshots one after another may also take longer time compared to taking a consistent +group snapshot. Some users may not want to do application quiesce very often for these +reasons. For example, a user may want to run weekly backups with application quiesce +and nightly backups without application quiesce but with consistent group support which +provides crash consistency across all volumes in the group. ## Kubernetes Volume Group Snapshots API -Kubernetes Volume Group Snapshots introduce [three new API objects](https://github.com/kubernetes-csi/external-snapshotter/blob/master/client/apis/volumegroupsnapshot/v1alpha1/types.go) for managing snapshots: +Kubernetes Volume Group Snapshots introduce [three new API +objects](https://github.com/kubernetes-csi/external-snapshotter/blob/master/client/apis/volumegroupsnapshot/v1alpha1/types.go) +for managing snapshots: `VolumeGroupSnapshot` : Created by a Kubernetes user (or perhaps by your own automation) to request -creation of a volume group snapshot for multiple volumes. +creation of a volume group snapshot for multiple persistent volume claims. It contains information about the volume group snapshot operation such as the timestamp when the volume group snapshot was taken and whether it is ready to use. The creation and deletion of this object represents a desire to create or delete a @@ -81,41 +84,50 @@ was created with a one-to-one mapping. : Created by cluster administrators to describe how volume group snapshots should be created. including the driver information, the deletion policy, etc. -The Volume Group Snapshot objects are defined as CustomResourceDefinitions (CRDs). +These three API kinds are defined as CustomResourceDefinitions (CRDs). These CRDs must be installed in a Kubernetes cluster for a CSI Driver to support volume group snapshots. ## How do I use Kubernetes Volume Group Snapshots -Volume Group Snapshot feature is implemented in the +Volume group snapshots are implemented in the [external-snapshotter](https://github.com/kubernetes-csi/external-snapshotter) repository. Implementing volume group snapshots meant adding or changing several components: -* Kubernetes Volume Group Snapshot CRDs +* Added new CustomResourceDefinitions for VolumeGroupSnapshot and two supporting APIs. * Volume group snapshot controller logic is added to the common snapshot controller. * Volume group snapshot validation webhook logic is added to the common snapshot validation webhook. -* Logic to make CSI calls is added to CSI Snapshotter sidecar controller. +* Adding logic to make CSI calls into the snapshotter sidecar controller. The volume snapshot controller, CRDs, and validation webhook are deployed once per cluster, while the sidecar is bundled with each CSI driver. Therefore, it makes sense to deploy the volume snapshot controller, CRDs, and validation -webhook as a cluster addon. It is strongly recommended that Kubernetes distributors +webhook as a cluster addon. I strongly recommend that Kubernetes distributors bundle and deploy the volume snapshot controller, CRDs, and validation webhook as part of their Kubernetes cluster management process (independent of any CSI Driver). ### Creating a new group snapshot with Kubernetes Once a VolumeGroupSnapshotClass object is defined and you have volumes you want to -snapshot together, you may create a new group snapshot by creating a VolumeGroupSnapshot +snapshot together, you may request a new group snapshot by creating a VolumeGroupSnapshot object. The source of the group snapshot specifies whether the underlying group snapshot should be dynamically created or if a pre-existing VolumeGroupSnapshotContent -should be used. One of the following members in the source must be set. +should be used. -* Selector - Selector is a label query over persistent volume claims that are to be grouped together for snapshotting. This labelSelector will be used to match the label added to a PVC. -* VolumeGroupSnapshotContentName - specifies the name of a pre-existing VolumeGroupSnapshotContent object representing an existing volume group snapshot. +A pre-existing VolumeGroupSnapshotContent is created by a cluster administrator. +It contains the details of the real volume group snapshot on the storage system which +is available for use by cluster users. + +One of the following members in the source of the group snapshot must be set. + +* `selector` - a label query over PersistentVolumeClaims that are to be grouped + together for snapshotting. This labelSelector will be used to match the label + added to a PVC. +* `volumeGroupSnapshotContentName` - specifies the name of a pre-existing + VolumeGroupSnapshotContent object representing an existing volume group snapshot. For dynamic provisioning, a selector must be set so that the snapshot controller can find PVCs with the matching labels to be snapshotted together. @@ -130,7 +142,8 @@ spec: volumeGroupSnapshotClassName: csi-groupSnapclass source: selector: - group: myGroup + matchLabels: + group: myGroup ``` In the VolumeGroupSnapshot spec, a user can specify the VolumeGroupSnapshotClass which @@ -187,7 +200,7 @@ snapshots that are part of a group snapshot. ## As a storage vendor, how do I add support for group snapshots to my CSI driver? -To implement the volume group snapshot feature, a CSI driver MUST: +To implement the volume group snapshot feature, a CSI driver **must**: * Implement a new group controller service. * Implement group controller RPCs: `CreateVolumeGroupSnapshot`, `DeleteVolumeGroupSnapshot`, and `GetVolumeGroupSnapshot`. @@ -197,7 +210,6 @@ See the [CSI spec](https://github.com/container-storage-interface/spec/blob/mast and the [Kubernetes-CSI Driver Developer Guide](https://kubernetes-csi.github.io/docs/) for more details. -Although Kubernetes poses as little prescriptive on the packaging and deployment of a CSI Volume Driver as possible, it provides a suggested mechanism to deploy a containerized CSI driver to simplify the process. @@ -215,8 +227,11 @@ The external-snapshotter watches the Kubernetes API server for the The alpha implementation of volume group snapshots for Kubernetes has the following limitations: -* Does not support reverting an existing PVC to an earlier state represented by a snapshot that is part of a group snapshot (only supports provisioning a new volume from a snapshot). -* No application consistency guarantees beyond any guarantees provided by the storage system (e.g. crash consistency). +* Does not support reverting an existing PVC to an earlier state represented by + a snapshot (only supports provisioning a new volume from a snapshot). +* No application consistency guarantees beyond any guarantees provided by the storage system + (e.g. crash consistency). See this [doc](https://github.com/kubernetes/community/blob/master/wg-data-protection/data-protection-workflows-white-paper.md#quiesce-and-unquiesce-hooks) + for more discussions on application consistency. ## What’s next? @@ -227,11 +242,11 @@ replication group, volume placement, application quiescing, changed block tracki ## How can I learn more? -The design spec for the volume group snapshot feature is [here](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/3476-volume-group-snapshot). - -The code repository for volume group snapshot APIs and controller is [here](https://github.com/kubernetes-csi/external-snapshotter). - -Check out additional documentation on the group snapshot feature [here](https://kubernetes-csi.github.io/docs/). +- The [design spec](https://github.com/kubernetes/enhancements/tree/master/keps/sig-storage/3476-volume-group-snapshot) + for the volume group snapshot feature. +- The [code repository](https://github.com/kubernetes-csi/external-snapshotter) for volume group + snapshot APIs and controller. +- CSI [documentation](https://kubernetes-csi.github.io/docs/) on the group snapshot feature. ## How do I get involved? From 136bdad0d4a00c296da6ca72baa09731aba729c7 Mon Sep 17 00:00:00 2001 From: Niranjan Darshan Date: Wed, 26 Apr 2023 08:30:16 +0530 Subject: [PATCH 19/22] Fixed broken link blog 2022 (#40663) * Fixed broken link blog 2022 * added correct link --- content/en/blog/_posts/2022-12-27-cpumanager-goes-GA.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/blog/_posts/2022-12-27-cpumanager-goes-GA.md b/content/en/blog/_posts/2022-12-27-cpumanager-goes-GA.md index d1edc4575b0..d7ced81802d 100644 --- a/content/en/blog/_posts/2022-12-27-cpumanager-goes-GA.md +++ b/content/en/blog/_posts/2022-12-27-cpumanager-goes-GA.md @@ -40,7 +40,7 @@ compatible behavior when disabled, and to document how to interact with each oth This enabled the Kubernetes project to graduate to GA the CPU Manager core component and core CPU allocation algorithms to GA, while also enabling a new age of experimentation in this area. -In Kubernetes v1.26, the CPU Manager supports [three different policy options](/docs/tasks/administer-cluster/cpu-management-policies.md#static-policy-options): +In Kubernetes v1.26, the CPU Manager supports [three different policy options](/docs/tasks/administer-cluster/cpu-management-policies#static-policy-options): `full-pcpus-only` : restrict the CPU Manager core allocation algorithm to full physical cores only, reducing noisy neighbor issues from hardware technologies that allow sharing cores. From 235c5be1b773135ebfc510ba966ad3751684aef2 Mon Sep 17 00:00:00 2001 From: xin gu <418294249@qq.com> Date: Tue, 25 Apr 2023 21:54:13 +0800 Subject: [PATCH 20/22] sync service-topology.md sync service-topology.md --- .../docs/concepts/services-networking/service-topology.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/zh-cn/docs/concepts/services-networking/service-topology.md b/content/zh-cn/docs/concepts/services-networking/service-topology.md index f605350272e..b9a3f188a05 100644 --- a/content/zh-cn/docs/concepts/services-networking/service-topology.md +++ b/content/zh-cn/docs/concepts/services-networking/service-topology.md @@ -20,12 +20,12 @@ weight: 150 此功能特性,尤其是 Alpha 阶段的 `topologyKeys` API,在 Kubernetes v1.21 版本中已被废弃。Kubernetes v1.21 版本中引入的 -[拓扑感知的提示](/zh-cn/docs/concepts/services-networking/topology-aware-hints/), +[拓扑感知路由](/zh-cn/docs/concepts/services-networking/topology-aware-routing/), 提供类似的功能。 {{}} From d5a5370ba2614f0fe38b2e1275b0be349898cc46 Mon Sep 17 00:00:00 2001 From: ydFu Date: Wed, 26 Apr 2023 16:25:41 +0800 Subject: [PATCH 21/22] [zh] sync blog\_posts\2022-12-27-cpumanager-goes-GA.md Signed-off-by: ydFu --- content/zh-cn/blog/_posts/2022-12-27-cpumanager-goes-GA.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/zh-cn/blog/_posts/2022-12-27-cpumanager-goes-GA.md b/content/zh-cn/blog/_posts/2022-12-27-cpumanager-goes-GA.md index 9e7b876d103..48f66608bb9 100644 --- a/content/zh-cn/blog/_posts/2022-12-27-cpumanager-goes-GA.md +++ b/content/zh-cn/blog/_posts/2022-12-27-cpumanager-goes-GA.md @@ -88,11 +88,11 @@ compatible behavior when disabled, and to document how to interact with each oth 这使得 Kubernetes 项目能够将 CPU 管理器核心组件和核心 CPU 分配算法进阶至 GA,同时也开启了该领域新的实验时代。 在 Kubernetes v1.26 中,CPU -管理器支持[三个不同的策略选项](/zh-cn/docs/tasks/administer-cluster/cpu-management-policies.md#static-policy-options): +管理器支持[三个不同的策略选项](/zh-cn/docs/tasks/administer-cluster/cpu-management-policies#static-policy-options):