From 83bb609c1e918aca218d5b7a01cce6aeebfa459a Mon Sep 17 00:00:00 2001 From: Nabarun Pal Date: Fri, 30 Jun 2023 23:25:34 +0530 Subject: [PATCH 01/82] add authorization config documentation Signed-off-by: Nabarun Pal --- content/en/docs/reference/access-authn-authz/authorization.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/content/en/docs/reference/access-authn-authz/authorization.md b/content/en/docs/reference/access-authn-authz/authorization.md index 9a77c7ac211..d81b4a9f55f 100644 --- a/content/en/docs/reference/access-authn-authz/authorization.md +++ b/content/en/docs/reference/access-authn-authz/authorization.md @@ -209,6 +209,10 @@ The following flags can be used: You can choose more than one authorization module. Modules are checked in order so an earlier module has higher priority to allow or deny a request. +## Configuring the API Server using a Authorization Config File + + + ## Privilege escalation via workload creation or edits {#privilege-escalation-via-pod-creation} Users who can create/edit pods in a namespace, either directly or through a [controller](/docs/concepts/architecture/controller/) From 2ff476c80db339f3d262900ef37885ac5cca5f8d Mon Sep 17 00:00:00 2001 From: xuzhenglun Date: Tue, 29 Aug 2023 20:17:57 +0800 Subject: [PATCH 03/82] KEP-3668: Update Service and feature-gate docs for GA --- .../en/docs/concepts/services-networking/service.md | 11 +++++------ .../command-line-tools-reference/feature-gates.md | 5 +++-- 2 files changed, 8 insertions(+), 8 deletions(-) diff --git a/content/en/docs/concepts/services-networking/service.md b/content/en/docs/concepts/services-networking/service.md index d1f54d32df3..46ceed9fbf6 100644 --- a/content/en/docs/concepts/services-networking/service.md +++ b/content/en/docs/concepts/services-networking/service.md @@ -517,16 +517,15 @@ spec: #### Reserve Nodeport Ranges to avoid collisions when port assigning -{{< feature-state for_k8s_version="v1.28" state="beta" >}} +{{< feature-state for_k8s_version="v1.29" state="stable" >}} The policy for assigning ports to NodePort services applies to both the auto-assignment and the manual assignment scenarios. When a user wants to create a NodePort service that uses a specific port, the target port may conflict with another port that has already been assigned. -In this case, you can enable the feature gate `ServiceNodePortStaticSubrange`, which allows you -to use a different port allocation strategy for NodePort Services. The port range for NodePort services -is divided into two bands. Dynamic port assignment uses the upper band by default, and it may use -the lower band once the upper band has been exhausted. Users can then allocate from the lower band -with a lower risk of port collision. + +To avoid this problem, the port range for NodePort services is divided into two bands. +Dynamic port assignment uses the upper band by default, and it may use the lower band once the +upper band has been exhausted. Users can then allocate from the lower band with a lower risk of port collision. #### Custom IP address configuration for `type: NodePort` Services {#service-nodeport-custom-listen-address} diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 797e123e55c..6e1ecef7ea1 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -186,8 +186,6 @@ For a reference to old feature gates that are removed, please refer to | `SELinuxMountReadWriteOncePod` | `false` | Alpha | 1.25 | 1.26 | | `SELinuxMountReadWriteOncePod` | `true` | Beta | 1.27 | | | `SecurityContextDeny` | `false` | Alpha | 1.27 | | -| `ServiceNodePortStaticSubrange` | `false` | Alpha | 1.27 | 1.27 | -| `ServiceNodePortStaticSubrange` | `true` | Beta | 1.28 | | | `SidecarContainers` | `false` | Alpha | 1.28 | | | `SizeMemoryBackedVolumes` | `false` | Alpha | 1.20 | 1.21 | | `SizeMemoryBackedVolumes` | `true` | Beta | 1.22 | | @@ -341,6 +339,9 @@ For a reference to old feature gates that are removed, please refer to | `ServiceInternalTrafficPolicy` | `false` | Alpha | 1.21 | 1.21 | | `ServiceInternalTrafficPolicy` | `true` | Beta | 1.22 | 1.25 | | `ServiceInternalTrafficPolicy` | `true` | GA | 1.26 | - | +| `ServiceNodePortStaticSubrange` | `false` | Alpha | 1.27 | 1.27 | +| `ServiceNodePortStaticSubrange` | `true` | Beta | 1.28 | 1.28 | +| `ServiceNodePortStaticSubrange` | `true` | GA | 1.29 | - | | `TopologyManager` | `false` | Alpha | 1.16 | 1.17 | | `TopologyManager` | `true` | Beta | 1.18 | 1.26 | | `TopologyManager` | `true` | GA | 1.27 | - | From ac5112ef0c842c2fa4ad58bb35a9bbbf07bdddf0 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Wojciech=20Tyczy=C5=84ski?= Date: Thu, 31 Aug 2023 08:20:16 +0200 Subject: [PATCH 04/82] Graduate APIListChunking to GA documentation --- .../command-line-tools-reference/feature-gates.md | 7 ++++--- content/en/docs/reference/using-api/api-concepts.md | 6 ++---- 2 files changed, 6 insertions(+), 7 deletions(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 797e123e55c..2987460453f 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -56,8 +56,6 @@ For a reference to old feature gates that are removed, please refer to | Feature | Default | Stage | Since | Until | |---------|---------|-------|-------|-------| -| `APIListChunking` | `false` | Alpha | 1.8 | 1.8 | -| `APIListChunking` | `true` | Beta | 1.9 | | | `APIPriorityAndFairness` | `false` | Alpha | 1.18 | 1.19 | | `APIPriorityAndFairness` | `true` | Beta | 1.20 | | | `APIResponseCompression` | `false` | Alpha | 1.7 | 1.15 | @@ -225,6 +223,9 @@ For a reference to old feature gates that are removed, please refer to | Feature | Default | Stage | Since | Until | |---------|---------|-------|-------|-------| +| `APIListChunking` | `false` | Alpha | 1.8 | 1.8 | +| `APIListChunking` | `true` | Beta | 1.9 | 1.28 | +| `APIListChunking` | `true` | GA | 1.29 | - | | `APISelfSubjectReview` | `false` | Alpha | 1.26 | 1.26 | | `APISelfSubjectReview` | `true` | Beta | 1.27 | 1.27 | | `APISelfSubjectReview` | `true` | GA | 1.28 | - | @@ -805,4 +806,4 @@ Each feature gate is designed for enabling/disabling a specific feature: feature, you will also need to enable any associated API resources. For example, to enable a particular resource like `storage.k8s.io/v1beta1/csistoragecapacities`, set `--runtime-config=storage.k8s.io/v1beta1/csistoragecapacities`. - See [API Versioning](/docs/reference/using-api/#api-versioning) for more details on the command line flags. \ No newline at end of file + See [API Versioning](/docs/reference/using-api/#api-versioning) for more details on the command line flags. diff --git a/content/en/docs/reference/using-api/api-concepts.md b/content/en/docs/reference/using-api/api-concepts.md index e6d973a1d5a..a29ce4283c7 100644 --- a/content/en/docs/reference/using-api/api-concepts.md +++ b/content/en/docs/reference/using-api/api-concepts.md @@ -316,7 +316,7 @@ The `content-encoding` header indicates that the response is compressed with `gz ## Retrieving large results sets in chunks -{{< feature-state for_k8s_version="v1.9" state="beta" >}} +{{< feature-state for_k8s_version="v1.29" state="stable" >}} On large clusters, retrieving the collection of some resource types may result in very large responses that can impact the server and client. For instance, a cluster @@ -324,9 +324,7 @@ may have tens of thousands of Pods, each of which is equivalent to roughly 2 KiB encoded JSON. Retrieving all pods across all namespaces may result in a very large response (10-20MB) and consume a large amount of server resources. -Provided that you don't explicitly disable the `APIListChunking` -[feature gate](/docs/reference/command-line-tools-reference/feature-gates/), the -Kubernetes API server supports the ability to break a single large collection request +The Kubernetes API server supports the ability to break a single large collection request into many smaller chunks while preserving the consistency of the total request. Each chunk can be returned sequentially which reduces both the total size of the request and allows user-oriented clients to display results incrementally to improve responsiveness. From d26e66def0f22befb20e228ba29842d2e5e8e45f Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Arda=20G=C3=BC=C3=A7l=C3=BC?= Date: Wed, 6 Sep 2023 09:04:42 +0300 Subject: [PATCH 05/82] Remove alpha environment variable because feature is in beta --- content/en/docs/reference/kubectl/kubectl.md | 8 -------- 1 file changed, 8 deletions(-) diff --git a/content/en/docs/reference/kubectl/kubectl.md b/content/en/docs/reference/kubectl/kubectl.md index 8d6e8aae0c3..aa92d9f9685 100644 --- a/content/en/docs/reference/kubectl/kubectl.md +++ b/content/en/docs/reference/kubectl/kubectl.md @@ -369,14 +369,6 @@ kubectl [flags] - -KUBECTL_INTERACTIVE_DELETE - - -When set to true, the --interactive flag in the kubectl delete command will be activated, allowing users to preview and confirm resources before proceeding to delete by passing this flag. - - - From 38baef2152d280d01036ef60e884e9caa4fee07b Mon Sep 17 00:00:00 2001 From: Kat Cosgrove Date: Sun, 17 Sep 2023 12:56:33 +0100 Subject: [PATCH 06/82] updates hugo.toml for 1.29 release --- hugo.toml | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/hugo.toml b/hugo.toml index d8d116205ca..2ac642f6bc5 100644 --- a/hugo.toml +++ b/hugo.toml @@ -138,9 +138,9 @@ time_format_default = "January 02, 2006 at 3:04 PM PST" description = "Production-Grade Container Orchestration" showedit = true -latest = "v1.28" +latest = "v1.29" -version = "v1.28" +version = "v1.29" githubbranch = "main" docsbranch = "main" deprecated = false @@ -180,11 +180,17 @@ js = [ ] [[params.versions]] -version = "v1.28" -githubbranch = "v1.28.0" +version = "v1.29" +githubbranch = "v1.29.0" docsbranch = "main" url = "https://kubernetes.io" +[[params.versions]] +version = "v1.28" +githubbranch = "v1.28.2" +docsbranch = "release-1.28" +url = "https://v1-28.docs.kubernetes.io" + [[params.versions]] version = "v1.27" githubbranch = "v1.27.4" @@ -203,12 +209,6 @@ githubbranch = "v1.25.12" docsbranch = "release-1.25" url = "https://v1-25.docs.kubernetes.io" -[[params.versions]] -version = "v1.24" -githubbranch = "v1.24.16" -docsbranch = "release-1.24" -url = "https://v1-24.docs.kubernetes.io" - # User interface configuration [params.ui] # Enable to show the side bar menu in its compact state. From ba282349285bca35661b1da78761621df268cbe0 Mon Sep 17 00:00:00 2001 From: Dan Winship Date: Mon, 4 Sep 2023 09:42:04 -0400 Subject: [PATCH 07/82] update CloudDualStackNodeIPs to beta --- .../services-networking/dual-stack.md | 19 ++++++------------- .../feature-gates.md | 3 ++- 2 files changed, 8 insertions(+), 14 deletions(-) diff --git a/content/en/docs/concepts/services-networking/dual-stack.md b/content/en/docs/concepts/services-networking/dual-stack.md index b9e943daa54..e4aa246b552 100644 --- a/content/en/docs/concepts/services-networking/dual-stack.md +++ b/content/en/docs/concepts/services-networking/dual-stack.md @@ -65,12 +65,12 @@ To configure IPv4/IPv6 dual-stack, set dual-stack cluster network assignments: * kube-proxy: * `--cluster-cidr=,` * kubelet: - * when there is no `--cloud-provider` the administrator can pass a comma-separated pair of IP - addresses via `--node-ip` to manually configure dual-stack `.status.addresses` for that Node. - If a Pod runs on that node in HostNetwork mode, the Pod reports these IP addresses in its - `.status.podIPs` field. - All `podIPs` in a node match the IP family preference defined by the `.status.addresses` - field for that Node. + * `--node-ip=,` + * This option is required for bare metal dual-stack nodes (nodes that do not define a + cloud provider with the `--cloud-provider` flag). If you are using a cloud provider + and choose to override the node IPs chosen by the cloud provider, set the + `--node-ip` option. + * (The legacy built-in cloud providers do not support dual-stack `--node-ip`.) {{< note >}} An example of an IPv4 CIDR: `10.244.0.0/16` (though you would supply your own address range) @@ -79,13 +79,6 @@ An example of an IPv6 CIDR: `fdXY:IJKL:MNOP:15::/64` (this shows the format but address - see [RFC 4193](https://tools.ietf.org/html/rfc4193)) {{< /note >}} -{{< feature-state for_k8s_version="v1.27" state="alpha" >}} - -When using an external cloud provider, you can pass a dual-stack `--node-ip` value to -kubelet if you enable the `CloudDualStackNodeIPs` feature gate in both kubelet and the -external cloud provider. This is only supported for cloud providers that support dual -stack clusters. - ## Services You can create {{< glossary_tooltip text="Services" term_id="service" >}} which can use IPv4, IPv6, or both. diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 797e123e55c..d881f5acf32 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -83,7 +83,8 @@ For a reference to old feature gates that are removed, please refer to | `CSINodeExpandSecret` | `true` | Beta | 1.27 | | | `CSIVolumeHealth` | `false` | Alpha | 1.21 | | | `CloudControllerManagerWebhook` | false | Alpha | 1.27 | | -| `CloudDualStackNodeIPs` | false | Alpha | 1.27 | | +| `CloudDualStackNodeIPs` | false | Alpha | 1.27 | 1.28 | +| `CloudDualStackNodeIPs` | true | Beta | 1.29 | | | `ClusterTrustBundle` | false | Alpha | 1.27 | | | `ComponentSLIs` | `false` | Alpha | 1.26 | 1.26 | | `ComponentSLIs` | `true` | Beta | 1.27 | | From 3768e3f81389d86e4a4b0934a38f0a4a109ae14b Mon Sep 17 00:00:00 2001 From: SataQiu Date: Mon, 16 Oct 2023 09:51:08 +0800 Subject: [PATCH 08/82] move feature gates CronJobTimeZone, JobMutableNodeSchedulingDirectives and LegacyServiceAccountTokenNoAutoGeneration to feature-gates-removed page --- .../feature-gates-removed.md | 23 +++++++++++++++---- .../feature-gates.md | 12 ---------- 2 files changed, 19 insertions(+), 16 deletions(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md b/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md index 75cf88333d0..5882f09a4f0 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md @@ -120,6 +120,9 @@ In the following table: | `CronJobControllerV2` | `false` | Alpha | 1.20 | 1.20 | | `CronJobControllerV2` | `true` | Beta | 1.21 | 1.21 | | `CronJobControllerV2` | `true` | GA | 1.22 | 1.23 | +| `CronJobTimeZone` | `false` | Alpha | 1.24 | 1.24 | +| `CronJobTimeZone` | `true` | Beta | 1.25 | 1.26 | +| `CronJobTimeZone` | `true` | GA | 1.27 | 1.28 | | `CustomPodDNS` | `false` | Alpha | 1.9 | 1.9 | | `CustomPodDNS` | `true` | Beta| 1.10 | 1.13 | | `CustomPodDNS` | `true` | GA | 1.14 | 1.16 | @@ -228,6 +231,8 @@ In the following table: | `IngressClassNamespacedParams` | `true` | GA | 1.23 | 1.24 | | `Initializers` | `false` | Alpha | 1.7 | 1.13 | | `Initializers` | - | Deprecated | 1.14 | 1.14 | +| `JobMutableNodeSchedulingDirectives` | `true` | Beta | 1.23 | 1.26 | +| `JobMutableNodeSchedulingDirectives` | `true` | GA | 1.27 | 1.28 | | `KMSv1` | `true` | Deprecated | 1.28 | | | `KubeletConfigFile` | `false` | Alpha | 1.8 | 1.9 | | `KubeletConfigFile` | - | Deprecated | 1.10 | 1.10 | @@ -240,6 +245,8 @@ In the following table: | `LegacyNodeRoleBehavior` | `false` | Alpha | 1.16 | 1.18 | | `LegacyNodeRoleBehavior` | `true` | Beta | 1.19 | 1.20 | | `LegacyNodeRoleBehavior` | `false` | GA | 1.21 | 1.22 | +| `LegacyServiceAccountTokenNoAutoGeneration` | `true` | Beta | 1.24 | 1.25 | +| `LegacyServiceAccountTokenNoAutoGeneration` | `true` | GA | 1.26 | 1.28 | | `LocalStorageCapacityIsolation` | `false` | Alpha | 1.7 | 1.9 | | `LocalStorageCapacityIsolation` | `true` | Beta | 1.10 | 1.24 | | `LocalStorageCapacityIsolation` | `true` | GA | 1.25 | 1.26 | @@ -591,10 +598,6 @@ In the following table: [Configure volume permission and ownership change policy for Pods](/docs/tasks/configure-pod-container/security-context/#configure-volume-permission-and-ownership-change-policy-for-pods) for more details. -- `CronJobControllerV2`: Use an alternative implementation of the - {{< glossary_tooltip text="CronJob" term_id="cronjob" >}} controller. Otherwise, - version 1 of the same controller is selected. - - `ControllerManagerLeaderMigration`: Enables Leader Migration for [kube-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#initial-leader-migration-configuration) and [cloud-controller-manager](/docs/tasks/administer-cluster/controller-manager-leader-migration/#deploy-cloud-controller-manager) @@ -602,6 +605,12 @@ In the following table: controllers from the kube-controller-manager into an external controller-manager (e.g. the cloud-controller-manager) in an HA cluster without downtime. +- `CronJobControllerV2`: Use an alternative implementation of the + {{< glossary_tooltip text="CronJob" term_id="cronjob" >}} controller. Otherwise, + version 1 of the same controller is selected. + +- `CronJobTimeZone`: Allow the use of the `timeZone` optional field in [CronJobs](/docs/concepts/workloads/controllers/cron-jobs/) + - `CustomPodDNS`: Enable customizing the DNS settings for a Pod using its `dnsConfig` property. Check [Pod's DNS Config](/docs/concepts/services-networking/dns-pod-service/#pods-dns-config) for more details. @@ -731,6 +740,9 @@ In the following table: - `Initializers`: Allow asynchronous coordination of object creation using the Initializers admission plugin. +- `JobMutableNodeSchedulingDirectives`: Allows updating node scheduling directives in + the pod template of [Job](/docs/concepts/workloads/controllers/job). + - `KubeletConfigFile`: Enable loading kubelet configuration from a file specified using a config file. See [setting kubelet parameters via a config file](/docs/tasks/administer-cluster/kubelet-config-file/) @@ -746,6 +758,9 @@ In the following table: node disruption will ignore the `node-role.kubernetes.io/master` label in favor of the feature-specific labels provided by `NodeDisruptionExclusion` and `ServiceNodeExclusion`. +- `LegacyServiceAccountTokenNoAutoGeneration`: Stop auto-generation of Secret-based + [service account tokens](/docs/concepts/security/service-accounts/#get-a-token). + - `LocalStorageCapacityIsolation`: Enable the consumption of [local ephemeral storage](/docs/concepts/configuration/manage-resources-containers/) and also the `sizeLimit` property of an diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 5ef1c5aa875..8701ea7d4fe 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -245,9 +245,6 @@ For a reference to old feature gates that are removed, please refer to | `CSIMigrationvSphere` | `true` | Beta | 1.25 | 1.25 | | `CSIMigrationvSphere` | `true` | GA | 1.26 | - | | `ConsistentHTTPGetHandlers` | `true` | GA | 1.25 | - | -| `CronJobTimeZone` | `false` | Alpha | 1.24 | 1.24 | -| `CronJobTimeZone` | `true` | Beta | 1.25 | 1.26 | -| `CronJobTimeZone` | `true` | GA | 1.27 | - | | `DaemonSetUpdateSurge` | `false` | Alpha | 1.21 | 1.21 | | `DaemonSetUpdateSurge` | `true` | Beta | 1.22 | 1.24 | | `DaemonSetUpdateSurge` | `true` | GA | 1.25 | | @@ -273,8 +270,6 @@ For a reference to old feature gates that are removed, please refer to | `IPTablesOwnershipCleanup` | `true` | GA | 1.28 | | | `InTreePluginRBDUnregister` | `false` | Alpha | 1.23 | 1.27 | | `InTreePluginRBDUnregister` | `false` | Deprecated | 1.28 | | -| `JobMutableNodeSchedulingDirectives` | `true` | Beta | 1.23 | 1.26 | -| `JobMutableNodeSchedulingDirectives` | `true` | GA | 1.27 | | | `JobTrackingWithFinalizers` | `false` | Alpha | 1.22 | 1.22 | | `JobTrackingWithFinalizers` | `false` | Beta | 1.23 | 1.24 | | `JobTrackingWithFinalizers` | `true` | Beta | 1.25 | 1.25 | @@ -286,8 +281,6 @@ For a reference to old feature gates that are removed, please refer to | `KubeletPodResourcesGetAllocatable` | `false` | Alpha | 1.21 | 1.22 | | `KubeletPodResourcesGetAllocatable` | `true` | Beta | 1.23 | 1.27 | | `KubeletPodResourcesGetAllocatable` | `true` | GA | 1.28 | | -| `LegacyServiceAccountTokenNoAutoGeneration` | `true` | Beta | 1.24 | 1.25 | -| `LegacyServiceAccountTokenNoAutoGeneration` | `true` | GA | 1.26 | | | `LegacyServiceAccountTokenTracking` | `false` | Alpha | 1.26 | 1.26 | | `LegacyServiceAccountTokenTracking` | `true` | Beta | 1.27 | 1.27 | | `LegacyServiceAccountTokenTracking` | `true` | GA | 1.28 | | @@ -457,7 +450,6 @@ Each feature gate is designed for enabling/disabling a specific feature: - `CronJobsScheduledAnnotation`: Set the scheduled job time as an {{< glossary_tooltip text="annotation" term_id="annotation" >}} on Jobs that were created on behalf of a CronJob. -- `CronJobTimeZone`: Allow the use of the `timeZone` optional field in [CronJobs](/docs/concepts/workloads/controllers/cron-jobs/) - `CRDValidationRatcheting`: Enable updates to custom resources to contain violations of their OpenAPI schema if the offending portions of the resource update did not change. See [Validation Ratcheting](/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-ratcheting) for more details. @@ -550,8 +542,6 @@ Each feature gate is designed for enabling/disabling a specific feature: and volume controllers. - `InTreePluginvSphereUnregister`: Stops registering the vSphere in-tree plugin in kubelet and volume controllers. -- `JobMutableNodeSchedulingDirectives`: Allows updating node scheduling directives in - the pod template of [Job](/docs/concepts/workloads/controllers/job). - `JobBackoffLimitPerIndex`: Allows specifying the maximal number of pod retries per index in Indexed jobs. - `JobPodFailurePolicy`: Allow users to specify handling of pod failures based on container @@ -604,8 +594,6 @@ Each feature gate is designed for enabling/disabling a specific feature: When enabled, kubelet CRI interface and authenticated http servers are instrumented to generate OpenTelemetry trace spans. See [Traces for Kubernetes System Components](/docs/concepts/cluster-administration/system-traces) for more details. -- `LegacyServiceAccountTokenNoAutoGeneration`: Stop auto-generation of Secret-based - [service account tokens](/docs/concepts/security/service-accounts/#get-a-token). - `LegacyServiceAccountTokenCleanUp`: Enable cleaning up Secret-based [service account tokens](/docs/concepts/security/service-accounts/#get-a-token) when they are not used in a specified time (default to be one year). From d83c806f370ee2cfc70932d0df070ef682e11a28 Mon Sep 17 00:00:00 2001 From: Aohan Yang Date: Fri, 13 Oct 2023 14:57:56 +0800 Subject: [PATCH 09/82] add doc for feature LoadBalancerIPMode --- .../concepts/services-networking/service.md | 22 +++++++++++++++++++ .../feature-gates.md | 4 ++++ 2 files changed, 26 insertions(+) diff --git a/content/en/docs/concepts/services-networking/service.md b/content/en/docs/concepts/services-networking/service.md index 4cbf42455ad..9ca2cd7f6d6 100644 --- a/content/en/docs/concepts/services-networking/service.md +++ b/content/en/docs/concepts/services-networking/service.md @@ -666,6 +666,28 @@ The value of `spec.loadBalancerClass` must be a label-style identifier, with an optional prefix such as "`internal-vip`" or "`example.com/internal-vip`". Unprefixed names are reserved for end-users. +#### Specifying IPMode of load balancer status {#load-balancer-ip-mode} + +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} + +Starting as Alpha in Kubernetes 1.29, +a [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) +named `LoadBalancerIPMode` allows you to set the `.status.loadBalancer.ingress.ipMode` +for a Service with `type` set to `LoadBalancer`. +The `.status.loadBalancer.ingress.ipMode` specifies how the load-balancer IP behaves. +It may be specified only when the `.status.loadBalancer.ingress.ip` field is also specified. + +There are two possible values for `.status.loadBalancer.ingress.ipMode`: "VIP" and "Proxy". +The default value is "VIP" meaning that traffic is delivered to the node +with the destination set to the load-balancer's IP and port. +There are two cases when setting this to "Proxy", depending on how the load-balancer +from the cloud provider delivers the traffics: + +- If the traffic is delivered to the node then DNATed to the pod, the destination would be set to the node's IP and node port; +- If the traffic is delivered directly to the pod, the destination would be set to the pod's IP and port. + +Service implementations may use this information to adjust traffic routing. + #### Internal load balancer In a mixed environment it is sometimes necessary to route traffic from Services inside the same diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 5ef1c5aa875..6fe21c711a5 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -135,6 +135,7 @@ For a reference to old feature gates that are removed, please refer to | `KubeletTracing` | `false` | Alpha | 1.25 | 1.26 | | `KubeletTracing` | `true` | Beta | 1.27 | | | `LegacyServiceAccountTokenCleanUp` | `false` | Alpha | 1.28 | | +| `LoadBalancerIPMode` | `false` | Alpha | 1.29 | | | `LocalStorageCapacityIsolationFSQuotaMonitoring` | `false` | Alpha | 1.15 | - | | `LogarithmicScaleDown` | `false` | Alpha | 1.21 | 1.21 | | `LogarithmicScaleDown` | `true` | Beta | 1.22 | | @@ -611,6 +612,9 @@ Each feature gate is designed for enabling/disabling a specific feature: when they are not used in a specified time (default to be one year). - `LegacyServiceAccountTokenTracking`: Track usage of Secret-based [service account tokens](/docs/concepts/security/service-accounts/#get-a-token). +- `LoadBalancerIPMode`: Allows setting `ipMode` for Services where `type` is set to `LoadBalancer`. + See [Specifying IPMode of load balancer status](/docs/concepts/services-networking/service/#load-balancer-ip-mode) + for more information. - `LocalStorageCapacityIsolationFSQuotaMonitoring`: When `LocalStorageCapacityIsolation` is enabled for [local ephemeral storage](/docs/concepts/configuration/manage-resources-containers/) From 1c73d4a8ce640bb15180a83bc54433403ec1ae43 Mon Sep 17 00:00:00 2001 From: Dave Chen Date: Mon, 16 Oct 2023 19:06:33 +0800 Subject: [PATCH 10/82] Introduce of the deprecated FG: MergeCLIArgumentsWithConfig --- .../en/docs/reference/setup-tools/kubeadm/kubeadm-init.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md index a766250ef1f..811bac73646 100644 --- a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md +++ b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md @@ -189,6 +189,7 @@ List of deprecated feature gates: Feature | Default :-------|:-------- `UpgradeAddonsBeforeControlPlane` | `false` +`MergeCLIArgumentsWithConfig` | `false` {{< /table >}} Feature gate descriptions: @@ -207,6 +208,13 @@ instance is upgraded. The deprecated `UpgradeAddonsBeforeControlPlane` feature g behavior. You should not need this old behavior; if you do, you should consider changing your cluster or upgrade processes, as this feature gate will be removed in a future release. + +`MergeCLIArgumentsWithConfig` +: This feature gate is introduced in Kubernetes v1.29 and defaults to `false`. Enabling this feature gate will merge the value passed +by the `--ignore-preflight-errors` flag with the value of `ignorePreflightErrors` defined in the config file, which is the default behavior +in v1.28 and earlier release. The default value `false` means that if the `--ignore-preflight-errors` flag is set and `ignorePreflightErrors` +is specified in the config file, the value defined in the config file will be ignored. + List of removed feature gates: {{< table caption="kubeadm removed feature gates" >}} From 280a9335dae5ccc51a28ced62ee39546e11ecfba Mon Sep 17 00:00:00 2001 From: shubham82 Date: Tue, 17 Oct 2023 13:43:38 +0530 Subject: [PATCH 11/82] Remove RetroactiveDefaultStorageClass feature gate. --- .../command-line-tools-reference/feature-gates-removed.md | 5 +++++ .../reference/command-line-tools-reference/feature-gates.md | 4 ---- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md b/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md index 75cf88333d0..9420c98e305 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md @@ -303,6 +303,9 @@ In the following table: | `ResourceQuotaScopeSelectors` | `false` | Alpha | 1.11 | 1.11 | | `ResourceQuotaScopeSelectors` | `true` | Beta | 1.12 | 1.16 | | `ResourceQuotaScopeSelectors` | `true` | GA | 1.17 | 1.18 | +| `RetroactiveDefaultStorageClass` | `false` | Alpha | 1.25 | 1.25 | +| `RetroactiveDefaultStorageClass` | `true` | Beta | 1.26 | 1.27 | +| `RetroactiveDefaultStorageClass` | `true` | GA | 1.28 | 1.28 | | `RootCAConfigMap` | `false` | Alpha | 1.13 | 1.19 | | `RootCAConfigMap` | `true` | Beta | 1.20 | 1.20 | | `RootCAConfigMap` | `true` | GA | 1.21 | 1.22 | @@ -818,6 +821,8 @@ In the following table: - `ResourceQuotaScopeSelectors`: Enable resource quota scope selectors. +- `RetroactiveDefaultStorageClass`: Allow assigning StorageClass to unbound PVCs retroactively. + - `RootCAConfigMap`: Configure the `kube-controller-manager` to publish a {{< glossary_tooltip text="ConfigMap" term_id="configmap" >}} named `kube-root-ca.crt` to every namespace. This ConfigMap contains a CA bundle used for verifying connections diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 5ef1c5aa875..dca0ac0d28d 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -310,9 +310,6 @@ For a reference to old feature gates that are removed, please refer to | `RemoveSelfLink` | `false` | Alpha | 1.16 | 1.19 | | `RemoveSelfLink` | `true` | Beta | 1.20 | 1.23 | | `RemoveSelfLink` | `true` | GA | 1.24 | | -| `RetroactiveDefaultStorageClass` | `false` | Alpha | 1.25 | 1.25 | -| `RetroactiveDefaultStorageClass` | `true` | Beta | 1.26 | 1.27 | -| `RetroactiveDefaultStorageClass` | `true` | GA | 1.28 | | | `SeccompDefault` | `false` | Alpha | 1.22 | 1.24 | | `SeccompDefault` | `true` | Beta | 1.25 | 1.26 | | `SeccompDefault` | `true` | GA | 1.27 | - | @@ -703,7 +700,6 @@ Each feature gate is designed for enabling/disabling a specific feature: objects and collections. This field has been deprecated since the Kubernetes v1.16 release. When this feature is enabled, the `.metadata.selfLink` field remains part of the Kubernetes API, but is always unset. -- `RetroactiveDefaultStorageClass`: Allow assigning StorageClass to unbound PVCs retroactively. - `RotateKubeletServerCertificate`: Enable the rotation of the server TLS certificate on the kubelet. See [kubelet configuration](/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/#kubelet-configuration) for more details. From a7241452dbbaa366ae93da5997fb13f766597621 Mon Sep 17 00:00:00 2001 From: "Dave (Wei) Chen" Date: Tue, 17 Oct 2023 21:51:41 +0800 Subject: [PATCH 12/82] Revert "Introduce of the deprecated FG: MergeCLIArgumentsWithConfig" --- .../en/docs/reference/setup-tools/kubeadm/kubeadm-init.md | 8 -------- 1 file changed, 8 deletions(-) diff --git a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md index 811bac73646..a766250ef1f 100644 --- a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md +++ b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md @@ -189,7 +189,6 @@ List of deprecated feature gates: Feature | Default :-------|:-------- `UpgradeAddonsBeforeControlPlane` | `false` -`MergeCLIArgumentsWithConfig` | `false` {{< /table >}} Feature gate descriptions: @@ -208,13 +207,6 @@ instance is upgraded. The deprecated `UpgradeAddonsBeforeControlPlane` feature g behavior. You should not need this old behavior; if you do, you should consider changing your cluster or upgrade processes, as this feature gate will be removed in a future release. - -`MergeCLIArgumentsWithConfig` -: This feature gate is introduced in Kubernetes v1.29 and defaults to `false`. Enabling this feature gate will merge the value passed -by the `--ignore-preflight-errors` flag with the value of `ignorePreflightErrors` defined in the config file, which is the default behavior -in v1.28 and earlier release. The default value `false` means that if the `--ignore-preflight-errors` flag is set and `ignorePreflightErrors` -is specified in the config file, the value defined in the config file will be ignored. - List of removed feature gates: {{< table caption="kubeadm removed feature gates" >}} From d485edf7fe19dee32e70ae28c0cf8d9a5b77b52d Mon Sep 17 00:00:00 2001 From: Roman Bednar Date: Tue, 17 Oct 2023 16:20:00 +0200 Subject: [PATCH 13/82] graduate PersistentVolumeLastPhaseTransitionTime to beta in v1.29 --- content/en/docs/concepts/storage/persistent-volumes.md | 2 +- .../reference/command-line-tools-reference/feature-gates.md | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/content/en/docs/concepts/storage/persistent-volumes.md b/content/en/docs/concepts/storage/persistent-volumes.md index 2d7c00b7b71..13b81f9c1f7 100644 --- a/content/en/docs/concepts/storage/persistent-volumes.md +++ b/content/en/docs/concepts/storage/persistent-volumes.md @@ -766,7 +766,7 @@ You can see the name of the PVC bound to the PV using `kubectl describe persiste #### Phase transition timestamp -{{< feature-state for_k8s_version="v1.28" state="alpha" >}} +{{< feature-state for_k8s_version="v1.29" state="beta" >}} The `.status` field for a PersistentVolume can include an alpha `lastPhaseTransitionTime` field. This field records the timestamp of when the volume last transitioned its phase. For newly created diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 8701ea7d4fe..1d104bac5df 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -162,7 +162,8 @@ For a reference to old feature gates that are removed, please refer to | `OpenAPIEnums` | `true` | Beta | 1.24 | | | `PDBUnhealthyPodEvictionPolicy` | `false` | Alpha | 1.26 | 1.26 | | `PDBUnhealthyPodEvictionPolicy` | `true` | Beta | 1.27 | | -| `PersistentVolumeLastPhaseTransistionTime` | `false` | Alpha | 1.28 | | +| `PersistentVolumeLastPhaseTransistionTime` | `false` | Alpha | 1.28 | 1.28 | +| `PersistentVolumeLastPhaseTransistionTime` | `true` | Beta | 1.29 | | | `PodAndContainerStatsFromCRI` | `false` | Alpha | 1.23 | | | `PodDeletionCost` | `false` | Alpha | 1.21 | 1.21 | | `PodDeletionCost` | `true` | Beta | 1.22 | | From abb8d0bd350fe1013282216ad8f95050fcdd2a0a Mon Sep 17 00:00:00 2001 From: Paco Xu Date: Wed, 18 Oct 2023 17:14:43 +0800 Subject: [PATCH 14/82] remove GAed FG DownwardAPIHugePages --- .../command-line-tools-reference/feature-gates-removed.md | 7 +++++++ .../command-line-tools-reference/feature-gates.md | 6 ------ 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md b/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md index 5882f09a4f0..5d9843ab0f5 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md @@ -156,6 +156,10 @@ In the following table: | `DisableAcceleratorUsageMetrics` | `false` | Alpha | 1.19 | 1.19 | | `DisableAcceleratorUsageMetrics` | `true` | Beta | 1.20 | 1.24 | | `DisableAcceleratorUsageMetrics` | `true` | GA | 1.25 | 1.27 | +| `DownwardAPIHugePages` | `false` | Alpha | 1.20 | 1.20 | +| `DownwardAPIHugePages` | `false` | Beta | 1.21 | 1.21 | +| `DownwardAPIHugePages` | `true` | Beta | 1.22 | 1.26 | +| `DownwardAPIHugePages` | `true` | GA | 1.27 | 1.29 | | `DryRun` | `false` | Alpha | 1.12 | 1.12 | | `DryRun` | `true` | Beta | 1.13 | 1.18 | | `DryRun` | `true` | GA | 1.19 | 1.27 | @@ -645,6 +649,9 @@ In the following table: - `DisableAcceleratorUsageMetrics`: [Disable accelerator metrics collected by the kubelet](/docs/concepts/cluster-administration/system-metrics/#disable-accelerator-metrics). +- `DownwardAPIHugePages`: Enables usage of hugepages in + [downward API](/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information). + - `DryRun`: Enable server-side [dry run](/docs/reference/using-api/api-concepts/#dry-run) requests so that validation, merging, and mutation can be tested without committing. diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index cbe767656e7..01dde16f7f3 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -250,10 +250,6 @@ For a reference to old feature gates that are removed, please refer to | `DaemonSetUpdateSurge` | `true` | Beta | 1.22 | 1.24 | | `DaemonSetUpdateSurge` | `true` | GA | 1.25 | | | `DefaultHostNetworkHostPortsInPodTemplates` | `false` | Deprecated | 1.28 | | -| `DownwardAPIHugePages` | `false` | Alpha | 1.20 | 1.20 | -| `DownwardAPIHugePages` | `false` | Beta | 1.21 | 1.21 | -| `DownwardAPIHugePages` | `true` | Beta | 1.22 | 1.26 | -| `DownwardAPIHugePages` | `true` | GA | 1.27 | | | `EfficientWatchResumption` | `false` | Alpha | 1.20 | 1.20 | | `EfficientWatchResumption` | `true` | Beta | 1.21 | 1.23 | | `EfficientWatchResumption` | `true` | GA | 1.24 | | @@ -477,8 +473,6 @@ Each feature gate is designed for enabling/disabling a specific feature: component flag. - `DisableKubeletCloudCredentialProviders`: Disable the in-tree functionality in kubelet to authenticate to a cloud provider container registry for image pull credentials. -- `DownwardAPIHugePages`: Enables usage of hugepages in - [downward API](/docs/tasks/inject-data-application/downward-api-volume-expose-pod-information). - `DynamicResourceAllocation`: Enables support for resources with custom parameters and a lifecycle that is independent of a Pod. - `ElasticIndexedJob`: Enables Indexed Jobs to be scaled up or down by mutating both From bea7fa8a69b5fc729c866051063fd81ed7432631 Mon Sep 17 00:00:00 2001 From: Paco Xu Date: Wed, 18 Oct 2023 18:26:45 +0800 Subject: [PATCH 15/82] Update feature-gates-removed.md Co-authored-by: Shubham --- .../command-line-tools-reference/feature-gates-removed.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md b/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md index 5d9843ab0f5..7574cf605bf 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md @@ -159,7 +159,7 @@ In the following table: | `DownwardAPIHugePages` | `false` | Alpha | 1.20 | 1.20 | | `DownwardAPIHugePages` | `false` | Beta | 1.21 | 1.21 | | `DownwardAPIHugePages` | `true` | Beta | 1.22 | 1.26 | -| `DownwardAPIHugePages` | `true` | GA | 1.27 | 1.29 | +| `DownwardAPIHugePages` | `true` | GA | 1.27 | 1.28 | | `DryRun` | `false` | Alpha | 1.12 | 1.12 | | `DryRun` | `true` | Beta | 1.13 | 1.18 | | `DryRun` | `true` | GA | 1.19 | 1.27 | From 53a8725ba7d2d3f8ede6b9c83d0ffc1c8586f3f3 Mon Sep 17 00:00:00 2001 From: Han Kang Date: Wed, 18 Oct 2023 10:00:09 -0700 Subject: [PATCH 16/82] update documentation for component-slis --- .../reference/command-line-tools-reference/feature-gates.md | 5 +++-- content/en/docs/reference/instrumentation/slis.md | 2 +- 2 files changed, 4 insertions(+), 3 deletions(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 01dde16f7f3..b5942d51830 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -85,8 +85,6 @@ For a reference to old feature gates that are removed, please refer to | `CloudDualStackNodeIPs` | false | Alpha | 1.27 | 1.28 | | `CloudDualStackNodeIPs` | true | Beta | 1.29 | | | `ClusterTrustBundle` | false | Alpha | 1.27 | | -| `ComponentSLIs` | `false` | Alpha | 1.26 | 1.26 | -| `ComponentSLIs` | `true` | Beta | 1.27 | | | `ConsistentListFromCache` | `false` | Alpha | 1.28 | | `ContainerCheckpoint` | `false` | Alpha | 1.25 | | | `ContextualLogging` | `false` | Alpha | 1.24 | | @@ -245,6 +243,9 @@ For a reference to old feature gates that are removed, please refer to | `CSIMigrationvSphere` | `false` | Beta | 1.19 | 1.24 | | `CSIMigrationvSphere` | `true` | Beta | 1.25 | 1.25 | | `CSIMigrationvSphere` | `true` | GA | 1.26 | - | +| `ComponentSLIs` | `false` | Alpha | 1.26 | 1.26 | +| `ComponentSLIs` | `true` | Beta | 1.27 | 1.28| +| `ComponentSLIs` | `true` | GA | 1.29 | - | | `ConsistentHTTPGetHandlers` | `true` | GA | 1.25 | - | | `DaemonSetUpdateSurge` | `false` | Alpha | 1.21 | 1.21 | | `DaemonSetUpdateSurge` | `true` | Beta | 1.22 | 1.24 | diff --git a/content/en/docs/reference/instrumentation/slis.md b/content/en/docs/reference/instrumentation/slis.md index 3b559a398c9..e520d0a9344 100644 --- a/content/en/docs/reference/instrumentation/slis.md +++ b/content/en/docs/reference/instrumentation/slis.md @@ -9,7 +9,7 @@ weight: 20 -{{< feature-state for_k8s_version="v1.27" state="beta" >}} +{{< feature-state for_k8s_version="v1.29" state="stable" >}} By default, Kubernetes {{< skew currentVersion >}} publishes Service Level Indicator (SLI) metrics for each Kubernetes component binary. This metric endpoint is exposed on the serving From 5aa6dc7c39f119fbe1e77a9995401c3f144d6bc8 Mon Sep 17 00:00:00 2001 From: AxeZhan Date: Wed, 18 Oct 2023 00:00:10 +0800 Subject: [PATCH 17/82] PodLifecycleSleepAction --- .../en/docs/concepts/containers/container-lifecycle-hooks.md | 5 ++++- .../reference/command-line-tools-reference/feature-gates.md | 2 ++ 2 files changed, 6 insertions(+), 1 deletion(-) diff --git a/content/en/docs/concepts/containers/container-lifecycle-hooks.md b/content/en/docs/concepts/containers/container-lifecycle-hooks.md index 8e1cd2eb596..aec5433a75c 100644 --- a/content/en/docs/concepts/containers/container-lifecycle-hooks.md +++ b/content/en/docs/concepts/containers/container-lifecycle-hooks.md @@ -55,12 +55,15 @@ There are two types of hook handlers that can be implemented for Containers: * Exec - Executes a specific command, such as `pre-stop.sh`, inside the cgroups and namespaces of the Container. Resources consumed by the command are counted against the Container. * HTTP - Executes an HTTP request against a specific endpoint on the Container. +* Sleep - Pauses the container for a specified duration. + The "Sleep" action is available when the [feature gate](/docs/reference/command-line-tool-reference/feagure-gates/) + `PodLifecycleSleepAction` is enabled. ### Hook handler execution When a Container lifecycle management hook is called, the Kubernetes management system executes the handler according to the hook action, -`httpGet` and `tcpSocket` are executed by the kubelet process, and `exec` is executed in the container. +`httpGet` , `tcpSocket` and `sleep` are executed by the kubelet process, and `exec` is executed in the container. Hook handler calls are synchronous within the context of the Pod containing the Container. This means that for a `PostStart` hook, diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 8701ea7d4fe..7aa695ab9d5 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -170,6 +170,7 @@ For a reference to old feature gates that are removed, please refer to | `PodDisruptionConditions` | `true` | Beta | 1.26 | | | `PodHostIPs` | `false` | Alpha | 1.28 | | | `PodIndexLabel` | `true` | Beta | 1.28 | | +| `PodLifecycleSleepAction` | `false` | Alpha | 1.29 | | | `PodReadyToStartContainersCondition` | `false` | Alpha | 1.28 | | | `PodSchedulingReadiness` | `false` | Alpha | 1.26 | 1.26 | | `PodSchedulingReadiness` | `true` | Beta | 1.27 | | @@ -664,6 +665,7 @@ Each feature gate is designed for enabling/disabling a specific feature: - `PodHostIPs`: Enable the `status.hostIPs` field for pods and the {{< glossary_tooltip term_id="downward-api" text="downward API" >}}. The field lets you expose host IP addresses to workloads. - `PodIndexLabel`: Enables the Job controller and StatefulSet controller to add the pod index as a label when creating new pods. See [Job completion mode docs](/docs/concepts/workloads/controllers/job#completion-mode) and [StatefulSet pod index label docs](/docs/concepts/workloads/controllers/statefulset/#pod-index-label) for more details. +- `PodLifecycleSleepAction`: Enables the `sleep` action in Container lifecycle hooks. - `PodReadyToStartContainersCondition`: Enable the kubelet to mark the [PodReadyToStartContainers](/docs/concepts/workloads/pods/pod-lifecycle/#pod-has-network) condition on pods. This was previously (1.25-1.27) known as `PodHasNetworkCondition`. - `PodSchedulingReadiness`: Enable setting `schedulingGates` field to control a Pod's [scheduling readiness](/docs/concepts/scheduling-eviction/pod-scheduling-readiness). From 53be005d91c47ae5836d83c5f7db8937e664f7fe Mon Sep 17 00:00:00 2001 From: Paco Xu Date: Wed, 18 Oct 2023 17:09:31 +0800 Subject: [PATCH 18/82] remove GAed Feature Gate GRPCContainerProbe Co-authored-by: Shubham --- .../command-line-tools-reference/feature-gates-removed.md | 6 ++++++ .../reference/command-line-tools-reference/feature-gates.md | 5 ----- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md b/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md index 5882f09a4f0..28419c53e7d 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md @@ -203,6 +203,9 @@ In the following table: | `ExternalPolicyForExternalIP` | `true` | GA | 1.18 | 1.22 | | `GCERegionalPersistentDisk` | `true` | Beta | 1.10 | 1.12 | | `GCERegionalPersistentDisk` | `true` | GA | 1.13 | 1.16 | +| `GRPCContainerProbe` | `false` | Alpha | 1.23 | 1.23 | +| `GRPCContainerProbe` | `true` | Beta | 1.24 | 1.26 | +| `GRPCContainerProbe` | `true` | GA | 1.27 | 1.28 | | `GenericEphemeralVolume` | `false` | Alpha | 1.19 | 1.20 | | `GenericEphemeralVolume` | `true` | Beta | 1.21 | 1.22 | | `GenericEphemeralVolume` | `true` | GA | 1.23 | 1.24 | @@ -704,6 +707,9 @@ In the following table: - `GCERegionalPersistentDisk`: Enable the regional PD feature on GCE. +- `GRPCContainerProbe`: Enables the gRPC probe method for {Liveness,Readiness,Startup}Probe. + See [Configure Liveness, Readiness and Startup Probes](/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-grpc-liveness-probe). + - `GenericEphemeralVolume`: Enables ephemeral, inline volumes that support all features of normal volumes (can be provided by third-party storage vendors, storage capacity tracking, restore from snapshot, etc.). diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index cbe767656e7..01bbd6d6881 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -263,9 +263,6 @@ For a reference to old feature gates that are removed, please refer to | `ExpandedDNSConfig` | `true` | GA | 1.28 | | | `ExperimentalHostUserNamespaceDefaulting` | `false` | Beta | 1.5 | 1.27 | | `ExperimentalHostUserNamespaceDefaulting` | `false` | Deprecated | 1.28 | | -| `GRPCContainerProbe` | `false` | Alpha | 1.23 | 1.23 | -| `GRPCContainerProbe` | `true` | Beta | 1.24 | 1.26 | -| `GRPCContainerProbe` | `true` | GA | 1.27 | | | `IPTablesOwnershipCleanup` | `false` | Alpha | 1.25 | 1.26 | | `IPTablesOwnershipCleanup` | `true` | Beta | 1.27 | 1.27 | | `IPTablesOwnershipCleanup` | `true` | GA | 1.28 | | @@ -515,8 +512,6 @@ Each feature gate is designed for enabling/disabling a specific feature: for more details. - `GracefulNodeShutdownBasedOnPodPriority`: Enables the kubelet to check Pod priorities when shutting down a node gracefully. -- `GRPCContainerProbe`: Enables the gRPC probe method for {Liveness,Readiness,Startup}Probe. - See [Configure Liveness, Readiness and Startup Probes](/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-grpc-liveness-probe). - `HonorPVReclaimPolicy`: Honor persistent volume reclaim policy when it is `Delete` irrespective of PV-PVC deletion ordering. For more details, check the [PersistentVolume deletion protection finalizer](/docs/concepts/storage/persistent-volumes/#persistentvolume-deletion-protection-finalizer) From 057e4d460e975e1ebf4e6129427878b139319c73 Mon Sep 17 00:00:00 2001 From: Paco Xu Date: Thu, 31 Aug 2023 16:00:18 +0800 Subject: [PATCH 19/82] kubeadm: EtcdLearnerMode is beta in v1.29 --- .../setup-tools/kubeadm/kubeadm-init.md | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md index a766250ef1f..3d40619bb7f 100644 --- a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md +++ b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md @@ -157,9 +157,9 @@ List of feature gates: {{< table caption="kubeadm feature gates" >}} Feature | Default | Alpha | Beta | GA :-------|:--------|:------|:-----|:---- +`EtcdLearnerMode` | `true` | 1.27 | 1.29 | - `PublicKeysECDSA` | `false` | 1.19 | - | - `RootlessControlPlane` | `false` | 1.22 | - | - -`EtcdLearnerMode` | `false` | 1.27 | - | - {{< /table >}} {{< note >}} @@ -168,6 +168,10 @@ Once a feature gate goes GA its value becomes locked to `true` by default. Feature gate descriptions: +`EtcdLearnerMode` +: With this feature gate enabled, when joining a new control plane node, a new etcd member will be created +as a learner and promoted to a voting member only after the etcd data are fully aligned. + `PublicKeysECDSA` : Can be used to create a cluster that uses ECDSA certificates instead of the default RSA algorithm. Renewal of existing ECDSA certificates is also supported using `kubeadm certs renew`, but you cannot @@ -179,10 +183,6 @@ for `kube-apiserver`, `kube-controller-manager`, `kube-scheduler` and `etcd` to If the flag is not set, those components run as root. You can change the value of this feature gate before you upgrade to a newer version of Kubernetes. -`EtcdLearnerMode` -: With this feature gate enabled, when joining a new control plane node, a new etcd member will be created -as a learner and promoted to a voting member only after the etcd data are fully aligned. - List of deprecated feature gates: {{< table caption="kubeadm deprecated feature gates" >}} @@ -212,12 +212,16 @@ List of removed feature gates: {{< table caption="kubeadm removed feature gates" >}} Feature | Alpha | Beta | GA | Removed :-------|:------|:-----|:---|:------- -`UnversionedKubeletConfigMap` | 1.22 | 1.23 | 1.25 | 1.26 `IPv6DualStack` | 1.16 | 1.21 | 1.23 | 1.24 +`UnversionedKubeletConfigMap` | 1.22 | 1.23 | 1.25 | 1.26 {{< /table >}} Feature gate descriptions: +`IPv6DualStack` +: This flag helps to configure components dual stack when the feature is in progress. For more details on Kubernetes +dual-stack support see [Dual-stack support with kubeadm](/docs/setup/production-environment/tools/kubeadm/dual-stack-support/). + `UnversionedKubeletConfigMap` : This flag controls the name of the {{< glossary_tooltip text="ConfigMap" term_id="configmap" >}} where kubeadm stores kubelet configuration data. With this flag not specified or set to `true`, the ConfigMap is named `kubelet-config`. @@ -228,10 +232,6 @@ or `kubeadm upgrade apply`), kubeadm respects the value of `UnversionedKubeletCo (during `kubeadm join`, `kubeadm reset`, `kubeadm upgrade ...`), kubeadm attempts to use unversioned ConfigMap name first; if that does not succeed, kubeadm falls back to using the legacy (versioned) name for that ConfigMap. -`IPv6DualStack` -: This flag helps to configure components dual stack when the feature is in progress. For more details on Kubernetes -dual-stack support see [Dual-stack support with kubeadm](/docs/setup/production-environment/tools/kubeadm/dual-stack-support/). - ### Adding kube-proxy parameters {#kube-proxy} For information about kube-proxy parameters in the kubeadm configuration see: From fab60723a8a43ab64a1af631b9dd7066b0c7fde0 Mon Sep 17 00:00:00 2001 From: Paco Xu Date: Thu, 19 Oct 2023 15:06:08 +0800 Subject: [PATCH 20/82] update SkipReadOnlyValidationGCE status to Deprecated --- .../reference/command-line-tools-reference/feature-gates.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 9db40320fe0..6dcdd9dd5c6 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -191,7 +191,6 @@ For a reference to old feature gates that are removed, please refer to | `SidecarContainers` | `false` | Alpha | 1.28 | | | `SizeMemoryBackedVolumes` | `false` | Alpha | 1.20 | 1.21 | | `SizeMemoryBackedVolumes` | `true` | Beta | 1.22 | | -| `SkipReadOnlyValidationGCE` | `false` | Alpha | 1.28 | | | `StableLoadBalancerNodeSet` | `true` | Beta | 1.27 | | | `StatefulSetAutoDeletePVC` | `false` | Alpha | 1.23 | 1.26 | | `StatefulSetAutoDeletePVC` | `false` | Beta | 1.27 | | @@ -318,6 +317,8 @@ For a reference to old feature gates that are removed, please refer to | `ServiceNodePortStaticSubrange` | `false` | Alpha | 1.27 | 1.27 | | `ServiceNodePortStaticSubrange` | `true` | Beta | 1.28 | 1.28 | | `ServiceNodePortStaticSubrange` | `true` | GA | 1.29 | - | +| `SkipReadOnlyValidationGCE` | `false` | Alpha | 1.28 | 1.28 | +| `SkipReadOnlyValidationGCE` | `true` | Deprecated | 1.29 | | | `TopologyManager` | `false` | Alpha | 1.16 | 1.17 | | `TopologyManager` | `true` | Beta | 1.18 | 1.26 | | `TopologyManager` | `true` | GA | 1.27 | - | From aeeb380c39ee0d424d43849223c98339b7839fb5 Mon Sep 17 00:00:00 2001 From: Humble Chirammal Date: Tue, 17 Oct 2023 21:34:14 +0530 Subject: [PATCH 21/82] Promote CSINodeExpandSecret feature to GA Signed-off-by: Humble Chirammal --- .../reference/command-line-tools-reference/feature-gates.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 3795ccf7918..d6ec5f42599 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -78,8 +78,6 @@ For a reference to old feature gates that are removed, please refer to | CRDValidationRatcheting | false | Alpha | 1.28 | | `CSIMigrationPortworx` | `false` | Alpha | 1.23 | 1.24 | | `CSIMigrationPortworx` | `false` | Beta | 1.25 | | -| `CSINodeExpandSecret` | `false` | Alpha | 1.25 | 1.26 | -| `CSINodeExpandSecret` | `true` | Beta | 1.27 | | | `CSIVolumeHealth` | `false` | Alpha | 1.21 | | | `CloudControllerManagerWebhook` | false | Alpha | 1.27 | | | `CloudDualStackNodeIPs` | false | Alpha | 1.27 | 1.28 | @@ -243,6 +241,9 @@ For a reference to old feature gates that are removed, please refer to | `CSIMigrationvSphere` | `false` | Beta | 1.19 | 1.24 | | `CSIMigrationvSphere` | `true` | Beta | 1.25 | 1.25 | | `CSIMigrationvSphere` | `true` | GA | 1.26 | - | +| `CSINodeExpandSecret` | `false` | Alpha | 1.25 | 1.26 | +| `CSINodeExpandSecret` | `true` | Beta | 1.27 | 1.28 | +| `CSINodeExpandSecret` | `true` | GA | 1.29 | | | `ComponentSLIs` | `false` | Alpha | 1.26 | 1.26 | | `ComponentSLIs` | `true` | Beta | 1.27 | 1.28| | `ComponentSLIs` | `true` | GA | 1.29 | - | From 314f5df8afc5ee80534cd89cd04038f764895cf5 Mon Sep 17 00:00:00 2001 From: Rey Lejano Date: Sun, 22 Oct 2023 00:10:55 -0700 Subject: [PATCH 22/82] Replacement PR for PR 43554 that targets the dev-1.29 branch --- .../command-line-tools-reference/feature-gates-removed.md | 7 +++++++ .../command-line-tools-reference/feature-gates.md | 6 ------ 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md b/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md index e5475e11732..6d3c129d28f 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates-removed.md @@ -410,6 +410,9 @@ In the following table: | `TokenRequestProjection` | `false` | Alpha | 1.11 | 1.11 | | `TokenRequestProjection` | `true` | Beta | 1.12 | 1.19 | | `TokenRequestProjection` | `true` | GA | 1.20 | 1.21 | +| `TopologyManager` | `false` | Alpha | 1.16 | 1.17 | +| `TopologyManager` | `true` | Beta | 1.18 | 1.26 | +| `TopologyManager` | `true` | GA | 1.27 | 1.28 | | `UserNamespacesStatelessPodsSupport` | `false` | Alpha | 1.25 | 1.27 | | `ValidateProxyRedirects` | `false` | Alpha | 1.12 | 1.13 | | `ValidateProxyRedirects` | `true` | Beta | 1.14 | 1.21 | @@ -953,6 +956,10 @@ In the following table: - `TokenRequestProjection`: Enable the injection of service account tokens into a Pod through a [`projected` volume](/docs/concepts/storage/volumes/#projected). +- `TopologyManager`: Enable a mechanism to coordinate fine-grained hardware resource + assignments for different components in Kubernetes. See + [Control Topology Management Policies on a node](/docs/tasks/administer-cluster/topology-manager/). + - `UserNamespacesStatelessPodsSupport`: Enable user namespace support for stateless Pods. This flag was renamed on newer releases to `UserNamespacesSupport`. - `ValidateProxyRedirects`: This flag controls whether the API server should validate that redirects diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 3795ccf7918..c6f53a87f42 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -318,9 +318,6 @@ For a reference to old feature gates that are removed, please refer to | `ServiceNodePortStaticSubrange` | `true` | GA | 1.29 | - | | `SkipReadOnlyValidationGCE` | `false` | Alpha | 1.28 | 1.28 | | `SkipReadOnlyValidationGCE` | `true` | Deprecated | 1.29 | | -| `TopologyManager` | `false` | Alpha | 1.16 | 1.17 | -| `TopologyManager` | `true` | Beta | 1.18 | 1.26 | -| `TopologyManager` | `true` | GA | 1.27 | - | | `WatchBookmark` | `false` | Alpha | 1.15 | 1.15 | | `WatchBookmark` | `true` | Beta | 1.16 | 1.16 | | `WatchBookmark` | `true` | GA | 1.17 | - | @@ -725,9 +722,6 @@ Each feature gate is designed for enabling/disabling a specific feature: in EndpointSlices. See [Topology Aware Hints](/docs/concepts/services-networking/topology-aware-hints/) for more details. -- `TopologyManager`: Enable a mechanism to coordinate fine-grained hardware resource - assignments for different components in Kubernetes. See - [Control Topology Management Policies on a node](/docs/tasks/administer-cluster/topology-manager/). - `TopologyManagerPolicyAlphaOptions`: Allow fine-tuning of topology manager policies, experimental, Alpha-quality options. This feature gate guards *a group* of topology manager options whose quality level is alpha. From 25615ec29cd643630111cc2b68153a1c03640c66 Mon Sep 17 00:00:00 2001 From: Ed Bartosh Date: Wed, 11 Oct 2023 12:43:21 +0300 Subject: [PATCH 23/82] Device Plugins: add info about beta graduation Co-authored-by: Michael Co-authored-by: Kevin Klues --- .../extend-kubernetes/compute-storage-net/device-plugins.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/en/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md b/content/en/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md index 4770e2c3be1..8dd955cdad9 100644 --- a/content/en/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md +++ b/content/en/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md @@ -159,8 +159,8 @@ The general workflow of a device plugin includes the following steps: {{< note >}} The processing of the fully-qualified CDI device names by the Device Manager requires that the `DevicePluginCDIDevices` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) - is enabled for the kubelet and the kube-apiserver. This was added as an alpha feature in Kubernetes - v1.28. + is enabled for both the kubelet and the kube-apiserver. This was added as an alpha feature in Kubernetes + v1.28 and graduated to beta in v1.29. {{< /note >}} ### Handling kubelet restarts From a9478b46acb19593987dc792220077fda51a6836 Mon Sep 17 00:00:00 2001 From: "Lubomir I. Ivanov" Date: Tue, 17 Oct 2023 21:18:58 +0300 Subject: [PATCH 24/82] kubeadm: introduce documentation changes for super-admin.conf - Update most pages where the kubeadm generated admin.conf is discussed. Include information about the new file "super-admin.conf". --- .../kubeadm_certs_renew_super-admin.conf.md | 92 +++++++++++++ ...beadm_init_phase_kubeconfig_super-admin.md | 121 ++++++++++++++++++ .../kubeadm/implementation-details.md | 22 +++- .../setup-tools/kubeadm/kubeadm-certs.md | 1 + .../setup-tools/kubeadm/kubeadm-init-phase.md | 1 + .../setup-tools/kubeadm/kubeadm-init.md | 14 +- .../docs/setup/best-practices/certificates.md | 31 ++++- .../tools/kubeadm/create-cluster-kubeadm.md | 18 ++- .../kubeadm/kubeadm-certs.md | 51 ++++++-- 9 files changed, 318 insertions(+), 33 deletions(-) create mode 100644 content/en/docs/reference/setup-tools/kubeadm/generated/kubeadm_certs_renew_super-admin.conf.md create mode 100644 content/en/docs/reference/setup-tools/kubeadm/generated/kubeadm_init_phase_kubeconfig_super-admin.md diff --git a/content/en/docs/reference/setup-tools/kubeadm/generated/kubeadm_certs_renew_super-admin.conf.md b/content/en/docs/reference/setup-tools/kubeadm/generated/kubeadm_certs_renew_super-admin.conf.md new file mode 100644 index 00000000000..db00d62bb57 --- /dev/null +++ b/content/en/docs/reference/setup-tools/kubeadm/generated/kubeadm_certs_renew_super-admin.conf.md @@ -0,0 +1,92 @@ + + + +Renew the certificate embedded in the kubeconfig file for the super-admin + +### Synopsis + + +Renew the certificate embedded in the kubeconfig file for the super-admin. + +Renewals run unconditionally, regardless of certificate expiration date; extra attributes such as SANs will be based on the existing file/certificates, there is no need to resupply them. + +Renewal by default tries to use the certificate authority in the local PKI managed by kubeadm; as alternative it is possible to use K8s certificate API for certificate renewal, or as a last option, to generate a CSR request. + +After renewal, in order to make changes effective, is required to restart control-plane components and eventually re-distribute the renewed certificate in case the file is used elsewhere. + +``` +kubeadm certs renew super-admin.conf [flags] +``` + +### Options + + ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
--cert-dir string     Default: "/etc/kubernetes/pki"

The path where to save the certificates

--config string

Path to a kubeadm configuration file.

-h, --help

help for admin.conf

--kubeconfig string     Default: "/etc/kubernetes/admin.conf"

The kubeconfig file to use when talking to the cluster. If the flag is not set, a set of standard locations can be searched for an existing kubeconfig file.

+ + + +### Options inherited from parent commands + + ++++ + + + + + + + + + + +
--rootfs string

[EXPERIMENTAL] The path to the 'real' host root filesystem.

+ + + diff --git a/content/en/docs/reference/setup-tools/kubeadm/generated/kubeadm_init_phase_kubeconfig_super-admin.md b/content/en/docs/reference/setup-tools/kubeadm/generated/kubeadm_init_phase_kubeconfig_super-admin.md new file mode 100644 index 00000000000..14de2fdbfb5 --- /dev/null +++ b/content/en/docs/reference/setup-tools/kubeadm/generated/kubeadm_init_phase_kubeconfig_super-admin.md @@ -0,0 +1,121 @@ + + + +Generate a kubeconfig file for the super-admin + +### Synopsis + + +Generate a kubeconfig file for the super-admin. + +``` +kubeadm init phase kubeconfig super-admin [flags] +``` + +### Options + + ++++ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
--apiserver-advertise-address string

The IP address the API Server will advertise it's listening on. If not set the default network interface will be used.

--apiserver-bind-port int32     Default: 6443

Port for the API Server to bind to.

--cert-dir string     Default: "/etc/kubernetes/pki"

The path where to save and store the certificates.

--config string

Path to a kubeadm configuration file.

--control-plane-endpoint string

Specify a stable IP address or DNS name for the control plane.

--dry-run

Don't apply any changes; just output what would be done.

-h, --help

help for admin

--kubeconfig-dir string     Default: "/etc/kubernetes"

The path where to save the kubeconfig file.

--kubernetes-version string     Default: "stable-1"

Choose a specific Kubernetes version for the control plane.

+ + + +### Options inherited from parent commands + + ++++ + + + + + + + + + + +
--rootfs string

[EXPERIMENTAL] The path to the 'real' host root filesystem.

+ + + diff --git a/content/en/docs/reference/setup-tools/kubeadm/implementation-details.md b/content/en/docs/reference/setup-tools/kubeadm/implementation-details.md index 7a0d5b3bf11..33463afb8ca 100644 --- a/content/en/docs/reference/setup-tools/kubeadm/implementation-details.md +++ b/content/en/docs/reference/setup-tools/kubeadm/implementation-details.md @@ -64,6 +64,7 @@ in a majority of cases, and the most intuitive location; other constants paths a - `controller-manager.conf` - `scheduler.conf` - `admin.conf` for the cluster admin and kubeadm itself + - `super-admin.conf` for the cluster super-admin that can bypass RBAC - Names of certificates and key files : @@ -209,12 +210,21 @@ Kubeadm generates kubeconfig files with identities for control plane components: This client cert should have the CN `system:kube-scheduler`, as defined by default [RBAC core components roles](/docs/reference/access-authn-authz/rbac/#core-component-roles) -Additionally, a kubeconfig file for kubeadm itself and the admin is generated and saved into the -`/etc/kubernetes/admin.conf` file. The "admin" here is defined as the actual person(s) that is -administering the cluster and wants to have full control (**root**) over the cluster. The -embedded client certificate for admin should be in the `system:masters` organization, as defined -by default [RBAC user facing role bindings](/docs/reference/access-authn-authz/rbac/#user-facing-roles). -It should also include a CN. Kubeadm uses the `kubernetes-admin` CN. +Additionally, a kubeconfig file for kubeadm as an administrative entity is generated and stored +in `/etc/kubernetes/admin.conf`. This file includes a certificate with +`Subject: O = kubeadm:cluster-admins, CN = kubernetes-admin`. `kubeadm:cluster-admins` +is a group managed by kubeadm. It is bound to the `cluster-admin` ClusterRole during `kubeadm init`, +by using the `super-admin.conf` file, which does not require RBAC. +This `admin.conf` file must remain on control plane nodes and not be shared with additional users. + +During `kubeadm init` another kubeconfig file is generated and stored in `/etc/kubernetes/super-admin.conf`. +This file includes a certificate with `Subject: O = system:masters, CN = kubernetes-super-admin`. +`system:masters` is a super user group that bypasses RBAC and makes `super-admin.conf` useful in case +of an emergency where a cluster is locked due to RBAC misconfiguration. +The `super-admin.conf` file can be stored in a safe location and not shared with additional users. + +See [RBAC user facing role bindings](/docs/reference/access-authn-authz/rbac/#user-facing-roles) +for additional information RBAC and built-in ClusterRoles and groups. Please note that: diff --git a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-certs.md b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-certs.md index 3bce10ccf0b..95cd0eb0963 100644 --- a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-certs.md +++ b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-certs.md @@ -34,6 +34,7 @@ For more details see [Manual certificate renewal](/docs/tasks/administer-cluster {{< tab name="etcd-server" include="generated/kubeadm_certs_renew_etcd-server.md" />}} {{< tab name="front-proxy-client" include="generated/kubeadm_certs_renew_front-proxy-client.md" />}} {{< tab name="scheduler.conf" include="generated/kubeadm_certs_renew_scheduler.conf.md" />}} +{{< tab name="super-admin.conf" include="generated/kubeadm_certs_renew_super-admin.conf.md" />}} {{< /tabs >}} ## kubeadm certs certificate-key {#cmd-certs-certificate-key} diff --git a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init-phase.md b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init-phase.md index 2bab24f74d7..c08427d4b67 100644 --- a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init-phase.md +++ b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init-phase.md @@ -58,6 +58,7 @@ You can create all required kubeconfig files by calling the `all` subcommand or {{< tab name="kubelet" include="generated/kubeadm_init_phase_kubeconfig_kubelet.md" />}} {{< tab name="controller-manager" include="generated/kubeadm_init_phase_kubeconfig_controller-manager.md" />}} {{< tab name="scheduler" include="generated/kubeadm_init_phase_kubeconfig_scheduler.md" />}} +{{< tab name="super-admin" include="generated/kubeadm_init_phase_kubeconfig_super-admin.md" />}} {{< /tabs >}} ## kubeadm init phase control-plane {#cmd-phase-control-plane} diff --git a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md index a766250ef1f..dbde5dc8e1a 100644 --- a/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md +++ b/content/en/docs/reference/setup-tools/kubeadm/kubeadm-init.md @@ -32,8 +32,9 @@ following steps: arguments, lowercased if necessary. 1. Writes kubeconfig files in `/etc/kubernetes/` for the kubelet, the controller-manager and the - scheduler to use to connect to the API server, each with its own identity, as well as an - additional kubeconfig file for administration named `admin.conf`. + scheduler to use to connect to the API server, each with its own identity. Also + additional kubeconfig files are written, for kubeadm as administrative entity (`admin.conf`) + and for a super admin user that can bypass RBAC (`super-admin.conf`). 1. Generates static Pod manifests for the API server, controller-manager and scheduler. In case an external etcd is not provided, @@ -186,7 +187,7 @@ as a learner and promoted to a voting member only after the etcd data are fully List of deprecated feature gates: {{< table caption="kubeadm deprecated feature gates" >}} -Feature | Default +Feature | Default :-------|:-------- `UpgradeAddonsBeforeControlPlane` | `false` {{< /table >}} @@ -291,7 +292,7 @@ for etcd and CoreDNS. #### Custom sandbox (pause) images {#custom-pause-image} -To set a custom image for these you need to configure this in your +To set a custom image for these you need to configure this in your {{< glossary_tooltip text="container runtime" term_id="container-runtime" >}} to use the image. Consult the documentation for your container runtime to find out how to change this setting; @@ -386,8 +387,9 @@ DNS name or an address of a load balancer. kubeadm certs certificate-key ``` -Once the cluster is up, you can grab the admin credentials from the control-plane node -at `/etc/kubernetes/admin.conf` and use that to talk to the cluster. +Once the cluster is up, you can use the `/etc/kubernetes/admin.conf` file from +a control-plane node to talk to the cluster with administrator credentials or +[Generating kubeconfig files for additional users](/docs/tasks/administer-cluster/kubeadm/kubeadm-certs#kubeconfig-additional-users). Note that this style of bootstrap has some relaxed security guarantees because it does not allow the root CA hash to be validated with diff --git a/content/en/docs/setup/best-practices/certificates.md b/content/en/docs/setup/best-practices/certificates.md index 8bcf5f7e1ec..4de7277e4af 100644 --- a/content/en/docs/setup/best-practices/certificates.md +++ b/content/en/docs/setup/best-practices/certificates.md @@ -184,12 +184,13 @@ you need to provide if you are generating all of your own keys and certificates: You must manually configure these administrator account and service accounts: -| filename | credential name | Default CN | O (in Subject) | -|-------------------------|----------------------------|-------------------------------------|----------------| -| admin.conf | default-admin | kubernetes-admin | system:masters | -| kubelet.conf | default-auth | system:node:`` (see note) | system:nodes | -| controller-manager.conf | default-controller-manager | system:kube-controller-manager | | -| scheduler.conf | default-scheduler | system:kube-scheduler | | +| filename | credential name | Default CN | O (in Subject) | +|-------------------------|----------------------------|-------------------------------------|------------------------| +| admin.conf | default-admin | kubernetes-admin | `` | +| super-admin.conf | default-super-admin | kubernetes-super-admin | system:masters | +| kubelet.conf | default-auth | system:node:`` (see note) | system:nodes | +| controller-manager.conf | default-controller-manager | system:kube-controller-manager | | +| scheduler.conf | default-scheduler | system:kube-scheduler | | {{< note >}} The value of `` for `kubelet.conf` **must** match precisely the value of the node name @@ -197,6 +198,22 @@ provided by the kubelet as it registers with the apiserver. For further details, [Node Authorization](/docs/reference/access-authn-authz/node/). {{< /note >}} +{{< note >}} +In the above example `` is implementation specific. Some tools sign the +certificate in the default `admin.conf` to be part of the `system:masters` group. +`system:masters` is a break-glass, super user group can bypass the authorization +layer of Kubernetes, such as RBAC. Also some tools do not generate a separate +`super-admin.conf` with a certificate bound to this super user group. + +kubeadm generates two separate administrator certificates in kubeconfig files. +One is in `admin.conf` and has `Subject: O = kubeadm:cluster-admins, CN = kubernetes-admin`. +`kubeadm:cluster-admins` is a custom group bound to the `cluster-admin` ClusterRole. +This file is generated on all kubeadm managed control plane machines. + +Another is in `super-admin.conf` that has `Subject: O = system:masters, CN = kubernetes-super-admin`. +This file is generated only on the node where `kubeadm init` was called. +{{< /note >}} + 1. For each config, generate an x509 cert/key pair with the given CN and O. 1. Run `kubectl` as follows for each config: @@ -213,6 +230,7 @@ These files are used as follows: | filename | command | comment | |-------------------------|-------------------------|-----------------------------------------------------------------------| | admin.conf | kubectl | Configures administrator user for the cluster | +| super-admin.conf | kubectl | Configures super administrator user for the cluster | | kubelet.conf | kubelet | One required for each node in the cluster. | | controller-manager.conf | kube-controller-manager | Must be added to manifest in `manifests/kube-controller-manager.yaml` | | scheduler.conf | kube-scheduler | Must be added to manifest in `manifests/kube-scheduler.yaml` | @@ -221,6 +239,7 @@ The following files illustrate full paths to the files listed in the previous ta ``` /etc/kubernetes/admin.conf +/etc/kubernetes/super-admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf diff --git a/content/en/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm.md b/content/en/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm.md index 5db35089784..4a1358e9f9f 100644 --- a/content/en/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm.md +++ b/content/en/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm.md @@ -211,11 +211,19 @@ export KUBECONFIG=/etc/kubernetes/admin.conf ``` {{< warning >}} -Kubeadm signs the certificate in the `admin.conf` to have `Subject: O = system:masters, CN = kubernetes-admin`. -`system:masters` is a break-glass, super user group that bypasses the authorization layer (e.g. RBAC). -Do not share the `admin.conf` file with anyone and instead grant users custom permissions by generating -them a kubeconfig file using the `kubeadm kubeconfig user` command. For more details see -[Generating kubeconfig files for additional users](/docs/tasks/administer-cluster/kubeadm/kubeadm-certs#kubeconfig-additional-users). +The kubeconfig file `admin.conf` that `kubeadm init` generates contains a certificate with +`Subject: O = kubeadm:cluster-admins, CN = kubernetes-admin`. The group `kubeadm:cluster-admins` +is bound to the built-in `cluster-admin` ClusterRole. +Do not share the `admin.conf` file with anyone. + +`kubeadm init` generates another kubeconfig file `super-admin.conf` that contains a certificate with +`Subject: O = system:masters, CN = kubernetes-super-admin`. +`system:masters` is a break-glass, super user group that bypasses the authorization layer (for example RBAC). +Do not share the `super-admin.conf` file with anyone. It is recommended to move the file to a safe location. + +See +[Generating kubeconfig files for additional users](/docs/tasks/administer-cluster/kubeadm/kubeadm-certs#kubeconfig-additional-users) +on how to use `kubeadm kubeconfig user` to generate kubeconfig files for additional users. {{< /warning >}} Make a record of the `kubeadm join` command that `kubeadm init` outputs. You diff --git a/content/en/docs/tasks/administer-cluster/kubeadm/kubeadm-certs.md b/content/en/docs/tasks/administer-cluster/kubeadm/kubeadm-certs.md index fead85f7e6a..51f3b748126 100644 --- a/content/en/docs/tasks/administer-cluster/kubeadm/kubeadm-certs.md +++ b/content/en/docs/tasks/administer-cluster/kubeadm/kubeadm-certs.md @@ -16,7 +16,6 @@ to kubeadm certificate management. ## {{% heading "prerequisites" %}} - You should be familiar with [PKI certificates and requirements in Kubernetes](/docs/setup/best-practices/certificates/). @@ -71,6 +70,7 @@ etcd-peer Dec 30, 2020 23:36 UTC 364d etcd-ca etcd-server Dec 30, 2020 23:36 UTC 364d etcd-ca no front-proxy-client Dec 30, 2020 23:36 UTC 364d front-proxy-ca no scheduler.conf Dec 30, 2020 23:36 UTC 364d no +super-admin.conf Dec 30, 2020 23:36 UTC 364d no CERTIFICATE AUTHORITY EXPIRES RESIDUAL TIME EXTERNALLY MANAGED ca Dec 28, 2029 23:36 UTC 9y no @@ -81,6 +81,7 @@ front-proxy-ca Dec 28, 2029 23:36 UTC 9y no The command shows expiration/residual time for the client certificates in the `/etc/kubernetes/pki` folder and for the client certificate embedded in the kubeconfig files used by kubeadm (`admin.conf`, `controller-manager.conf` and `scheduler.conf`). +If missing, `super-admin.conf` will not cause an error. Additionally, kubeadm informs the user if the certificate is externally managed; in this case, the user should take care of managing certificate renewal manually/using other tools. @@ -326,14 +327,18 @@ CSRs requesting serving certificates for any IP or domain name. ## Generating kubeconfig files for additional users {#kubeconfig-additional-users} -During cluster creation, kubeadm signs the certificate in the `admin.conf` to have -`Subject: O = system:masters, CN = kubernetes-admin`. -[`system:masters`](/docs/reference/access-authn-authz/rbac/#user-facing-roles) -is a break-glass, super user group that bypasses the authorization layer (for example, -[RBAC](/docs/reference/access-authn-authz/rbac/)). -Sharing the `admin.conf` with additional users is **not recommended**! +The kubeconfig file `admin.conf` that `kubeadm init` generates contains a certificate with +`Subject: O = kubeadm:cluster-admins, CN = kubernetes-admin`. The group `kubeadm:cluster-admins` +is bound to the built-in `cluster-admin` ClusterRole. +Do not share the `admin.conf` file with anyone. -Instead, you can use the [`kubeadm kubeconfig user`](/docs/reference/setup-tools/kubeadm/kubeadm-kubeconfig) +`kubeadm init` generates another kubeconfig file `super-admin.conf` that contains a certificate with +`Subject: O = system:masters, CN = kubernetes-super-admin`. +`system:masters` is a break-glass, super user group that bypasses the authorization layer (for example RBAC). +Do not share the `super-admin.conf` file with anyone. It is recommended to move the file to a safe location. + +Instead of sharing these files, you can use the +[`kubeadm kubeconfig user`](/docs/reference/setup-tools/kubeadm/kubeadm-kubeconfig) command to generate kubeconfig files for additional users. The command accepts a mixture of command line flags and [kubeadm configuration](/docs/reference/config-api/kubeadm-config.v1beta3/) options. @@ -368,8 +373,34 @@ for a new user `johndoe` that is part of the `appdevs` group: kubeadm kubeconfig user --config example.yaml --org appdevs --client-name johndoe --validity-period 24h ``` -The following example will generate a kubeconfig file with administrator credentials valid for 1 week: +The following example binds the `cluster-admin` ClusterRole to a new group called +`my-cluster-admins`. It then creates a new kubeconfig file for an admin called `my-admin` that +is part of the `my-cluster-admins` group. The kubeconfig is valid for 1 week: ```shell -kubeadm kubeconfig user --config example.yaml --client-name admin --validity-period 168h +kubectl create clusterrolebinding my-cluster-admins-binding --clusterrole cluster-admin --group my-cluster-admins +kubeadm kubeconfig user --config example.yaml --client-name my-admin --org my-cluster-admins --validity-period 168h ``` + +Alternatively, you can create a ClusterRoleBinding for individual users. The following example +binds the `cluster-admin` ClusterRole to the new user `my-admin` and then creates a kubeconfig +for the user: + +```shell +kubectl create clusterrolebinding my-admin-binding --clusterrole cluster-admin --user my-admin +kubeadm kubeconfig user --config example.yaml --client-name my-admin ---validity-period 168h +``` + +Removing the ClusterRoleBinding for a group of users or individual users can act +as permission revocation: + +```shell +kubectl delete clusterrolebinding +``` + +{{< warning >}} +Creating kubeconfig files for additional users that are part of the default kubeadm group +`kubeadm:cluster-admins` or the built-in Kubernetes group `system:masters` is not recommended. +Ideally, these groups should only be used for the users stored in `admin.conf` and `super-admin.conf` - +`kubernetes-admin` and `kubernetes-super-admin`, respectively. +{{< /warning >}} From fe172fc2c86121770ff89c8c62cb61723af305dd Mon Sep 17 00:00:00 2001 From: Jordan Liggitt Date: Mon, 30 Oct 2023 20:11:53 -0400 Subject: [PATCH 25/82] Add 1.32 removal info for v1beta3 flowcontrol API --- .../docs/reference/using-api/deprecation-guide.md | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/content/en/docs/reference/using-api/deprecation-guide.md b/content/en/docs/reference/using-api/deprecation-guide.md index 69f32f9eebc..9b1f5767dfa 100644 --- a/content/en/docs/reference/using-api/deprecation-guide.md +++ b/content/en/docs/reference/using-api/deprecation-guide.md @@ -20,6 +20,19 @@ deprecated API versions to newer and more stable API versions. ## Removed APIs by release +### v1.32 + +The **v1.32** release will stop serving the following deprecated API versions: + +#### Flow control resources {#flowcontrol-resources-v132} + +The **flowcontrol.apiserver.k8s.io/v1beta3** API version of FlowSchema and PriorityLevelConfiguration will no longer be served in v1.32. + +* Migrate manifests and API clients to use the **flowcontrol.apiserver.k8s.io/v1** API version, available since v1.29. +* All existing persisted objects are accessible via the new API +* Notable changes in **flowcontrol.apiserver.k8s.io/v1**: + * The PriorityLevelConfiguration `spec.limited.nominalConcurrencyShares` field only defaults to 30 when unspecified, and an explicit value of 0 is not changed to 30. + ### v1.29 The **v1.29** release will stop serving the following deprecated API versions: From 1571a07f7974aba312f9cd32854fe60940a290fd Mon Sep 17 00:00:00 2001 From: HirazawaUi <695097494plus@gmail.com> Date: Sat, 7 Oct 2023 00:08:47 +0800 Subject: [PATCH 26/82] add DisableNodeKubeProxyVersion feature gate --- .../reference/command-line-tools-reference/feature-gates.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 1d88ec7b0f3..a579e6a8aeb 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -96,6 +96,7 @@ For a reference to old feature gates that are removed, please refer to | `DevicePluginCDIDevices` | `false` | Alpha | 1.28 | | | `DisableCloudProviders` | `false` | Alpha | 1.22 | | | `DisableKubeletCloudCredentialProviders` | `false` | Alpha | 1.23 | | +| `DisableNodeKubeProxyVersion` | `false` | Alpha | 1.29 | | | `DynamicResourceAllocation` | `false` | Alpha | 1.26 | | | `ElasticIndexedJob` | `true` | Beta` | 1.27 | | | `EventedPLEG` | `false` | Alpha | 1.26 | 1.26 | @@ -467,6 +468,7 @@ Each feature gate is designed for enabling/disabling a specific feature: component flag. - `DisableKubeletCloudCredentialProviders`: Disable the in-tree functionality in kubelet to authenticate to a cloud provider container registry for image pull credentials. +- `DisableNodeKubeProxyVersion`: Disable setting the `kubeProxyVersion` field of the Node. - `DynamicResourceAllocation`: Enables support for resources with custom parameters and a lifecycle that is independent of a Pod. - `ElasticIndexedJob`: Enables Indexed Jobs to be scaled up or down by mutating both From 91aa69bb0e3e66c845e0c812b89951ddd4395ec7 Mon Sep 17 00:00:00 2001 From: Jordan Liggitt Date: Wed, 1 Nov 2023 08:32:48 -0400 Subject: [PATCH 27/82] Update v1beta2 flowcontrol guidance --- content/en/docs/reference/using-api/deprecation-guide.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/content/en/docs/reference/using-api/deprecation-guide.md b/content/en/docs/reference/using-api/deprecation-guide.md index 9b1f5767dfa..72df2cae7d1 100644 --- a/content/en/docs/reference/using-api/deprecation-guide.md +++ b/content/en/docs/reference/using-api/deprecation-guide.md @@ -41,8 +41,10 @@ The **v1.29** release will stop serving the following deprecated API versions: The **flowcontrol.apiserver.k8s.io/v1beta2** API version of FlowSchema and PriorityLevelConfiguration will no longer be served in v1.29. -* Migrate manifests and API clients to use the **flowcontrol.apiserver.k8s.io/v1beta3** API version, available since v1.26. +* Migrate manifests and API clients to use the **flowcontrol.apiserver.k8s.io/v1** API version, available since v1.29, or the **flowcontrol.apiserver.k8s.io/v1beta3** API version, available since v1.26. * All existing persisted objects are accessible via the new API +* Notable changes in **flowcontrol.apiserver.k8s.io/v1**: + * The PriorityLevelConfiguration `spec.limited.assuredConcurrencyShares` field is renamed to `spec.limited.nominalConcurrencyShares` and only defaults to 30 when unspecified, and an explicit value of 0 is not changed to 30. * Notable changes in **flowcontrol.apiserver.k8s.io/v1beta3**: * The PriorityLevelConfiguration `spec.limited.assuredConcurrencyShares` field is renamed to `spec.limited.nominalConcurrencyShares` From e9629259717fca6f9973b2675a054167cca4a63d Mon Sep 17 00:00:00 2001 From: Han Kang Date: Wed, 1 Nov 2023 15:02:32 -0700 Subject: [PATCH 28/82] update documented metrics for v1.29 --- .../docs/reference/instrumentation/metrics.md | 22 +++++++++---------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/content/en/docs/reference/instrumentation/metrics.md b/content/en/docs/reference/instrumentation/metrics.md index a94953a5655..0bc4338b604 100644 --- a/content/en/docs/reference/instrumentation/metrics.md +++ b/content/en/docs/reference/instrumentation/metrics.md @@ -8,7 +8,7 @@ description: >- ## Metrics (v1.29) - + This page details the metrics that different Kubernetes components export. You can query the metrics endpoint for these components using an HTTP scrape, and fetch the current metrics data in Prometheus format. @@ -635,25 +635,25 @@ Alpha metrics do not have any API guarantees. These metrics must be used at your
  • protocoltransport
  • apiserver_encryption_config_controller_automatic_reload_failures_total
    -
    Total number of failed automatic reloads of encryption configuration.
    +
    Total number of failed automatic reloads of encryption configuration split by apiserver identity.
    • ALPHA
    • Counter
    • -
    +
  • apiserver_id_hash
  • apiserver_encryption_config_controller_automatic_reload_last_timestamp_seconds
    -
    Timestamp of the last successful or failed automatic reload of encryption configuration.
    +
    Timestamp of the last successful or failed automatic reload of encryption configuration split by apiserver identity.
    • ALPHA
    • Gauge
    • -
    • status
    +
  • apiserver_id_hashstatus
  • apiserver_encryption_config_controller_automatic_reload_success_total
    -
    Total number of successful automatic reloads of encryption configuration.
    +
    Total number of successful automatic reloads of encryption configuration split by apiserver identity.
    • ALPHA
    • Counter
    • -
    +
  • apiserver_id_hash
  • apiserver_envelope_encryption_dek_cache_fill_percent
    Percent of the cache slots currently occupied by cached DEKs.
    @@ -688,21 +688,21 @@ Alpha metrics do not have any API guarantees. These metrics must be used at your
    • ALPHA
    • Gauge
    • -
    • key_id_hashprovider_nametransformation_type
    +
  • apiserver_id_hashkey_id_hashprovider_nametransformation_type
  • apiserver_envelope_encryption_key_id_hash_status_last_timestamp_seconds
    The last time in seconds when a keyID was returned by the Status RPC call.
    • ALPHA
    • Gauge
    • -
    • key_id_hashprovider_name
    +
  • apiserver_id_hashkey_id_hashprovider_name
  • apiserver_envelope_encryption_key_id_hash_total
    -
    Number of times a keyID is used split by transformation type and provider.
    +
    Number of times a keyID is used split by transformation type, provider, and apiserver identity.
    • ALPHA
    • Counter
    • -
    • key_id_hashprovider_nametransformation_type
    +
  • apiserver_id_hashkey_id_hashprovider_nametransformation_type
  • apiserver_envelope_encryption_kms_operations_latency_seconds
    KMS operation duration with gRPC error code status total.
    From 73731382ba6c924b534e501ca99e6e5dbb01ef2e Mon Sep 17 00:00:00 2001 From: Michal Wozniak Date: Mon, 9 Oct 2023 11:10:32 +0200 Subject: [PATCH 29/82] Docs update for Job Backoff Limit Per Index in Beta --- content/en/docs/concepts/workloads/controllers/job.md | 2 +- .../reference/command-line-tools-reference/feature-gates.md | 3 ++- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/content/en/docs/concepts/workloads/controllers/job.md b/content/en/docs/concepts/workloads/controllers/job.md index 09914197f09..cb344b4e33a 100644 --- a/content/en/docs/concepts/workloads/controllers/job.md +++ b/content/en/docs/concepts/workloads/controllers/job.md @@ -382,7 +382,7 @@ from failed Jobs is not lost inadvertently. ### Backoff limit per index {#backoff-limit-per-index} -{{< feature-state for_k8s_version="v1.28" state="alpha" >}} +{{< feature-state for_k8s_version="v1.29" state="beta" >}} {{< note >}} You can only configure the backoff limit per index for an [Indexed](#completion-mode) Job, if you diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 6fd45b731d4..a7e5527c7c3 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -114,7 +114,8 @@ For a reference to old feature gates that are removed, please refer to | `InTreePluginOpenStackUnregister` | `false` | Alpha | 1.21 | | | `InTreePluginPortworxUnregister` | `false` | Alpha | 1.23 | | | `InTreePluginvSphereUnregister` | `false` | Alpha | 1.21 | | -| `JobBackoffLimitPerIndex` | `false` | Alpha | 1.28 | | +| `JobBackoffLimitPerIndex` | `false` | Alpha | 1.28 | 1.28 | +| `JobBackoffLimitPerIndex` | `true` | Beta | 1.29 | | | `JobPodFailurePolicy` | `false` | Alpha | 1.25 | 1.25 | | `JobPodFailurePolicy` | `true` | Beta | 1.26 | | | `JobPodReplacementPolicy` | `false` | Alpha | 1.28 | | From 6886cad27e9c6571c5ea6251716ac8ae80348e77 Mon Sep 17 00:00:00 2001 From: Michal Wozniak Date: Fri, 3 Nov 2023 18:24:49 +0100 Subject: [PATCH 30/82] Docs update about JobReadyPods graduated to GA --- .../reference/command-line-tools-reference/feature-gates.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 6fd45b731d4..91846f06376 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -118,8 +118,6 @@ For a reference to old feature gates that are removed, please refer to | `JobPodFailurePolicy` | `false` | Alpha | 1.25 | 1.25 | | `JobPodFailurePolicy` | `true` | Beta | 1.26 | | | `JobPodReplacementPolicy` | `false` | Alpha | 1.28 | | -| `JobReadyPods` | `false` | Alpha | 1.23 | 1.23 | -| `JobReadyPods` | `true` | Beta | 1.24 | | | `KMSv2` | `false` | Alpha | 1.25 | 1.26 | | `KMSv2` | `true` | Beta | 1.27 | | | `KMSv2KDF` | `false` | Beta | 1.28 | | @@ -267,6 +265,9 @@ For a reference to old feature gates that are removed, please refer to | `IPTablesOwnershipCleanup` | `true` | GA | 1.28 | | | `InTreePluginRBDUnregister` | `false` | Alpha | 1.23 | 1.27 | | `InTreePluginRBDUnregister` | `false` | Deprecated | 1.28 | | +| `JobReadyPods` | `false` | Alpha | 1.23 | 1.23 | +| `JobReadyPods` | `true` | Beta | 1.24 | 1.28 | +| `JobReadyPods` | `true` | GA | 1.29 | | | `JobTrackingWithFinalizers` | `false` | Alpha | 1.22 | 1.22 | | `JobTrackingWithFinalizers` | `false` | Beta | 1.23 | 1.24 | | `JobTrackingWithFinalizers` | `true` | Beta | 1.25 | 1.25 | From c71a21611b5807d03c5d478cd8006e9fa0370b5d Mon Sep 17 00:00:00 2001 From: charles-chenzz <73322208+charles-chenzz@users.noreply.github.com> Date: Tue, 17 Oct 2023 14:37:44 +0000 Subject: [PATCH 31/82] update docs to promote PodReadyToStartContainersCondition into beta --- .../reference/command-line-tools-reference/feature-gates.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 6fd45b731d4..32201947ef6 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -169,7 +169,8 @@ For a reference to old feature gates that are removed, please refer to | `PodHostIPs` | `false` | Alpha | 1.28 | | | `PodIndexLabel` | `true` | Beta | 1.28 | | | `PodLifecycleSleepAction` | `false` | Alpha | 1.29 | | -| `PodReadyToStartContainersCondition` | `false` | Alpha | 1.28 | | +| `PodReadyToStartContainersCondition` | `false` | Alpha | 1.28 | 1.28 | +| `PodReadyToStartContainersCondition` | `true` | Beta | 1.29 | | | `PodSchedulingReadiness` | `false` | Alpha | 1.26 | 1.26 | | `PodSchedulingReadiness` | `true` | Beta | 1.27 | | | `ProcMountType` | `false` | Alpha | 1.12 | | From 97a1c7482b907493985951dc0a9c85f07289e682 Mon Sep 17 00:00:00 2001 From: Paco Xu Date: Wed, 1 Nov 2023 16:20:11 +0800 Subject: [PATCH 32/82] v1.29: kubeadm skew policy for kubelet is n-3 --- .../tools/kubeadm/create-cluster-kubeadm.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm.md b/content/en/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm.md index 4a1358e9f9f..82ed96872ee 100644 --- a/content/en/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm.md +++ b/content/en/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm.md @@ -559,7 +559,7 @@ version as kubeadm or one version older. Example: * kubeadm is at {{< skew currentVersion >}} -* kubelet on the host must be at {{< skew currentVersion >}} or {{< skew currentVersionAddMinor -1 >}} +* kubelet on the host must be at {{< skew currentVersion >}}, {{< skew currentVersionAddMinor -1 >}}, {{< skew currentVersionAddMinor -2 >}} or {{< skew currentVersionAddMinor -3 >}} ### kubeadm's skew against kubeadm From ddb784aab1bba6a62d62898dccfe31a9d1dcd5e8 Mon Sep 17 00:00:00 2001 From: "Lubomir I. Ivanov" Date: Fri, 10 Nov 2023 14:27:38 +0200 Subject: [PATCH 33/82] certificates.md: add note about system:masters in apiserver cert The kube-apiserver flag --kubelet-client-certificate accepts a client certificate (kube-apiserver-kubelet-client.crt) to connect to the kubelet. There is no need for this certificate to have "system:masters" as "O" in the Subject, instead it can be a less privileged group like kubeadm's "kubeadm:cluster-admins". --- content/en/docs/setup/best-practices/certificates.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/content/en/docs/setup/best-practices/certificates.md b/content/en/docs/setup/best-practices/certificates.md index 4de7277e4af..f8af369c804 100644 --- a/content/en/docs/setup/best-practices/certificates.md +++ b/content/en/docs/setup/best-practices/certificates.md @@ -95,6 +95,12 @@ Required certificates: | kube-apiserver-kubelet-client | kubernetes-ca | system:masters | client | | | front-proxy-client | kubernetes-front-proxy-ca | | client | | +{{< note >}} +Instead of using the super-user group `system:masters` for `kube-apiserver-kubelet-client` +a less privileged group can be used. kubeadm uses the `kubeadm:cluster-admins` group for +that purpose. +{{< /note >}} + [1]: any other IP or DNS name you contact your cluster on (as used by [kubeadm](/docs/reference/setup-tools/kubeadm/) the load balancer stable IP and/or DNS name, `kubernetes`, `kubernetes.default`, `kubernetes.default.svc`, `kubernetes.default.svc.cluster`, `kubernetes.default.svc.cluster.local`) From 3be75f26fd7a5025d3e4bccbae63748c3da425c7 Mon Sep 17 00:00:00 2001 From: Aldo Culquicondor Date: Tue, 17 Oct 2023 09:31:02 -0400 Subject: [PATCH 34/82] Graduate JobPodReplacementPolicy to beta --- content/en/docs/concepts/workloads/controllers/job.md | 5 +++-- .../reference/command-line-tools-reference/feature-gates.md | 3 ++- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/content/en/docs/concepts/workloads/controllers/job.md b/content/en/docs/concepts/workloads/controllers/job.md index cb344b4e33a..0ffc3087925 100644 --- a/content/en/docs/concepts/workloads/controllers/job.md +++ b/content/en/docs/concepts/workloads/controllers/job.md @@ -953,11 +953,12 @@ scaling an indexed Job, such as MPI, Horovord, Ray, and PyTorch training jobs. ### Delayed creation of replacement pods {#pod-replacement-policy} -{{< feature-state for_k8s_version="v1.28" state="alpha" >}} +{{< feature-state for_k8s_version="v1.29" state="beta" >}} {{< note >}} You can only set `podReplacementPolicy` on Jobs if you enable the `JobPodReplacementPolicy` -[feature gate](/docs/reference/command-line-tools-reference/feature-gates/). +[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) +(enabled by default). {{< /note >}} By default, the Job controller recreates Pods as soon they either fail or are terminating (have a deletion timestamp). diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 2e09dddefb4..d272cbf503b 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -118,7 +118,8 @@ For a reference to old feature gates that are removed, please refer to | `JobBackoffLimitPerIndex` | `true` | Beta | 1.29 | | | `JobPodFailurePolicy` | `false` | Alpha | 1.25 | 1.25 | | `JobPodFailurePolicy` | `true` | Beta | 1.26 | | -| `JobPodReplacementPolicy` | `false` | Alpha | 1.28 | | +| `JobPodReplacementPolicy` | `false` | Alpha | 1.28 | 1.28 | +| `JobPodReplacementPolicy` | `true` | Beta | 1.29 | | | `KMSv2` | `false` | Alpha | 1.25 | 1.26 | | `KMSv2` | `true` | Beta | 1.27 | | | `KMSv2KDF` | `false` | Beta | 1.28 | | From 725f68ff18f0dd28144af73eb0c1e91879650a53 Mon Sep 17 00:00:00 2001 From: Patrick Ohly Date: Mon, 13 Nov 2023 14:48:34 +0100 Subject: [PATCH 35/82] dra: warn about scheduling performance That pods with ResourceClaims get scheduled more slowly, but that this also affects other pods may be surprising and is worth calling out. --- .../dynamic-resource-allocation.md | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/content/en/docs/concepts/scheduling-eviction/dynamic-resource-allocation.md b/content/en/docs/concepts/scheduling-eviction/dynamic-resource-allocation.md index f34f7a2c5ad..47420240d94 100644 --- a/content/en/docs/concepts/scheduling-eviction/dynamic-resource-allocation.md +++ b/content/en/docs/concepts/scheduling-eviction/dynamic-resource-allocation.md @@ -162,6 +162,17 @@ gets scheduled onto one node and then cannot run there, which is bad because such a pending Pod also blocks all other resources like RAM or CPU that were set aside for it. +{{< note >}} + +Scheduling of pods which use ResourceClaims is going to be slower because of +the additional communication that is required. Beware that this may also impact +pods that don't use ResourceClaims because only one pod at a time gets +scheduled, blocking API calls are made while handling a pod with +ResourceClaims, and thus scheduling the next pod gets delayed. + +{{< /note >}} + + ## Monitoring resources The kubelet provides a gRPC service to enable discovery of dynamic resources of From d820f2b988d90756c11b0c9371b4a830e7cc6c66 Mon Sep 17 00:00:00 2001 From: Alexander Zielenski <351783+alexzielenski@users.noreply.github.com> Date: Wed, 18 Oct 2023 09:19:36 -0700 Subject: [PATCH 36/82] add CRDValidationRatcheting 1.29 docs --- .../custom-resource-definitions.md | 39 +++++++++++++++++-- 1 file changed, 36 insertions(+), 3 deletions(-) diff --git a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md index 28b3493a046..b5ff4b07313 100644 --- a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md +++ b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md @@ -749,8 +749,12 @@ validations are not supported by ratcheting under the implementation in Kubernet - `not` - any validations in a descendent of one of these fields - `x-kubernetes-validations` - For Kubernetes {{< skew currentVersion >}}, CRD validation rules](#validation-rules) are ignored by - ratcheting. This may change in later Kubernetes releases. + For Kubernetes 1.28, CRD validation rules](#validation-rules) are ignored by + ratcheting. Starting with Alpha 2 in Kubernetes 1.29, `x-kubernetes-validations` + are ratcheted. + + Transition Rules are never ratcheted: only errors raised by rules that do not + use `oldSelf` will be automatically ratcheted if their values are unchanged. - `x-kubernetes-list-type` Errors arising from changing the list type of a subschema will not be ratcheted. For example adding `set` onto a list with duplicates will always @@ -767,7 +771,9 @@ validations are not supported by ratcheting under the implementation in Kubernet - `additionalProperties` To remove a previously specified `additionalProperties` validation will not be ratcheted. - +- `metadata` + Errors arising from changes to fields within an object's `metadata` are not + ratcheted. ### Validation rules @@ -1177,6 +1183,33 @@ The `fieldPath` field does not support indexing arrays numerically. Setting `fieldPath` is optional. +#### The `optionalOldSelf` field {#field-optional-oldself} + +The `optionalOldSelf` field is a boolean field added in Kubernetes 1.29. The feature +[CRDValidationRatcheting](#validation-ratcheting) must be enabled in order to +make use of this field. + +This field alters the behavior of [Transition Rules](#transition-rules) described +below. Normally, a transition rule will not evaluate if `oldSelf` cannot be determined: +during object creation or when a new value is introduced in an update. + +If `optionalOldSelf` is set to true, then transition rules will always be +evaluated and the type of `oldSelf` be changed to a CEL [`Optional`](https://pkg.go.dev/github.com/google/cel-go/cel#OptionalTypes) type. + +`optionalOldSelf` is useful in cases where schema authors would like a more +powerful tool than [implicit deepequal validation ratcheting][#validation-ratcheting] +to introduce newer, usually stricter constraints on new values, while still +allowing old values to be "grandfathered" or ratcheted using the older validation. + +Example Usage: + +| CEL | Description | +|-----------------------------------------|-------------| +| `self.foo == "foo" || (oldSelf.hasValue() && oldSelf.value().foo != "foo")` | Ratcheted rule. Once a value is set to "foo", it must stay foo. But if it existed before the "foo" constraint was introduced, it may use any value | +| [oldSelf.orValue(""), self].all(x, ["OldCase1", "OldCase2"].exists(case, x == case)) || ["NewCase1", "NewCase2"].exists(case, self == case) || ["NewCase"].has(self)` | "Ratcheted validation for removed enum cases if oldSelf used them" | +| oldSelf.optMap(o, o.size()).orValue(0) < 4 || self.size() >= 4 | Ratcheted validation of newly increased minimum map or list size | + + #### Validation functions {#available-validation-functions} Functions available include: From 407407e92131e1ea44969e6034acc52e233f0b7c Mon Sep 17 00:00:00 2001 From: Sean Sullivan Date: Fri, 20 Oct 2023 22:57:17 +0000 Subject: [PATCH 37/82] Placeholder for KEP-4006 --- .../command-line-tools-reference/feature-gates.md | 4 ++++ content/en/docs/reference/kubectl/kubectl.md | 8 ++++++++ 2 files changed, 12 insertions(+) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index d272cbf503b..e4425639e25 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -205,6 +205,7 @@ For a reference to old feature gates that are removed, please refer to | `TopologyManagerPolicyBetaOptions` | `true` | Beta | 1.28 | | | `TopologyManagerPolicyOptions` | `false` | Alpha | 1.26 | 1.27 | | `TopologyManagerPolicyOptions` | `true` | Beta | 1.28 | | +| `TranslateStreamCloseWebsocketRequests` | `false` | Alpha | 1.29 | | | `UnknownVersionInteroperabilityProxy` | `false` | Alpha | 1.28 | | | `UserNamespacesSupport` | `false` | Alpha | 1.28 | | | `ValidatingAdmissionPolicy` | `false` | Alpha | 1.26 | 1.27 | @@ -739,6 +740,9 @@ Each feature gate is designed for enabling/disabling a specific feature: This feature gate guards *a group* of topology manager options whose quality level is beta. This feature gate will never graduate to stable. - `TopologyManagerPolicyOptions`: Allow fine-tuning of topology manager policies, +- `TranslateStreamCloseWebsocketRequests`: Allow WebSocket streaming of the + remote command sub-protocol (`exec`, `cp`, `attach`) from clients requesting + version 5 (v5) of the sub-protocol. - `UnknownVersionInteroperabilityProxy`: Proxy resource requests to the correct peer kube-apiserver when multiple kube-apiservers exist at varied versions. See [Mixed version proxy](/docs/concepts/architecture/mixed-version-proxy/) for more information. diff --git a/content/en/docs/reference/kubectl/kubectl.md b/content/en/docs/reference/kubectl/kubectl.md index aa92d9f9685..80377dce33e 100644 --- a/content/en/docs/reference/kubectl/kubectl.md +++ b/content/en/docs/reference/kubectl/kubectl.md @@ -369,6 +369,14 @@ kubectl [flags] + +KUBECTL_REMOTE_COMMAND_WEBSOCKETS + + +When set to true, the kubectl exec, cp, and attach commands will attempt to stream using the websockets protocol. If the upgrade to websockets fails, the commands will fallback to use the current SPDY protocol. + + + From b1d5b82a4a72ecd4b04dbad8f27e76fe614d2014 Mon Sep 17 00:00:00 2001 From: Antonio Ojea Date: Tue, 14 Nov 2023 22:35:32 +0000 Subject: [PATCH 38/82] remove MultiCIDRRangeAllocator --- .../reference/command-line-tools-reference/feature-gates.md | 2 -- 1 file changed, 2 deletions(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index d272cbf503b..8312e733bf4 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -146,7 +146,6 @@ For a reference to old feature gates that are removed, please refer to | `MinDomainsInPodTopologySpread` | `false` | Alpha | 1.24 | 1.24 | | `MinDomainsInPodTopologySpread` | `false` | Beta | 1.25 | 1.26 | | `MinDomainsInPodTopologySpread` | `true` | Beta | 1.27 | | -| `MultiCIDRRangeAllocator` | `false` | Alpha | 1.25 | | | `MultiCIDRServiceAllocator` | `false` | Alpha | 1.27 | | | `NewVolumeManagerReconstruction` | `false` | Beta | 1.27 | 1.27 | | `NewVolumeManagerReconstruction` | `true` | Beta | 1.28 | | @@ -619,7 +618,6 @@ Each feature gate is designed for enabling/disabling a specific feature: [Pod topology spread constraints](/docs/concepts/scheduling-eviction/topology-spread-constraints/). - `MinimizeIPTablesRestore`: Enables new performance improvement logics in the kube-proxy iptables mode. -- `MultiCIDRRangeAllocator`: Enables the MultiCIDR range allocator. - `MultiCIDRServiceAllocator`: Track IP address allocations for Service cluster IPs using IPAddress objects. - `NewVolumeManagerReconstruction`: Enables improved discovery of mounted volumes during kubelet startup. Since this code has been significantly refactored, we allow to opt-out in case kubelet From 8f7cfdbf9c856a0d6a1c339cf5fc97211569647f Mon Sep 17 00:00:00 2001 From: Matthias Bertschy Date: Wed, 15 Nov 2023 01:21:56 +0100 Subject: [PATCH 39/82] modifying docs for SidecarContainers beta graduation (#43471) Signed-off-by: Matthias Bertschy --- .../en/docs/concepts/workloads/pods/_index.md | 4 ++-- .../concepts/workloads/pods/init-containers.md | 6 +++--- .../concepts/workloads/pods/pod-lifecycle.md | 18 +++++++++++++++++- .../feature-gates.md | 3 ++- 4 files changed, 24 insertions(+), 7 deletions(-) diff --git a/content/en/docs/concepts/workloads/pods/_index.md b/content/en/docs/concepts/workloads/pods/_index.md index febf062c2eb..1132c38793c 100644 --- a/content/en/docs/concepts/workloads/pods/_index.md +++ b/content/en/docs/concepts/workloads/pods/_index.md @@ -111,9 +111,9 @@ Some Pods have {{< glossary_tooltip text="init containers" term_id="init-contain as well as {{< glossary_tooltip text="app containers" term_id="app-container" >}}. By default, init containers run and complete before the app containers are started. -{{< feature-state for_k8s_version="v1.28" state="alpha" >}} +{{< feature-state for_k8s_version="v1.29" state="beta" >}} -Enabling the `SidecarContainers` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) +Enabled by default, the `SidecarContainers` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) allows you to specify `restartPolicy: Always` for init containers. Setting the `Always` restart policy ensures that the init containers where you set it are kept running during the entire lifetime of the Pod. diff --git a/content/en/docs/concepts/workloads/pods/init-containers.md b/content/en/docs/concepts/workloads/pods/init-containers.md index 2533c286d79..03152840dd0 100644 --- a/content/en/docs/concepts/workloads/pods/init-containers.md +++ b/content/en/docs/concepts/workloads/pods/init-containers.md @@ -291,9 +291,9 @@ validation error is thrown for any container sharing a name with another. #### API for sidecar containers -{{< feature-state for_k8s_version="v1.28" state="alpha" >}} +{{< feature-state for_k8s_version="v1.29" state="beta" >}} -Starting with Kubernetes 1.28 in alpha, a feature gate named `SidecarContainers` +Enabled by default with Kubernetes 1.29, a feature gate named `SidecarContainers` allows you to specify a `restartPolicy` for init containers which is independent of the Pod and other init containers. Container [probes](/docs/concepts/workloads/pods/pod-lifecycle/#types-of-probe) can also be added to control their lifecycle. @@ -376,4 +376,4 @@ Kubernetes, consult the documentation for the version you are using. * Read about [creating a Pod that has an init container](/docs/tasks/configure-pod-container/configure-pod-initialization/#create-a-pod-that-has-an-init-container) * Learn how to [debug init containers](/docs/tasks/debug/debug-application/debug-init-containers/) * Read about an overview of [kubelet](/docs/reference/command-line-tools-reference/kubelet/) and [kubectl](/docs/reference/kubectl/) -* Learn about the [types of probes](/docs/concepts/workloads/pods/pod-lifecycle/#types-of-probe): liveness, readiness, startup probe. +* Learn about the [types of probes](/docs/concepts/workloads/pods/pod-lifecycle/#types-of-probe): liveness, readiness, startup probe. \ No newline at end of file diff --git a/content/en/docs/concepts/workloads/pods/pod-lifecycle.md b/content/en/docs/concepts/workloads/pods/pod-lifecycle.md index 1f73ccbe3ff..fda5fdb89d8 100644 --- a/content/en/docs/concepts/workloads/pods/pod-lifecycle.md +++ b/content/en/docs/concepts/workloads/pods/pod-lifecycle.md @@ -504,6 +504,22 @@ termination grace period _begins_. The behavior above is described when the feature gate `EndpointSliceTerminatingCondition` is enabled. {{}} +{{}} +Beginning with Kubernetes 1.29, if your Pod includes one or more sidecar containers +(init containers with an Always restart policy), the kubelet will delay sending +the TERM signal to these sidecar containers until the last main container has fully terminated. +The sidecar containers will be terminated in the reverse order they are defined in the Pod spec. +This ensures that sidecar containers continue serving the other containers in the Pod until they are no longer needed. + +Note that slow termination of a main container will also delay the termination of the sidecar containers. +If the grace period expires before the termination process is complete, the Pod may enter emergency termination. +In this case, all remaining containers in the Pod will be terminated simultaneously with a short grace period. + +Similarly, if the Pod has a preStop hook that exceeds the termination grace period, emergency termination may occur. +In general, if you have used preStop hooks to control the termination order without sidecar containers, you can now +remove them and allow the kubelet to manage sidecar termination automatically. +{{}} + 1. When the grace period expires, the kubelet triggers forcible shutdown. The container runtime sends `SIGKILL` to any processes still running in any container in the Pod. The kubelet also cleans up a hidden `pause` container if that container runtime uses one. @@ -584,4 +600,4 @@ for more details. * For detailed information about Pod and container status in the API, see the API reference documentation covering - [`status`](/docs/reference/kubernetes-api/workload-resources/pod-v1/#PodStatus) for Pod. + [`status`](/docs/reference/kubernetes-api/workload-resources/pod-v1/#PodStatus) for Pod. \ No newline at end of file diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 8312e733bf4..f6639c72c71 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -185,7 +185,8 @@ For a reference to old feature gates that are removed, please refer to | `SELinuxMountReadWriteOncePod` | `true` | Beta | 1.28 | | | `SchedulerQueueingHints` | `true` | Beta | 1.28 | | | `SecurityContextDeny` | `false` | Alpha | 1.27 | | -| `SidecarContainers` | `false` | Alpha | 1.28 | | +| `SidecarContainers` | `false` | Alpha | 1.28 | 1.28 | +| `SidecarContainers` | `true` | Beta | 1.29 | | | `SizeMemoryBackedVolumes` | `false` | Alpha | 1.20 | 1.21 | | `SizeMemoryBackedVolumes` | `true` | Beta | 1.22 | | | `StableLoadBalancerNodeSet` | `true` | Beta | 1.27 | | From 16fb2e68c6dd40c18952925dbccc9d61f39e37e1 Mon Sep 17 00:00:00 2001 From: Cici Huang Date: Wed, 11 Oct 2023 17:52:25 +0000 Subject: [PATCH 40/82] Promote CRD validation rules to GA --- .../command-line-tools-reference/feature-gates.md | 5 +++-- .../custom-resources/custom-resource-definitions.md | 10 +--------- 2 files changed, 4 insertions(+), 11 deletions(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index d272cbf503b..d7f4d736c06 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -89,8 +89,6 @@ For a reference to old feature gates that are removed, please refer to | `CronJobsScheduledAnnotation` | `true` | Beta | 1.28 | | | `CrossNamespaceVolumeDataSource` | `false` | Alpha| 1.26 | | | `CustomCPUCFSQuotaPeriod` | `false` | Alpha | 1.12 | | -| `CustomResourceValidationExpressions` | `false` | Alpha | 1.23 | 1.24 | -| `CustomResourceValidationExpressions` | `true` | Beta | 1.25 | | | `DevicePluginCDIDevices` | `false` | Alpha | 1.28 | | | `DisableCloudProviders` | `false` | Alpha | 1.22 | | | `DisableKubeletCloudCredentialProviders` | `false` | Alpha | 1.23 | | @@ -249,6 +247,9 @@ For a reference to old feature gates that are removed, please refer to | `ComponentSLIs` | `true` | Beta | 1.27 | 1.28| | `ComponentSLIs` | `true` | GA | 1.29 | - | | `ConsistentHTTPGetHandlers` | `true` | GA | 1.25 | - | +| `CustomResourceValidationExpressions` | `false` | Alpha | 1.23 | 1.24 | +| `CustomResourceValidationExpressions` | `true` | Beta | 1.25 | 1.28 | +| `CustomResourceValidationExpressions` | `true` | GA | 1.29 | - | | `DaemonSetUpdateSurge` | `false` | Alpha | 1.21 | 1.21 | | `DaemonSetUpdateSurge` | `true` | Beta | 1.22 | 1.24 | | `DaemonSetUpdateSurge` | `true` | GA | 1.25 | | diff --git a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md index 28b3493a046..ff7bb46c324 100644 --- a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md +++ b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md @@ -771,15 +771,7 @@ validations are not supported by ratcheting under the implementation in Kubernet ### Validation rules -{{< feature-state state="beta" for_k8s_version="v1.25" >}} - - -Validation rules are in beta since 1.25 and the `CustomResourceValidationExpressions` -[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled by default to -validate custom resource based on _validation rules_. You can disable this feature by explicitly -setting the `CustomResourceValidationExpressions` feature gate to `false`, for the -[kube-apiserver](/docs/reference/command-line-tools-reference/kube-apiserver/) component. This -feature is only available if the schema is a [structural schema](#specifying-a-structural-schema). +{{< feature-state state="stable" for_k8s_version="v1.29" >}} Validation rules use the [Common Expression Language (CEL)](https://github.com/google/cel-spec) to validate custom resource values. Validation rules are included in From 50ea97524ef7b0de50abb0e671361a243fcd3f32 Mon Sep 17 00:00:00 2001 From: charles-chenzz Date: Thu, 16 Nov 2023 20:03:14 +0800 Subject: [PATCH 41/82] update pod-lifecycle.md to reflect the state of podreadytostartcontainer --- content/en/docs/concepts/workloads/pods/pod-lifecycle.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/content/en/docs/concepts/workloads/pods/pod-lifecycle.md b/content/en/docs/concepts/workloads/pods/pod-lifecycle.md index 1f73ccbe3ff..85f20701656 100644 --- a/content/en/docs/concepts/workloads/pods/pod-lifecycle.md +++ b/content/en/docs/concepts/workloads/pods/pod-lifecycle.md @@ -164,7 +164,7 @@ through which the Pod has or has not passed. Kubelet manages the following PodConditions: * `PodScheduled`: the Pod has been scheduled to a node. -* `PodReadyToStartContainers`: (alpha feature; must be [enabled explicitly](#pod-has-network)) the +* `PodReadyToStartContainers`: (beta feature; enable by [default now](#pod-has-network)) the Pod sandbox has been successfully created and networking configured. * `ContainersReady`: all containers in the Pod are ready. * `Initialized`: all [init containers](/docs/concepts/workloads/pods/init-containers/) @@ -242,17 +242,16 @@ When a Pod's containers are Ready but at least one custom condition is missing o ### Pod network readiness {#pod-has-network} -{{< feature-state for_k8s_version="v1.25" state="alpha" >}} +{{< feature-state for_k8s_version="v1.29" state="beta" >}} {{< note >}} -This condition was renamed from PodHasNetwork to PodReadyToStartContainers. +This condition was renamed from PodHasNetwork to PodReadyToStartContainers. And now is enable by default {{< /note >}} After a Pod gets scheduled on a node, it needs to be admitted by the Kubelet and have any volumes mounted. Once these phases are complete, the Kubelet works with a container runtime (using {{< glossary_tooltip term_id="cri" >}}) to set up a -runtime sandbox and configure networking for the Pod. If the -`PodReadyToStartContainersCondition` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled, +runtime sandbox and configure networking for the Pod, Kubelet reports whether a pod has reached this initialization milestone through the `PodReadyToStartContainers` condition in the `status.conditions` field of a Pod. From e109ce70bac38de8443c2ccb68ac06e645e97e7d Mon Sep 17 00:00:00 2001 From: charles-chenzz Date: Thu, 16 Nov 2023 21:17:19 +0800 Subject: [PATCH 42/82] first round of comment address --- .../concepts/workloads/pods/pod-lifecycle.md | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-) diff --git a/content/en/docs/concepts/workloads/pods/pod-lifecycle.md b/content/en/docs/concepts/workloads/pods/pod-lifecycle.md index 85f20701656..f7405b412e9 100644 --- a/content/en/docs/concepts/workloads/pods/pod-lifecycle.md +++ b/content/en/docs/concepts/workloads/pods/pod-lifecycle.md @@ -164,7 +164,7 @@ through which the Pod has or has not passed. Kubelet manages the following PodConditions: * `PodScheduled`: the Pod has been scheduled to a node. -* `PodReadyToStartContainers`: (beta feature; enable by [default now](#pod-has-network)) the +* `PodReadyToStartContainers`: (beta feature; enabled by [default](#pod-has-network)) the Pod sandbox has been successfully created and networking configured. * `ContainersReady`: all containers in the Pod are ready. * `Initialized`: all [init containers](/docs/concepts/workloads/pods/init-containers/) @@ -245,15 +245,18 @@ When a Pod's containers are Ready but at least one custom condition is missing o {{< feature-state for_k8s_version="v1.29" state="beta" >}} {{< note >}} -This condition was renamed from PodHasNetwork to PodReadyToStartContainers. And now is enable by default +During its early development, this condition was named `PodHasNetwork`. {{< /note >}} -After a Pod gets scheduled on a node, it needs to be admitted by the Kubelet and -have any volumes mounted. Once these phases are complete, the Kubelet works with +After a Pod gets scheduled on a node, it needs to be admitted by the kubelet and +to have any required storage volumes mounted. Once these phases are complete, +the kubelet works with a container runtime (using {{< glossary_tooltip term_id="cri" >}}) to set up a -runtime sandbox and configure networking for the Pod, -Kubelet reports whether a pod has reached this initialization milestone through -the `PodReadyToStartContainers` condition in the `status.conditions` field of a Pod. +runtime sandbox and configure networking for the Pod. If the +`PodReadyToStartContainersCondition` +[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled +(it is enabled by default for Kubernetes {{< skew currentVersion >}}), the +`PodReadyToStartContainers` condition in the `status.conditions` field of a Pod. The `PodReadyToStartContainers` condition is set to `False` by the Kubelet when it detects a Pod does not have a runtime sandbox with networking configured. This occurs in From 42c9e4e20fdb3ba5f0f6b9251c5e8e63d03ddef3 Mon Sep 17 00:00:00 2001 From: Monis Khan Date: Wed, 15 Nov 2023 15:51:42 -0500 Subject: [PATCH 43/82] KEP-4193: bound service account token improvements Signed-off-by: Monis Khan --- .../concepts/security/service-accounts.md | 5 ++- .../service-accounts-admin.md | 4 +- .../feature-gates.md | 10 +++++ .../configure-service-account.md | 45 ++++++++++++++++++- 4 files changed, 58 insertions(+), 6 deletions(-) diff --git a/content/en/docs/concepts/security/service-accounts.md b/content/en/docs/concepts/security/service-accounts.md index 365074cba97..38b81ab49df 100644 --- a/content/en/docs/concepts/security/service-accounts.md +++ b/content/en/docs/concepts/security/service-accounts.md @@ -217,7 +217,8 @@ request. The API server checks the validity of that bearer token as follows: The TokenRequest API produces _bound tokens_ for a ServiceAccount. This binding is linked to the lifetime of the client, such as a Pod, that is acting -as that ServiceAccount. +as that ServiceAccount. See [Token Volume Projection](/docs/tasks/configure-pod-container/configure-service-account/#serviceaccount-token-volume-projection) +for an example of a bound pod service account token's JWT schema and payload. For tokens issued using the `TokenRequest` API, the API server also checks that the specific object reference that is using the ServiceAccount still exists, @@ -239,7 +240,7 @@ account credentials, you can use the following methods: The Kubernetes project recommends that you use the TokenReview API, because this method invalidates tokens that are bound to API objects such as Secrets, -ServiceAccounts, and Pods when those objects are deleted. For example, if you +ServiceAccounts, Pods or Nodes when those objects are deleted. For example, if you delete the Pod that contains a projected ServiceAccount token, the cluster invalidates that token immediately and a TokenReview immediately fails. If you use OIDC validation instead, your clients continue to treat the token diff --git a/content/en/docs/reference/access-authn-authz/service-accounts-admin.md b/content/en/docs/reference/access-authn-authz/service-accounts-admin.md index ca6f831da83..252d090d8f1 100644 --- a/content/en/docs/reference/access-authn-authz/service-accounts-admin.md +++ b/content/en/docs/reference/access-authn-authz/service-accounts-admin.md @@ -1,9 +1,7 @@ --- reviewers: - - bprashanth - - davidopp - - lavalamp - liggitt + - enj title: Managing Service Accounts content_type: concept weight: 50 diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 7aaa7b2cd4a..4d015964d28 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -186,6 +186,10 @@ For a reference to old feature gates that are removed, please refer to | `SELinuxMountReadWriteOncePod` | `true` | Beta | 1.28 | | | `SchedulerQueueingHints` | `true` | Beta | 1.28 | | | `SecurityContextDeny` | `false` | Alpha | 1.27 | | +| `ServiceAccountTokenJTI` | `false` | Alpha | 1.29 | | +| `ServiceAccountTokenNodeBinding` | `false` | Alpha | 1.29 | | +| `ServiceAccountTokenNodeBindingValidation` | `false` | Alpha | 1.29 | | +| `ServiceAccountTokenPodNodeInfo` | `false` | Alpha | 1.29 | | | `SidecarContainers` | `false` | Alpha | 1.28 | 1.28 | | `SidecarContainers` | `true` | Beta | 1.29 | | | `SizeMemoryBackedVolumes` | `false` | Alpha | 1.20 | 1.21 | @@ -726,6 +730,12 @@ Each feature gate is designed for enabling/disabling a specific feature: - `ServerSideFieldValidation`: Enables server-side field validation. This means the validation of resource schema is performed at the API server side rather than the client side (for example, the `kubectl create` or `kubectl apply` command line). +- `ServiceAccountTokenJTI`: Controls whether JTIs (UUIDs) are embedded into generated service account tokens, + and whether these JTIs are recorded into the Kubernetes audit log for future requests made by these tokens. +- `ServiceAccountTokenNodeBinding`: Controls whether the apiserver allows binding service account tokens to Node objects. +- `ServiceAccountTokenNodeBindingValidation`: Controls whether the apiserver will validate a Node reference in service account tokens. +- `ServiceAccountTokenPodNodeInfo`: Controls whether the apiserver embeds the node name and uid + for the associated node when issuing service account tokens bound to Pod objects. - `SidecarContainers`: Allow setting the `restartPolicy` of an init container to `Always` so that the container becomes a sidecar container (restartable init containers). See diff --git a/content/en/docs/tasks/configure-pod-container/configure-service-account.md b/content/en/docs/tasks/configure-pod-container/configure-service-account.md index e5530ec2a78..002fc3708e9 100644 --- a/content/en/docs/tasks/configure-pod-container/configure-service-account.md +++ b/content/en/docs/tasks/configure-pod-container/configure-service-account.md @@ -1,6 +1,6 @@ --- reviewers: -- bprashanth +- enj - liggitt - thockin title: Configure Service Accounts for Pods @@ -184,6 +184,16 @@ ServiceAccount. You can request a specific token duration using the `--duration` command line argument to `kubectl create token` (the actual duration of the issued token might be shorter, or could even be longer). +When the `ServiceAccountTokenNodeBinding` and `ServiceAccountTokenNodeBindingValidation` +features are enabled and the `KUBECTL_NODE_BOUND_TOKENS` enviroment variable is set to `true`, +it is possible to create a service account token that is directly bound to a `Node`: + +```shell +KUBECTL_NODE_BOUND_TOKENS=true kubectl create token build-robot --bound-object-kind Node --bound-object-name node-001 --bound-object-uid 123...456 +``` + +The token will be valid until it expires or either the assocaited `Node` or service account are deleted. + {{< note >}} Versions of Kubernetes before v1.22 automatically created long term credentials for accessing the Kubernetes API. This older mechanism was based on creating token Secrets @@ -408,6 +418,39 @@ You can configure this behavior for the `spec` of a Pod using a [projected volume](/docs/concepts/storage/volumes/#projected) type called `ServiceAccountToken`. +The token from this projected volume is a {{}} (JWT). +The JSON payload of this token follows a well defined schema - an example payload for a pod bound token: + +```yaml +{ + "aud": [ # matches the requested audiences, or the API server's default audiences when none are explicitly requested + "https://kubernetes.default.svc" + ], + "exp": 1731613413, + "iat": 1700077413, + "iss": "https://kubernetes.default.svc", # matches the first value passed to the --service-account-issuer flag + "jti": "ea28ed49-2e11-4280-9ec5-bc3d1d84661a", # ServiceAccountTokenJTI feature must be enabled for the claim to be present + "kubernetes.io": { + "namespace": "kube-system", + "node": { # ServiceAccountTokenPodNodeInfo feature must be enabled for the API server to add this node reference claim + "name": "127.0.0.1", + "uid": "58456cb0-dd00-45ed-b797-5578fdceaced" + }, + "pod": { + "name": "coredns-69cbfb9798-jv9gn", + "uid": "778a530c-b3f4-47c0-9cd5-ab018fb64f33" + }, + "serviceaccount": { + "name": "coredns", + "uid": "a087d5a0-e1dd-43ec-93ac-f13d89cd13af" + }, + "warnafter": 1700081020 + }, + "nbf": 1700077413, + "sub": "system:serviceaccount:kube-system:coredns" +} +``` + ### Launch a Pod using service account token projection To provide a Pod with a token with an audience of `vault` and a validity duration From bcb527b5bebd66437ab6437333709147e6844e8c Mon Sep 17 00:00:00 2001 From: tinatingyu Date: Sat, 28 Oct 2023 03:52:22 +0000 Subject: [PATCH 44/82] Add LegacyServiceAccountTokenCleanUp feature to beta --- .../service-accounts-admin.md | 107 ++++++++++++++++++ .../feature-gates.md | 9 +- .../labels-annotations-taints/_index.md | 17 +++ 3 files changed, 130 insertions(+), 3 deletions(-) diff --git a/content/en/docs/reference/access-authn-authz/service-accounts-admin.md b/content/en/docs/reference/access-authn-authz/service-accounts-admin.md index ca6f831da83..778a5dc72a0 100644 --- a/content/en/docs/reference/access-authn-authz/service-accounts-admin.md +++ b/content/en/docs/reference/access-authn-authz/service-accounts-admin.md @@ -140,6 +140,62 @@ using [TokenRequest](/docs/reference/kubernetes-api/authentication-resources/tok to obtain short-lived API access tokens is recommended instead. {{< /note >}} +## Auto-generated legacy ServiceAccount token clean up {#auto-generated-legacy-serviceaccount-token-clean-up} + +Before version 1.24, Kubernetes automatically generated Secret-based tokens for +ServiceAccounts. To distinguish between automatically generated tokens and +manually created ones, Kubernetes checks for a reference from the +ServiceAccount's secrets field. If the Secret is referenced in the `secrets` +field, it is considered an auto-generated legacy token. Otherwise, it is +considered a manually created legacy token. For example: + +```yaml +apiVersion: v1 +kind: ServiceAccount +metadata: + name: build-robot + namespace: default +secrets: + - name: build-robot-secret # usually NOT present for a manually generated token +``` + +Beginning from version 1.29, legacy ServiceAccount tokens that were generated +automatically will be marked as invalid if they remain unused for a certain +period of time (set to default at one year). Tokens that continue to be unused +for this defined period (again, by default, one year) will subsequently be +purged by the control plane. + +If users use an invalidated auto-generated token, the token validator will + +1. add an audit annotation for the key-value pair + `authentication.k8s.io/legacy-token-invalidated: /`, +1. increment the `invalid_legacy_auto_token_uses_total` metric count, +1. update the Secret label `kubernetes.io/legacy-token-last-used` with the new + date, +1. return an error indicating that the token has been invalidated. + +When receiving this validation error, users can update the Secret to remove the +`kubernetes.io/legacy-token-invalid-since` label to temporarily allow use of +this token. + +Here's an example of an auto-generated legacy token that has been marked with the +`kubernetes.io/legacy-token-last-used` and `kubernetes.io/legacy-token-invalid-since` +labels: + +```yaml +apiVersion: v1 +kind: Secret +metadata: + name: build-robot-secret + namespace: default + labels: + kubernetes.io/legacy-token-last-used: 2022-10-24 + kubernetes.io/legacy-token-invalid-since: 2023-10-25 + annotations: + kubernetes.io/service-account.name: build-robot +type: kubernetes.io/service-account-token +``` + ## Control plane details ### ServiceAccount controller @@ -193,6 +249,51 @@ it does the following when a Pod is created: 1. If the spec of the incoming Pod doesn't already contain any `imagePullSecrets`, then the admission controller adds `imagePullSecrets`, copying them from the `ServiceAccount`. +### Legacy ServiceAccount token tracking controller + +{{< feature-state for_k8s_version="v1.28" state="stable" >}} + +This controller generates a ConfigMap called +`kube-system/kube-apiserver-legacy-service-account-token-tracking` in the +`kube-system` namespace. The ConfigMap records the timestamp when legacy service +account tokens began to be monitored by the system. + +### Legacy ServiceAccount token cleaner + +{{< feature-state for_k8s_version="v1.29" state="beta" >}} + +The legacy ServiceAccount token cleaner runs as part of the +`kube-controller-manager` and checks every 24 hours to see if any auto-generated +legacy ServiceAccount token has not been used in a *specified amount of time*. +If so, the cleaner marks those tokens as invalid. + +The cleaner works by first checking the ConfigMap created by the control plane +(provided that `LegacyServiceAccountTokenTracking` is enabled). If the current +time is a *specified amount of time* after the date in the ConfigMap, the +cleaner then loops through the list of Secrets in the cluster and evaluates each +Secret that has the type `kubernetes.io/service-account-token`. + +If a Secret meets all of the following conditions, the cleaner marks it as +invalid: + +- The Secret is auto-generated, meaning that it is bi-directionally referenced + by a ServiceAccount. +- The Secret is not currently mounted by any pods. +- The Secret has not been used in a *specified amount of time* since it was + created or since it was last used. + +The cleaner marks a Secret invalid by adding a label called +`kubernetes.io/legacy-token-invalid-since` to the Secret, with the current date +as the value. If an invalid Secret is not used in a *specified amount of time*, +the cleaner will delete it. + +{{< note >}} +All the *specified amount of time* above defaults to one year. The cluster +administrator can configure this value through the +`--legacy-service-account-token-clean-up-period` command line argument for the +`kube-controller-manager` component. +{{< /note >}} + ### TokenRequest API {{< feature-state for_k8s_version="v1.22" state="stable" >}} @@ -300,6 +401,12 @@ token: ... If you launch a new Pod into the `examplens` namespace, it can use the `myserviceaccount` service-account-token Secret that you just created. +{{< caution >}} +Do not reference manually created Secrets in the `secrets` field of a +ServiceAccount. Or the manually created Secrets will be cleaned if it is not used for a long +time. Please refer to [auto-generated legacy ServiceAccount token clean up](#auto-generated-legacy-serviceaccount-token-clean-up). +{{< /caution >}} + ## Delete/invalidate a ServiceAccount token {#delete-token} If you know the name of the Secret that contains the token you want to remove: diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 4bf9a734983..ab6e7cd67ca 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -129,7 +129,8 @@ For a reference to old feature gates that are removed, please refer to | `KubeletPodResourcesGet` | `false` | Alpha | 1.27 | | | `KubeletTracing` | `false` | Alpha | 1.25 | 1.26 | | `KubeletTracing` | `true` | Beta | 1.27 | | -| `LegacyServiceAccountTokenCleanUp` | `false` | Alpha | 1.28 | | +| `LegacyServiceAccountTokenCleanUp` | `false` | Alpha | 1.28 | 1.28 | +| `LegacyServiceAccountTokenCleanUp` | `true` | Beta | 1.29 | | | `LoadBalancerIPMode` | `false` | Alpha | 1.29 | | | `LocalStorageCapacityIsolationFSQuotaMonitoring` | `false` | Alpha | 1.15 | - | | `LogarithmicScaleDown` | `false` | Alpha | 1.21 | 1.21 | @@ -603,9 +604,11 @@ Each feature gate is designed for enabling/disabling a specific feature: See [Traces for Kubernetes System Components](/docs/concepts/cluster-administration/system-traces) for more details. - `LegacyServiceAccountTokenNoAutoGeneration`: Stop auto-generation of Secret-based [service account tokens](/docs/concepts/security/service-accounts/#get-a-token). -- `LegacyServiceAccountTokenCleanUp`: Enable cleaning up Secret-based +- `LegacyServiceAccountTokenCleanUp`: Enable invalidating auto-generated Secret-based [service account tokens](/docs/concepts/security/service-accounts/#get-a-token) - when they are not used in a specified time (default to be one year). + when they have not been used in a specified time (defaults to one year). Clean up + the auto-generated Secret-based tokens if they have been invalidated for a specified time + (defaults to one year). - `LegacyServiceAccountTokenTracking`: Track usage of Secret-based [service account tokens](/docs/concepts/security/service-accounts/#get-a-token). - `LoadBalancerIPMode`: Allows setting `ipMode` for Services where `type` is set to `LoadBalancer`. diff --git a/content/en/docs/reference/labels-annotations-taints/_index.md b/content/en/docs/reference/labels-annotations-taints/_index.md index fd00019d528..5d16729dbda 100644 --- a/content/en/docs/reference/labels-annotations-taints/_index.md +++ b/content/en/docs/reference/labels-annotations-taints/_index.md @@ -1028,6 +1028,23 @@ last saw a request where the client authenticated using the service account toke If a legacy token was last used before the cluster gained the feature (added in Kubernetes v1.26), then the label isn't set. +### kubernetes.io/legacy-token-invalid-since + +Type: Label + +Example: `kubernetes.io/legacy-token-invalid-since: 2023-10-27` + +Used on: Secret + +The control plane automatically adds this label to auto-generated Secrets that +have the type `kubernetes.io/service-account-token`, provided that you have the +`LegacyServiceAccountTokenCleanUp` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) +enabled. Kubernetes {{< skew currentVersion >}} enables that behavior by default. +This label marks the Secret-based token as invalid for authentication. The value +of this label records the date (ISO 8601 format, UTC time zone) when the control +plane detects that the auto-generated Secret has not been used for a specified +duration (defaults to one year). + ### endpointslice.kubernetes.io/managed-by {#endpointslicekubernetesiomanaged-by} Type: Label From 8598729e5dd39784da921aa0174a72b61fe0c134 Mon Sep 17 00:00:00 2001 From: Anish Ramasekar Date: Tue, 10 Oct 2023 00:10:40 +0000 Subject: [PATCH 45/82] update docs for KMSv2 and KMSv2KDF stable Signed-off-by: Anish Ramasekar --- .../feature-gates.md | 11 ++- .../tasks/administer-cluster/encrypt-data.md | 13 +--- .../tasks/administer-cluster/kms-provider.md | 77 ++++++------------- 3 files changed, 34 insertions(+), 67 deletions(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index d272cbf503b..07f18f43f4c 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -120,9 +120,6 @@ For a reference to old feature gates that are removed, please refer to | `JobPodFailurePolicy` | `true` | Beta | 1.26 | | | `JobPodReplacementPolicy` | `false` | Alpha | 1.28 | 1.28 | | `JobPodReplacementPolicy` | `true` | Beta | 1.29 | | -| `KMSv2` | `false` | Alpha | 1.25 | 1.26 | -| `KMSv2` | `true` | Beta | 1.27 | | -| `KMSv2KDF` | `false` | Beta | 1.28 | | | `KubeProxyDrainingTerminatingNodes` | `false` | Alpha | 1.28 | | | `KubeletCgroupDriverFromCRI` | `false` | Alpha | 1.28 | | | `KubeletInUserNamespace` | `false` | Alpha | 1.22 | | @@ -274,7 +271,13 @@ For a reference to old feature gates that are removed, please refer to | `JobTrackingWithFinalizers` | `false` | Beta | 1.23 | 1.24 | | `JobTrackingWithFinalizers` | `true` | Beta | 1.25 | 1.25 | | `JobTrackingWithFinalizers` | `true` | GA | 1.26 | | -| `KMSv1` | `true` | Deprecated | 1.28 | | +| `KMSv1` | `true` | Deprecated | 1.28 | 1.29 | +| `KMSv1` | `false` | Deprecated | 1.29 | | +| `KMSv2` | `false` | Alpha | 1.25 | 1.26 | +| `KMSv2` | `true` | Beta | 1.27 | 1.28 | +| `KMSv2` | `true` | GA | 1.29 | | +| `KMSv2KDF` | `false` | Beta | 1.28 | 1.29 | +| `KMSv2KDF` | `true` | GA | 1.29 | | | `KubeletPodResources` | `false` | Alpha | 1.13 | 1.14 | | `KubeletPodResources` | `true` | Beta | 1.15 | 1.27 | | `KubeletPodResources` | `true` | GA | 1.28 | | diff --git a/content/en/docs/tasks/administer-cluster/encrypt-data.md b/content/en/docs/tasks/administer-cluster/encrypt-data.md index 5a86d2df14c..402cd0e0632 100644 --- a/content/en/docs/tasks/administer-cluster/encrypt-data.md +++ b/content/en/docs/tasks/administer-cluster/encrypt-data.md @@ -248,7 +248,7 @@ The following table describes each available provider. - kms v2 (beta) + kms v2 Uses envelope encryption scheme with DEK per API server. Strongest Fast @@ -259,14 +259,10 @@ The following table describes each available provider. Data is encrypted by data encryption keys (DEKs) using AES-GCM; DEKs are encrypted by key encryption keys (KEKs) according to configuration in Key Management Service (KMS). - Kubernetes defaults to generating a new DEK at API server startup, which is then - reused for object encryption. - If you enable the KMSv2KDF - feature gate, - Kubernetes instead generates a new DEK per encryption from a secret seed. - Whichever approach you configure, the DEK or seed is also rotated whenever the KEK is rotated.
    + Kubernetes generates a new DEK per encryption from a secret seed. + The seed is rotated whenever the KEK is rotated.
    A good choice if using a third party tool for key management. - Available in beta from Kubernetes v1.27. + Available in stable from Kubernetes v1.29.
    Read how to configure the KMS V2 provider. @@ -538,4 +534,3 @@ To allow automatic reloading, configure the API server to run with: * Read about [decrypting data that are already stored at rest](/docs/tasks/administer-cluster/decrypt-data/) * Learn more about the [EncryptionConfiguration configuration API (v1)](/docs/reference/config-api/apiserver-encryption.v1/). - diff --git a/content/en/docs/tasks/administer-cluster/kms-provider.md b/content/en/docs/tasks/administer-cluster/kms-provider.md index 921e13d29fe..d00ad97b190 100644 --- a/content/en/docs/tasks/administer-cluster/kms-provider.md +++ b/content/en/docs/tasks/administer-cluster/kms-provider.md @@ -9,7 +9,7 @@ weight: 370 This page shows how to configure a Key Management Service (KMS) provider and plugin to enable secret data encryption. In Kubernetes {{< skew currentVersion >}} there are two versions of KMS at-rest encryption. -You should use KMS v2 if feasible because KMS v1 is deprecated (since Kubernetes v1.28). +You should use KMS v2 if feasible because KMS v1 is deprecated (since Kubernetes v1.28) and disabled by default (since Kubernetes v1.29). However, you should also read and observe the **Caution** notices in this page that highlight specific cases when you must not use KMS v2. KMS v2 offers significantly better performance characteristics than KMS v1. @@ -24,7 +24,7 @@ you have selected. Kubernetes recommends using KMS v2. (if you are running a different version of Kubernetes that also supports the v2 KMS API, switch to the documentation for that version of Kubernetes). - If you selected KMS API v1 to support clusters prior to version v1.27 - or if you have a legacy KMS plugin that only supports KMS v1, + or if you have a legacy KMS plugin that only supports KMS v1, any supported Kubernetes version will work. This API is deprecated as of Kubernetes v1.28. Kubernetes does not recommend the use of this API. @@ -35,18 +35,17 @@ you have selected. Kubernetes recommends using KMS v2. * Kubernetes version 1.10.0 or later is required +* For version 1.29 and later, the feature is disabled by default. + To enable the feature, set `--feature-gates=KMSv1=true` to configure a KMS v1 provider. + * Your cluster must use etcd v3 or later ### KMS v2 -{{< feature-state for_k8s_version="v1.27" state="beta" >}} +{{< feature-state for_k8s_version="v1.29" state="stable" >}} * For version 1.25 and 1.26, enabling the feature via kube-apiserver feature gate is required. Set `--feature-gates=KMSv2=true` to configure a KMS v2 provider. - For environments where all API servers are running version 1.28 or later, and you do not require the ability - to downgrade to Kubernetes v1.27, you can enable the `KMSv2KDF` feature gate (a beta feature) for more - robust data encryption key generation. The Kubernetes project recommends enabling KMS v2 KDF if those - preconditions are met. - + * Your cluster must use etcd v3 or later {{< caution >}} @@ -56,8 +55,9 @@ enabled will result in data loss. --- -Running mixed API server versions with some servers at v1.27, and others at v1.28 _with the -`KMSv2KDF` feature gate enabled_ is **not supported** - and is likely to result in data loss. +`KMSv2KDF` feature gate is enabled by default in v1.29 and cannot be disabled. +Running mixed API server versions with some servers at v1.28 _with the `KMSv2KDF` feature gate disabled_, +and others at v1.29 is **not supported** - and is likely to result in data loss. {{< /caution >}} @@ -68,47 +68,16 @@ The DEKs are encrypted with a key encryption key (KEK) that is stored and manage With KMS v1, a new DEK is generated for each encryption. -With KMS v2, there are two ways for the API server to generate a DEK. -Kubernetes defaults to generating a new DEK at API server startup, which is then reused -for resource encryption. However, if you use KMS v2 _and_ enable the `KMSv2KDF` -[feature gate](/docs/reference/command-line-tools-reference/feature-gates/), then -Kubernetes instead generates a new DEK **per encryption**: the API server uses a +With KMS v2, a new DEK is generated **per encryption**: the API server uses a _key derivation function_ to generate single use data encryption keys from a secret seed combined with some random data. -Whichever approach you configure, the DEK or seed is also rotated whenever the KEK is rotated +The seed is rotated whenever the KEK is rotated (see `Understanding key_id and Key Rotation` section below for more details). The KMS provider uses gRPC to communicate with a specific KMS plugin over a UNIX domain socket. The KMS plugin, which is implemented as a gRPC server and deployed on the same host(s) as the Kubernetes control plane, is responsible for all communication with the remote KMS. -{{< caution >}} - -If you are running virtual machine (VM) based nodes that leverage VM state store with this feature, -using KMS v2 is **insecure** and an information security risk unless you also explicitly enable -the `KMSv2KDF` -[feature gate](/docs/reference/command-line-tools-reference/feature-gates/). - -With KMS v2, the API server uses AES-GCM with a 12 byte nonce (8 byte atomic counter and 4 bytes random data) for encryption. -The following issues could occur if the VM is saved and restored: - -1. The counter value may be lost or corrupted if the VM is saved in an inconsistent state or restored improperly. - This can lead to a situation where the same counter value is used twice, resulting in the same nonce being used - for two different messages. -2. If the VM is restored to a previous state, the counter value may be set back to its previous value, -resulting in the same nonce being used again. - -Although both of these cases are partially mitigated by the 4 byte random nonce, this can compromise -the security of the encryption. - -If you have enabled the `KMSv2KDF` -[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) _and_ are using KMS v2 -(not KMS v1), the API server generates single use data encryption keys from a secret seed. -This eliminates the need for a counter based nonce while avoiding nonce collision concerns. -It also removes any specific concerns with using KMS v2 and VM state store. - -{{< /caution >}} - ## Configuring the KMS provider To configure a KMS provider on the API server, include a provider of type `kms` in the @@ -197,9 +166,9 @@ Then use the functions and data structures in the stub file to develop the serve ##### KMS v2 {#developing-a-kms-plugin-gRPC-server-notes-kms-v2} -* KMS plugin version: `v2beta1` +* KMS plugin version: `v2` - In response to procedure call `Status`, a compatible KMS plugin should return `v2beta1` as `StatusResponse.version`, + In response to procedure call `Status`, a compatible KMS plugin should return `v2` as `StatusResponse.version`, "ok" as `StatusResponse.healthz` and a `key_id` (remote KMS KEK ID) as `StatusResponse.key_id`. The API server polls the `Status` procedure call approximately every minute when everything is healthy, @@ -258,20 +227,20 @@ Then use the functions and data structures in the stub file to develop the serve API server restart is required to perform KEK rotation. {{< caution >}} - Because you don't control the number of writes performed with the DEK, + Because you don't control the number of writes performed with the DEK, the Kubernetes project recommends rotating the KEK at least every 90 days. {{< /caution >}} * protocol: UNIX domain socket (`unix`) - The plugin is implemented as a gRPC server that listens at UNIX domain socket. - The plugin deployment should create a file on the file system to run the gRPC unix domain socket connection. - The API server (gRPC client) is configured with the KMS provider (gRPC server) unix - domain socket endpoint in order to communicate with it. - An abstract Linux socket may be used by starting the endpoint with `/@`, i.e. `unix:///@foo`. - Care must be taken when using this type of socket as they do not have concept of ACL - (unlike traditional file based sockets). - However, they are subject to Linux networking namespace, so will only be accessible to + The plugin is implemented as a gRPC server that listens at UNIX domain socket. + The plugin deployment should create a file on the file system to run the gRPC unix domain socket connection. + The API server (gRPC client) is configured with the KMS provider (gRPC server) unix + domain socket endpoint in order to communicate with it. + An abstract Linux socket may be used by starting the endpoint with `/@`, i.e. `unix:///@foo`. + Care must be taken when using this type of socket as they do not have concept of ACL + (unlike traditional file based sockets). + However, they are subject to Linux networking namespace, so will only be accessible to containers within the same pod unless host networking is used. ### Integrating a KMS plugin with the remote KMS From 5627db272093fab4f1fbb43ecf5bbcd6c81c20ca Mon Sep 17 00:00:00 2001 From: Nabarun Pal Date: Mon, 20 Nov 2023 08:58:49 +0530 Subject: [PATCH 46/82] add documentation for AuthorizationConfiguration Signed-off-by: Nabarun Pal --- .../access-authn-authz/authorization.md | 109 +++++++++++++++++- 1 file changed, 107 insertions(+), 2 deletions(-) diff --git a/content/en/docs/reference/access-authn-authz/authorization.md b/content/en/docs/reference/access-authn-authz/authorization.md index d81b4a9f55f..16cbaa4a583 100644 --- a/content/en/docs/reference/access-authn-authz/authorization.md +++ b/content/en/docs/reference/access-authn-authz/authorization.md @@ -211,7 +211,113 @@ so an earlier module has higher priority to allow or deny a request. ## Configuring the API Server using a Authorization Config File - +{{< feature-state state="alpha" for_k8s_version="v1.29" >}} + +Kubernetes API Server authorizer chain can be configured using a config file by passing it through the `--authorization-config` flag. An example configuration with all possible values is provided below. In order to use the feature, the `StructuredAuthorizationConfiguration` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) has to be enabled. + +Note: When the feature is enabled, setting both `--authorization-config` and configuring an authorization webhook using the `--authorization-mode` and `--authorization-webhook-*` command line flags is not allowed. If done, there will be an error and API Server would exit right away. + +```yaml +# +# DO NOT USE THE CONFIG AS IS. THIS IS AN EXAMPLE. +# +apiVersion: apiserver.config.k8s.io/v1alpha1 +kind: AuthorizationConfiguration +# authorizers are defined in order of precedence +authorizers: + - type: Webhook + # Name used to describe the authorizer + # This is explicitly used in monitoring machinery for metrics + # Note: + # - Validation for this field is similar to how K8s labels are validated today. + # Required, with no default + name: webhook + webhook: + # The duration to cache 'authorized' responses from the webhook + # authorizer. + # Same as setting `--authorization-webhook-cache-authorized-ttl` flag + # Default: 5m0s + authorizedTTL: 30s + # The duration to cache 'authorized' responses from the webhook + # authorizer. + # Same as setting `--authorization-webhook-cache-unauthorized-ttl` flag + # Default: 30s + unauthorizedTTL: 30s + # Timeout for the webhook request + # Maximum allowed is 30s. + # Required, with no default. + timeout: 3s + # The API version of the authorization.k8s.io SubjectAccessReview to + # send to and expect from the webhook. + # Same as setting `--authorization-webhook-version` flag + # Required, with no default + # Valid values: v1beta1, v1 + subjectAccessReviewVersion: v1 + # MatchConditionSubjectAccessReviewVersion specifies the SubjectAccessReview + # version the CEL expressions are evaluated against + # Valid values: v1 + # Required only if matchConditions are specified, no default value + matchConditionSubjectAccessReviewVersion: v1 + # Controls the authorization decision when a webhook request fails to + # complete or returns a malformed response or errors evaluating + # matchConditions. + # Valid values: + # - NoOpinion: continue to subsequent authorizers to see if one of + # them allows the request + # - Deny: reject the request without consulting subsequent authorizers + # Required, with no default. + failurePolicy: Deny + connectionInfo: + # Controls how the webhook should communicate with the server. + # Valid values: + # - KubeConfig: use the file specified in kubeConfigFile to locate the + # server. + # - InClusterConfig: use the in-cluster configuration to call the + # SubjectAccessReview API hosted by kube-apiserver. This mode is not + # allowed for kube-apiserver. + type: KubeConfig + # Path to KubeConfigFile for connection info + # Required, if connectionInfo.Type is KubeConfig + kubeConfigFile: /kube-system-authz-webhook.yaml + # matchConditions is a list of conditions that must be met for a request to be sent to this + # webhook. An empty list of matchConditions matches all requests. + # There are a maximum of 64 match conditions allowed. + # + # The exact matching logic is (in order): + # 1. If at least one matchCondition evaluates to FALSE, then the webhook is skipped. + # 2. If ALL matchConditions evaluate to TRUE, then the webhook is called. + # 3. If at least one matchCondition evaluates to an error (but none are FALSE): + # - If failurePolicy=Deny, then the webhook rejects the request + # - If failurePolicy=NoOpinion, then the error is ignored and the webhook is skipped + matchConditions: + # expression represents the expression which will be evaluated by CEL. Must evaluate to bool. + # CEL expressions have access to the contents of the SubjectAccessReview in v1 version. + # If version specified by subjectAccessReviewVersion in the request variable is v1beta1, + # the contents would be converted to the v1 version before evaluating the CEL expression. + # + # Documentation on CEL: https://kubernetes.io/docs/reference/using-api/cel/ + # + # only send resource requests to the webhook + - expression: has(request.resourceAttributes) + # only intercept requests to kube-system + - expression: request.resourceAttributes.namespace == 'kube-system' + # don't intercept requests from kube-system service accounts + - expression: !('system:serviceaccounts:kube-system' in request.user.groups) + - type: Node + name: node + - type: RBAC + name: rbac + - type: Webhook + name: in-cluster-authorizer + webhook: + authorizedTTL: 5m + unauthorizedTTL: 30s + timeout: 3s + subjectAccessReviewVersion: v1 + failurePolicy: NoOpinion + connectionInfo: + type: InClusterConfig +``` ## Privilege escalation via workload creation or edits {#privilege-escalation-via-pod-creation} @@ -245,4 +351,3 @@ This should be considered when deciding on your RBAC controls. * To learn more about Authentication, see **Authentication** in [Controlling Access to the Kubernetes API](/docs/concepts/security/controlling-access/). * To learn more about Admission Control, see [Using Admission Controllers](/docs/reference/access-authn-authz/admission-controllers/). - From 2ec25fbe7bfd01f47b2c9256a4eab5b6eaee8f32 Mon Sep 17 00:00:00 2001 From: Kensei Nakada Date: Mon, 20 Nov 2023 23:04:59 +0900 Subject: [PATCH 47/82] add: the doc for matchLabelKeys/mismatchLabelKeys in pod (anti)affinity (#43812) * add: the doc for matchLabelKeys/mismatchLabelKeys in pod (anti)affinity * fix based on reviews * add the explanation for labelSelector * address the review --- .../scheduling-eviction/assign-pod-node.md | 102 ++++++++++++++++++ .../feature-gates.md | 3 + 2 files changed, 105 insertions(+) diff --git a/content/en/docs/concepts/scheduling-eviction/assign-pod-node.md b/content/en/docs/concepts/scheduling-eviction/assign-pod-node.md index c33184b83e0..3e8a6a14073 100644 --- a/content/en/docs/concepts/scheduling-eviction/assign-pod-node.md +++ b/content/en/docs/concepts/scheduling-eviction/assign-pod-node.md @@ -358,6 +358,108 @@ The affinity term is applied to namespaces selected by both `namespaceSelector` Note that an empty `namespaceSelector` ({}) matches all namespaces, while a null or empty `namespaces` list and null `namespaceSelector` matches the namespace of the Pod where the rule is defined. +#### matchLabelKeys + +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} + +{{< note >}} + +The `matchLabelKeys` field is a alpha-level field and is disabled by default in +Kubernetes {{< skew currentVersion >}}. +When you want to use it, you have to enable it via the +`MatchLabelKeysInPodAffinity` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/). +{{< /note >}} + +Kubernetes includes an optional `matchLabelKeys` field for Pod affinity +or anti-affinity. The field specifies keys for the labels that should match with the incoming Pod's labels, +when satisfying the Pod (anti)affinity. + +The keys are used to look up values from the pod labels; those key-value labels are combined +(using `AND`) with the match restrictions defined using the `labelSelector` field. The combined +filtering selects the set of existing pods that will be taken into Pod (anti)affinity calculation. + +A common use case is to use `matchLabelKeys` with `pod-template-hash` (set on Pods +managed as part of a Deployment, where the value is unique for each revision). +Using `pod-template-hash` in `matchLabelKeys` allows you to target the Pods that belong +to the same revision as the incoming Pod, so that a rolling upgrade won't break affinity. + +```yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: application-server +... +spec: + template: + affinity: + podAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + - labelSelector: + matchExpressions: + - key: app + operator: In + values: + - database + topologyKey: topology.kubernetes.io/zone + # Only Pods from a given rollout are taken into consideration when calculating pod affinity. + # If you update the Deployment, the replacement Pods follow their own affinity rules + # (if there are any defined in the new Pod template) + matchLabelKeys: + - pod-template-hash +``` + +#### mismatchLabelKeys + +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} + +{{< note >}} + +The `mismatchLabelKeys` field is a alpha-level field and is disabled by default in +Kubernetes {{< skew currentVersion >}}. +When you want to use it, you have to enable it via the +`MatchLabelKeysInPodAffinity` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/). +{{< /note >}} + +Kubernetes includes an optional `mismatchLabelKeys` field for Pod affinity +or anti-affinity. The field specifies keys for the labels that should **not** match with the incoming Pod's labels, +when satisfying the Pod (anti)affinity. + +One example use case is to ensure Pods go to the topology domain (node, zone, etc) where only Pods from the same tenant or team are scheduled in. +In other words, you want to avoid running Pods from two different tenants on the same topology domain at the same time. + +```yaml +apiVersion: v1 +kind: Pod +metadata: + labels: + # Assume that all relevant Pods have a "tenant" label set + tenant: tenant-a +... +spec: + affinity: + podAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + # ensure that pods associated with this tenant land on the correct node pool + - matchLabelKeys: + - tenant + topologyKey: node-pool + podAntiAffinity: + requiredDuringSchedulingIgnoredDuringExecution: + # ensure that pods associated with this tenant can't schedule to nodes used for another tenant + - mismatchLabelKeys: + - tenant # whatever the value of the "tenant" label for this Pod, prevent + # scheduling to nodes in any pool where any Pod from a different + # tenant is running. + labelSelector: + # We have to have the labelSelector which selects only Pods with the tenant label, + # otherwise this Pod would hate Pods from daemonsets as well, for example, + # which aren't supposed to have the tenant label. + matchExpressions: + - key: tenant + operator: Exists + topologyKey: node-pool +``` + #### More practical use-cases Inter-pod affinity and anti-affinity can be even more useful when they are used with higher diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 6f7de2ec1a5..47329a60710 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -136,6 +136,7 @@ For a reference to old feature gates that are removed, please refer to | `LogarithmicScaleDown` | `true` | Beta | 1.22 | | | `LoggingAlphaOptions` | `false` | Alpha | 1.24 | - | | `LoggingBetaOptions` | `true` | Beta | 1.24 | - | +| `MatchLabelKeysInPodAffinity` | `false` | Alpha | 1.29 | - | | `MatchLabelKeysInPodTopologySpread` | `false` | Alpha | 1.25 | 1.26 | | `MatchLabelKeysInPodTopologySpread` | `true` | Beta | 1.27 | - | | `MaxUnavailableStatefulSet` | `false` | Alpha | 1.24 | | @@ -627,6 +628,8 @@ Each feature gate is designed for enabling/disabling a specific feature: based on logarithmic bucketing of pod timestamps. - `LoggingAlphaOptions`: Allow fine-tuing of experimental, alpha-quality logging options. - `LoggingBetaOptions`: Allow fine-tuing of experimental, beta-quality logging options. +- `MatchLabelKeysInPodAffinity`: Enable the `matchLabelKeys` and `mismatchLabelKeys` field for + [pod (anti)affinity](/docs/concepts/scheduling-eviction/assign-pod-node/). - `MatchLabelKeysInPodTopologySpread`: Enable the `matchLabelKeys` field for [Pod topology spread constraints](/docs/concepts/scheduling-eviction/topology-spread-constraints/). - `MaxUnavailableStatefulSet`: Enables setting the `maxUnavailable` field for the From 8b9f3f84aa5844b7216285db998211795e878805 Mon Sep 17 00:00:00 2001 From: Anish Ramasekar Date: Fri, 17 Nov 2023 18:38:56 +0000 Subject: [PATCH 48/82] review feedback Signed-off-by: Anish Ramasekar --- .../feature-gates.md | 4 +- .../tasks/administer-cluster/encrypt-data.md | 2 +- .../tasks/administer-cluster/kms-provider.md | 45 +++++++++---------- 3 files changed, 23 insertions(+), 28 deletions(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 07f18f43f4c..7293394b731 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -271,12 +271,12 @@ For a reference to old feature gates that are removed, please refer to | `JobTrackingWithFinalizers` | `false` | Beta | 1.23 | 1.24 | | `JobTrackingWithFinalizers` | `true` | Beta | 1.25 | 1.25 | | `JobTrackingWithFinalizers` | `true` | GA | 1.26 | | -| `KMSv1` | `true` | Deprecated | 1.28 | 1.29 | +| `KMSv1` | `true` | Deprecated | 1.28 | 1.28 | | `KMSv1` | `false` | Deprecated | 1.29 | | | `KMSv2` | `false` | Alpha | 1.25 | 1.26 | | `KMSv2` | `true` | Beta | 1.27 | 1.28 | | `KMSv2` | `true` | GA | 1.29 | | -| `KMSv2KDF` | `false` | Beta | 1.28 | 1.29 | +| `KMSv2KDF` | `false` | Beta | 1.28 | 1.28 | | `KMSv2KDF` | `true` | GA | 1.29 | | | `KubeletPodResources` | `false` | Alpha | 1.13 | 1.14 | | `KubeletPodResources` | `true` | Beta | 1.15 | 1.27 | diff --git a/content/en/docs/tasks/administer-cluster/encrypt-data.md b/content/en/docs/tasks/administer-cluster/encrypt-data.md index 402cd0e0632..1ceed92ac3d 100644 --- a/content/en/docs/tasks/administer-cluster/encrypt-data.md +++ b/content/en/docs/tasks/administer-cluster/encrypt-data.md @@ -262,7 +262,7 @@ The following table describes each available provider. Kubernetes generates a new DEK per encryption from a secret seed. The seed is rotated whenever the KEK is rotated.
    A good choice if using a third party tool for key management. - Available in stable from Kubernetes v1.29. + Available as stable from Kubernetes v1.29.
    Read how to configure the KMS V2 provider. diff --git a/content/en/docs/tasks/administer-cluster/kms-provider.md b/content/en/docs/tasks/administer-cluster/kms-provider.md index d00ad97b190..6ed39227fb2 100644 --- a/content/en/docs/tasks/administer-cluster/kms-provider.md +++ b/content/en/docs/tasks/administer-cluster/kms-provider.md @@ -10,8 +10,16 @@ weight: 370 This page shows how to configure a Key Management Service (KMS) provider and plugin to enable secret data encryption. In Kubernetes {{< skew currentVersion >}} there are two versions of KMS at-rest encryption. You should use KMS v2 if feasible because KMS v1 is deprecated (since Kubernetes v1.28) and disabled by default (since Kubernetes v1.29). -However, you should also read and observe the **Caution** notices in this page that highlight specific -cases when you must not use KMS v2. KMS v2 offers significantly better performance characteristics than KMS v1. +KMS v2 offers significantly better performance characteristics than KMS v1. + +{{< caution >}} +This documentation is for the generally available implementation of KMS v2 (and for the +deprecated version 1 implementation). +If you are using any control plane components older than Kubernetes v1.29, please check +the equivalent page in the documentation for the version of Kubernetes that your cluster +is running. Earlier releases of Kubernetes had different behavior that may be relevant +for information security. +{{< /caution >}} ## {{% heading "prerequisites" %}} @@ -35,7 +43,7 @@ you have selected. Kubernetes recommends using KMS v2. * Kubernetes version 1.10.0 or later is required -* For version 1.29 and later, the feature is disabled by default. +* For version 1.29 and later, the v1 implementation of KMS is disabled by default. To enable the feature, set `--feature-gates=KMSv1=true` to configure a KMS v1 provider. * Your cluster must use etcd v3 or later @@ -43,36 +51,23 @@ you have selected. Kubernetes recommends using KMS v2. ### KMS v2 {{< feature-state for_k8s_version="v1.29" state="stable" >}} -* For version 1.25 and 1.26, enabling the feature via kube-apiserver feature gate is required. -Set `--feature-gates=KMSv2=true` to configure a KMS v2 provider. - * Your cluster must use etcd v3 or later -{{< caution >}} -The KMS v2 API and implementation changed in incompatible ways in-between the alpha release in v1.25 -and the beta release in v1.27. Attempting to upgrade from old versions with the alpha feature -enabled will result in data loss. - ---- - -`KMSv2KDF` feature gate is enabled by default in v1.29 and cannot be disabled. -Running mixed API server versions with some servers at v1.28 _with the `KMSv2KDF` feature gate disabled_, -and others at v1.29 is **not supported** - and is likely to result in data loss. -{{< /caution >}} - +## KMS encryption and per-object encryption keys + The KMS encryption provider uses an envelope encryption scheme to encrypt data in etcd. The data is encrypted using a data encryption key (DEK). The DEKs are encrypted with a key encryption key (KEK) that is stored and managed in a remote KMS. -With KMS v1, a new DEK is generated for each encryption. +If you use the (deprecated) v1 implementation of KMS, a new DEK is generated for each encryption. With KMS v2, a new DEK is generated **per encryption**: the API server uses a _key derivation function_ to generate single use data encryption keys from a secret seed combined with some random data. The seed is rotated whenever the KEK is rotated -(see `Understanding key_id and Key Rotation` section below for more details). +(see the _Understanding key_id and Key Rotation_ section below for more details). The KMS provider uses gRPC to communicate with a specific KMS plugin over a UNIX domain socket. The KMS plugin, which is implemented as a gRPC server and deployed on the same host(s) @@ -168,8 +163,12 @@ Then use the functions and data structures in the stub file to develop the serve * KMS plugin version: `v2` - In response to procedure call `Status`, a compatible KMS plugin should return `v2` as `StatusResponse.version`, + In response to the `Status` remote procedure call, a compatible KMS plugin should return its KMS compatibility + version as `StatusResponse.version`. That status response should also include "ok" as `StatusResponse.healthz` and a `key_id` (remote KMS KEK ID) as `StatusResponse.key_id`. + The Kubernetes project recommends you make your plugin + compatible with the stable `v2` KMS API. Kubernetes {{< skew currentVersion >}} also supports the + `v2beta1` API for KMS; future Kubernetes releases are likely to continue supporting that beta version. The API server polls the `Status` procedure call approximately every minute when everything is healthy, and every 10 seconds when the plugin is not healthy. Plugins must take care to optimize this call as it will be @@ -332,10 +331,6 @@ The following table summarizes the health check endpoints for each KMS version: These healthcheck endpoint paths are hard coded and generated/controlled by the server. The indices for individual healthchecks corresponds to the order in which the KMS encryption config is processed. -At a high level, restarting an API server when a KMS plugin is unhealthy is unlikely to make the situation better. -It can make the situation significantly worse by throwing away the API server's DEK cache. Thus the general -recommendation is to ignore the API server KMS healthz checks for liveness purposes, i.e. `/livez?exclude=kms-providers`. - Until the steps defined in [Ensuring all secrets are encrypted](#ensuring-all-secrets-are-encrypted) are performed, the `providers` list should end with the `identity: {}` provider to allow unencrypted data to be read. Once all resources are encrypted, the `identity` provider should be removed to prevent the API server from honoring unencrypted data. For details about the `EncryptionConfiguration` format, please check the From a8d08be6317632a8cf7d76af632bbb8e3adb71b2 Mon Sep 17 00:00:00 2001 From: charles-chenzz Date: Tue, 21 Nov 2023 07:54:04 +0800 Subject: [PATCH 49/82] third round of comment address --- content/en/docs/concepts/workloads/pods/pod-lifecycle.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/concepts/workloads/pods/pod-lifecycle.md b/content/en/docs/concepts/workloads/pods/pod-lifecycle.md index f7405b412e9..d2c7f962e79 100644 --- a/content/en/docs/concepts/workloads/pods/pod-lifecycle.md +++ b/content/en/docs/concepts/workloads/pods/pod-lifecycle.md @@ -256,7 +256,7 @@ runtime sandbox and configure networking for the Pod. If the `PodReadyToStartContainersCondition` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled (it is enabled by default for Kubernetes {{< skew currentVersion >}}), the -`PodReadyToStartContainers` condition in the `status.conditions` field of a Pod. +`PodReadyToStartContainers` condition will be added to the `status.conditions` field of a Pod. The `PodReadyToStartContainers` condition is set to `False` by the Kubelet when it detects a Pod does not have a runtime sandbox with networking configured. This occurs in From 6dd3091e5560f0fe6ae52c1543749e935fa7b967 Mon Sep 17 00:00:00 2001 From: Taahir Ahmed Date: Thu, 19 Oct 2023 20:21:47 -0700 Subject: [PATCH 50/82] ClusterTrustBundles: Document projected volumes --- .../concepts/storage/projected-volumes.md | 26 +++++++++++++++++ .../certificate-signing-requests.md | 8 +++++- .../feature-gates.md | 4 ++- .../storage/projected-clustertrustbundle.yaml | 28 +++++++++++++++++++ 4 files changed, 64 insertions(+), 2 deletions(-) create mode 100644 content/en/examples/pods/storage/projected-clustertrustbundle.yaml diff --git a/content/en/docs/concepts/storage/projected-volumes.md b/content/en/docs/concepts/storage/projected-volumes.md index ac64fa4d7da..8d59b802648 100644 --- a/content/en/docs/concepts/storage/projected-volumes.md +++ b/content/en/docs/concepts/storage/projected-volumes.md @@ -24,6 +24,7 @@ Currently, the following types of volume sources can be projected: * [`downwardAPI`](/docs/concepts/storage/volumes/#downwardapi) * [`configMap`](/docs/concepts/storage/volumes/#configmap) * [`serviceAccountToken`](#serviceaccounttoken) +* [`clusterTrustBundle`](#clustertrustbundle) All sources are required to be in the same namespace as the Pod. For more details, see the [all-in-one volume](https://git.k8s.io/design-proposals-archive/node/all-in-one-volume.md) design document. @@ -70,6 +71,31 @@ A container using a projected volume source as a [`subPath`](/docs/concepts/stor volume mount will not receive updates for those volume sources. {{< /note >}} +## clusterTrustBundle projected volumes {#clustertrustbundle} + +{{}} + +{{< note >}} +To use this feature in Kubernetes {{< skew currentVersion >}}, you must enable support for ClusterTrustBundle objects with the `ClusterTrustBundle` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) and `--runtime-config=certificates.k8s.io/v1alpha1/clustertrustbundles=true` kube-apiserver flag, then enable the `ClusterTrustBundleProjection` feature gate. +{{< /note >}} + +The `clusterTrustBundle` projected volume source injects the contents of one or more [ClusterTrustBundle](/docs/reference/access-authn-authz/certificate-signing-requests#cluster-trust-bundles) objects as an automatically-updating file in the container filesystem. + +ClusterTrustBundles can be selected either by [name](/docs/reference/access-authn-authz/certificate-signing-requests#ctb-signer-unlinked) or by [signer name](/docs/reference/access-authn-authz/certificate-signing-requests#ctb-signer-linked). + +To select by name, use the `name` field to designate a single ClusterTrustBundle object. + +To select by signer name, use the `signerName` field (and optionally the +`labelSelector` field) to designate a set of ClusterTrustBundle objects that use +the given signer name. If `labelSelector` is not present, then all +ClusterTrustBundles for that signer are selected. + +The kubelet deduplicates the certificates in the selected ClusterTrustBundle objects, normalizes the PEM representations (discarding comments and headers), reorders the certificates, and writes them into the file named by `path`. As the set of selected ClusterTrustBundles or their content changes, kubelet keeps the file up-to-date. + +By default, the kubelet will prevent the pod from starting if the named ClusterTrustBundle is not found, or if `signerName` / `labelSelector` do not match any ClusterTrustBundles. If this behavior is not what you want, then set the `optional` field to `true`, and the pod will start up with an empty file at `path`. + +{{% code_sample file="pods/storage/projected-clustertrustbundle.yaml" %}} + ## SecurityContext interactions The [proposal](https://git.k8s.io/enhancements/keps/sig-storage/2451-service-account-token-volumes#proposal) for file permission handling in projected service account volume enhancement introduced the projected files having the correct owner permissions set. diff --git a/content/en/docs/reference/access-authn-authz/certificate-signing-requests.md b/content/en/docs/reference/access-authn-authz/certificate-signing-requests.md index 9da1dd6c1af..ec13b0badef 100644 --- a/content/en/docs/reference/access-authn-authz/certificate-signing-requests.md +++ b/content/en/docs/reference/access-authn-authz/certificate-signing-requests.md @@ -371,7 +371,7 @@ you like. If you want to add a note for human consumption, use the {{< feature-state for_k8s_version="v1.27" state="alpha" >}} {{< note >}} -In Kubernetes {{< skew currentVersion >}}, you must enable the `ClusterTrustBundles` +In Kubernetes {{< skew currentVersion >}}, you must enable the `ClusterTrustBundle` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) _and_ the `certificates.k8s.io/v1alpha1` {{< glossary_tooltip text="API group" term_id="api-group" >}} in order to use @@ -472,6 +472,12 @@ such as role-based access control. To distinguish them from signer-linked ClusterTrustBundles, the names of signer-unlinked ClusterTrustBundles **must not** contain a colon (`:`). +### Accessing ClusterTrustBundles from pods {#ctb-projection} + +{{}} + +The contents of ClusterTrustBundles can be injected into the container filesystem, similar to ConfigMaps and Secrets. See the [clusterTrustBundle projected volume source](/docs/concepts/storage/projected-volumes#clustertrustbundle) for more details. + ## How to issue a certificate for a user {#normal-user} diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 178682c3bcc..06bcf144a73 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -82,6 +82,7 @@ For a reference to old feature gates that are removed, please refer to | `CloudDualStackNodeIPs` | `false` | Alpha | 1.27 | 1.28 | | `CloudDualStackNodeIPs` | `true` | Beta | 1.29 | | | `ClusterTrustBundle` | false | Alpha | 1.27 | | +| `ClusterTrustBundleProjection` | `false` | Alpha | 1.29 | | | `ComponentSLIs` | `false` | Alpha | 1.26 | 1.26 | | `ComponentSLIs` | `true` | Beta | 1.27 | | | `ConsistentListFromCache` | `false` | Alpha | 1.28 | | @@ -441,7 +442,8 @@ Each feature gate is designed for enabling/disabling a specific feature: - `CloudDualStackNodeIPs`: Enables dual-stack `kubelet --node-ip` with external cloud providers. See [Configure IPv4/IPv6 dual-stack](/docs/concepts/services-networking/dual-stack/#configure-ipv4-ipv6-dual-stack) for more details. -- `ClusterTrustBundle`: Enable ClusterTrustBundle objects and kubelet integration. +- `ClusterTrustBundle`: Enable ClusterTrustBundle objects. +- `ClusterTrustBundleProjection`: [`clusterTrustBundle` projected volume sources](/docs/concepts/storage/projected-volumes#clustertrustbundle). - `ComponentSLIs`: Enable the `/metrics/slis` endpoint on Kubernetes components like kubelet, kube-scheduler, kube-proxy, kube-controller-manager, cloud-controller-manager allowing you to scrape health check metrics. diff --git a/content/en/examples/pods/storage/projected-clustertrustbundle.yaml b/content/en/examples/pods/storage/projected-clustertrustbundle.yaml new file mode 100644 index 00000000000..452384a4451 --- /dev/null +++ b/content/en/examples/pods/storage/projected-clustertrustbundle.yaml @@ -0,0 +1,28 @@ +apiVersion: v1 +kind: Pod +metadata: + name: sa-ctb-name-test +spec: + containers: + - name: container-test + image: busybox + command: ["sleep", "3600"] + volumeMounts: + - name: token-vol + mountPath: "/root-certificates" + readOnly: true + serviceAccountName: default + volumes: + - name: root-certificates-vol + projected: + sources: + - clusterTrustBundle: + name: example + path: example-roots.pem + - clusterTrustBundle: + signerName: "example.com/mysigner" + labelSelector: + matchLabels: + version: live + path: mysigner-roots.pem + optional: true From c07ce392e4cabebd147b8fd22982c507c5d9ac0a Mon Sep 17 00:00:00 2001 From: Chris Henzie Date: Tue, 10 Oct 2023 07:55:28 -0700 Subject: [PATCH 51/82] Graduate ReadWriteOncePod to GA Included is a task for migrating existing PersistentVolumes to use ReadWriteOncePod, taken from the alpha blog post. --- .../concepts/storage/persistent-volumes.md | 26 ++- .../workloads/controllers/statefulset.md | 6 + .../feature-gates.md | 5 +- .../change-pv-access-mode-readwriteoncepod.md | 187 ++++++++++++++++++ .../configure-persistent-volume-storage.md | 6 + 5 files changed, 219 insertions(+), 11 deletions(-) create mode 100644 content/en/docs/tasks/administer-cluster/change-pv-access-mode-readwriteoncepod.md diff --git a/content/en/docs/concepts/storage/persistent-volumes.md b/content/en/docs/concepts/storage/persistent-volumes.md index 13b81f9c1f7..b6d18462c4a 100644 --- a/content/en/docs/concepts/storage/persistent-volumes.md +++ b/content/en/docs/concepts/storage/persistent-volumes.md @@ -39,8 +39,8 @@ NFS, iSCSI, or a cloud-provider-specific storage system. A _PersistentVolumeClaim_ (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific -size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or -ReadWriteMany, see [AccessModes](#access-modes)). +size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany, +ReadWriteMany, or ReadWriteOncePod, see [AccessModes](#access-modes)). While PersistentVolumeClaims allow a user to consume abstract storage resources, it is common that users need PersistentVolumes with varying properties, such as @@ -626,7 +626,8 @@ The access modes are: `ReadWriteOnce` : the volume can be mounted as read-write by a single node. ReadWriteOnce access - mode still can allow multiple pods to access the volume when the pods are running on the same node. + mode still can allow multiple pods to access the volume when the pods are + running on the same node. For single pod access, please see ReadWriteOncePod. `ReadOnlyMany` : the volume can be mounted as read-only by many nodes. @@ -635,15 +636,22 @@ The access modes are: : the volume can be mounted as read-write by many nodes. `ReadWriteOncePod` -: {{< feature-state for_k8s_version="v1.27" state="beta" >}} +: {{< feature-state for_k8s_version="v1.29" state="stable" >}} the volume can be mounted as read-write by a single Pod. Use ReadWriteOncePod access mode if you want to ensure that only one pod across the whole cluster can - read that PVC or write to it. This is only supported for CSI volumes and - Kubernetes version 1.22+. + read that PVC or write to it. -The blog article -[Introducing Single Pod Access Mode for PersistentVolumes](/blog/2021/09/13/read-write-once-pod-access-mode-alpha/) -covers this in more detail. +{{< note >}} +The `ReadWriteOncePod` access mode is only supported for +{{< glossary_tooltip text="CSI" term_id="csi" >}} volumes and Kubernetes version +1.22+. To use this feature you will need to update the following +[CSI sidecars](https://kubernetes-csi.github.io/docs/sidecar-containers.html) +to these versions or greater: + +* [csi-provisioner:v3.0.0+](https://github.com/kubernetes-csi/external-provisioner/releases/tag/v3.0.0) +* [csi-attacher:v3.3.0+](https://github.com/kubernetes-csi/external-attacher/releases/tag/v3.3.0) +* [csi-resizer:v1.3.0+](https://github.com/kubernetes-csi/external-resizer/releases/tag/v1.3.0) +{{< /note >}} In the CLI, the access modes are abbreviated to: diff --git a/content/en/docs/concepts/workloads/controllers/statefulset.md b/content/en/docs/concepts/workloads/controllers/statefulset.md index 927b2e53f3e..b0176108cea 100644 --- a/content/en/docs/concepts/workloads/controllers/statefulset.md +++ b/content/en/docs/concepts/workloads/controllers/statefulset.md @@ -116,6 +116,12 @@ spec: storage: 1Gi ``` +{{< note >}} +This example uses the `ReadWriteOnce` access mode, for simplicity. For +production use, the Kubernetes project recommends using the `ReadWriteOncePod` +access mode instead. +{{< /note >}} + In the above example: * A Headless Service, named `nginx`, is used to control the network domain. diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index f5ff4c9a857..a897300b063 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -170,8 +170,6 @@ For a reference to old feature gates that are removed, please refer to | `PodSchedulingReadiness` | `true` | Beta | 1.27 | | | `ProcMountType` | `false` | Alpha | 1.12 | | | `QOSReserved` | `false` | Alpha | 1.11 | | -| `ReadWriteOncePod` | `false` | Alpha | 1.22 | 1.26 | -| `ReadWriteOncePod` | `true` | Beta | 1.27 | | | `RecoverVolumeExpansionFailure` | `false` | Alpha | 1.23 | | | `RemainingItemCount` | `false` | Alpha | 1.15 | 1.15 | | `RemainingItemCount` | `true` | Beta | 1.16 | | @@ -311,6 +309,9 @@ For a reference to old feature gates that are removed, please refer to | `ProxyTerminatingEndpoints` | `false` | Alpha | 1.22 | 1.25 | | `ProxyTerminatingEndpoints` | `true` | Beta | 1.26 | 1.27 | | `ProxyTerminatingEndpoints` | `true` | GA | 1.28 | | +| `ReadWriteOncePod` | `false` | Alpha | 1.22 | 1.26 | +| `ReadWriteOncePod` | `true` | Beta | 1.27 | 1.28 | +| `ReadWriteOncePod` | `true` | GA | 1.29 | | | `RemoveSelfLink` | `false` | Alpha | 1.16 | 1.19 | | `RemoveSelfLink` | `true` | Beta | 1.20 | 1.23 | | `RemoveSelfLink` | `true` | GA | 1.24 | | diff --git a/content/en/docs/tasks/administer-cluster/change-pv-access-mode-readwriteoncepod.md b/content/en/docs/tasks/administer-cluster/change-pv-access-mode-readwriteoncepod.md new file mode 100644 index 00000000000..a2dffdc702c --- /dev/null +++ b/content/en/docs/tasks/administer-cluster/change-pv-access-mode-readwriteoncepod.md @@ -0,0 +1,187 @@ +--- +title: Change the Access Mode of a PersistentVolume to ReadWriteOncePod +content_type: task +weight: 90 +min-kubernetes-server-version: v1.22 +--- + + + +This page shows how to change the access mode on an existing PersistentVolume to +use `ReadWriteOncePod`. + +## {{% heading "prerequisites" %}} + +{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}} + +{{< note >}} +The `ReadWriteOncePod` access mode graduated to stable in the Kubernetes v1.29 +release. If you are running a version of Kubernetes older than v1.29, you might +need to enable a feature gate. Check the documentation for your version of +Kubernetes. +{{< /note >}} + +{{< note >}} +The `ReadWriteOncePod` access mode is only supported for +{{< glossary_tooltip text="CSI" term_id="csi" >}} volumes. +To use this volume access mode you will need to update the following +[CSI sidecars](https://kubernetes-csi.github.io/docs/sidecar-containers.html) +to these versions or greater: + +* [csi-provisioner:v3.0.0+](https://github.com/kubernetes-csi/external-provisioner/releases/tag/v3.0.0) +* [csi-attacher:v3.3.0+](https://github.com/kubernetes-csi/external-attacher/releases/tag/v3.3.0) +* [csi-resizer:v1.3.0+](https://github.com/kubernetes-csi/external-resizer/releases/tag/v1.3.0) +{{< /note >}} + +## Why should I use `ReadWriteOncePod`? + +Prior to Kubernetes v1.22, the `ReadWriteOnce` access mode was commonly used to +restrict PersistentVolume access for workloads that required single-writer +access to storage. However, this access mode had a limitation: it restricted +volume access to a single *node*, allowing multiple pods on the same node to +read from and write to the same volume simultaneously. This could pose a risk +for applications that demand strict single-writer access for data safety. + +If ensuring single-writer access is critical for your workloads, consider +migrating your volumes to `ReadWriteOncePod`. + + + +## Migrating existing PersistentVolumes + +If you have existing PersistentVolumes, they can be migrated to use +`ReadWriteOncePod`. Only migrations from `ReadWriteOnce` to `ReadWriteOncePod` +are supported. + +In this example, there is already a `ReadWriteOnce` "cat-pictures-pvc" +PersistentVolumeClaim that is bound to a "cat-pictures-pv" PersistentVolume, +and a "cat-pictures-writer" Deployment that uses this PersistentVolumeClaim. + +{{< note >}} +If your storage plugin supports +[Dynamic provisioning](/docs/concepts/storage/dynamic-provisioning/), +the "cat-picutres-pv" will be created for you, but its name may differ. To get +your PersistentVolume's name run: + +```shell +kubectl get pvc cat-pictures-pvc -o jsonpath='{.spec.volumeName}' +``` +{{< /note >}} + +And you can view the PVC before you make changes. Either view the manifest +locally, or run `kubectl get pvc -o yaml`. The output is similar +to: + +```yaml +# cat-pictures-pvc.yaml +kind: PersistentVolumeClaim +apiVersion: v1 +metadata: + name: cat-pictures-pvc +spec: + accessModes: + - ReadWriteOnce + resources: + requests: + storage: 1Gi +``` + +Here's an example Deployment that relies on that PersistentVolumeClaim: + +```yaml +# cat-pictures-writer-deployment.yaml +apiVersion: apps/v1 +kind: Deployment +metadata: + name: cat-pictures-writer +spec: + replicas: 3 + selector: + matchLabels: + app: cat-pictures-writer + template: + metadata: + labels: + app: cat-pictures-writer + spec: + containers: + - name: nginx + image: nginx:1.14.2 + ports: + - containerPort: 80 + volumeMounts: + - name: cat-pictures + mountPath: /mnt + volumes: + - name: cat-pictures + persistentVolumeClaim: + claimName: cat-pictures-pvc + readOnly: false +``` + +As a first step, you need to edit your PersistentVolume's +`spec.persistentVolumeReclaimPolicy` and set it to `Retain`. This ensures your +PersistentVolume will not be deleted when you delete the corresponding +PersistentVolumeClaim: + +```shell +kubectl patch pv cat-pictures-pv -p '{"spec":{"persistentVolumeReclaimPolicy":"Retain"}}' +``` + +Next you need to stop any workloads that are using the PersistentVolumeClaim +bound to the PersistentVolume you want to migrate, and then delete the +PersistentVolumeClaim. Avoid making any other changes to the +PersistentVolumeClaim, such as volume resizes, until after the migration is +complete. + +Once that is done, you need to clear your PersistentVolume's `spec.claimRef.uid` +to ensure PersistentVolumeClaims can bind to it upon recreation: + +```shell +kubectl scale --replicas=0 deployment cat-pictures-writer +kubectl delete pvc cat-pictures-pvc +kubectl patch pv cat-pictures-pv -p '{"spec":{"claimRef":{"uid":""}}}' +``` + +After that, replace the PersistentVolume's list of valid access modes to be +(only) `ReadWriteOncePod`: + +```shell +kubectl patch pv cat-pictures-pv -p '{"spec":{"accessModes":["ReadWriteOncePod"]}}' +``` + +{{< note >}} +The `ReadWriteOncePod` access mode cannot be combined with other access modes. +Make sure `ReadWriteOncePod` is the only access mode on the PersistentVolume +when updating, otherwise the request will fail. +{{< /note >}} + +Next you need to modify your PersistentVolumeClaim to set `ReadWriteOncePod` as +the only access mode. You should also set the PersistentVolumeClaim's +`spec.volumeName` to the name of your PersistentVolume to ensure it binds to +this specific PersistentVolume. + +Once this is done, you can recreate your PersistentVolumeClaim and start up your +workloads: + +```shell +# IMPORTANT: Make sure to edit your PVC in cat-pictures-pvc.yaml before applying. You need to: +# - Set ReadWriteOncePod as the only access mode +# - Set spec.volumeName to "cat-pictures-pv" + +kubectl apply -f cat-pictures-pvc.yaml +kubectl apply -f cat-pictures-writer-deployment.yaml +``` + +Lastly you may edit your PersistentVolume's `spec.persistentVolumeReclaimPolicy` +and set to it back to `Delete` if you previously changed it. + +```shell +kubectl patch pv cat-pictures-pv -p '{"spec":{"persistentVolumeReclaimPolicy":"Delete"}}' +``` + +## {{% heading "whatsnext" %}} + +* Learn more about [PersistentVolumes](/docs/concepts/storage/persistent-volumes/). +* Learn more about [PersistentVolumeClaims](/docs/concepts/storage/persistent-volumes/#persistentvolumeclaims). +* Learn more about [Configuring a Pod to Use a PersistentVolume for Storage](/docs/tasks/configure-pod-container/configure-persistent-volume-storage/) diff --git a/content/en/docs/tasks/configure-pod-container/configure-persistent-volume-storage.md b/content/en/docs/tasks/configure-pod-container/configure-persistent-volume-storage.md index dc1a01abb00..0d6c9859dc9 100644 --- a/content/en/docs/tasks/configure-pod-container/configure-persistent-volume-storage.md +++ b/content/en/docs/tasks/configure-pod-container/configure-persistent-volume-storage.md @@ -98,6 +98,12 @@ read-write by a single Node. It defines the [StorageClass name](/docs/concepts/s `manual` for the PersistentVolume, which will be used to bind PersistentVolumeClaim requests to this PersistentVolume. +{{< note >}} +This example uses the `ReadWriteOnce` access mode, for simplicity. For +production use, the Kubernetes project recommends using the `ReadWriteOncePod` +access mode instead. +{{< /note >}} + Create the PersistentVolume: ```shell From 01e6f317e3e0b0434ceeae0d75c487be7e8408e7 Mon Sep 17 00:00:00 2001 From: Anish Ramasekar Date: Tue, 10 Oct 2023 00:07:17 +0000 Subject: [PATCH 52/82] add docs for StructuredAuthenticationConfig v1alpha1 Signed-off-by: Anish Ramasekar --- .../access-authn-authz/authentication.md | 123 ++++++++++++++++-- 1 file changed, 113 insertions(+), 10 deletions(-) diff --git a/content/en/docs/reference/access-authn-authz/authentication.md b/content/en/docs/reference/access-authn-authz/authentication.md index 7bc2a3d560f..c80434eca98 100644 --- a/content/en/docs/reference/access-authn-authz/authentication.md +++ b/content/en/docs/reference/access-authn-authz/authentication.md @@ -242,7 +242,7 @@ and are assigned to the groups `system:serviceaccounts` and `system:serviceaccou {{< warning >}} Because service account tokens can also be stored in Secret API objects, any user with -write access to Secrets can request a token, and any user with read access to those +write access to Secrets can request a token, and any user with read access to those Secrets can authenticate as the service account. Be cautious when granting permissions to service accounts and read or write capabilities for Secrets. {{< /warning >}} @@ -293,8 +293,9 @@ sequenceDiagram 1. Your identity provider will provide you with an `access_token`, `id_token` and a `refresh_token` 1. When using `kubectl`, use your `id_token` with the `--token` flag or add it directly to your `kubeconfig` 1. `kubectl` sends your `id_token` in a header called Authorization to the API server -1. The API server will make sure the JWT signature is valid by checking against the certificate named in the configuration +1. The API server will make sure the JWT signature is valid 1. Check to make sure the `id_token` hasn't expired + 1. Perform claim and/or user validation if CEL expressions are configured with `AuthenticationConfiguration`. 1. Make sure the user is authorized 1. Once authorized the API server returns a response to `kubectl` 1. `kubectl` provides feedback to the user @@ -312,6 +313,8 @@ very scalable solution for authentication. It does offer a few challenges: #### Configuring the API Server +##### Using flags + To enable the plugin, configure the following flags on the API server: | Parameter | Description | Example | Required | @@ -326,6 +329,106 @@ To enable the plugin, configure the following flags on the API server: | `--oidc-ca-file` | The path to the certificate for the CA that signed your identity provider's web certificate. Defaults to the host's root CAs. | `/etc/kubernetes/ssl/kc-ca.pem` | No | | `--oidc-signing-algs` | The signing algorithms accepted. Default is "RS256". | `RS512` | No | +##### Using Authentication Configuration + +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} + +The API server can be configured to use a JWT authenticator via the `--authentication-config` flag. This flag takes a path to a file containing the `AuthenticationConfiguration`. An example configuration is provided below. +To use this config, the `StructuredAuthenticationConfiguration` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) +has to be enabled. + +Note: When the feature is enabled, setting both `--authentication-config` and any of the `--oidc-*` flags will result in an error. If you want to use the feature, you have to remove the `--oidc-*` flags and use the configuration file instead. + +```yaml +--- +# +# CAUTION: this is an example configuration. +# Do not use this for your own cluster! +# +apiVersion: apiserver.config.k8s.io/v1alpha1 +kind: AuthenticationConfiguration +# list of authenticators to authenticate Kubernetes users using JWT compliant tokens. +jwt: +- issuer: + url: https://example.com # Same as --oidc-issuer-url. + audiences: + - my-app # Same as --oidc-client-id. + # rules applied to validate token claims to authenticate users. + claimValidationRules: + # Same as --oidc-required-claim key=value. + - claim: hd + requiredValue: example.com + # Instead of claim and requiredValue, you can use expression to validate the claim. + # expression is a CEL expression that evaluates to a boolean. + - expression: 'claims.hd == "example.com"' + # Message customizes the error message seen in the API server logs when the validation fails. + message: the hd claim must be set to example.com + - expression: 'claims.exp - claims.nbf <= 86400' + message: total token lifetime must not exceed 24 hours + claimMappings: + # username represents an option for the username attribute. + # This is the only required attribute. + username: + # Same as --oidc-username-claim. Mutually exclusive with username.expression. + claim: "sub" + # Same as --oidc-username-prefix. Mutually exclusive with username.expression. + # if username.claim is set, username.prefix is required. + # Explicitly set it to "" if no prefix is desired. + prefix: "" + # Mutually exclusive with username.claim and username.prefix. + # expression is a CEL expression that evaluates to a string. + expression: 'claims.username + ":external-user"' + # groups represents an option for the groups attribute. + groups: + # Same as --oidc-groups-claim. Mutually exclusive with groups.expression. + claim: "sub" + # Same as --oidc-groups-prefix. Mutually exclusive with groups.expression. + # if groups.claim is set, groups.prefix is required. + # Explicitly set it to "" if no prefix is desired. + prefix: "" + # Mutually exclusive with groups.claim and groups.prefix. + # expression is a CEL expression that evaluates to a string or a list of strings. + expression: 'claims.roles.split(",")' + # uid represents an option for the uid attribute. + uid: + # Mutually exclusive with uid.expression. + claim: 'sub' + # Mutually exclusive with uid.claim + # expression is a CEL expression that evaluates to a string. + expression: 'claims.sub' + # extra attributes to be added to the UserInfo object. Keys must be domain-prefix path and must be unique. + extra: + - key: 'example.com/tenant' + # valueExpression is a CEL expression that evaluates to a string or a list of strings. + valueExpression: 'claims.tenant' + # validation rules applied to the final user object. + userValidationRules: + # expression is a CEL expression that evaluates to a boolean. + - expression: "!user.username.startsWith('system:')" + # Message customizes the error message seen in the API server logs when the validation fails. + message: 'username cannot used reserved system: prefix' + - expression: "user.groups.all(group, !group.startsWith('system:'))" + message: 'groups cannot used reserved system: prefix' +``` + +* Claim validation rule expression + + `jwt.claimValidationRules[i].expression` represents the expression which will be evaluated by CEL. + CEL expressions have access to the contents of the token payload, organized into `claims` CEL variable. + `claims` is a map of claim names (as strings) to claim values (of any type). +* User validation rule expression + + `jwt.userValidationRules[i].expression` represents the expression which will be evaluated by CEL. + CEL expressions have access to the contents of `userInfo`, organized into `user` CEL variable. +* Claim mapping expression + + `jwt.claimMappings.username.expression`, `jwt.claimMappings.groups.expression`, `jwt.claimMappings.uid.expression` + `jwt.claimMappings.extra[i].valueExpression` represents the expression which will be evaluated by CEL. + CEL expressions have access to the contents of the token payload, organized into `claims` CEL variable. + `claims` is a map of claim names (as strings) to claim values (of any type). + + To learn more, see the [Documentation on CEL](/docs/reference/using-api/cel/) + Importantly, the API server is not an OAuth2 client, rather it can only be configured to trust a single issuer. This allows the use of public providers, such as Google, without trusting credentials issued to third parties. Admins who @@ -432,7 +535,7 @@ Webhook authentication is a hook for verifying bearer tokens. * `--authentication-token-webhook-config-file` a configuration file describing how to access the remote webhook service. * `--authentication-token-webhook-cache-ttl` how long to cache authentication decisions. Defaults to two minutes. -* `--authentication-token-webhook-version` determines whether to use `authentication.k8s.io/v1beta1` or `authentication.k8s.io/v1` +* `--authentication-token-webhook-version` determines whether to use `authentication.k8s.io/v1beta1` or `authentication.k8s.io/v1` `TokenReview` objects to send/receive information from the webhook. Defaults to `v1beta1`. The configuration file uses the [kubeconfig](/docs/concepts/configuration/organize-cluster-access-kubeconfig/) @@ -489,9 +592,9 @@ To opt into receiving `authentication.k8s.io/v1` token reviews, the API server m "spec": { # Opaque bearer token sent to the API server "token": "014fbff9a07c...", - + # Optional list of the audience identifiers for the server the token was presented to. - # Audience-aware token authenticators (for example, OIDC token authenticators) + # Audience-aware token authenticators (for example, OIDC token authenticators) # should verify the token was intended for at least one of the audiences in this list, # and return the intersection of this list and the valid audiences for the token in the response status. # This ensures the token is valid to authenticate to the server it was presented to. @@ -509,9 +612,9 @@ To opt into receiving `authentication.k8s.io/v1` token reviews, the API server m "spec": { # Opaque bearer token sent to the API server "token": "014fbff9a07c...", - + # Optional list of the audience identifiers for the server the token was presented to. - # Audience-aware token authenticators (for example, OIDC token authenticators) + # Audience-aware token authenticators (for example, OIDC token authenticators) # should verify the token was intended for at least one of the audiences in this list, # and return the intersection of this list and the valid audiences for the token in the response status. # This ensures the token is valid to authenticate to the server it was presented to. @@ -870,7 +973,7 @@ rules: {{< note >}} Impersonating a user or group allows you to perform any action as if you were that user or group; for that reason, impersonation is not namespace scoped. -If you want to allow impersonation using Kubernetes RBAC, +If you want to allow impersonation using Kubernetes RBAC, this requires using a `ClusterRole` and a `ClusterRoleBinding`, not a `Role` and `RoleBinding`. {{< /note >}} @@ -1374,7 +1477,7 @@ status: {{% /tab %}} {{< /tabs >}} -This feature is extremely useful when a complicated authentication flow is used in a Kubernetes cluster, +This feature is extremely useful when a complicated authentication flow is used in a Kubernetes cluster, for example, if you use [webhook token authentication](/docs/reference/access-authn-authz/authentication/#webhook-token-authentication) or [authenticating proxy](/docs/reference/access-authn-authz/authentication/#authenticating-proxy). @@ -1386,7 +1489,7 @@ you see the user details and properties for the user that was impersonated. {{< /note >}} By default, all authenticated users can create `SelfSubjectReview` objects when the `APISelfSubjectReview` -feature is enabled. It is allowed by the `system:basic-user` cluster role. +feature is enabled. It is allowed by the `system:basic-user` cluster role. {{< note >}} You can only make `SelfSubjectReview` requests if: From 394db549ac5ca90164ca26ef3cc1d672c6a63fcd Mon Sep 17 00:00:00 2001 From: Andrea Tosatto Date: Mon, 10 Jul 2023 15:50:15 +0100 Subject: [PATCH 53/82] Decouple TaintManager from NodeLifeCycleController (KEP-3902) --- .../command-line-tools-reference/feature-gates.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 72a260878da..32480fa5a7f 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -184,6 +184,7 @@ For a reference to old feature gates that are removed, please refer to | `SELinuxMountReadWriteOncePod` | `true` | Beta | 1.28 | | | `SchedulerQueueingHints` | `true` | Beta | 1.28 | | | `SecurityContextDeny` | `false` | Alpha | 1.27 | | +| `SeparateTaintEvictionController` | `true` | Beta | 1.29 | | | `ServiceAccountTokenJTI` | `false` | Alpha | 1.29 | | | `ServiceAccountTokenNodeBinding` | `false` | Alpha | 1.29 | | | `ServiceAccountTokenNodeBindingValidation` | `false` | Alpha | 1.29 | | @@ -726,6 +727,11 @@ Each feature gate is designed for enabling/disabling a specific feature: for all workloads. The seccomp profile is specified in the `securityContext` of a Pod and/or a Container. - `SecurityContextDeny`: This gate signals that the `SecurityContextDeny` admission controller is deprecated. +- `SeparateTaintEvictionController`: Enables running `TaintEvictionController`, + that performs [Taint-based Evictions](/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-based-evictions), + in a controller separated from `NodeLifecycleController`. When this feature is + enabled, users can optionally disable Taint-based Eviction setting the + `--controllers=-taint-eviction-controller` flag on the `kube-controller-manager`. - `ServerSideApply`: Enables the [Sever Side Apply (SSA)](/docs/reference/using-api/server-side-apply/) feature on the API Server. - `ServerSideFieldValidation`: Enables server-side field validation. This means the validation @@ -794,4 +800,4 @@ Each feature gate is designed for enabling/disabling a specific feature: feature, you will also need to enable any associated API resources. For example, to enable a particular resource like `storage.k8s.io/v1beta1/csistoragecapacities`, set `--runtime-config=storage.k8s.io/v1beta1/csistoragecapacities`. - See [API Versioning](/docs/reference/using-api/#api-versioning) for more details on the command line flags. + See [API Versioning](/docs/reference/using-api/#api-versioning) for more details on the command line flags. \ No newline at end of file From fdf935bf57c90f92b867ca68bc5569a769937560 Mon Sep 17 00:00:00 2001 From: Shiming Zhang Date: Wed, 1 Nov 2023 09:53:49 +0800 Subject: [PATCH 54/82] Docs update for Beta PodHostIPs --- .../reference/command-line-tools-reference/feature-gates.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 1d88ec7b0f3..0fa1541be5f 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -167,7 +167,8 @@ For a reference to old feature gates that are removed, please refer to | `PodDeletionCost` | `true` | Beta | 1.22 | | | `PodDisruptionConditions` | `false` | Alpha | 1.25 | 1.25 | | `PodDisruptionConditions` | `true` | Beta | 1.26 | | -| `PodHostIPs` | `false` | Alpha | 1.28 | | +| `PodHostIPs` | `false` | Alpha | 1.28 | 1.28 | +| `PodHostIPs` | `true` | Beta | 1.29 | | | `PodIndexLabel` | `true` | Beta | 1.28 | | | `PodLifecycleSleepAction` | `false` | Alpha | 1.29 | | | `PodReadyToStartContainersCondition` | `false` | Alpha | 1.28 | | From 10568634b5faf70fb907f3077cfacb772ba20ed1 Mon Sep 17 00:00:00 2001 From: Nabarun Pal Date: Wed, 22 Nov 2023 10:22:33 +0530 Subject: [PATCH 55/82] Update from code review Signed-off-by: Nabarun Pal --- .../reference/access-authn-authz/authorization.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/content/en/docs/reference/access-authn-authz/authorization.md b/content/en/docs/reference/access-authn-authz/authorization.md index 16cbaa4a583..83a7c818339 100644 --- a/content/en/docs/reference/access-authn-authz/authorization.md +++ b/content/en/docs/reference/access-authn-authz/authorization.md @@ -209,14 +209,20 @@ The following flags can be used: You can choose more than one authorization module. Modules are checked in order so an earlier module has higher priority to allow or deny a request. -## Configuring the API Server using a Authorization Config File +## Configuring the API Server using an Authorization Config File {{< feature-state state="alpha" for_k8s_version="v1.29" >}} -Kubernetes API Server authorizer chain can be configured using a config file by passing it through the `--authorization-config` flag. An example configuration with all possible values is provided below. In order to use the feature, the `StructuredAuthorizationConfiguration` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) has to be enabled. +Kubernetes API Server authorizer chain can be configured using a config file by passing it through the `--authorization-config` flag. This feature enables creation of authorization chains with multiple webhooks with well-defined parameters that validate requests in a certain order and enables fine grained control like explicit Deny on failures. An example configuration with all possible values is provided below. In order to use the feature, the `StructuredAuthorizationConfiguration` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) has to be enabled. Note: When the feature is enabled, setting both `--authorization-config` and configuring an authorization webhook using the `--authorization-mode` and `--authorization-webhook-*` command line flags is not allowed. If done, there will be an error and API Server would exit right away. +{{< caution >}} +While the feature is in Alpha/Beta, there is no change if you want to keep on using command line flags. When the feature goes Beta, the feature flag would be turned on by default. The feature flag would be removed when feature goes GA. + +When configuring the authorizer chain using a config file, make sure all the apiserver nodes have the file. Also, take a note of the apiserver configuration when upgrading/downgrading the clusters. For example, if upgrading to v1.29+ clusters and using the config file, you would need to add to the flags back to the cluster bootrap mechanism. +{{< /caution >}} + ```yaml # # DO NOT USE THE CONFIG AS IS. THIS IS AN EXAMPLE. @@ -238,7 +244,7 @@ authorizers: # Same as setting `--authorization-webhook-cache-authorized-ttl` flag # Default: 5m0s authorizedTTL: 30s - # The duration to cache 'authorized' responses from the webhook + # The duration to cache 'unauthorized' responses from the webhook # authorizer. # Same as setting `--authorization-webhook-cache-unauthorized-ttl` flag # Default: 30s From 1c3945fa7ee241eebd36ea35e4e6e0d68243126e Mon Sep 17 00:00:00 2001 From: Abu Kashem Date: Wed, 22 Nov 2023 14:49:27 -0500 Subject: [PATCH 56/82] apiserver: update APF documentation for GA --- .../cluster-administration/flow-control.md | 39 ++++++++----------- .../health-for-strangers.yaml | 2 +- .../list-events-default-service-account.yaml | 2 +- 3 files changed, 19 insertions(+), 24 deletions(-) diff --git a/content/en/docs/concepts/cluster-administration/flow-control.md b/content/en/docs/concepts/cluster-administration/flow-control.md index a00c146d514..859e841d0b7 100644 --- a/content/en/docs/concepts/cluster-administration/flow-control.md +++ b/content/en/docs/concepts/cluster-administration/flow-control.md @@ -7,7 +7,7 @@ weight: 110 -{{< feature-state state="beta" for_k8s_version="v1.20" >}} +{{< feature-state state="stable" for_k8s_version="v1.29" >}} Controlling the behavior of the Kubernetes API server in an overload situation is a key task for cluster administrators. The {{< glossary_tooltip @@ -45,30 +45,27 @@ are not subject to the `--max-requests-inflight` limit. ## Enabling/Disabling API Priority and Fairness -The API Priority and Fairness feature is controlled by a feature gate -and is enabled by default. See [Feature -Gates](/docs/reference/command-line-tools-reference/feature-gates/) -for a general explanation of feature gates and how to enable and -disable them. The name of the feature gate for APF is -"APIPriorityAndFairness". This feature also involves an {{< -glossary_tooltip term_id="api-group" text="API Group" >}} with: (a) a -`v1alpha1` version and a `v1beta1` version, disabled by default, and -(b) `v1beta2` and `v1beta3` versions, enabled by default. You can -disable the feature gate and API group beta versions by adding the +The API Priority and Fairness feature is controlled by a command-line flag +and is enabled by default. See +[Options](/docs/reference/command-line-tools-reference/kube-apiserver/options/) +for a general explanation of the available kube-apiserver command-line +options and how to enable and disable them. The name of the +command-line option for APF is "--enable-priority-and-fairness". This feature +also involves an {{}} +with: (a) a stable `v1` version, introduced in 1.29, and +enabled by default (b) a `v1beta3` version, enabled by default, and +deprecated in v1.29. You can +disable the API group beta version `v1beta3` by adding the following command-line flags to your `kube-apiserver` invocation: ```shell kube-apiserver \ ---feature-gates=APIPriorityAndFairness=false \ ---runtime-config=flowcontrol.apiserver.k8s.io/v1beta2=false,flowcontrol.apiserver.k8s.io/v1beta3=false \ +--runtime-config=flowcontrol.apiserver.k8s.io/v1beta3=false \ # …and other flags as usual ``` -Alternatively, you can enable the v1alpha1 and v1beta1 versions of the API group -with `--runtime-config=flowcontrol.apiserver.k8s.io/v1alpha1=true,flowcontrol.apiserver.k8s.io/v1beta1=true`. - The command-line flag `--enable-priority-and-fairness=false` will disable the -API Priority and Fairness feature, even if other flags have enabled it. +API Priority and Fairness feature. ## Concepts @@ -178,14 +175,12 @@ server. ## Resources The flow control API involves two kinds of resources. -[PriorityLevelConfigurations](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#prioritylevelconfiguration-v1beta2-flowcontrol-apiserver-k8s-io) +[PriorityLevelConfigurations](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#prioritylevelconfiguration-v1-flowcontrol-apiserver-k8s-io) define the available priority levels, the share of the available concurrency budget that each can handle, and allow for fine-tuning queuing behavior. -[FlowSchemas](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#flowschema-v1beta2-flowcontrol-apiserver-k8s-io) +[FlowSchemas](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#flowschema-v1-flowcontrol-apiserver-k8s-io) are used to classify individual inbound requests, matching each to a -single PriorityLevelConfiguration. There is also a `v1alpha1` version -of the same API group, and it has the same Kinds with the same syntax and -semantics. +single PriorityLevelConfiguration. ### PriorityLevelConfiguration diff --git a/content/en/examples/priority-and-fairness/health-for-strangers.yaml b/content/en/examples/priority-and-fairness/health-for-strangers.yaml index 86b92619e7f..312f80751ff 100644 --- a/content/en/examples/priority-and-fairness/health-for-strangers.yaml +++ b/content/en/examples/priority-and-fairness/health-for-strangers.yaml @@ -1,4 +1,4 @@ -apiVersion: flowcontrol.apiserver.k8s.io/v1beta3 +apiVersion: flowcontrol.apiserver.k8s.io/v1 kind: FlowSchema metadata: name: health-for-strangers diff --git a/content/en/examples/priority-and-fairness/list-events-default-service-account.yaml b/content/en/examples/priority-and-fairness/list-events-default-service-account.yaml index 94e73ae9488..e9e1beab998 100644 --- a/content/en/examples/priority-and-fairness/list-events-default-service-account.yaml +++ b/content/en/examples/priority-and-fairness/list-events-default-service-account.yaml @@ -1,4 +1,4 @@ -apiVersion: flowcontrol.apiserver.k8s.io/v1beta3 +apiVersion: flowcontrol.apiserver.k8s.io/v1 kind: FlowSchema metadata: name: list-events-default-service-account From 03e2976d908bee79b24c503b5e973fc0e4e54273 Mon Sep 17 00:00:00 2001 From: Nabarun Pal Date: Fri, 24 Nov 2023 12:03:35 +0530 Subject: [PATCH 57/82] Add more context to downgrade example Signed-off-by: Nabarun Pal --- content/en/docs/reference/access-authn-authz/authorization.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/reference/access-authn-authz/authorization.md b/content/en/docs/reference/access-authn-authz/authorization.md index 83a7c818339..cf1ee8847cd 100644 --- a/content/en/docs/reference/access-authn-authz/authorization.md +++ b/content/en/docs/reference/access-authn-authz/authorization.md @@ -220,7 +220,7 @@ Note: When the feature is enabled, setting both `--authorization-config` and con {{< caution >}} While the feature is in Alpha/Beta, there is no change if you want to keep on using command line flags. When the feature goes Beta, the feature flag would be turned on by default. The feature flag would be removed when feature goes GA. -When configuring the authorizer chain using a config file, make sure all the apiserver nodes have the file. Also, take a note of the apiserver configuration when upgrading/downgrading the clusters. For example, if upgrading to v1.29+ clusters and using the config file, you would need to add to the flags back to the cluster bootrap mechanism. +When configuring the authorizer chain using a config file, make sure all the apiserver nodes have the file. Also, take a note of the apiserver configuration when upgrading/downgrading the clusters. For example, if upgrading to v1.29+ clusters and using the config file, you would need to make sure the config file exists before upgrading the cluster. When downgrading to v1.28, you would need to add the flags back to their bootstrap mechanism. {{< /caution >}} ```yaml From edddb55b7a8f0dc314333a39053d86dcfcb77c4f Mon Sep 17 00:00:00 2001 From: Kirtana Ashok Date: Tue, 17 Oct 2023 11:18:21 -0700 Subject: [PATCH 58/82] KEP 4216: Doc changes for image pull per runtime class Signed-off-by: Kirtana Ashok (cherry picked from commit 10a984d1ed258b9878bee94bfd779209d1ea0f8c) Signed-off-by: Kirtana Ashok --- content/en/docs/concepts/containers/images.md | 11 +++++++++++ .../command-line-tools-reference/feature-gates.md | 3 +++ 2 files changed, 14 insertions(+) diff --git a/content/en/docs/concepts/containers/images.md b/content/en/docs/concepts/containers/images.md index b01b2fd112e..230e613c333 100644 --- a/content/en/docs/concepts/containers/images.md +++ b/content/en/docs/concepts/containers/images.md @@ -159,6 +159,17 @@ that Kubernetes will keep trying to pull the image, with an increasing back-off Kubernetes raises the delay between each attempt until it reaches a compiled-in limit, which is 300 seconds (5 minutes). +## Image pull per runtime class + +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} +Kubernetes includes alpha support for performing image pulls based on the RuntimeClass of a Pod. + +If you enable the `RuntimeClassInImageCriApi` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/), +the kubelet references container images by a tuple of (image name, runtime handler) rather than just the +image name or digest. Your {{< glossary_tooltip text="container runtime" term_id="container-runtime" >}} +may adapt its behavior based on the selected runtime handler. +Pulling images based on runtime class will be helpful for VM based containers like windows hyperV containers. + ## Serial and parallel image pulls By default, kubelet pulls images serially. In other words, kubelet sends only diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 8701ea7d4fe..a33af1335af 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -182,6 +182,7 @@ For a reference to old feature gates that are removed, please refer to | `RemainingItemCount` | `true` | Beta | 1.16 | | | `RotateKubeletServerCertificate` | `false` | Alpha | 1.7 | 1.11 | | `RotateKubeletServerCertificate` | `true` | Beta | 1.12 | | +| `RuntimeClassInImageCriApi` | `false` | Alpha | 1.29 | | | `SELinuxMountReadWriteOncePod` | `false` | Alpha | 1.25 | 1.26 | | `SELinuxMountReadWriteOncePod` | `false` | Beta | 1.27 | 1.27 | | `SELinuxMountReadWriteOncePod` | `true` | Beta | 1.28 | | @@ -695,6 +696,8 @@ Each feature gate is designed for enabling/disabling a specific feature: - `RotateKubeletServerCertificate`: Enable the rotation of the server TLS certificate on the kubelet. See [kubelet configuration](/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/#kubelet-configuration) for more details. +- `RuntimeClassInImageCriApi` : Enables images to be pulled based on the [runtime class] + (/docs/concepts/containers/runtime-class/) of the pods that reference them. - `SELinuxMountReadWriteOncePod`: Speeds up container startup by allowing kubelet to mount volumes for a Pod directly with the correct SELinux label instead of changing each file on the volumes recursively. The initial implementation focused on ReadWriteOncePod volumes. From 21ac70ee24adde952da2a452fd6e139cb0238d60 Mon Sep 17 00:00:00 2001 From: Nabarun Pal Date: Mon, 27 Nov 2023 16:16:40 +0530 Subject: [PATCH 59/82] Wrap markdown text Signed-off-by: Nabarun Pal --- .../access-authn-authz/authorization.md | 29 ++++++++++++++++--- 1 file changed, 25 insertions(+), 4 deletions(-) diff --git a/content/en/docs/reference/access-authn-authz/authorization.md b/content/en/docs/reference/access-authn-authz/authorization.md index cf1ee8847cd..621cc9773b4 100644 --- a/content/en/docs/reference/access-authn-authz/authorization.md +++ b/content/en/docs/reference/access-authn-authz/authorization.md @@ -213,14 +213,35 @@ so an earlier module has higher priority to allow or deny a request. {{< feature-state state="alpha" for_k8s_version="v1.29" >}} -Kubernetes API Server authorizer chain can be configured using a config file by passing it through the `--authorization-config` flag. This feature enables creation of authorization chains with multiple webhooks with well-defined parameters that validate requests in a certain order and enables fine grained control like explicit Deny on failures. An example configuration with all possible values is provided below. In order to use the feature, the `StructuredAuthorizationConfiguration` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) has to be enabled. +The Kubernetes API server's authorizer chain can be configured using a +configuration file. -Note: When the feature is enabled, setting both `--authorization-config` and configuring an authorization webhook using the `--authorization-mode` and `--authorization-webhook-*` command line flags is not allowed. If done, there will be an error and API Server would exit right away. +You specify the path to that authorization configuration using the +`--authorization-config` command line argument. This feature enables +creation of authorization chains with multiple webhooks with well-defined +parameters that validate requests in a certain order and enables fine grained +control - such as explicit Deny on failures. An example configuration with +all possible values is provided below. + +In order to customise the authorizer chain, you need to enable the +`StructuredAuthorizationConfiguration` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/). + +Note: When the feature is enabled, setting both `--authorization-config` and +configuring an authorization webhook using the `--authorization-mode` and +`--authorization-webhook-*` command line flags is not allowed. If done, there +will be an error and API Server would exit right away. {{< caution >}} -While the feature is in Alpha/Beta, there is no change if you want to keep on using command line flags. When the feature goes Beta, the feature flag would be turned on by default. The feature flag would be removed when feature goes GA. +While the feature is in Alpha/Beta, there is no change if you want to keep on +using command line flags. When the feature goes Beta, the feature flag would +be turned on by default. The feature flag would be removed when feature goes GA. -When configuring the authorizer chain using a config file, make sure all the apiserver nodes have the file. Also, take a note of the apiserver configuration when upgrading/downgrading the clusters. For example, if upgrading to v1.29+ clusters and using the config file, you would need to make sure the config file exists before upgrading the cluster. When downgrading to v1.28, you would need to add the flags back to their bootstrap mechanism. +When configuring the authorizer chain using a config file, make sure all the +apiserver nodes have the file. Also, take a note of the apiserver configuration +when upgrading/downgrading the clusters. For example, if upgrading to v1.29+ +clusters and using the config file, you would need to make sure the config file +exists before upgrading the cluster. When downgrading to v1.28, you would need +to add the flags back to their bootstrap mechanism. {{< /caution >}} ```yaml From 75e93c6c23c3518f853251d5c6ed5b9bb54f382c Mon Sep 17 00:00:00 2001 From: Dan Winship Date: Thu, 19 Oct 2023 09:04:00 -0400 Subject: [PATCH 60/82] Document the nftables kube-proxy mode. --- .../docs/reference/networking/virtual-ips.md | 20 +++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/content/en/docs/reference/networking/virtual-ips.md b/content/en/docs/reference/networking/virtual-ips.md index 1595834ee5c..623cb37525c 100644 --- a/content/en/docs/reference/networking/virtual-ips.md +++ b/content/en/docs/reference/networking/virtual-ips.md @@ -62,6 +62,9 @@ On Linux nodes, the available modes for kube-proxy are: [`ipvs`](#proxy-mode-ipvs) : a mode where the kube-proxy configures packet forwarding rules using ipvs. +[`nftables`](#proxy-mode-nftables) +: a mode where the kube-proxy configures packet forwarding rules using nftables. + There is only one mode available for kube-proxy on Windows: [`kernelspace`](#proxy-mode-kernelspace) @@ -268,6 +271,23 @@ falls back to running in iptables proxy mode. {{< figure src="/images/docs/services-ipvs-overview.svg" title="Virtual IP address mechanism for Services, using IPVS mode" class="diagram-medium" >}} +### `nftables` proxy mode {#proxy-mode-nftables} + +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} + +_This proxy mode is only available on Linux nodes._ + +In this mode, kube-proxy configures packet forwarding rules using the +nftables API of the kernel netfilter subsystem. For each endpoint, it +installs nftables rules which, by default, select a backend Pod at +random. + +The nftables API is the successor to the iptables API, and although it +is designed to provide better performance and scalability than +iptables, the kube-proxy nftables mode is still under heavy +development as of {{< skew currentVersion >}} and is not necessarily +expected to outperform the other Linux modes at this time. + ### `kernelspace` proxy mode {#proxy-mode-kernelspace} _This proxy mode is only available on Windows nodes._ From 41e0c2f21bac90af94d1044c4125460a271c7f2c Mon Sep 17 00:00:00 2001 From: Alex Zielenski Date: Mon, 27 Nov 2023 09:08:09 -0800 Subject: [PATCH 61/82] jpbetz feedback --- .../custom-resources/custom-resource-definitions.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md index b5ff4b07313..b704c4c3797 100644 --- a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md +++ b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md @@ -1185,11 +1185,12 @@ Setting `fieldPath` is optional. #### The `optionalOldSelf` field {#field-optional-oldself} -The `optionalOldSelf` field is a boolean field added in Kubernetes 1.29. The feature -[CRDValidationRatcheting](#validation-ratcheting) must be enabled in order to +{{< feature-state state="alpha" for_k8s_version="v1.29" >}} + +The feature [CRDValidationRatcheting](#validation-ratcheting) must be enabled in order to make use of this field. -This field alters the behavior of [Transition Rules](#transition-rules) described +The `optionalOldSelf` field is a boolean that field alters the behavior of [Transition Rules](#transition-rules) described below. Normally, a transition rule will not evaluate if `oldSelf` cannot be determined: during object creation or when a new value is introduced in an update. From 6f44e15b562517c278e4edbe34e29ad10dceecbb Mon Sep 17 00:00:00 2001 From: Alex Zielenski Date: Mon, 27 Nov 2023 09:17:40 -0800 Subject: [PATCH 62/82] typo fix --- .../custom-resources/custom-resource-definitions.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md index b704c4c3797..d765d7e6481 100644 --- a/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md +++ b/content/en/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions.md @@ -1190,7 +1190,7 @@ Setting `fieldPath` is optional. The feature [CRDValidationRatcheting](#validation-ratcheting) must be enabled in order to make use of this field. -The `optionalOldSelf` field is a boolean that field alters the behavior of [Transition Rules](#transition-rules) described +The `optionalOldSelf` field is a boolean field that alters the behavior of [Transition Rules](#transition-rules) described below. Normally, a transition rule will not evaluate if `oldSelf` cannot be determined: during object creation or when a new value is introduced in an update. @@ -1198,7 +1198,7 @@ If `optionalOldSelf` is set to true, then transition rules will always be evaluated and the type of `oldSelf` be changed to a CEL [`Optional`](https://pkg.go.dev/github.com/google/cel-go/cel#OptionalTypes) type. `optionalOldSelf` is useful in cases where schema authors would like a more -powerful tool than [implicit deepequal validation ratcheting][#validation-ratcheting] +control tool [than provided by the default equality based behavior of ][#validation-ratcheting] to introduce newer, usually stricter constraints on new values, while still allowing old values to be "grandfathered" or ratcheted using the older validation. From 74caa0daaaf30a6a0d03742499492b8c5009181c Mon Sep 17 00:00:00 2001 From: Anish Ramasekar Date: Mon, 27 Nov 2023 19:51:49 +0000 Subject: [PATCH 63/82] review feedback Signed-off-by: Anish Ramasekar --- .../access-authn-authz/authentication.md | 187 +++++++++++++++++- 1 file changed, 186 insertions(+), 1 deletion(-) diff --git a/content/en/docs/reference/access-authn-authz/authentication.md b/content/en/docs/reference/access-authn-authz/authentication.md index c80434eca98..4831723e8fe 100644 --- a/content/en/docs/reference/access-authn-authz/authentication.md +++ b/content/en/docs/reference/access-authn-authz/authentication.md @@ -333,11 +333,16 @@ To enable the plugin, configure the following flags on the API server: {{< feature-state for_k8s_version="v1.29" state="alpha" >}} +JWT Authenticator is an authenticator to authenticate Kubernetes users using JWT compliant tokens. The authenticator will attempt to +parse a raw ID token, verify it's been signed by the configured issuer. The public key to verify the signature is discovered from the issuer's public endpoint using OIDC discovery. + The API server can be configured to use a JWT authenticator via the `--authentication-config` flag. This flag takes a path to a file containing the `AuthenticationConfiguration`. An example configuration is provided below. To use this config, the `StructuredAuthenticationConfiguration` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) has to be enabled. -Note: When the feature is enabled, setting both `--authentication-config` and any of the `--oidc-*` flags will result in an error. If you want to use the feature, you have to remove the `--oidc-*` flags and use the configuration file instead. +{{< note >}} +When the feature is enabled, setting both `--authentication-config` and any of the `--oidc-*` flags will result in an error. If you want to use the feature, you have to remove the `--oidc-*` flags and use the configuration file instead. +{{< /note >}} ```yaml --- @@ -360,6 +365,7 @@ jwt: requiredValue: example.com # Instead of claim and requiredValue, you can use expression to validate the claim. # expression is a CEL expression that evaluates to a boolean. + # all the expressions must evaluate to true for validation to succeed. - expression: 'claims.hd == "example.com"' # Message customizes the error message seen in the API server logs when the validation fails. message: the hd claim must be set to example.com @@ -404,6 +410,7 @@ jwt: # validation rules applied to the final user object. userValidationRules: # expression is a CEL expression that evaluates to a boolean. + # all the expressions must evaluate to true for the user to be valid. - expression: "!user.username.startsWith('system:')" # Message customizes the error message seen in the API server logs when the validation fails. message: 'username cannot used reserved system: prefix' @@ -420,6 +427,7 @@ jwt: `jwt.userValidationRules[i].expression` represents the expression which will be evaluated by CEL. CEL expressions have access to the contents of `userInfo`, organized into `user` CEL variable. + Refer to the [UserInfo](/docs/reference/generated/kubernetes-api/v{{< skew currentVersion >}}/#userinfo-v1-authentication-k8s-io) API documentation for the schema of `user`. * Claim mapping expression `jwt.claimMappings.username.expression`, `jwt.claimMappings.groups.expression`, `jwt.claimMappings.uid.expression` @@ -429,6 +437,183 @@ jwt: To learn more, see the [Documentation on CEL](/docs/reference/using-api/cel/) + Here are examples of the `AuthenticationConfiguration` with different token payloads. + + {{< tabs name="example_configuration" >}} + {{% tab name="Valid token" %}} + ```yaml + apiVersion: apiserver.config.k8s.io/v1alpha1 + kind: AuthenticationConfiguration + jwt: + - issuer: + url: https://example.com + audiences: + - my-app + claimMappings: + username: + expression: 'claims.username + ":external-user"' + groups: + expression: 'claims.roles.split(",")' + uid: + expression: 'claims.sub' + extra: + - key: 'example.com/tenant' + valueExpression: 'claims.tenant' + userValidationRules: + - expression: "!user.username.startsWith('system:')" # the expression will evaluate to true, so validation will succeed. + message: 'username cannot used reserved system: prefix' + ``` + + ```bash + TOKEN=eyJhbGciOiJSUzI1NiIsImtpZCI6ImY3dF9tOEROWmFTQk1oWGw5QXZTWGhBUC04Y0JmZ0JVbFVpTG5oQkgxdXMiLCJ0eXAiOiJKV1QifQ.eyJhdWQiOiJrdWJlcm5ldGVzIiwiZXhwIjoxNzAzMjMyOTQ5LCJpYXQiOjE3MDExMDcyMzMsImlzcyI6Imh0dHBzOi8vZXhhbXBsZS5jb20iLCJqdGkiOiI3YzMzNzk0MjgwN2U3M2NhYTJjMzBjODY4YWMwY2U5MTBiY2UwMmRkY2JmZWJlOGMyM2I4YjVmMjdhZDYyODczIiwibmJmIjoxNzAxMTA3MjMzLCJyb2xlcyI6InVzZXIsYWRtaW4iLCJzdWIiOiJhdXRoIiwidGVuYW50IjoiNzJmOTg4YmYtODZmMS00MWFmLTkxYWItMmQ3Y2QwMTFkYjRhIiwidXNlcm5hbWUiOiJmb28ifQ.TBWF2RkQHm4QQz85AYPcwLxSk-VLvQW-mNDHx7SEOSv9LVwcPYPuPajJpuQn9C_gKq1R94QKSQ5F6UgHMILz8OfmPKmX_00wpwwNVGeevJ79ieX2V-__W56iNR5gJ-i9nn6FYk5pwfVREB0l4HSlpTOmu80gbPWAXY5hLW0ZtcE1JTEEmefORHV2ge8e3jp1xGafNy6LdJWabYuKiw8d7Qga__HxtKB-t0kRMNzLRS7rka_SfQg0dSYektuxhLbiDkqhmRffGlQKXGVzUsuvFw7IGM5ZWnZgEMDzCI357obHeM3tRqpn5WRjtB8oM7JgnCymaJi-P3iCd88iu1xnzA + ``` + where the token payload is: + + ```json + { + "aud": "kubernetes", + "exp": 1703232949, + "iat": 1701107233, + "iss": "https://example.com", + "jti": "7c337942807e73caa2c30c868ac0ce910bce02ddcbfebe8c23b8b5f27ad62873", + "nbf": 1701107233, + "roles": "user,admin", + "sub": "auth", + "tenant": "72f988bf-86f1-41af-91ab-2d7cd011db4a", + "username": "foo" + } + ``` + + The token with the above `AuthenticationConfiguration` will produce the following `UserInfo` object and successfully authenticate the user. + + ```json + { + "username": "foo:external-user", + "uid": "auth", + "groups": [ + "user", + "admin" + ], + "extra": { + "example.com/tenant": "tenant1" + } + } + ``` + {{% /tab %}} + {{% tab name="Fails claim validation" %}} + ```yaml + apiVersion: apiserver.config.k8s.io/v1alpha1 + kind: AuthenticationConfiguration + jwt: + - issuer: + url: https://example.com + audiences: + - my-app + claimValidationRules: + - expression: 'claims.hd == "example.com"' # the token below does not have this claim, so validation will fail. + message: the hd claim must be set to example.com + claimMappings: + username: + expression: 'claims.username + ":external-user"' + groups: + expression: 'claims.roles.split(",")' + uid: + expression: 'claims.sub' + extra: + - key: 'example.com/tenant' + valueExpression: 'claims.tenant' + userValidationRules: + - expression: "!user.username.startsWith('system:')" # the expression will evaluate to true, so validation will succeed. + message: 'username cannot used reserved system: prefix' + ``` + + ```bash + TOKEN=eyJhbGciOiJSUzI1NiIsImtpZCI6ImY3dF9tOEROWmFTQk1oWGw5QXZTWGhBUC04Y0JmZ0JVbFVpTG5oQkgxdXMiLCJ0eXAiOiJKV1QifQ.eyJhdWQiOiJrdWJlcm5ldGVzIiwiZXhwIjoxNzAzMjMyOTQ5LCJpYXQiOjE3MDExMDcyMzMsImlzcyI6Imh0dHBzOi8vZXhhbXBsZS5jb20iLCJqdGkiOiI3YzMzNzk0MjgwN2U3M2NhYTJjMzBjODY4YWMwY2U5MTBiY2UwMmRkY2JmZWJlOGMyM2I4YjVmMjdhZDYyODczIiwibmJmIjoxNzAxMTA3MjMzLCJyb2xlcyI6InVzZXIsYWRtaW4iLCJzdWIiOiJhdXRoIiwidGVuYW50IjoiNzJmOTg4YmYtODZmMS00MWFmLTkxYWItMmQ3Y2QwMTFkYjRhIiwidXNlcm5hbWUiOiJmb28ifQ.TBWF2RkQHm4QQz85AYPcwLxSk-VLvQW-mNDHx7SEOSv9LVwcPYPuPajJpuQn9C_gKq1R94QKSQ5F6UgHMILz8OfmPKmX_00wpwwNVGeevJ79ieX2V-__W56iNR5gJ-i9nn6FYk5pwfVREB0l4HSlpTOmu80gbPWAXY5hLW0ZtcE1JTEEmefORHV2ge8e3jp1xGafNy6LdJWabYuKiw8d7Qga__HxtKB-t0kRMNzLRS7rka_SfQg0dSYektuxhLbiDkqhmRffGlQKXGVzUsuvFw7IGM5ZWnZgEMDzCI357obHeM3tRqpn5WRjtB8oM7JgnCymaJi-P3iCd88iu1xnzA + ``` + where the token payload is: + ```json + { + "aud": "kubernetes", + "exp": 1703232949, + "iat": 1701107233, + "iss": "https://example.com", + "jti": "7c337942807e73caa2c30c868ac0ce910bce02ddcbfebe8c23b8b5f27ad62873", + "nbf": 1701107233, + "roles": "user,admin", + "sub": "auth", + "tenant": "72f988bf-86f1-41af-91ab-2d7cd011db4a", + "username": "foo" + } + ``` + + The token with the above `AuthenticationConfiguration` will fail to authenticate because the `hd` claim is not set to `example.com`. The API server will return `401 Unauthorized` error. + {{% /tab %}} + {{% tab name="Fails user validation" %}} + ```yaml + apiVersion: apiserver.config.k8s.io/v1alpha1 + kind: AuthenticationConfiguration + jwt: + - issuer: + url: https://example.com + audiences: + - my-app + claimValidationRules: + - expression: 'claims.hd == "example.com"' + message: the hd claim must be set to example.com + claimMappings: + username: + expression: '"system:" + claims.username' # this will prefix the username with "system:" and will fail user validation. + groups: + expression: 'claims.roles.split(",")' + uid: + expression: 'claims.sub' + extra: + - key: 'example.com/tenant' + valueExpression: 'claims.tenant' + userValidationRules: + - expression: "!user.username.startsWith('system:')" # the username will be system:foo and expression will evaluate to false, so validation will fail. + message: 'username cannot used reserved system: prefix' + ``` + ```bash + TOKEN=eyJhbGciOiJSUzI1NiIsImtpZCI6ImY3dF9tOEROWmFTQk1oWGw5QXZTWGhBUC04Y0JmZ0JVbFVpTG5oQkgxdXMiLCJ0eXAiOiJKV1QifQ.eyJhdWQiOiJrdWJlcm5ldGVzIiwiZXhwIjoxNzAzMjMyOTQ5LCJoZCI6ImV4YW1wbGUuY29tIiwiaWF0IjoxNzAxMTEzMTAxLCJpc3MiOiJodHRwczovL2V4YW1wbGUuY29tIiwianRpIjoiYjViMDY1MjM3MmNkMjBlMzQ1YjZmZGZmY2RjMjE4MWY0YWZkNmYyNTlhYWI0YjdlMzU4ODEyMzdkMjkyMjBiYyIsIm5iZiI6MTcwMTExMzEwMSwicm9sZXMiOiJ1c2VyLGFkbWluIiwic3ViIjoiYXV0aCIsInRlbmFudCI6IjcyZjk4OGJmLTg2ZjEtNDFhZi05MWFiLTJkN2NkMDExZGI0YSIsInVzZXJuYW1lIjoiZm9vIn0.FgPJBYLobo9jnbHreooBlvpgEcSPWnKfX6dc0IvdlRB-F0dCcgy91oCJeK_aBk-8zH5AKUXoFTlInfLCkPivMOJqMECA1YTrMUwt_IVqwb116AqihfByUYIIqzMjvUbthtbpIeHQm2fF0HbrUqa_Q0uaYwgy8mD807h7sBcUMjNd215ff_nFIHss-9zegH8GI1d9fiBf-g6zjkR1j987EP748khpQh9IxPjMJbSgG_uH5x80YFuqgEWwq-aYJPQxXX6FatP96a2EAn7wfPpGlPRt0HcBOvq5pCnudgCgfVgiOJiLr_7robQu4T1bis0W75VPEvwWtgFcLnvcQx0JWg + ``` + where the token payload is: + + ```json + { + "aud": "kubernetes", + "exp": 1703232949, + "hd": "example.com", + "iat": 1701113101, + "iss": "https://example.com", + "jti": "b5b0652372cd20e345b6fdffcdc2181f4afd6f259aab4b7e35881237d29220bc", + "nbf": 1701113101, + "roles": "user,admin", + "sub": "auth", + "tenant": "72f988bf-86f1-41af-91ab-2d7cd011db4a", + "username": "foo" + } + ``` + + The token with the above `AuthenticationConfiguration` will produce the following `UserInfo` object: + + ```json + { + "username": "system:foo", + "uid": "auth", + "groups": [ + "user", + "admin" + ], + "extra": { + "example.com/tenant": "tenant1" + } + } + ``` + which will fail user validation because the username starts with `system:`. The API server will return `401 Unauthorized` error. + {{% /tab %}} + {{< /tabs >}} + Importantly, the API server is not an OAuth2 client, rather it can only be configured to trust a single issuer. This allows the use of public providers, such as Google, without trusting credentials issued to third parties. Admins who From d6f07783e17f3e816837c4fcbfd09aca6182e449 Mon Sep 17 00:00:00 2001 From: Dan Winship Date: Sun, 26 Nov 2023 18:52:36 -0500 Subject: [PATCH 64/82] Remove description of how iptables kube-proxy differs from userspace The iptables kube-proxy documentation notes that it has "lower system overhead", but doesn't mention what it's lower than; it's talking about the userspace proxy, which no longer exists, and which no current documentation readers would think to compare the iptables proxy mode to. Likewise, there is no point in explaining how iptables mode endpoint selection differs from userspace mode endpoint selection, because the iptables mode behaves in the way that everyone would consider normal. It was the userspace proxy that was weird, and so we had to document the *change* in behavior when we introduced the iptables proxy, but there's no reason to keep documenting "we don't do something you wouldn't have expected us to do" now. --- .../en/docs/reference/networking/virtual-ips.md | 14 -------------- 1 file changed, 14 deletions(-) diff --git a/content/en/docs/reference/networking/virtual-ips.md b/content/en/docs/reference/networking/virtual-ips.md index 623cb37525c..5f775b01e22 100644 --- a/content/en/docs/reference/networking/virtual-ips.md +++ b/content/en/docs/reference/networking/virtual-ips.md @@ -85,20 +85,6 @@ select a backend Pod. By default, kube-proxy in iptables mode chooses a backend at random. -Using iptables to handle traffic has a lower system overhead, because traffic -is handled by Linux netfilter without the need to switch between userspace and the -kernel space. This approach is also likely to be more reliable. - -If kube-proxy is running in iptables mode and the first Pod that's selected -does not respond, the connection fails. This is different from the old `userspace` -mode: in that scenario, kube-proxy would detect that the connection to the first -Pod had failed and would automatically retry with a different backend Pod. - -You can use Pod [readiness probes](/docs/concepts/workloads/pods/pod-lifecycle/#container-probes) -to verify that backend Pods are working OK, so that kube-proxy in iptables mode -only sees backends that test out as healthy. Doing this means you avoid -having traffic sent via kube-proxy to a Pod that's known to have failed. - {{< figure src="/images/docs/services-iptables-overview.svg" title="Virtual IP mechanism for Services, using iptables mode" class="diagram-medium" >}} #### Example {#packet-processing-iptables} From b34bf12fbae17b4b93b657d610bfd2b1cb61ee0c Mon Sep 17 00:00:00 2001 From: Peter Hunt Date: Tue, 14 Nov 2023 13:42:00 -0500 Subject: [PATCH 65/82] garbage collection: add blurb about ImageMaximumGCAge Signed-off-by: Peter Hunt --- .../concepts/architecture/garbage-collection.md | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/content/en/docs/concepts/architecture/garbage-collection.md b/content/en/docs/concepts/architecture/garbage-collection.md index 947ad515fc9..98eeb9fc7d4 100644 --- a/content/en/docs/concepts/architecture/garbage-collection.md +++ b/content/en/docs/concepts/architecture/garbage-collection.md @@ -137,6 +137,20 @@ collection, which deletes images in order based on the last time they were used, starting with the oldest first. The kubelet deletes images until disk usage reaches the `LowThresholdPercent` value. +#### Garbage collection for unused container images {#image-maximum-age-gc} + +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} + +As an alpha feature, you can specify the maximum time a local image can be unused for, +regardless of disk usage. This is a kubelet setting that you configure for each node. + +To configure the setting, enable the `ImageMaximumGCAge` +[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) for the kubelet, +and also set a value for the `ImageMaximumGCAge` field in the kubelet configuration file. + +The value is specified as a Kubernetes _duration_; for example, you can set the configuration +field to `3d12h`, which means 3 days and 12 hours. + ### Container garbage collection {#container-image-garbage-collection} The kubelet garbage collects unused containers based on the following variables, @@ -178,4 +192,4 @@ configure garbage collection: * Learn more about [ownership of Kubernetes objects](/docs/concepts/overview/working-with-objects/owners-dependents/). * Learn more about Kubernetes [finalizers](/docs/concepts/overview/working-with-objects/finalizers/). -* Learn about the [TTL controller](/docs/concepts/workloads/controllers/ttlafterfinished/) that cleans up finished Jobs. +* Learn about the [TTL controller](/docs/concepts/workloads/controllers/ttlafterfinished/) that cleans up finished Jobs. \ No newline at end of file From 90c282e6cf2cede5d298e8eb945d8129bd73a950 Mon Sep 17 00:00:00 2001 From: Pranshu Srivastava Date: Mon, 13 Nov 2023 11:04:58 +0530 Subject: [PATCH 66/82] kep-2305: document dynamic cardinality enforcement Signed-off-by: Pranshu Srivastava --- .../cluster-administration/system-metrics.md | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/content/en/docs/concepts/cluster-administration/system-metrics.md b/content/en/docs/concepts/cluster-administration/system-metrics.md index 6f8b7227743..ff6b41bbcd0 100644 --- a/content/en/docs/concepts/cluster-administration/system-metrics.md +++ b/content/en/docs/concepts/cluster-administration/system-metrics.md @@ -202,10 +202,23 @@ Here is an example: --allow-label-value number_count_metric,odd_number='1,3,5', number_count_metric,even_number='2,4,6', date_gauge_metric,weekend='Saturday,Sunday' ``` +In addition to specifying this from the CLI, this can also be done within a configuration file. You +can specify the path to that configuration file using the `--allow-metric-labels-manifest` command +line argument to a component. Here's an example of the contents of that configuration file: + +```yaml +allow-list: +- "metric1,label2": "v1,v2,v3" +- "metric2,label1": "v1,v2,v3" +``` + +Additionally, the `cardinality_enforcement_unexpected_categorizations_total` meta-metric records the +count of unexpected categorizations during cardinality enforcement, that is, whenever a label value +is encountered that is not allowed with respect to the allow-list contraints. + ## {{% heading "whatsnext" %}} * Read about the [Prometheus text format](https://github.com/prometheus/docs/blob/master/content/docs/instrumenting/exposition_formats.md#text-based-format) for metrics * See the list of [stable Kubernetes metrics](https://github.com/kubernetes/kubernetes/blob/master/test/instrumentation/testdata/stable-metrics-list.yaml) -* Read about the [Kubernetes deprecation policy](/docs/reference/using-api/deprecation-policy/#deprecating-a-feature-or-behavior) - +* Read about the [Kubernetes deprecation policy](/docs/reference/using-api/deprecation-policy/#deprecating-a-feature-or-behavior) \ No newline at end of file From 4e156c738d411ed4965739f9ad187cd8f37b2dbc Mon Sep 17 00:00:00 2001 From: Sascha Grunert Date: Fri, 3 Nov 2023 10:29:17 +0100 Subject: [PATCH 67/82] Add documentation about user namespaces and PSS Adding required documentation for [KEP-127](https://github.com/kubernetes/enhancements/issues/127). Signed-off-by: Sascha Grunert Signed-off-by: Rodrigo Campos Co-authored-by: Tim Bannister Signed-off-by: Sascha Grunert --- .../workloads/pods/user-namespaces.md | 29 +++++++++++++++++++ .../feature-gates.md | 5 ++++ 2 files changed, 34 insertions(+) diff --git a/content/en/docs/concepts/workloads/pods/user-namespaces.md b/content/en/docs/concepts/workloads/pods/user-namespaces.md index fa51a47d305..410b3c90524 100644 --- a/content/en/docs/concepts/workloads/pods/user-namespaces.md +++ b/content/en/docs/concepts/workloads/pods/user-namespaces.md @@ -152,6 +152,35 @@ host's file owner/group. [CVE-2021-25741]: https://github.com/kubernetes/kubernetes/issues/104980 +## Integration with Pod security admission checks + +{{< feature-state state="alpha" for_k8s_version="v1.29" >}} + +For Linux Pods that enable user namespaces, Kubernetes relaxes the application of +[Pod Security Standards](/docs/concepts/security/pod-security-standards) in a controlled way. +This behavior can be controlled by the [feature +gate](/docs/reference/command-line-tools-reference/feature-gates/) +`UserNamespacesPodSecurityStandards`, which allows an early opt-in for end +users. Admins have to ensure that user namespaces are enabled by all nodes +within the cluster if using the feature gate. + +If you enable the associated feature gate and create a Pod that uses user +namespaces, the following fields won't be constrained even in contexts that enforce the +_Baseline_ or _Restricted_ pod security standard. This behavior does not +present a security concern because `root` inside a Pod with user namespaces +actually refers to the user inside the container, that is never mapped to a +privileged user on the host. Here's the list of fields that are **not** checks for Pods in those +circumstances: + +- `spec.securityContext.runAsNonRoot` +- `spec.containers[*].securityContext.runAsNonRoot` +- `spec.initContainers[*].securityContext.runAsNonRoot` +- `spec.ephemeralContainers[*].securityContext.runAsNonRoot` +- `spec.securityContext.runAsUser` +- `spec.containers[*].securityContext.runAsUser` +- `spec.initContainers[*].securityContext.runAsUser` +- `spec.ephemeralContainers[*].securityContext.runAsUser` + ## Limitations When using a user namespace for the pod, it is disallowed to use other host diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 313c511600b..6415e461543 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -212,6 +212,7 @@ For a reference to old feature gates that are removed, please refer to | `TopologyManagerPolicyOptions` | `true` | Beta | 1.28 | | | `TranslateStreamCloseWebsocketRequests` | `false` | Alpha | 1.29 | | | `UnknownVersionInteroperabilityProxy` | `false` | Alpha | 1.28 | | +| `UserNamespacesPodSecurityStandards` | `false` | Alpha | 1.29 | | | `UserNamespacesSupport` | `false` | Alpha | 1.28 | | | `ValidatingAdmissionPolicy` | `false` | Alpha | 1.26 | 1.27 | | `ValidatingAdmissionPolicy` | `false` | Beta | 1.28 | | @@ -803,6 +804,10 @@ Each feature gate is designed for enabling/disabling a specific feature: - `UnknownVersionInteroperabilityProxy`: Proxy resource requests to the correct peer kube-apiserver when multiple kube-apiservers exist at varied versions. See [Mixed version proxy](/docs/concepts/architecture/mixed-version-proxy/) for more information. +- `UserNamespacesPodSecurityStandards`: Enable Pod Security Standards policies relaxation for pods + that run with namespaces. You must set the value of this feature gate consistently across all nodes in + your cluster, and you must also enable `UserNamespacesSupport` to use this feature. + See [User Namespaces](/docs/concepts/workloads/pods/user-namespaces/#integration-with-pod-security-standards) for more details. - `UserNamespacesSupport`: Enable user namespace support for Pods. Before Kubernetes v1.28, this feature gate was named `UserNamespacesStatelessPodsSupport`. - `ValidatingAdmissionPolicy`: Enable [ValidatingAdmissionPolicy](/docs/reference/access-authn-authz/validating-admission-policy/) From 6f2db0bfa6cca53f0b967ecaf990447b4017f18e Mon Sep 17 00:00:00 2001 From: Kirtana Ashok Date: Tue, 28 Nov 2023 09:43:41 -0800 Subject: [PATCH 68/82] Change font size for image pull per runtime doc Signed-off-by: Kirtana Ashok --- content/en/docs/concepts/containers/images.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/content/en/docs/concepts/containers/images.md b/content/en/docs/concepts/containers/images.md index 128f6668f26..9b36a6b7280 100644 --- a/content/en/docs/concepts/containers/images.md +++ b/content/en/docs/concepts/containers/images.md @@ -159,7 +159,7 @@ that Kubernetes will keep trying to pull the image, with an increasing back-off Kubernetes raises the delay between each attempt until it reaches a compiled-in limit, which is 300 seconds (5 minutes). -## Image pull per runtime class +### Image pull per runtime class {{< feature-state for_k8s_version="v1.29" state="alpha" >}} Kubernetes includes alpha support for performing image pulls based on the RuntimeClass of a Pod. From 8ccd0cc3b20bb65662c4a37d3437e5bbed54f788 Mon Sep 17 00:00:00 2001 From: Sunny Song Date: Thu, 12 Oct 2023 17:28:43 -0700 Subject: [PATCH 69/82] Add Documentation for VolumeAttributesClass KEP-3751 --- .../storage/volume-attributes-classes.md | 119 ++++++++++++++++++ 1 file changed, 119 insertions(+) create mode 100644 content/en/docs/concepts/storage/volume-attributes-classes.md diff --git a/content/en/docs/concepts/storage/volume-attributes-classes.md b/content/en/docs/concepts/storage/volume-attributes-classes.md new file mode 100644 index 00000000000..29729640500 --- /dev/null +++ b/content/en/docs/concepts/storage/volume-attributes-classes.md @@ -0,0 +1,119 @@ +--- +reviewers: +- msau42 +- xing-yang +title: Volume Attributes Classes +content_type: concept +weight: 40 +--- + + +This page assumes that you are familiar with [StorageClasses](/docs/concepts/storage/storage-classes/), +[volumes](/docs/concepts/storage/volumes/) and [PersistentVolumes](/docs/concepts/storage/persistent-volumes/) +in Kubernetes. + + + +## Introduction + +A VolumeAttributesClass provides a way for administrators to describe the mutable +"classes" of storage they offer. Different classes might map to different quality-of-service levels. +Kubernetes itself is unopinionated about what these classes represent. + + +## The VolumeAttributesClass Resource + +Each VolumeAttributesClass contains the `driverName` and `parameters`, which are used when a PersistentVolume belonging to the class needs to be dynamically provisioned or modified. + +The name of a VolumeAttributesClass object is significant and is how users can request a particular class. +Administrators set the name and other parameters of a class when first creating VolumeAttributesClass objects. +While the name of a VolumeAttributesClass object in a PersistentVolumeClaim is mutable, the parameters in an existing class are immutable. + + +```yaml +apiVersion: storage.k8s.io/v1alpha1 +kind: VolumeAttributesClass +metadata: + name: silver +driverName: pd.csi.storage.gke.io +parameters: + provisioned-iops: "3000" + provisioned-throughput: "50" +``` + + +### Provisioner + +Each VolumeAttributes has a provisioner that determines what volume plugin is used for provisioning PVs. This field must be specified. + +The feature support for VolumeAttributesClass is implemented in [kubernetes-csi/external-provisioner](https://github.com/kubernetes-csi/external-provisioner). + +You are not restricted to specifying the [kubernetes-csi/external-provisioner](https://github.com/kubernetes-csi/external-provisioner). You can also run and specify external provisioners, +which are independent programs that follow a specification defined by Kubernetes. +Authors of external provisioners have full discretion over where their code lives, how +the provisioner is shipped, how it needs to be run, what volume plugin it uses (including Flex), etc. + + +### Resizer + +Each VolumeAttributes has a resizer that determines what volume plugin is used for modifying PVs. This field must be specified. + +The modifying volume feature support for VolumeAttributesClass is implemented in [kubernetes-csi/external-resizer](https://github.com/kubernetes-csi/external-resizer). + +For example the existing PVC is using VolumeAttributesClass silver: + + +```yaml +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: test-pv-claim +spec: + … + volumeAttributesClassName: gold + … +``` + + +A new VolumeAttributesClass gold is available in the cluster: + + +```yaml +apiVersion: storage.k8s.io/v1alpha1 +kind: VolumeAttributesClass +metadata: + name: gold +driverName: pd.csi.storage.gke.io +parameters: + iops: "4000" + throughput: "60" +``` + + +The end user can update the PVC with the new VolumeAttributesClass gold and apply: + + +```yaml +apiVersion: v1 +kind: PersistentVolumeClaim +metadata: + name: test-pv-claim +spec: + … + volumeAttributesClassName: gold + … +``` + + +## Parameters + +Volume Classes have parameters that describe volumes belonging to them. Different parameters may be accepted +depending on the provisioner or the resizer. For example, the value `4000`, for the parameter `iops`, +and the parameter `throughput` are specific to GCE PD. +When a parameter is omitted, the default is used at volume provisioning. +If a user apply the PVC with a different VolumeAttributesClass with omitted parameters, the default value of +the parameters may be used depends on the CSI driver implementation. +Please refer to the related CSI driver documentation for more details. + +There can be at most 512 parameters defined for a VolumeAttributesClass. +The total length of the parameters object including its keys and values cannot exceed 256 KiB. \ No newline at end of file From b90698e025c05c10f68b2071c64b64817dd6b464 Mon Sep 17 00:00:00 2001 From: Sunny Song Date: Mon, 27 Nov 2023 16:15:53 -0800 Subject: [PATCH 70/82] Update Based on Comments - Nov 27 --- .../en/docs/concepts/storage/volume-attributes-classes.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/content/en/docs/concepts/storage/volume-attributes-classes.md b/content/en/docs/concepts/storage/volume-attributes-classes.md index 29729640500..4967d004f64 100644 --- a/content/en/docs/concepts/storage/volume-attributes-classes.md +++ b/content/en/docs/concepts/storage/volume-attributes-classes.md @@ -44,19 +44,19 @@ parameters: ### Provisioner -Each VolumeAttributes has a provisioner that determines what volume plugin is used for provisioning PVs. This field must be specified. +Each VolumeAttributesClass has a provisioner that determines what volume plugin is used for provisioning PVs. The field ``driverName`` must be specified. The feature support for VolumeAttributesClass is implemented in [kubernetes-csi/external-provisioner](https://github.com/kubernetes-csi/external-provisioner). You are not restricted to specifying the [kubernetes-csi/external-provisioner](https://github.com/kubernetes-csi/external-provisioner). You can also run and specify external provisioners, which are independent programs that follow a specification defined by Kubernetes. Authors of external provisioners have full discretion over where their code lives, how -the provisioner is shipped, how it needs to be run, what volume plugin it uses (including Flex), etc. +the provisioner is shipped, how it needs to be run, what volume plugin it uses, etc. ### Resizer -Each VolumeAttributes has a resizer that determines what volume plugin is used for modifying PVs. This field must be specified. +Each VolumeAttributesClass has a resizer that determines what volume plugin is used for modifying PVs. The field ``driverName`` must be specified. The modifying volume feature support for VolumeAttributesClass is implemented in [kubernetes-csi/external-resizer](https://github.com/kubernetes-csi/external-resizer). From 058e522b6304ecfe5b824c29643c6d2af6d912ba Mon Sep 17 00:00:00 2001 From: Sunny Song Date: Tue, 28 Nov 2023 09:04:39 -0800 Subject: [PATCH 71/82] Update Based on Comments - Nov 28 --- .../concepts/storage/persistent-volumes.md | 3 +- .../docs/concepts/storage/storage-classes.md | 4 +-- .../storage/volume-attributes-classes.md | 36 ++++++++++++------- .../feature-gates.md | 4 +++ 4 files changed, 31 insertions(+), 16 deletions(-) diff --git a/content/en/docs/concepts/storage/persistent-volumes.md b/content/en/docs/concepts/storage/persistent-volumes.md index b6d18462c4a..4b6086a4aa8 100644 --- a/content/en/docs/concepts/storage/persistent-volumes.md +++ b/content/en/docs/concepts/storage/persistent-volumes.md @@ -17,7 +17,8 @@ weight: 20 This document describes _persistent volumes_ in Kubernetes. Familiarity with -[volumes](/docs/concepts/storage/volumes/) is suggested. +[volumes](/docs/concepts/storage/volumes/), [StorageClasses](/docs/concepts/storage/storage-classes/) +and [VolumeAttributesClasses](/docs/concepts/storage/volume-attributes-classes/) is suggested. diff --git a/content/en/docs/concepts/storage/storage-classes.md b/content/en/docs/concepts/storage/storage-classes.md index 393d72a77bf..c68f8b935ec 100644 --- a/content/en/docs/concepts/storage/storage-classes.md +++ b/content/en/docs/concepts/storage/storage-classes.md @@ -17,8 +17,6 @@ with [volumes](/docs/concepts/storage/volumes/) and -## Introduction - A StorageClass provides a way for administrators to describe the "classes" of storage they offer. Different classes might map to quality-of-service levels, or to backup policies, or to arbitrary policies determined by the cluster @@ -26,7 +24,7 @@ administrators. Kubernetes itself is unopinionated about what classes represent. This concept is sometimes called "profiles" in other storage systems. -## The StorageClass Resource +## The StorageClass API Each StorageClass contains the fields `provisioner`, `parameters`, and `reclaimPolicy`, which are used when a PersistentVolume belonging to the diff --git a/content/en/docs/concepts/storage/volume-attributes-classes.md b/content/en/docs/concepts/storage/volume-attributes-classes.md index 4967d004f64..69b4e412892 100644 --- a/content/en/docs/concepts/storage/volume-attributes-classes.md +++ b/content/en/docs/concepts/storage/volume-attributes-classes.md @@ -8,26 +8,40 @@ weight: 40 --- +{{< feature-state for_k8s_version="v1.29" state="alpha" >}} + This page assumes that you are familiar with [StorageClasses](/docs/concepts/storage/storage-classes/), [volumes](/docs/concepts/storage/volumes/) and [PersistentVolumes](/docs/concepts/storage/persistent-volumes/) in Kubernetes. -## Introduction - A VolumeAttributesClass provides a way for administrators to describe the mutable "classes" of storage they offer. Different classes might map to different quality-of-service levels. Kubernetes itself is unopinionated about what these classes represent. +This is an alpha feature and disabled by default. -## The VolumeAttributesClass Resource +If you want to test the feature whilst it's alpha, you need to enable the `VolumeAttributesClass` +[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) for the kube-controller-manager and the kube-apiserver. You use the `--feature-gates` command line argument: -Each VolumeAttributesClass contains the `driverName` and `parameters`, which are used when a PersistentVolume belonging to the class needs to be dynamically provisioned or modified. +``` +--feature-gates="...,VolumeAttributesClass=true" +``` + +You can also only use VolumeAttributesClasses with storage backed by +{{< glossary_tooltip text="Container Storage Interface" term_id="csi" >}}, and only where the +relevant CSI driver implements the `ModifyVolume` API. + +## The VolumeAttributesClass API + +Each VolumeAttributesClass contains the `driverName` and `parameters`, which are +used when a PersistentVolume (PV) belonging to the class needs to be dynamically provisioned +or modified. The name of a VolumeAttributesClass object is significant and is how users can request a particular class. Administrators set the name and other parameters of a class when first creating VolumeAttributesClass objects. -While the name of a VolumeAttributesClass object in a PersistentVolumeClaim is mutable, the parameters in an existing class are immutable. +While the name of a VolumeAttributesClass object in a `PersistentVolumeClaim` is mutable, the parameters in an existing class are immutable. ```yaml @@ -44,7 +58,7 @@ parameters: ### Provisioner -Each VolumeAttributesClass has a provisioner that determines what volume plugin is used for provisioning PVs. The field ``driverName`` must be specified. +Each VolumeAttributesClass has a provisioner that determines what volume plugin is used for provisioning PVs. The field `driverName` must be specified. The feature support for VolumeAttributesClass is implemented in [kubernetes-csi/external-provisioner](https://github.com/kubernetes-csi/external-provisioner). @@ -56,12 +70,11 @@ the provisioner is shipped, how it needs to be run, what volume plugin it uses, ### Resizer -Each VolumeAttributesClass has a resizer that determines what volume plugin is used for modifying PVs. The field ``driverName`` must be specified. +Each VolumeAttributesClass has a resizer that determines what volume plugin is used for modifying PVs. The field `driverName` must be specified. The modifying volume feature support for VolumeAttributesClass is implemented in [kubernetes-csi/external-resizer](https://github.com/kubernetes-csi/external-resizer). -For example the existing PVC is using VolumeAttributesClass silver: - +For example, a existing PersistentVolumeClaim is using a VolumeAttributesClass named silver: ```yaml apiVersion: v1 @@ -70,11 +83,10 @@ metadata: name: test-pv-claim spec: … - volumeAttributesClassName: gold + volumeAttributesClassName: silver … ``` - A new VolumeAttributesClass gold is available in the cluster: @@ -107,7 +119,7 @@ spec: ## Parameters -Volume Classes have parameters that describe volumes belonging to them. Different parameters may be accepted +VolumeAttributeClasses have parameters that describe volumes belonging to them. Different parameters may be accepted depending on the provisioner or the resizer. For example, the value `4000`, for the parameter `iops`, and the parameter `throughput` are specific to GCE PD. When a parameter is omitted, the default is used at volume provisioning. diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 151ac79404c..bb68fddea79 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -218,6 +218,7 @@ For a reference to old feature gates that are removed, please refer to | `ValidatingAdmissionPolicy` | `false` | Alpha | 1.26 | 1.27 | | `ValidatingAdmissionPolicy` | `false` | Beta | 1.28 | | | `VolumeCapacityPriority` | `false` | Alpha | 1.21 | | +| `VolumeAttributesClass` | `false` | Alpha | 1.29 | | | `WatchList` | `false` | Alpha | 1.27 | | | `WinDSR` | `false` | Alpha | 1.14 | | | `WinOverlay` | `false` | Alpha | 1.14 | 1.19 | @@ -816,6 +817,9 @@ Each feature gate is designed for enabling/disabling a specific feature: support for CEL validations be used in Admission Control. - `VolumeCapacityPriority`: Enable support for prioritizing nodes in different topologies based on available PV capacity. +- `VolumeAttributesClass`: Enable support for VolumeAttributesClasses. + See [Volume Attributes Classes](/docs/concepts/storage/volume-attributes-classes/) + for more information. - `WatchBookmark`: Enable support for watch bookmark events. - `WatchList` : Enable support for [streaming initial state of objects in watch requests](/docs/reference/using-api/api-concepts/#streaming-lists). - `WinDSR`: Allows kube-proxy to create DSR loadbalancers for Windows. From dff94b84bc1c9689bb42a40b319b21d08a0c6599 Mon Sep 17 00:00:00 2001 From: Antonio Ojea Date: Sat, 25 Nov 2023 17:25:30 +0000 Subject: [PATCH 72/82] KEP-1880 Multiple ServiceCIDR --- .../feature-gates.md | 3 +- .../docs/reference/networking/virtual-ips.md | 38 ++++ .../tasks/network/extend-service-ip-ranges.md | 184 ++++++++++++++++++ 3 files changed, 224 insertions(+), 1 deletion(-) create mode 100644 content/en/docs/tasks/network/extend-service-ip-ranges.md diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index 223aa166b0b..6f5076180d4 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -659,7 +659,8 @@ Each feature gate is designed for enabling/disabling a specific feature: [Pod topology spread constraints](/docs/concepts/scheduling-eviction/topology-spread-constraints/). - `MinimizeIPTablesRestore`: Enables new performance improvement logics in the kube-proxy iptables mode. -- `MultiCIDRServiceAllocator`: Track IP address allocations for Service cluster IPs using IPAddress objects. +- `MultiCIDRServiceAllocator`: Allow to dynamically configure the cluster Service IP ranges using + ServiceCIDR objects and track IP address allocations for Service cluster IPs using IPAddress objects. - `NewVolumeManagerReconstruction`: Enables improved discovery of mounted volumes during kubelet startup. Since this code has been significantly refactored, we allow to opt-out in case kubelet gets stuck at the startup or is not unmounting volumes from terminated Pods. Note that this diff --git a/content/en/docs/reference/networking/virtual-ips.md b/content/en/docs/reference/networking/virtual-ips.md index 623cb37525c..8c4a65364e5 100644 --- a/content/en/docs/reference/networking/virtual-ips.md +++ b/content/en/docs/reference/networking/virtual-ips.md @@ -414,6 +414,44 @@ NAME PARENTREF 2001:db8:1:2::a services/kube-system/kube-dns ``` +This feature also allow users to dynamically define the available IP ranges for Services using +ServiceCIDR objects. During bootstrap, a default ServiceCIDR object named `kubernetes` is created +from the value of the `--service-cluster-ip-range` command line argument to kube-apiserver: + +```shell +kubectl get servicecidrs +``` +``` +NAME CIDRS AGE +kubernetes 10.96.0.0/28 17m +``` + +Users can create or delete new ServiceCIDR objects to manage the available IP ranges for Services: + +```shell +cat <<'EOF' | kubectl apply -f - +apiVersion: networking.k8s.io/v1alpha1 +kind: ServiceCIDR +metadata: + name: newservicecidr +spec: + cidrs: + - 10.96.0.0/24 +EOF +``` +``` +servicecidr.networking.k8s.io/newcidr1 created +``` + +```shell +kubectl get servicecidrs +``` +``` +NAME CIDRS AGE +kubernetes 10.96.0.0/28 17m +newservicecidr 10.96.0.0/24 7m +``` + #### IP address ranges for Service virtual IP addresses {#service-ip-static-sub-range} {{< feature-state for_k8s_version="v1.26" state="stable" >}} diff --git a/content/en/docs/tasks/network/extend-service-ip-ranges.md b/content/en/docs/tasks/network/extend-service-ip-ranges.md new file mode 100644 index 00000000000..fdce843c68c --- /dev/null +++ b/content/en/docs/tasks/network/extend-service-ip-ranges.md @@ -0,0 +1,184 @@ +--- +reviewers: +- thockin +- dwinship +min-kubernetes-server-version: v1.29 +title: Extend Service IP Ranges +content_type: task +--- + + +{{< feature-state state="alpha" for_k8s_version="v1.29" >}} + +This document shares how to extend the existing Service IP range assigned to a cluster. + + +## {{% heading "prerequisites" %}} + +{{< include "task-tutorial-prereqs.md" >}} + +{{< version-check >}} + + + +## API + +Kubernetes clusters with kube-apiservers that have enabled the `MultiCIDRServiceAllocator` +[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) and the `networking.k8s.io/v1alpha1` API, +will create a new ServiceCIDR object that takes the well-known name `kubernetes`, and that uses an IP address range +based on the value of the `--service-cluster-ip-range` command line argument to kube-apiserver. + +```sh +kubectl get servicecidr +``` +``` +NAME CIDRS AGE +kubernetes 10.96.0.0/28 17d +``` + +The well-known `kubernetes` Service, that exposes the kube-apiserver endpoint to the Pods, calculates +the first IP address from the default ServiceCIDR range and uses that IP address as its +cluster IP address. + +```sh +kubectl get service kubernetes +``` +``` +NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +kubernetes ClusterIP 10.96.0.1 443/TCP 17d +``` + +The default Service, in this case, uses the ClusterIP 10.96.0.1, that has the corresponding IPAddress object. + +```sh +kubectl get ipaddress 10.96.0.1 +``` +``` +NAME PARENTREF +10.96.0.1 services/default/kubernetes +``` + +The ServiceCIDRs are protected with {{}}, to avoid leaving Service ClusterIPs orphans; +the finalizer is only removed if there is another subnet that contains the existing IPAddresses or +there are no IPAddresses belonging to the subnet. + +## Extend the number of available IPs for Services + +There are cases that users will need to increase the number addresses available to Services, previously, increasing the Service range was a disruptive operation that could also cause data loss. With this new feature users only need to add a new ServiceCIDR to increase the number of available addresses. + +### Adding a new ServiceCIDR + +On a cluster with a 10.96.0.0/28 range for Services, there is only 2^(32-28) - 2 = 14 IP addresses available. The `kubernetes.default` Service is always created; for this example, that leaves you with only 13 possible Services. + +```sh +for i in $(seq 1 13); do kubectl create service clusterip "test-$i" --tcp 80 -o json | jq -r .spec.clusterIP; done +``` +``` +10.96.0.11 +10.96.0.5 +10.96.0.12 +10.96.0.13 +10.96.0.14 +10.96.0.2 +10.96.0.3 +10.96.0.4 +10.96.0.6 +10.96.0.7 +10.96.0.8 +10.96.0.9 +error: failed to create ClusterIP service: Internal error occurred: failed to allocate a serviceIP: range is full +``` + +You can increase the number of IP addresses available for Services, by creating a new ServiceCIDR +that extends or adds new IP address ranges. + +```sh +cat Date: Mon, 23 Oct 2023 11:29:24 +0200 Subject: [PATCH 73/82] Update information about CronJob's unsupported time zone field --- .../concepts/workloads/controllers/cron-jobs.md | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-) diff --git a/content/en/docs/concepts/workloads/controllers/cron-jobs.md b/content/en/docs/concepts/workloads/controllers/cron-jobs.md index 33f91471646..d2ce56bece0 100644 --- a/content/en/docs/concepts/workloads/controllers/cron-jobs.md +++ b/content/en/docs/concepts/workloads/controllers/cron-jobs.md @@ -181,15 +181,14 @@ A time zone database from the Go standard library is included in the binaries an ### Unsupported TimeZone specification -The implementation of the CronJob API in Kubernetes {{< skew currentVersion >}} lets you set -the `.spec.schedule` field to include a timezone; for example: `CRON_TZ=UTC * * * * *` -or `TZ=UTC * * * * *`. +Specifying a timezone using `CRON_TZ` or `TZ` variables inside `.spec.schedule` +is **not officially supported** (and never has been). -Specifying a timezone that way is **not officially supported** (and never has been). - -If you try to set a schedule that includes `TZ` or `CRON_TZ` timezone specification, -Kubernetes reports a [warning](/blog/2020/09/03/warnings/) to the client. -Future versions of Kubernetes will prevent setting the unofficial timezone mechanism entirely. +Starting with Kubernetes 1.29 if you try to set a schedule that includes `TZ` or `CRON_TZ` +timezone specification, Kubernetes will fail to create the resource with a validation +error. +Updates to CronJobs already using `TZ` or `CRON_TZ` will continue to report a +[warning](/blog/2020/09/03/warnings/) to the client. ### Modifying a CronJob From 1ea312d31e09bf89e5b82d8e11907f441e1edc93 Mon Sep 17 00:00:00 2001 From: Tim Bannister Date: Wed, 29 Nov 2023 00:36:51 +0000 Subject: [PATCH 74/82] Revise docs for API tracking of IP address assignment --- .../docs/reference/networking/virtual-ips.md | 39 ++++++++++++------- 1 file changed, 24 insertions(+), 15 deletions(-) diff --git a/content/en/docs/reference/networking/virtual-ips.md b/content/en/docs/reference/networking/virtual-ips.md index 8c4a65364e5..1e772f4b1db 100644 --- a/content/en/docs/reference/networking/virtual-ips.md +++ b/content/en/docs/reference/networking/virtual-ips.md @@ -364,7 +364,7 @@ ensure that no two Services can collide. Kubernetes does that by allocating each Service its own IP address from within the `service-cluster-ip-range` CIDR range that is configured for the {{< glossary_tooltip term_id="kube-apiserver" text="API Server" >}}. -#### IP address allocation tracking +### IP address allocation tracking To ensure each Service receives a unique IP, an internal allocator atomically updates a global allocation map in {{< glossary_tooltip term_id="etcd" >}} @@ -378,25 +378,34 @@ in-memory locking). Kubernetes also uses controllers to check for invalid assignments (e.g. due to administrator intervention) and for cleaning up allocated IP addresses that are no longer used by any Services. +#### IP address allocation tracking using the Kubernetes API {#ip-address-objects} + {{< feature-state for_k8s_version="v1.27" state="alpha" >}} + If you enable the `MultiCIDRServiceAllocator` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) and the [`networking.k8s.io/v1alpha1` API group](/docs/tasks/administer-cluster/enable-disable-api/), -the control plane replaces the existing etcd allocator with a new one, using IPAddress -objects instead of an internal global allocation map. The ClusterIP address -associated to each Service will have a referenced IPAddress object. +the control plane replaces the existing etcd allocator with a revised implementation +that uses IPAddress and ServiceCIDR objects instead of an internal global allocation map. +Each cluster IP address associated to a Service then references an IPAddress object. -The background controller is also replaced by a new one to handle the new IPAddress -objects and the migration from the old allocator model. +Enabling the feature gate also replaces a background controller with an alternative +that handles the IPAddress objects and supports migration from the old allocator model. +Kubernetes {{< skew currentVersion >}} does not support migrating from IPAddress +objects to the internal allocation map. -One of the main benefits of the new allocator is that it removes the size limitations -for the `service-cluster-ip-range`, there is no limitations for IPv4 and for IPv6 -users can use masks equal or larger than /64 (previously it was /108). +One of the main benefits of the revised allocator is that it removes the size limitations +for the IP address range that can be used for the cluster IP address of Services. +With `MultiCIDRServiceAllocator` enabled, there are no limitations for IPv4, and for IPv6 +you can use IP address netmasks that are a /64 or smaller (as opposed to /108 with the +legacy implementation). -Users now will be able to inspect the IP addresses assigned to their Services, and -Kubernetes extensions such as the [Gateway](https://gateway-api.sigs.k8s.io/) API, can use this new -IPAddress object kind to enhance the Kubernetes networking capabilities, going beyond the limitations of -the built-in Service API. +Making IP address allocations available via the API means that you as a cluster administrator +can allow users to inspect the IP addresses assigned to their Services. +Kubernetes extensions, such as the [Gateway API](/docs/concepts/services-networking/gateway/), +can use the IPAddress API to extend Kubernetes' inherent networking capabilities. + +Here is a brief example of a user querying for IP addresses: ```shell kubectl get services @@ -414,7 +423,7 @@ NAME PARENTREF 2001:db8:1:2::a services/kube-system/kube-dns ``` -This feature also allow users to dynamically define the available IP ranges for Services using +Kubernetes also allow users to dynamically define the available IP ranges for Services using ServiceCIDR objects. During bootstrap, a default ServiceCIDR object named `kubernetes` is created from the value of the `--service-cluster-ip-range` command line argument to kube-apiserver: @@ -452,7 +461,7 @@ kubernetes 10.96.0.0/28 17m newservicecidr 10.96.0.0/24 7m ``` -#### IP address ranges for Service virtual IP addresses {#service-ip-static-sub-range} +### IP address ranges for Service virtual IP addresses {#service-ip-static-sub-range} {{< feature-state for_k8s_version="v1.26" state="stable" >}} From 387192d95f0e0e1e0693c7b62d476321a7d03b7c Mon Sep 17 00:00:00 2001 From: Tim Bannister Date: Wed, 29 Nov 2023 00:52:17 +0000 Subject: [PATCH 75/82] Fix style nits MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Update “Virtual IPs and Service Proxies” to align better with our style guide and SIG conventions. --- content/en/docs/reference/networking/virtual-ips.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/en/docs/reference/networking/virtual-ips.md b/content/en/docs/reference/networking/virtual-ips.md index 1e772f4b1db..dd0efc6583f 100644 --- a/content/en/docs/reference/networking/virtual-ips.md +++ b/content/en/docs/reference/networking/virtual-ips.md @@ -366,7 +366,7 @@ CIDR range that is configured for the {{< glossary_tooltip term_id="kube-apiserv ### IP address allocation tracking -To ensure each Service receives a unique IP, an internal allocator atomically +To ensure each Service receives a unique IP address, an internal allocator atomically updates a global allocation map in {{< glossary_tooltip term_id="etcd" >}} prior to creating each Service. The map object must exist in the registry for Services to get IP address assignments, otherwise creations will @@ -375,7 +375,7 @@ fail with a message indicating an IP address could not be allocated. In the control plane, a background controller is responsible for creating that map (needed to support migrating from older versions of Kubernetes that used in-memory locking). Kubernetes also uses controllers to check for invalid -assignments (e.g. due to administrator intervention) and for cleaning up allocated +assignments (for example: due to administrator intervention) and for cleaning up allocated IP addresses that are no longer used by any Services. #### IP address allocation tracking using the Kubernetes API {#ip-address-objects} From 60a0a6606ce35cf3d08e14fe8b5680e796694fdf Mon Sep 17 00:00:00 2001 From: Kat Cosgrove Date: Wed, 29 Nov 2023 14:08:14 +0000 Subject: [PATCH 76/82] Add 1.29 to release schedule for 1.29 release --- data/releases/schedule.yaml | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/data/releases/schedule.yaml b/data/releases/schedule.yaml index 59c592e750a..2277dc12fcc 100644 --- a/data/releases/schedule.yaml +++ b/data/releases/schedule.yaml @@ -2,6 +2,14 @@ # This file helps to populate the /releases page, and is also parsed to find out the # latest patch version for a minor release. schedules: +- release: 1.29 + releaseDate: 2023-12-05 + next: + release: 1.29.1 + cherryPickDeadline: 2024-01-05 + targetDate: 2024-01-10 + maintenanceModeStartDate: 2025-12-28 + endOfLifeDate: 2026-02-28 - release: 1.28 releaseDate: 2023-08-15 next: From 8dc08062a7877b4c8f59c1c82a91c4ca8490d4a9 Mon Sep 17 00:00:00 2001 From: Sascha Grunert Date: Thu, 30 Nov 2023 10:31:41 +0100 Subject: [PATCH 77/82] Link PSS to User Namespaces Signed-off-by: Sascha Grunert Co-authored-by: Tim Bannister --- content/en/docs/concepts/security/pod-security-standards.md | 6 ++++++ .../reference/command-line-tools-reference/feature-gates.md | 4 ++-- 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/content/en/docs/concepts/security/pod-security-standards.md b/content/en/docs/concepts/security/pod-security-standards.md index 35c4952b60e..2fdd2638367 100644 --- a/content/en/docs/concepts/security/pod-security-standards.md +++ b/content/en/docs/concepts/security/pod-security-standards.md @@ -485,6 +485,12 @@ Restrictions on the following controls are only required if `.spec.os.name` is n - Seccomp - Linux Capabilities +## User namespaces + +User Namespaces are a Linux-only feature to run workloads with increased +isolation. How they work together with Pod Security Standards is described in +the [documentation](/docs/concepts/workloads/pods/user-namespaces#integration-with-pod-security-admission-checks) for Pods that use user namespaces. + ## FAQ ### Why isn't there a profile between privileged and baseline? diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index d95bf95b798..b316b6085f8 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -811,7 +811,7 @@ Each feature gate is designed for enabling/disabling a specific feature: - `UserNamespacesPodSecurityStandards`: Enable Pod Security Standards policies relaxation for pods that run with namespaces. You must set the value of this feature gate consistently across all nodes in your cluster, and you must also enable `UserNamespacesSupport` to use this feature. - See [User Namespaces](/docs/concepts/workloads/pods/user-namespaces/#integration-with-pod-security-standards) for more details. + See [User Namespaces](/docs/concepts/workloads/pods/user-namespaces/#integration-with-pod-security-admission-checks) for more details. - `UserNamespacesSupport`: Enable user namespace support for Pods. Before Kubernetes v1.28, this feature gate was named `UserNamespacesStatelessPodsSupport`. - `ValidatingAdmissionPolicy`: Enable [ValidatingAdmissionPolicy](/docs/reference/access-authn-authz/validating-admission-policy/) @@ -835,4 +835,4 @@ Each feature gate is designed for enabling/disabling a specific feature: feature, you will also need to enable any associated API resources. For example, to enable a particular resource like `storage.k8s.io/v1beta1/csistoragecapacities`, set `--runtime-config=storage.k8s.io/v1beta1/csistoragecapacities`. - See [API Versioning](/docs/reference/using-api/#api-versioning) for more details on the command line flags. \ No newline at end of file + See [API Versioning](/docs/reference/using-api/#api-versioning) for more details on the command line flags. From cf47dab07cdd2f5e0a5cca56bce3c2444f8a612c Mon Sep 17 00:00:00 2001 From: Dan Winship Date: Sun, 26 Nov 2023 21:11:58 -0500 Subject: [PATCH 78/82] Fix redundancy in kube-proxy iptables and ipvs docs Move the "watches Services and EndpointSlices" and "control loop" text to the top level, since that applies to all proxy modes. Likewise, the allegedly iptables-specific graphic is actually sufficiently abstract to apply to any possible proxy. Also fix an out-of-date claim about ipvs mode falling back to iptables mode. --- .../docs/reference/networking/virtual-ips.md | 39 ++++++++++--------- 1 file changed, 20 insertions(+), 19 deletions(-) diff --git a/content/en/docs/reference/networking/virtual-ips.md b/content/en/docs/reference/networking/virtual-ips.md index 5f775b01e22..0004ee7af07 100644 --- a/content/en/docs/reference/networking/virtual-ips.md +++ b/content/en/docs/reference/networking/virtual-ips.md @@ -14,6 +14,18 @@ The `kube-proxy` component is responsible for implementing a _virtual IP_ mechanism for {{< glossary_tooltip term_id="service" text="Services">}} of `type` other than [`ExternalName`](/docs/concepts/services-networking/service/#externalname). +Each instance of kube-proxy watches the Kubernetes {{< glossary_tooltip +term_id="control-plane" text="control plane" >}} for the addition and +removal of Service and EndpointSlice {{< glossary_tooltip +term_id="object" text="objects" >}}. For each Service, kube-proxy +calls appropriate APIs (depending on the kube-proxy mode) to configure +the node to capture traffic to the Service's `clusterIP` and `port`, +and redirect that traffic to one of the Service's endpoints +(usually a Pod, but possibly an arbitrary user-provided IP address). A control +loop ensures that the rules on each node are reliably synchronized with +the Service and EndpointSlice state as indicated by the API server. + +{{< figure src="/images/docs/services-iptables-overview.svg" title="Virtual IP mechanism for Services, using iptables mode" class="diagram-medium" >}} A question that pops up every now and then is why Kubernetes relies on proxying to forward inbound traffic to backends. What about other @@ -57,7 +69,7 @@ The kube-proxy starts up in different modes, which are determined by its configu On Linux nodes, the available modes for kube-proxy are: [`iptables`](#proxy-mode-iptables) -: A mode where the kube-proxy configures packet forwarding rules using iptables, on Linux. +: A mode where the kube-proxy configures packet forwarding rules using iptables. [`ipvs`](#proxy-mode-ipvs) : a mode where the kube-proxy configures packet forwarding rules using ipvs. @@ -74,18 +86,10 @@ There is only one mode available for kube-proxy on Windows: _This proxy mode is only available on Linux nodes._ -In this mode, kube-proxy watches the Kubernetes -{{< glossary_tooltip term_id="control-plane" text="control plane" >}} for the addition and -removal of Service and EndpointSlice {{< glossary_tooltip term_id="object" text="objects." >}} -For each Service, it installs -iptables rules, which capture traffic to the Service's `clusterIP` and `port`, -and redirect that traffic to one of the Service's -backend sets. For each endpoint, it installs iptables rules which -select a backend Pod. - -By default, kube-proxy in iptables mode chooses a backend at random. - -{{< figure src="/images/docs/services-iptables-overview.svg" title="Virtual IP mechanism for Services, using iptables mode" class="diagram-medium" >}} +In this mode, kube-proxy configures packet forwarding rules using the +iptables API of the kernel netfilter subsystem. For each endpoint, it +installs iptables rules which, by default, select a backend Pod at +random. #### Example {#packet-processing-iptables} @@ -193,11 +197,8 @@ and is likely to hurt functionality more than it improves performance. _This proxy mode is only available on Linux nodes._ -In `ipvs` mode, kube-proxy watches Kubernetes Services and EndpointSlices, -calls `netlink` interface to create IPVS rules accordingly and synchronizes -IPVS rules with Kubernetes Services and EndpointSlices periodically. -This control loop ensures that IPVS status matches the desired state. -When accessing a Service, IPVS directs traffic to one of the backend Pods. +In `ipvs` mode, kube-proxy uses the kernel IPVS and iptables APIs to +create rules to redirect traffic from Service IPs to endpoint IPs. The IPVS proxy mode is based on netfilter hook function that is similar to iptables mode, but uses a hash table as the underlying data structure and works @@ -252,7 +253,7 @@ the node before starting kube-proxy. When kube-proxy starts in IPVS proxy mode, it verifies whether IPVS kernel modules are available. If the IPVS kernel modules are not detected, then kube-proxy -falls back to running in iptables proxy mode. +exits with an error. {{< /note >}} {{< figure src="/images/docs/services-ipvs-overview.svg" title="Virtual IP address mechanism for Services, using IPVS mode" class="diagram-medium" >}} From 9795352deb5f1b7e5510d50653975abb6a91a4d6 Mon Sep 17 00:00:00 2001 From: Troy Connor Date: Fri, 1 Dec 2023 10:35:22 -0500 Subject: [PATCH 79/82] UnauthenticatedHTTP2DOSMitigation default in 1.29 is set to true Signed-off-by: Troy Connor --- .../reference/command-line-tools-reference/feature-gates.md | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/content/en/docs/reference/command-line-tools-reference/feature-gates.md b/content/en/docs/reference/command-line-tools-reference/feature-gates.md index b316b6085f8..6b891d97090 100644 --- a/content/en/docs/reference/command-line-tools-reference/feature-gates.md +++ b/content/en/docs/reference/command-line-tools-reference/feature-gates.md @@ -212,6 +212,8 @@ For a reference to old feature gates that are removed, please refer to | `TopologyManagerPolicyOptions` | `false` | Alpha | 1.26 | 1.27 | | `TopologyManagerPolicyOptions` | `true` | Beta | 1.28 | | | `TranslateStreamCloseWebsocketRequests` | `false` | Alpha | 1.29 | | +| `UnauthenticatedHTTP2DOSMitigation` | `false` | Beta | 1.28 | | +| `UnauthenticatedHTTP2DOSMitigation` | `true` | Beta | 1.29 | | | `UnknownVersionInteroperabilityProxy` | `false` | Alpha | 1.28 | | | `UserNamespacesPodSecurityStandards` | `false` | Alpha | 1.29 | | | `UserNamespacesSupport` | `false` | Alpha | 1.28 | | @@ -805,6 +807,9 @@ Each feature gate is designed for enabling/disabling a specific feature: - `TranslateStreamCloseWebsocketRequests`: Allow WebSocket streaming of the remote command sub-protocol (`exec`, `cp`, `attach`) from clients requesting version 5 (v5) of the sub-protocol. +- `UnauthenticatedHTTP2DOSMitigation`: Enables HTTP/2 Denial of Service (DoS) + mitigations for unauthenticated clients. + Kubernetes v1.28.0 through v1.28.2 do not include this feature gate. - `UnknownVersionInteroperabilityProxy`: Proxy resource requests to the correct peer kube-apiserver when multiple kube-apiservers exist at varied versions. See [Mixed version proxy](/docs/concepts/architecture/mixed-version-proxy/) for more information. From d5c530002f2d9e2439abb60cc612fb672fa19776 Mon Sep 17 00:00:00 2001 From: Dan Winship Date: Sun, 26 Nov 2023 21:21:31 -0500 Subject: [PATCH 80/82] Clarify iptables performance slightly --- content/en/docs/reference/networking/virtual-ips.md | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/content/en/docs/reference/networking/virtual-ips.md b/content/en/docs/reference/networking/virtual-ips.md index 0004ee7af07..3d5761aa7cb 100644 --- a/content/en/docs/reference/networking/virtual-ips.md +++ b/content/en/docs/reference/networking/virtual-ips.md @@ -115,8 +115,10 @@ through a load-balancer, though in those cases the client IP address does get al #### Optimizing iptables mode performance -In large clusters (with tens of thousands of Pods and Services), the -iptables mode of kube-proxy may take a long time to update the rules +In iptables mode, kube-proxy creates a few iptables rules for every +Service, and a few iptables rules for each endpoint IP address. In +clusters with tens of thousands of Pods and Services, this means tens +of thousands of iptables rules, and kube-proxy may take a long time to update the rules in the kernel when Services (or their EndpointSlices) change. You can adjust the syncing behavior of kube-proxy via options in the [`iptables` section](/docs/reference/config-api/kube-proxy-config.v1alpha1/#kubeproxy-config-k8s-io-v1alpha1-KubeProxyIPTablesConfiguration) of the @@ -205,7 +207,7 @@ iptables mode, but uses a hash table as the underlying data structure and works in the kernel space. That means kube-proxy in IPVS mode redirects traffic with lower latency than kube-proxy in iptables mode, with much better performance when synchronizing -proxy rules. Compared to the other proxy modes, IPVS mode also supports a +proxy rules. Compared to the iptables proxy mode, IPVS mode also supports a higher throughput of network traffic. IPVS provides more options for balancing traffic to backend Pods; From dd5be8b7ad27b2e8c48f33439a208fabf515b068 Mon Sep 17 00:00:00 2001 From: Kat Cosgrove Date: Fri, 1 Dec 2023 20:33:55 +0000 Subject: [PATCH 81/82] updating dates to reflect delayed release --- data/releases/schedule.yaml | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/data/releases/schedule.yaml b/data/releases/schedule.yaml index 2277dc12fcc..d5c7d4147c8 100644 --- a/data/releases/schedule.yaml +++ b/data/releases/schedule.yaml @@ -3,11 +3,11 @@ # latest patch version for a minor release. schedules: - release: 1.29 - releaseDate: 2023-12-05 + releaseDate: 2023-12-13 next: release: 1.29.1 - cherryPickDeadline: 2024-01-05 - targetDate: 2024-01-10 + cherryPickDeadline: 2024-01-12 + targetDate: 2024-01-17 maintenanceModeStartDate: 2025-12-28 endOfLifeDate: 2026-02-28 - release: 1.28 From 38d537b2d24aa5cf921633085d45d5f00013d6c0 Mon Sep 17 00:00:00 2001 From: Kat Cosgrove Date: Mon, 11 Dec 2023 15:05:29 +0000 Subject: [PATCH 82/82] Update data/releases/schedule.yaml MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Marko Mudrinić --- data/releases/schedule.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/data/releases/schedule.yaml b/data/releases/schedule.yaml index d5c7d4147c8..e0c0171a725 100644 --- a/data/releases/schedule.yaml +++ b/data/releases/schedule.yaml @@ -8,8 +8,8 @@ schedules: release: 1.29.1 cherryPickDeadline: 2024-01-12 targetDate: 2024-01-17 - maintenanceModeStartDate: 2025-12-28 - endOfLifeDate: 2026-02-28 + maintenanceModeStartDate: 2024-12-28 + endOfLifeDate: 2025-02-28 - release: 1.28 releaseDate: 2023-08-15 next: