Merge pull request #41108 from Zhuzhenghao/clean

Cleanup page garbage-collection and nodes
pull/41111/head
Kubernetes Prow Robot 2023-05-13 21:33:26 -07:00 committed by GitHub
commit c6ff7b40db
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 68 additions and 69 deletions

View File

@ -8,17 +8,17 @@ weight: 70
{{<glossary_definition term_id="garbage-collection" length="short">}} This {{<glossary_definition term_id="garbage-collection" length="short">}} This
allows the clean up of resources like the following: allows the clean up of resources like the following:
* [Terminated pods](/docs/concepts/workloads/pods/pod-lifecycle/#pod-garbage-collection) * [Terminated pods](/docs/concepts/workloads/pods/pod-lifecycle/#pod-garbage-collection)
* [Completed Jobs](/docs/concepts/workloads/controllers/ttlafterfinished/) * [Completed Jobs](/docs/concepts/workloads/controllers/ttlafterfinished/)
* [Objects without owner references](#owners-dependents) * [Objects without owner references](#owners-dependents)
* [Unused containers and container images](#containers-images) * [Unused containers and container images](#containers-images)
* [Dynamically provisioned PersistentVolumes with a StorageClass reclaim policy of Delete](/docs/concepts/storage/persistent-volumes/#delete) * [Dynamically provisioned PersistentVolumes with a StorageClass reclaim policy of Delete](/docs/concepts/storage/persistent-volumes/#delete)
* [Stale or expired CertificateSigningRequests (CSRs)](/docs/reference/access-authn-authz/certificate-signing-requests/#request-signing-process) * [Stale or expired CertificateSigningRequests (CSRs)](/docs/reference/access-authn-authz/certificate-signing-requests/#request-signing-process)
* {{<glossary_tooltip text="Nodes" term_id="node">}} deleted in the following scenarios: * {{<glossary_tooltip text="Nodes" term_id="node">}} deleted in the following scenarios:
* On a cloud when the cluster uses a [cloud controller manager](/docs/concepts/architecture/cloud-controller/) * On a cloud when the cluster uses a [cloud controller manager](/docs/concepts/architecture/cloud-controller/)
* On-premises when the cluster uses an addon similar to a cloud controller * On-premises when the cluster uses an addon similar to a cloud controller
manager manager
* [Node Lease objects](/docs/concepts/architecture/nodes/#heartbeats) * [Node Lease objects](/docs/concepts/architecture/nodes/#heartbeats)
## Owners and dependents {#owners-dependents} ## Owners and dependents {#owners-dependents}
@ -63,8 +63,8 @@ delete an object, you can control whether Kubernetes deletes the object's
dependents automatically, in a process called *cascading deletion*. There are dependents automatically, in a process called *cascading deletion*. There are
two types of cascading deletion, as follows: two types of cascading deletion, as follows:
* Foreground cascading deletion * Foreground cascading deletion
* Background cascading deletion * Background cascading deletion
You can also control how and when garbage collection deletes resources that have You can also control how and when garbage collection deletes resources that have
owner references using Kubernetes {{<glossary_tooltip text="finalizers" term_id="finalizer">}}. owner references using Kubernetes {{<glossary_tooltip text="finalizers" term_id="finalizer">}}.
@ -75,12 +75,12 @@ In foreground cascading deletion, the owner object you're deleting first enters
a *deletion in progress* state. In this state, the following happens to the a *deletion in progress* state. In this state, the following happens to the
owner object: owner object:
* The Kubernetes API server sets the object's `metadata.deletionTimestamp` * The Kubernetes API server sets the object's `metadata.deletionTimestamp`
field to the time the object was marked for deletion. field to the time the object was marked for deletion.
* The Kubernetes API server also sets the `metadata.finalizers` field to * The Kubernetes API server also sets the `metadata.finalizers` field to
`foregroundDeletion`. `foregroundDeletion`.
* The object remains visible through the Kubernetes API until the deletion * The object remains visible through the Kubernetes API until the deletion
process is complete. process is complete.
After the owner object enters the deletion in progress state, the controller After the owner object enters the deletion in progress state, the controller
deletes the dependents. After deleting all the dependent objects, the controller deletes the dependents. After deleting all the dependent objects, the controller
@ -129,8 +129,8 @@ which is part of the kubelet, with the cooperation of
considers the following disk usage limits when making garbage collection considers the following disk usage limits when making garbage collection
decisions: decisions:
* `HighThresholdPercent` * `HighThresholdPercent`
* `LowThresholdPercent` * `LowThresholdPercent`
Disk usage above the configured `HighThresholdPercent` value triggers garbage Disk usage above the configured `HighThresholdPercent` value triggers garbage
collection, which deletes images in order based on the last time they were used, collection, which deletes images in order based on the last time they were used,
@ -142,12 +142,12 @@ until disk usage reaches the `LowThresholdPercent` value.
The kubelet garbage collects unused containers based on the following variables, The kubelet garbage collects unused containers based on the following variables,
which you can define: which you can define:
* `MinAge`: the minimum age at which the kubelet can garbage collect a * `MinAge`: the minimum age at which the kubelet can garbage collect a
container. Disable by setting to `0`. container. Disable by setting to `0`.
* `MaxPerPodContainer`: the maximum number of dead containers each Pod * `MaxPerPodContainer`: the maximum number of dead containers each Pod
can have. Disable by setting to less than `0`. can have. Disable by setting to less than `0`.
* `MaxContainers`: the maximum number of dead containers the cluster can have. * `MaxContainers`: the maximum number of dead containers the cluster can have.
Disable by setting to less than `0`. Disable by setting to less than `0`.
In addition to these variables, the kubelet garbage collects unidentified and In addition to these variables, the kubelet garbage collects unidentified and
deleted containers, typically starting with the oldest first. deleted containers, typically starting with the oldest first.
@ -171,8 +171,8 @@ You can tune garbage collection of resources by configuring options specific to
the controllers managing those resources. The following pages show you how to the controllers managing those resources. The following pages show you how to
configure garbage collection: configure garbage collection:
* [Configuring cascading deletion of Kubernetes objects](/docs/tasks/administer-cluster/use-cascading-deletion/) * [Configuring cascading deletion of Kubernetes objects](/docs/tasks/administer-cluster/use-cascading-deletion/)
* [Configuring cleanup of finished Jobs](/docs/concepts/workloads/controllers/ttlafterfinished/) * [Configuring cleanup of finished Jobs](/docs/concepts/workloads/controllers/ttlafterfinished/)
<!-- * [Configuring unused container and image garbage collection](/docs/tasks/administer-cluster/reconfigure-kubelet/) --> <!-- * [Configuring unused container and image garbage collection](/docs/tasks/administer-cluster/reconfigure-kubelet/) -->

View File

@ -81,7 +81,7 @@ first and re-added after the update.
### Self-registration of Nodes ### Self-registration of Nodes
When the kubelet flag `--register-node` is true (the default), the kubelet will attempt to When the kubelet flag `--register-node` is true (the default), the kubelet will attempt to
register itself with the API server. This is the preferred pattern, used by most distros. register itself with the API server. This is the preferred pattern, used by most distros.
For self-registration, the kubelet is started with the following options: For self-registration, the kubelet is started with the following options:
@ -122,7 +122,7 @@ Pods already scheduled on the Node may misbehave or cause issues if the Node
configuration will be changed on kubelet restart. For example, already running configuration will be changed on kubelet restart. For example, already running
Pod may be tainted against the new labels assigned to the Node, while other Pod may be tainted against the new labels assigned to the Node, while other
Pods, that are incompatible with that Pod will be scheduled based on this new Pods, that are incompatible with that Pod will be scheduled based on this new
label. Node re-registration ensures all Pods will be drained and properly label. Node re-registration ensures all Pods will be drained and properly
re-scheduled. re-scheduled.
{{< /note >}} {{< /note >}}
@ -225,9 +225,9 @@ of the Node resource. For example, the following JSON structure describes a heal
When problems occur on nodes, the Kubernetes control plane automatically creates When problems occur on nodes, the Kubernetes control plane automatically creates
[taints](/docs/concepts/scheduling-eviction/taint-and-toleration/) that match the conditions [taints](/docs/concepts/scheduling-eviction/taint-and-toleration/) that match the conditions
affecting the node. An example of this is when the `status` of the Ready condition affecting the node. An example of this is when the `status` of the Ready condition
remains `Unknown` or `False` for longer than the kube-controller-manager's `NodeMonitorGracePeriod`, remains `Unknown` or `False` for longer than the kube-controller-manager's `NodeMonitorGracePeriod`,
which defaults to 40 seconds. This will cause either an `node.kubernetes.io/unreachable` taint, for an `Unknown` status, which defaults to 40 seconds. This will cause either an `node.kubernetes.io/unreachable` taint, for an `Unknown` status,
or a `node.kubernetes.io/not-ready` taint, for a `False` status, to be added to the Node. or a `node.kubernetes.io/not-ready` taint, for a `False` status, to be added to the Node.
These taints affect pending pods as the scheduler takes the Node's taints into consideration when These taints affect pending pods as the scheduler takes the Node's taints into consideration when
@ -321,7 +321,7 @@ This period can be configured using the `--node-monitor-period` flag on the
### Rate limits on eviction ### Rate limits on eviction
In most cases, the node controller limits the eviction rate to In most cases, the node controller limits the eviction rate to
`--node-eviction-rate` (default 0.1) per second, meaning it won't evict pods `--node-eviction-rate` (default 0.1) per second, meaning it won't evict pods
from more than 1 node per 10 seconds. from more than 1 node per 10 seconds.
@ -345,7 +345,7 @@ then the eviction mechanism does not take per-zone unavailability into account.
A key reason for spreading your nodes across availability zones is so that the A key reason for spreading your nodes across availability zones is so that the
workload can be shifted to healthy zones when one entire zone goes down. workload can be shifted to healthy zones when one entire zone goes down.
Therefore, if all nodes in a zone are unhealthy, then the node controller evicts at Therefore, if all nodes in a zone are unhealthy, then the node controller evicts at
the normal rate of `--node-eviction-rate`. The corner case is when all zones are the normal rate of `--node-eviction-rate`. The corner case is when all zones are
completely unhealthy (none of the nodes in the cluster are healthy). In such a completely unhealthy (none of the nodes in the cluster are healthy). In such a
case, the node controller assumes that there is some problem with connectivity case, the node controller assumes that there is some problem with connectivity
between the control plane and the nodes, and doesn't perform any evictions. between the control plane and the nodes, and doesn't perform any evictions.
@ -568,7 +568,7 @@ shutdown node comes up, the pods will be deleted by kubelet and new pods will be
created on a different running node. If the original shutdown node does not come up, created on a different running node. If the original shutdown node does not come up,
these pods will be stuck in terminating status on the shutdown node forever. these pods will be stuck in terminating status on the shutdown node forever.
To mitigate the above situation, a user can manually add the taint `node.kubernetes.io/out-of-service` with either `NoExecute` To mitigate the above situation, a user can manually add the taint `node.kubernetes.io/out-of-service` with either `NoExecute`
or `NoSchedule` effect to a Node marking it out-of-service. or `NoSchedule` effect to a Node marking it out-of-service.
If the `NodeOutOfServiceVolumeDetach`[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) If the `NodeOutOfServiceVolumeDetach`[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
is enabled on {{< glossary_tooltip text="kube-controller-manager" term_id="kube-controller-manager" >}}, and a Node is marked out-of-service with this taint, the is enabled on {{< glossary_tooltip text="kube-controller-manager" term_id="kube-controller-manager" >}}, and a Node is marked out-of-service with this taint, the
@ -641,10 +641,9 @@ see [KEP-2400](https://github.com/kubernetes/enhancements/issues/2400) and its
## {{% heading "whatsnext" %}} ## {{% heading "whatsnext" %}}
Learn more about the following: Learn more about the following:
* [Components](/docs/concepts/overview/components/#node-components) that make up a node. * [Components](/docs/concepts/overview/components/#node-components) that make up a node.
* [API definition for Node](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#node-v1-core). * [API definition for Node](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#node-v1-core).
* [Node](https://git.k8s.io/design-proposals-archive/architecture/architecture.md#the-kubernetes-node) section of the architecture design document. * [Node](https://git.k8s.io/design-proposals-archive/architecture/architecture.md#the-kubernetes-node) section of the architecture design document.
* [Taints and Tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/). * [Taints and Tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/).
* [Node Resource Managers](/docs/concepts/policy/node-resource-managers/). * [Node Resource Managers](/docs/concepts/policy/node-resource-managers/).
* [Resource Management for Windows nodes](/docs/concepts/configuration/windows-resource-management/). * [Resource Management for Windows nodes](/docs/concepts/configuration/windows-resource-management/).