Revise cluster management task

After removing the sections of the page that are not in line with the
content guide, there is little left.

Edit pages that link to removed task to no longer link there.
Redirect using 302 status so that there is a future opportunity to reinstate
the page or something like it.

- Avoid links to removed cluster management task
- Broaden applicability of “Safely Drain A Node”
- Add (stub) cluster upgrade task page
- Add a basic page about upgrading your cluster.
- Add a task page about enabling or disabling HTTP APIs
pull/24415/head
Tim Bannister 2020-10-07 19:16:47 +01:00
parent bc687f287c
commit 59dcd57cc9
11 changed files with 140 additions and 241 deletions

View File

@ -12,7 +12,7 @@ Kubernetes is well-known for running scalable workloads. It scales your workload
## Guaranteed scheduling with controlled cost ## Guaranteed scheduling with controlled cost
[Kubernetes Cluster Autoscaler](https://kubernetes.io/docs/tasks/administer-cluster/cluster-management/#cluster-autoscaling) is an excellent tool in the ecosystem which adds more nodes to your cluster when your applications need them. However, cluster autoscaler has some limitations and may not work for all users: [Kubernetes Cluster Autoscaler](https://github.com/kubernetes/autoscaler/) is an excellent tool in the ecosystem which adds more nodes to your cluster when your applications need them. However, cluster autoscaler has some limitations and may not work for all users:
- It does not work in physical clusters. - It does not work in physical clusters.
- Adding more nodes to the cluster costs more. - Adding more nodes to the cluster costs more.

View File

@ -92,9 +92,8 @@ Controllers that interact with external state find their desired state from
the API server, then communicate directly with an external system to bring the API server, then communicate directly with an external system to bring
the current state closer in line. the current state closer in line.
(There actually is a controller that horizontally scales the (There actually is a [controller](https://github.com/kubernetes/autoscaler/)
nodes in your cluster. See that horizontally scales the nodes in your cluster.)
[Cluster autoscaling](/docs/tasks/administer-cluster/cluster-management/#cluster-autoscaling)).
The important point here is that the controller makes some change to bring about The important point here is that the controller makes some change to bring about
your desired state, and then reports current state back to your cluster's API server. your desired state, and then reports current state back to your cluster's API server.

View File

@ -338,5 +338,4 @@ for more information.
* Read the [Node](https://git.k8s.io/community/contributors/design-proposals/architecture/architecture.md#the-kubernetes-node) * Read the [Node](https://git.k8s.io/community/contributors/design-proposals/architecture/architecture.md#the-kubernetes-node)
section of the architecture design document. section of the architecture design document.
* Read about [taints and tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/). * Read about [taints and tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/).
* Read about [cluster autoscaling](/docs/tasks/administer-cluster/cluster-management/#cluster-autoscaling).

View File

@ -39,8 +39,6 @@ Before choosing a guide, here are some considerations:
## Managing a cluster ## Managing a cluster
* [Managing a cluster](/docs/tasks/administer-cluster/cluster-management/) describes several topics related to the lifecycle of a cluster: creating a new cluster, upgrading your cluster's master and worker nodes, performing node maintenance (e.g. kernel upgrades), and upgrading the Kubernetes API version of a running cluster.
* Learn how to [manage nodes](/docs/concepts/architecture/nodes/). * Learn how to [manage nodes](/docs/concepts/architecture/nodes/).
* Learn how to set up and manage the [resource quota](/docs/concepts/policy/resource-quotas/) for shared clusters. * Learn how to set up and manage the [resource quota](/docs/concepts/policy/resource-quotas/) for shared clusters.

View File

@ -321,9 +321,7 @@ Pod may be created that fits on the same Node. In this case, the scheduler will
schedule the higher priority Pod instead of the preemptor. schedule the higher priority Pod instead of the preemptor.
This is expected behavior: the Pod with the higher priority should take the place This is expected behavior: the Pod with the higher priority should take the place
of a Pod with a lower priority. Other controller actions, such as of a Pod with a lower priority.
[cluster autoscaling](/docs/tasks/administer-cluster/cluster-management/#cluster-autoscaling),
may eventually provide capacity to schedule the pending Pods.
### Higher priority Pods are preempted before lower priority pods ### Higher priority Pods are preempted before lower priority pods

View File

@ -1,223 +0,0 @@
---
reviewers:
- lavalamp
- thockin
title: Cluster Management
content_type: concept
---
<!-- overview -->
This document describes several topics related to the lifecycle of a cluster: creating a new cluster,
upgrading your cluster's
master and worker nodes, performing node maintenance (e.g. kernel upgrades), and upgrading the Kubernetes API version of a
running cluster.
<!-- body -->
## Creating and configuring a Cluster
To install Kubernetes on a set of machines, consult one of the existing [Getting Started guides](/docs/setup/) depending on your environment.
## Upgrading a cluster
The current state of cluster upgrades is provider dependent, and some releases may require special care when upgrading. It is recommended that administrators consult both the [release notes](https://git.k8s.io/kubernetes/CHANGELOG/README.md), as well as the version specific upgrade notes prior to upgrading their clusters.
### Upgrading an Azure Kubernetes Service (AKS) cluster
Azure Kubernetes Service enables easy self-service upgrades of the control plane and nodes in your cluster. The process is
currently user-initiated and is described in the [Azure AKS documentation](https://docs.microsoft.com/en-us/azure/aks/upgrade-cluster).
### Upgrading Google Compute Engine clusters
Google Compute Engine Open Source (GCE-OSS) support master upgrades by deleting and
recreating the master, while maintaining the same Persistent Disk (PD) to ensure that data is retained across the
upgrade.
Node upgrades for GCE use a [Managed Instance Group](https://cloud.google.com/compute/docs/instance-groups/), each node
is sequentially destroyed and then recreated with new software. Any Pods that are running on that node need to be
controlled by a Replication Controller, or manually re-created after the roll out.
Upgrades on open source Google Compute Engine (GCE) clusters are controlled by the `cluster/gce/upgrade.sh` script.
Get its usage by running `cluster/gce/upgrade.sh -h`.
For example, to upgrade just your master to a specific version (v1.0.2):
```shell
cluster/gce/upgrade.sh -M v1.0.2
```
Alternatively, to upgrade your entire cluster to the latest stable release:
```shell
cluster/gce/upgrade.sh release/stable
```
### Upgrading Google Kubernetes Engine clusters
Google Kubernetes Engine automatically updates master components (e.g. `kube-apiserver`, `kube-scheduler`) to the latest version. It also handles upgrading the operating system and other components that the master runs on.
The node upgrade process is user-initiated and is described in the [Google Kubernetes Engine documentation](https://cloud.google.com/kubernetes-engine/docs/clusters/upgrade).
### Upgrading an Amazon EKS Cluster
Amazon EKS cluster's master components can be upgraded by using eksctl, AWS Management Console, or AWS CLI. The process is user-initiated and is described in the [Amazon EKS documentation](https://docs.aws.amazon.com/eks/latest/userguide/update-cluster.html).
### Upgrading an Oracle Cloud Infrastructure Container Engine for Kubernetes (OKE) cluster
Oracle creates and manages a set of master nodes in the Oracle control plane on your behalf (and associated Kubernetes infrastructure such as etcd nodes) to ensure you have a highly available managed Kubernetes control plane. You can also seamlessly upgrade these master nodes to new versions of Kubernetes with zero downtime. These actions are described in the [OKE documentation](https://docs.cloud.oracle.com/iaas/Content/ContEng/Tasks/contengupgradingk8smasternode.htm).
### Upgrading clusters on other platforms
Different providers, and tools, will manage upgrades differently. It is recommended that you consult their main documentation regarding upgrades.
* [kops](https://github.com/kubernetes/kops)
* [kubespray](https://github.com/kubernetes-sigs/kubespray)
* [CoreOS Tectonic](https://coreos.com/tectonic/docs/latest/admin/upgrade.html)
* [Digital Rebar](https://provision.readthedocs.io/en/tip/doc/content-packages/krib.html)
* ...
To upgrade a cluster on a platform not mentioned in the above list, check the order of component upgrade on the
[Skewed versions](/docs/setup/release/version-skew-policy/#supported-component-upgrade-order) page.
## Resizing a cluster
If your cluster runs short on resources you can easily add more machines to it if your cluster
is running in [Node self-registration mode](/docs/concepts/architecture/nodes/#self-registration-of-nodes).
If you're using GCE or Google Kubernetes Engine it's done by resizing the Instance Group managing your Nodes.
It can be accomplished by modifying number of instances on
`Compute > Compute Engine > Instance groups > your group > Edit group`
[Google Cloud Console page](https://console.developers.google.com) or using gcloud CLI:
```shell
gcloud compute instance-groups managed resize kubernetes-node-pool --size=42 --zone=$ZONE
```
The Instance Group will take care of putting appropriate image on new machines and starting them,
while the Kubelet will register its Node with the API server to make it available for scheduling.
If you scale the instance group down, system will randomly choose Nodes to kill.
In other environments you may need to configure the machine yourself and tell the Kubelet on which machine API server is running.
### Resizing an Azure Kubernetes Service (AKS) cluster
Azure Kubernetes Service enables user-initiated resizing of the cluster from either the CLI or
the Azure Portal and is described in the
[Azure AKS documentation](https://docs.microsoft.com/en-us/azure/aks/scale-cluster).
### Cluster autoscaling
If you are using GCE or Google Kubernetes Engine, you can configure your cluster so that it is automatically rescaled based on
pod needs.
As described in [Compute Resource](/docs/concepts/configuration/manage-resources-containers/),
users can reserve how much CPU and memory is allocated to pods.
This information is used by the Kubernetes scheduler to find a place to run the pod. If there is
no node that has enough free capacity (or doesn't match other pod requirements) then the pod has
to wait until some pods are terminated or a new node is added.
Cluster autoscaler looks for the pods that cannot be scheduled and checks if adding a new node, similar
to the other in the cluster, would help. If yes, then it resizes the cluster to accommodate the waiting pods.
Cluster autoscaler also scales down the cluster if it notices that one or more nodes are not needed anymore for
an extended period of time (10min but it may change in the future).
Cluster autoscaler is configured per instance group (GCE) or node pool (Google Kubernetes Engine).
If you are using GCE then you can either enable it while creating a cluster with kube-up.sh script.
To configure cluster autoscaler you have to set three environment variables:
* `KUBE_ENABLE_CLUSTER_AUTOSCALER` - it enables cluster autoscaler if set to true.
* `KUBE_AUTOSCALER_MIN_NODES` - minimum number of nodes in the cluster.
* `KUBE_AUTOSCALER_MAX_NODES` - maximum number of nodes in the cluster.
Example:
```shell
KUBE_ENABLE_CLUSTER_AUTOSCALER=true KUBE_AUTOSCALER_MIN_NODES=3 KUBE_AUTOSCALER_MAX_NODES=10 NUM_NODES=5 ./cluster/kube-up.sh
```
On Google Kubernetes Engine you configure cluster autoscaler either on cluster creation or update or when creating a particular node pool
(which you want to be autoscaled) by passing flags `--enable-autoscaling` `--min-nodes` and `--max-nodes`
to the corresponding `gcloud` commands.
Examples:
```shell
gcloud container clusters create mytestcluster --zone=us-central1-b --enable-autoscaling --min-nodes=3 --max-nodes=10 --num-nodes=5
```
```shell
gcloud container clusters update mytestcluster --enable-autoscaling --min-nodes=1 --max-nodes=15
```
**Cluster autoscaler expects that nodes have not been manually modified (e.g. by adding labels via kubectl) as those properties would not be propagated to the new nodes within the same instance group.**
For more details about how the cluster autoscaler decides whether, when and how
to scale a cluster, please refer to the [FAQ](https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md)
documentation from the autoscaler project.
## Maintenance on a Node
If you need to reboot a node (such as for a kernel upgrade, libc upgrade, hardware repair, etc.), and the downtime is
brief, then when the Kubelet restarts, it will attempt to restart the pods scheduled to it. If the reboot takes longer
(the default time is 5 minutes, controlled by `--pod-eviction-timeout` on the controller-manager),
then the node controller will terminate the pods that are bound to the unavailable node. If there is a corresponding
replica set (or replication controller), then a new copy of the pod will be started on a different node. So, in the case where all
pods are replicated, upgrades can be done without special coordination, assuming that not all nodes will go down at the same time.
If you want more control over the upgrading process, you may use the following workflow:
Use `kubectl drain` to gracefully terminate all pods on the node while marking the node as unschedulable:
```shell
kubectl drain $NODENAME
```
This keeps new pods from landing on the node while you are trying to get them off.
For pods with a replica set, the pod will be replaced by a new pod which will be scheduled to a new node. Additionally, if the pod is part of a service, then clients will automatically be redirected to the new pod.
For pods with no replica set, you need to bring up a new copy of the pod, and assuming it is not part of a service, redirect clients to it.
Perform maintenance work on the node.
Make the node schedulable again:
```shell
kubectl uncordon $NODENAME
```
If you deleted the node's VM instance and created a new one, then a new schedulable node resource will
be created automatically (if you're using a cloud provider that supports
node discovery; currently this is only Google Compute Engine, not including CoreOS on Google Compute Engine using kube-register).
See [Node](/docs/concepts/architecture/nodes/) for more details.
## Advanced Topics
### Turn on or off an API version for your cluster
Specific API versions can be turned on or off by passing `--runtime-config=api/<version>` flag while bringing up the API server. For example: to turn off v1 API, pass `--runtime-config=api/v1=false`.
runtime-config also supports 2 special keys: api/all and api/legacy to control all and legacy APIs respectively.
For example, for turning off all API versions except v1, pass `--runtime-config=api/all=false,api/v1=true`.
For the purposes of these flags, _legacy_ APIs are those APIs which have been explicitly deprecated (e.g. `v1beta3`).
### Switching your cluster's storage API version
The objects that are stored to disk for a cluster's internal representation of the Kubernetes resources active in the cluster are written using a particular version of the API.
When the supported API changes, these objects may need to be rewritten in the newer API. Failure to do this will eventually result in resources that are no longer decodable or usable
by the Kubernetes API server.
### Switching your config files to a new API version
You can use `kubectl convert` command to convert config files between different API versions.
```shell
kubectl convert -f pod.yaml --output-version v1
```
For more options, please refer to the usage of [kubectl convert](/docs/reference/generated/kubectl/kubectl-commands#convert) command.

View File

@ -0,0 +1,93 @@
---
title: Upgrade A Cluster
content_type: task
---
<!-- overview -->
This page provides an overview of the steps you should follow to upgrade a
Kubernetes cluster.
The way that you upgrade a cluster depends on how you initially deployed it
and on any subsequent changes.
At a high level, the steps you perform are:
- Upgrade the {{< glossary_tooltip text="control plane" term_id="control-plane" >}}
- Upgrade the nodes in your cluster
- Upgrade clients such as {{< glossary_tooltip text="kubectl" term_id="kubectl" >}}
- Adjust manifests and other resources based on the API changes that accompany the
new Kubernetes version
## {{% heading "prerequisites" %}}
You must have an existing cluster. This page is about upgrading from Kubernetes
{{< skew prevMinorVersion >}} to Kubernetes {{< skew latestVersion >}}. If your cluster
is not currently running Kubernetes {{< skew prevMinorVersion >}} then please check
the documentation for the version of Kubernetes that you plan to upgrade to.
## Upgrade approaches
### kubeadm {#upgrade-kubeadm}
If your cluster was deployed using the `kubeadm` tool, refer to
[Upgrading kubeadm clusters](/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/)
for detailed information on how to upgrade the cluster.
Once you have upgraded the cluster, remember to
[install the latest version of `kubectl`](/docs/tasks/tools/install-kubectl/).
### Manual deployments
{{< caution >}}
These steps do not account for third-party extensions such as network and storage
plugins.
{{< /caution >}}
You should manually update the control plane following this sequence:
- etcd (all instances)
- kube-apiserver (all control plane hosts)
- kube-controller-manager
- kube-scheduler
- cloud controller manager, if you use one
At this point you should
[install the latest version of `kubectl`](/docs/tasks/tools/install-kubectl/).
For each node in your cluster, [drain](/docs/tasks/administer-cluster/safely-drain-node/)
that node and then either replace it with a new node that uses the {{< skew latestVersion >}}
kubelet, or upgrade the kubelet on that node and bring the node back into service.
### Other deployments {#upgrade-other}
Refer to the documentation for your cluster deployment tool to learn the recommended set
up steps for maintenance.
## Post-upgrade tasks
### Switch your cluster's storage API version
The objects that are serialized into etcd for a cluster's internal
representation of the Kubernetes resources active in the cluster are
written using a particular version of the API.
When the supported API changes, these objects may need to be rewritten
in the newer API. Failure to do this will eventually result in resources
that are no longer decodable or usable by the Kubernetes API server.
For each affected object, fetch it using the latest supported API and then
write it back also using the latest supported API.
### Update manifests
Upgrading to a new Kubernetes version can provide new APIs.
You can use `kubectl convert` command to convert manifests between different API versions.
For example:
```shell
kubectl convert -f pod.yaml --output-version v1
```
The `kubectl` tool replaces the contents of `pod.yaml` with a manifest that sets `kind` to
Pod (unchanged), but with a revised `apiVersion`.

View File

@ -0,0 +1,29 @@
---
title: Enable Or Disable A Kubernetes API
content_type: task
---
<!-- overview -->
This page shows how to enable or disable an API version from your cluster's
{{< glossary_tooltip text="control plane" term_id="control-plane" >}}.
<!-- steps -->
Specific API versions can be turned on or off by passing `--runtime-config=api/<version>` as a
command line argument to the API server. The values for this argument are a comma-separated
list of API versions. Later values override earlier values.
The `runtime-config` command line argument also supports 2 special keys:
- `api/all`, representing all known APIs
- `api/legacy`, representing only legacy APIs. Legacy APIs are any APIs that have been
explicitly [deprecated](/docs/reference/using-api/deprecation-policy/).
For example, to turning off all API versions except v1, pass `--runtime-config=api/all=false,api/v1=true`
to the `kube-apiserver`.
## {{% heading "whatsnext" %}}
Read the [full documentation](/docs/reference/command-line-tools-reference/kube-apiserver/)
for the `kube-apiserver` component.

View File

@ -4,14 +4,14 @@ reviewers:
- mml - mml
- foxish - foxish
- kow3ns - kow3ns
title: Safely Drain a Node while Respecting the PodDisruptionBudget title: Safely Drain a Node
content_type: task content_type: task
min-kubernetes-server-version: 1.5 min-kubernetes-server-version: 1.5
--- ---
<!-- overview --> <!-- overview -->
This page shows how to safely drain a {{< glossary_tooltip text="node" term_id="node" >}}, This page shows how to safely drain a {{< glossary_tooltip text="node" term_id="node" >}},
respecting the PodDisruptionBudget you have defined. optionally respecting the PodDisruptionBudget you have defined.
## {{% heading "prerequisites" %}} ## {{% heading "prerequisites" %}}
@ -27,6 +27,15 @@ This task also assumes that you have met the following prerequisites:
<!-- steps --> <!-- steps -->
## (Optional) Configure a disruption budget {#configure-poddisruptionbudget}
To endure that your workloads remain available during maintenance, you can
configure a [PodDisruptionBudget](/docs/concepts/workloads/pods/disruptions/).
If availability is important for any applications that run or could run on the node(s)
that you are draining, [configure a PodDisruptionBudgets](/docs/tasks/run-application/configure-pdb/)
first and the continue following this guide.
## Use `kubectl drain` to remove a node from service ## Use `kubectl drain` to remove a node from service
You can use `kubectl drain` to safely evict all of your pods from a You can use `kubectl drain` to safely evict all of your pods from a
@ -158,7 +167,4 @@ application owners and cluster owners to establish an agreement on behavior in t
* Follow steps to protect your application by [configuring a Pod Disruption Budget](/docs/tasks/run-application/configure-pdb/). * Follow steps to protect your application by [configuring a Pod Disruption Budget](/docs/tasks/run-application/configure-pdb/).
* Learn more about [maintenance on a node](/docs/tasks/administer-cluster/cluster-management/#maintenance-on-a-node).

View File

@ -47,7 +47,7 @@ can not schedule your pod. Reasons include:
You may have exhausted the supply of CPU or Memory in your cluster. In this You may have exhausted the supply of CPU or Memory in your cluster. In this
case you can try several things: case you can try several things:
* [Add more nodes](/docs/tasks/administer-cluster/cluster-management/#resizing-a-cluster) to the cluster. * Add more nodes to the cluster.
* [Terminate unneeded pods](/docs/concepts/workloads/pods/#pod-termination) * [Terminate unneeded pods](/docs/concepts/workloads/pods/#pod-termination)
to make room for pending pods. to make room for pending pods.

View File

@ -28,7 +28,7 @@
/docs/admin/audit/ /docs/tasks/debug-application-cluster/audit/ 301 /docs/admin/audit/ /docs/tasks/debug-application-cluster/audit/ 301
/docs/admin/authorization/rbac.md /docs/admin/authorization/rbac/ 301 /docs/admin/authorization/rbac.md /docs/admin/authorization/rbac/ 301
/docs/admin/cluster-components/ /docs/concepts/overview/components/ 301 /docs/admin/cluster-components/ /docs/concepts/overview/components/ 301
/docs/admin/cluster-management/ /docs/tasks/administer-cluster/cluster-management/ 301 /docs/admin/cluster-management/ /docs/tasks/administer-cluster/ 302
/docs/admin/cluster-troubleshooting/ /docs/tasks/debug-application-cluster/debug-cluster/ 301 /docs/admin/cluster-troubleshooting/ /docs/tasks/debug-application-cluster/debug-cluster/ 301
/docs/admin/daemons/ /docs/concepts/workloads/controllers/daemonset/ 301 /docs/admin/daemons/ /docs/concepts/workloads/controllers/daemonset/ 301
/docs/admin/disruptions/ /docs/concepts/workloads/pods/disruptions/ 301 /docs/admin/disruptions/ /docs/concepts/workloads/pods/disruptions/ 301
@ -83,7 +83,7 @@
/docs/concepts/cluster-administration/access-cluster/ /docs/tasks/access-application-cluster/access-cluster/ 301 /docs/concepts/cluster-administration/access-cluster/ /docs/tasks/access-application-cluster/access-cluster/ 301
/docs/concepts/cluster-administration/audit/ /docs/tasks/debug-application-cluster/audit/ 301 /docs/concepts/cluster-administration/audit/ /docs/tasks/debug-application-cluster/audit/ 301
/docs/concepts/cluster-administration/authenticate-across-clusters-kubeconfig /docs/tasks/access-application-cluster/authenticate-across-clusters-kubeconfig/ 301 /docs/concepts/cluster-administration/authenticate-across-clusters-kubeconfig /docs/tasks/access-application-cluster/authenticate-across-clusters-kubeconfig/ 301
/docs/concepts/cluster-administration/cluster-management/ /docs/tasks/administer-cluster/cluster-management/ 301 /docs/concepts/cluster-administration/cluster-management/ /docs/tasks/administer-cluster/ 302
/docs/concepts/cluster-administration/configure-etcd/ /docs/tasks/administer-cluster/configure-upgrade-etcd/ 301 /docs/concepts/cluster-administration/configure-etcd/ /docs/tasks/administer-cluster/configure-upgrade-etcd/ 301
/docs/concepts/cluster-administration/device-plugins/ /docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/ 301 /docs/concepts/cluster-administration/device-plugins/ /docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/ 301
/docs/concepts/cluster-administration/etcd-upgrade/ /docs/tasks/administer-cluster/configure-upgrade-etcd/ 301 /docs/concepts/cluster-administration/etcd-upgrade/ /docs/tasks/administer-cluster/configure-upgrade-etcd/ 301