Merge branch 'main' into update-page-weights
commit
9efe14f6a7
|
@ -44,6 +44,7 @@ aliases:
|
|||
- divya-mohan0209
|
||||
- kbhawkey
|
||||
- mehabhalodiya
|
||||
- mengjiao-liu
|
||||
- natalisucks
|
||||
- nate-double-u
|
||||
- onlydole
|
||||
|
@ -104,11 +105,9 @@ aliases:
|
|||
- atoato88
|
||||
- bells17
|
||||
- kakts
|
||||
- makocchi-git
|
||||
- ptux
|
||||
- t-inu
|
||||
sig-docs-ko-owners: # Admins for Korean content
|
||||
- ClaudiaJKang
|
||||
- gochist
|
||||
- ianychoi
|
||||
- jihoon-seo
|
||||
|
@ -116,7 +115,6 @@ aliases:
|
|||
- yoonian
|
||||
- ysyukr
|
||||
sig-docs-ko-reviews: # PR reviews for Korean content
|
||||
- ClaudiaJKang
|
||||
- gochist
|
||||
- ianychoi
|
||||
- jihoon-seo
|
||||
|
@ -146,6 +144,7 @@ aliases:
|
|||
- chenxuc
|
||||
- howieyuen
|
||||
# idealhack
|
||||
- kinzhi
|
||||
- mengjiao-liu
|
||||
- my-git9
|
||||
# pigletfly
|
||||
|
@ -160,9 +159,7 @@ aliases:
|
|||
- devlware
|
||||
- edsoncelio
|
||||
- femrtnz
|
||||
- jailton
|
||||
- jcjesus
|
||||
- jhonmike
|
||||
- rikatz
|
||||
- stormqueen1990
|
||||
- yagonobre
|
||||
|
@ -170,9 +167,8 @@ aliases:
|
|||
- devlware
|
||||
- edsoncelio
|
||||
- femrtnz
|
||||
- jailton
|
||||
- jcjesus
|
||||
- jhonmike
|
||||
- mrerlison
|
||||
- rikatz
|
||||
- stormqueen1990
|
||||
- yagonobre
|
||||
|
@ -196,9 +192,7 @@ aliases:
|
|||
- mfilocha
|
||||
- nvtkaszpir
|
||||
sig-docs-uk-owners: # Admins for Ukrainian content
|
||||
- anastyakulyk
|
||||
- Arhell
|
||||
- butuzov
|
||||
- MaxymVlasov
|
||||
sig-docs-uk-reviews: # PR reviews for Ukrainian content
|
||||
- Arhell
|
||||
|
|
|
@ -16,7 +16,7 @@ Hugo(Extended version)を使用してWebサイトをローカルで実行する
|
|||
このリポジトリを使用するには、以下をローカルにインストールする必要があります。
|
||||
|
||||
- [npm](https://www.npmjs.com/)
|
||||
- [Go](https://golang.org/)
|
||||
- [Go](https://go.dev/)
|
||||
- [Hugo(Extended version)](https://gohugo.io/)
|
||||
- [Docker](https://www.docker.com/)などのコンテナランタイム
|
||||
|
||||
|
|
|
@ -13,7 +13,7 @@ Você pode executar o website localmente utilizando o Hugo (versão Extended), o
|
|||
Para usar este repositório, você precisa instalar:
|
||||
|
||||
- [npm](https://www.npmjs.com/)
|
||||
- [Go](https://golang.org/)
|
||||
- [Go](https://go.dev/)
|
||||
- [Hugo (versão Extended)](https://gohugo.io/)
|
||||
- Um container runtime, por exemplo [Docker](https://www.docker.com/).
|
||||
|
||||
|
|
|
@ -878,3 +878,17 @@ div.alert > em.javascript-required {
|
|||
color: #fff;
|
||||
background: #326de6;
|
||||
}
|
||||
|
||||
// Adjust Bing search result page
|
||||
#bing-results-container {
|
||||
padding: 1em;
|
||||
}
|
||||
#bing-pagination-container {
|
||||
padding: 1em;
|
||||
margin-bottom: 1em;
|
||||
|
||||
a.bing-page-anchor {
|
||||
padding: 0.5em;
|
||||
margin: 0.25em;
|
||||
}
|
||||
}
|
||||
|
|
|
@ -30,7 +30,9 @@ Whether testing locally or running a global enterprise, Kubernetes flexibility g
|
|||
{{% blocks/feature image="suitcase" %}}
|
||||
#### Run K8s Anywhere
|
||||
|
||||
Kubernetes is open source giving you the freedom to take advantage of on-premises, hybrid, or public cloud infrastructure, letting you effortlessly move workloads to where it matters to you.
|
||||
Kubernetes is open source giving you the freedom to take advantage of on-premises, hybrid, or public cloud infrastructure, letting you effortlessly move workloads to where it matters to you.
|
||||
|
||||
To download Kubernetes, visit the [download](/releases/download/) section.
|
||||
|
||||
{{% /blocks/feature %}}
|
||||
|
||||
|
@ -43,12 +45,12 @@ Kubernetes is open source giving you the freedom to take advantage of on-premise
|
|||
<button id="desktopShowVideoButton" onclick="kub.showVideo()">Watch Video</button>
|
||||
<br>
|
||||
<br>
|
||||
<a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america" button id="desktopKCButton">Attend KubeCon North America on October 24-28, 2022</a>
|
||||
<a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-europe/" button id="desktopKCButton">Attend KubeCon + CloudNativeCon Europe on April 18-21, 2023</a>
|
||||
<br>
|
||||
<br>
|
||||
<br>
|
||||
<br>
|
||||
<a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-europe/" button id="desktopKCButton">Attend KubeCon Europe on April 17-21, 2023</a>
|
||||
<a href="https://events.linuxfoundation.org/kubecon-cloudnativecon-north-america/" button id="desktopKCButton">Attend KubeCon + CloudNativeCon North America on November 6-9, 2023</a>
|
||||
</div>
|
||||
<div id="videoPlayer">
|
||||
<iframe data-url="https://www.youtube.com/embed/H06qrNmGqyE?autoplay=1" frameborder="0" allowfullscreen></iframe>
|
||||
|
|
|
@ -16,7 +16,7 @@ To give you a flavor, here are four Kubernetes features that came from our exper
|
|||
|
||||
|
||||
|
||||
1) [Pods](https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/pods.md). A pod is the unit of scheduling in Kubernetes. It is a resource envelope in which one or more containers run. Containers that are part of the same pod are guaranteed to be scheduled together onto the same machine, and can share state via local volumes.
|
||||
1) [Pods](/docs/concepts/workloads/pods/). A pod is the unit of scheduling in Kubernetes. It is a resource envelope in which one or more containers run. Containers that are part of the same pod are guaranteed to be scheduled together onto the same machine, and can share state via local volumes.
|
||||
|
||||
|
||||
|
||||
|
@ -24,15 +24,15 @@ Borg has a similar abstraction, called an alloc (short for “resource allocatio
|
|||
|
||||
|
||||
|
||||
2) [Services](https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/services.md). Although Borg’s primary role is to manage the lifecycles of tasks and machines, the applications that run on Borg benefit from many other cluster services, including naming and load balancing. Kubernetes supports naming and load balancing using the service abstraction: a service has a name and maps to a dynamic set of pods defined by a label selector (see next section). Any container in the cluster can connect to the service using the service name. Under the covers, Kubernetes automatically load-balances connections to the service among the pods that match the label selector, and keeps track of where the pods are running as they get rescheduled over time due to failures.
|
||||
2) [Services](/docs/concepts/services-networking/service/). Although Borg’s primary role is to manage the lifecycles of tasks and machines, the applications that run on Borg benefit from many other cluster services, including naming and load balancing. Kubernetes supports naming and load balancing using the service abstraction: a service has a name and maps to a dynamic set of pods defined by a label selector (see next section). Any container in the cluster can connect to the service using the service name. Under the covers, Kubernetes automatically load-balances connections to the service among the pods that match the label selector, and keeps track of where the pods are running as they get rescheduled over time due to failures.
|
||||
|
||||
|
||||
|
||||
3) [Labels](https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/labels.md). A container in Borg is usually one replica in a collection of identical or nearly identical containers that correspond to one tier of an Internet service (e.g. the front-ends for Google Maps) or to the workers of a batch job (e.g. a MapReduce). The collection is called a Job, and each replica is called a Task. While the Job is a very useful abstraction, it can be limiting. For example, users often want to manage their entire service (composed of many Jobs) as a single entity, or to uniformly manage several related instances of their service, for example separate canary and stable release tracks. At the other end of the spectrum, users frequently want to reason about and control subsets of tasks within a Job -- the most common example is during rolling updates, when different subsets of the Job need to have different configurations.
|
||||
3) [Labels](/docs/concepts/overview/working-with-objects/labels/). A container in Borg is usually one replica in a collection of identical or nearly identical containers that correspond to one tier of an Internet service (e.g. the front-ends for Google Maps) or to the workers of a batch job (e.g. a MapReduce). The collection is called a Job, and each replica is called a Task. While the Job is a very useful abstraction, it can be limiting. For example, users often want to manage their entire service (composed of many Jobs) as a single entity, or to uniformly manage several related instances of their service, for example separate canary and stable release tracks. At the other end of the spectrum, users frequently want to reason about and control subsets of tasks within a Job -- the most common example is during rolling updates, when different subsets of the Job need to have different configurations.
|
||||
|
||||
|
||||
|
||||
Kubernetes supports more flexible collections than Borg by organizing pods using labels, which are arbitrary key/value pairs that users attach to pods (and in fact to any object in the system). Users can create groupings equivalent to Borg Jobs by using a “job:\<jobname\>” label on their pods, but they can also use additional labels to tag the service name, service instance (production, staging, test), and in general, any subset of their pods. A label query (called a “label selector”) is used to select which set of pods an operation should be applied to. Taken together, labels and [replication controllers](https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/replication-controller.md) allow for very flexible update semantics, as well as for operations that span the equivalent of Borg Jobs.
|
||||
Kubernetes supports more flexible collections than Borg by organizing pods using labels, which are arbitrary key/value pairs that users attach to pods (and in fact to any object in the system). Users can create groupings equivalent to Borg Jobs by using a “job:\<jobname\>” label on their pods, but they can also use additional labels to tag the service name, service instance (production, staging, test), and in general, any subset of their pods. A label query (called a “label selector”) is used to select which set of pods an operation should be applied to. Taken together, labels and [replication controllers](/docs/concepts/workloads/controllers/replicationcontroller/) allow for very flexible update semantics, as well as for operations that span the equivalent of Borg Jobs.
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -143,7 +143,7 @@ When a default StorageClass exists and a user creates a PersistentVolumeClaim wi
|
|||
|
||||
|
||||
|
||||
Kubernetes 1.4 maintains backwards compatibility with the alpha version of the dynamic provisioning feature to allow for a smoother transition to the beta version. The alpha behavior is triggered by the existance of the alpha dynamic provisioning annotation (volume. **alpha**.kubernetes.io/storage-class). Keep in mind that if the beta annotation (volume. **beta**.kubernetes.io/storage-class) is present, it takes precedence, and triggers the beta behavior.
|
||||
Kubernetes 1.4 maintains backwards compatibility with the alpha version of the dynamic provisioning feature to allow for a smoother transition to the beta version. The alpha behavior is triggered by the existence of the alpha dynamic provisioning annotation (volume. **alpha**.kubernetes.io/storage-class). Keep in mind that if the beta annotation (volume. **beta**.kubernetes.io/storage-class) is present, it takes precedence, and triggers the beta behavior.
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -192,7 +192,7 @@ To modify/add your own DAGs, you can use `kubectl cp` to upload local files into
|
|||
|
||||
# Get Involved
|
||||
|
||||
This feature is just the beginning of multiple major efforts to improves Apache Airflow integration into Kubernetes. The Kubernetes Operator has been merged into the [1.10 release branch of Airflow](https://github.com/apache/incubator-airflow/tree/v1-10-test) (the executor in experimental mode), along with a fully k8s native scheduler called the Kubernetes Executor (article to come). These features are still in a stage where early adopters/contributers can have a huge influence on the future of these features.
|
||||
This feature is just the beginning of multiple major efforts to improves Apache Airflow integration into Kubernetes. The Kubernetes Operator has been merged into the [1.10 release branch of Airflow](https://github.com/apache/incubator-airflow/tree/v1-10-test) (the executor in experimental mode), along with a fully k8s native scheduler called the Kubernetes Executor (article to come). These features are still in a stage where early adopters/contributors can have a huge influence on the future of these features.
|
||||
|
||||
For those interested in joining these efforts, I'd recommend checkint out these steps:
|
||||
|
||||
|
|
|
@ -460,7 +460,7 @@ Now you can configure your DHCP. Basically you should set the `next-server` and
|
|||
I use ISC-DHCP server, and here is an example `dhcpd.conf`:
|
||||
|
||||
```
|
||||
shared-network ltsp-netowrk {
|
||||
shared-network ltsp-network {
|
||||
subnet 10.9.0.0 netmask 255.255.0.0 {
|
||||
authoritative;
|
||||
default-lease-time -1;
|
||||
|
|
|
@ -8,7 +8,7 @@ date: 2018-12-12
|
|||
|
||||
Kubernetes provides great primitives for deploying applications to a cluster: it can be as simple as `kubectl create -f app.yaml`. Deploy apps across multiple clusters has never been that simple. How should app workloads be distributed? Should the app resources be replicated into all clusters, replicated into selected clusters, or partitioned into clusters? How is access to the clusters managed? What happens if some of the resources that a user wants to distribute pre-exist, in some or all of the clusters, in some form?
|
||||
|
||||
In SIG Multicluster, our journey has revealed that there are multiple possible models to solve these problems and there probably is no single best-fit, all-scenario solution. [Federation](/docs/concepts/cluster-administration/federation/), however, is the single biggest Kubernetes open source sub-project, and has seen the maximum interest and contribution from the community in this problem space. The project initially reused the Kubernetes API to do away with any added usage complexity for an existing Kubernetes user. This approach was not viable, because of the problems summarised below:
|
||||
In SIG Multicluster, our journey has revealed that there are multiple possible models to solve these problems and there probably is no single best-fit, all-scenario solution. [Kubernetes Cluster Federation (KubeFed for short)](https://github.com/kubernetes-sigs/kubefed), however, is the single biggest Kubernetes open source sub-project, and has seen the maximum interest and contribution from the community in this problem space. The project initially reused the Kubernetes API to do away with any added usage complexity for an existing Kubernetes user. This approach was not viable, because of the problems summarised below:
|
||||
|
||||
* Difficulties in re-implementing the Kubernetes API at the cluster level, as federation-specific extensions were stored in annotations.
|
||||
* Limited flexibility in federated types, placement and reconciliation, due to 1:1 emulation of the Kubernetes API.
|
||||
|
|
|
@ -129,7 +129,7 @@ spec:
|
|||
spec:
|
||||
containers:
|
||||
- name: test-container
|
||||
image: k8s.gcr.io/busybox
|
||||
image: registry.k8s.io/busybox # updated after publication (previously used k8s.gcr.io/busybox)
|
||||
command:
|
||||
- "/bin/sh"
|
||||
args:
|
||||
|
|
|
@ -27,7 +27,7 @@ Our goal is for Kubernetes docs to be a trustworthy guide to Kubernetes features
|
|||
|
||||
### Re-homing content
|
||||
|
||||
Some content will be removed that readers may find helpful. To make sure readers have continous access to information, we're giving stakeholders until the [1.19 release deadline for docs](https://github.com/kubernetes/sig-release/tree/master/releases/release-1.19), **July 9th, 2020** to re-home any content slated for removal.
|
||||
Some content will be removed that readers may find helpful. To make sure readers have continuous access to information, we're giving stakeholders until the [1.19 release deadline for docs](https://github.com/kubernetes/sig-release/tree/master/releases/release-1.19), **July 9th, 2020** to re-home any content slated for removal.
|
||||
|
||||
Over the next few months you'll see less third party content in the docs as contributors open PRs to remove content.
|
||||
|
||||
|
|
|
@ -520,7 +520,7 @@ And the real strength of WSL2 integration, the port `8443` once open on WSL2 dis
|
|||
|
||||
Working on the command line is always good and very insightful. However, when dealing with Kubernetes we might want, at some point, to have a visual overview.
|
||||
|
||||
For that, Minikube embeded the [Kubernetes Dashboard](https://github.com/kubernetes/dashboard). Thanks to it, running and accessing the Dashboard is very simple:
|
||||
For that, Minikube embedded the [Kubernetes Dashboard](https://github.com/kubernetes/dashboard). Thanks to it, running and accessing the Dashboard is very simple:
|
||||
|
||||
```bash
|
||||
# Enable the Dashboard service
|
||||
|
|
|
@ -55,7 +55,7 @@ The team has made progress in the last few months that is well worth celebrating
|
|||
|
||||
- The K8s-Infrastructure Working Group released an automated billing report that they start every meeting off by reviewing as a group.
|
||||
- DNS for k8s.io and kubernetes.io are also fully [community-owned](https://groups.google.com/g/kubernetes-dev/c/LZTYJorGh7c/m/u-ydk-yNEgAJ), with community members able to [file issues](https://github.com/kubernetes/k8s.io/issues/new?assignees=&labels=wg%2Fk8s-infra&template=dns-request.md&title=DNS+REQUEST%3A+%3Cyour-dns-record%3E) to manage records.
|
||||
- The container registry [k8s.gcr.io](https://github.com/kubernetes/k8s.io/tree/main/k8s.gcr.io) is also fully community-owned and available for all Kubernetes subprojects to use.
|
||||
- The container registry [registry.k8s.io](https://github.com/kubernetes/k8s.io/tree/main/registry.k8s.io) is also fully community-owned and available for all Kubernetes subprojects to use.
|
||||
_Note:_ The container registry has changed to registry.k8s.io. Updated on August 25, 2022.
|
||||
- The Kubernetes [publishing-bot](https://github.com/kubernetes/publishing-bot) responsible for keeping k8s.io/kubernetes/staging repositories published to their own top-level repos (For example: [kubernetes/api](https://github.com/kubernetes/api)) runs on a community-owned cluster.
|
||||
- The gcsweb.k8s.io service used to provide anonymous access to GCS buckets for kubernetes artifacts runs on a community-owned cluster.
|
||||
|
|
|
@ -198,7 +198,7 @@ GUINEVERE SAENGER: I would want Jorge to be really on top of making sure that ev
|
|||
|
||||
Greater communication of timelines and just giving people more time and space to be able to get in their changes, or at least, seemingly give them more time and space by sending early warnings, is going to be helpful. Of course, he's going to have a slightly longer release, too, than I did. This might be related to a unique Q4 challenge. Overall, I would encourage him to take more breaks, to rely more on his release shadows, and split out the work in a fashion that allows everyone to have a turn and everyone to have a break as well.
|
||||
|
||||
**ADAM GLICK: What would your advice be to someone who is hearing your experience and is inspired to get involved with the Kubernetes release or contributer process?**
|
||||
**ADAM GLICK: What would your advice be to someone who is hearing your experience and is inspired to get involved with the Kubernetes release or contributor process?**
|
||||
|
||||
GUINEVERE SAENGER: Those are two separate questions. So let me tackle the Kubernetes release question first. Kubernetes [SIG Release](https://github.com/kubernetes/sig-release/#readme) has, in my opinion, a really excellent onboarding program for new members. We have what is called the [Release Team Shadow Program](https://github.com/kubernetes/sig-release/blob/master/release-team/shadows.md). We also have the Release Engineering Shadow Program, or the Release Management Shadow Program. Those are two separate subprojects within SIG Release. And each subproject has a team of roles, and each role can have two to four shadows that are basically people who are part of that role team, and they are learning that role as they are doing it.
|
||||
|
||||
|
|
|
@ -81,7 +81,7 @@ If the `ServerSideFieldValidation` feature gate is enabled starting 1.23, users
|
|||
|
||||
With the feature gate enabled, we also introduce the `fieldValidation` query parameter so that users can specify the desired behavior of the server on a per request basis. Valid values for the `fieldValidation` query parameter are:
|
||||
|
||||
- Ignore (default when feature gate is disabled, same as pre-1.23 behavior of dropping/ignoring unkonwn fields)
|
||||
- Ignore (default when feature gate is disabled, same as pre-1.23 behavior of dropping/ignoring unknown fields)
|
||||
- Warn (default when feature gate is enabled).
|
||||
- Strict (this will fail the request with an Invalid Request error)
|
||||
|
||||
|
|
|
@ -5,6 +5,7 @@ linkTitle: "Dockershim Removal FAQ"
|
|||
date: 2022-02-17
|
||||
slug: dockershim-faq
|
||||
aliases: [ '/dockershim' ]
|
||||
evergreen: true
|
||||
---
|
||||
|
||||
**This supersedes the original
|
||||
|
|
|
@ -32,7 +32,7 @@ Caleb is also a co-organizer of the [CloudNative NZ](https://www.meetup.com/clou
|
|||
|
||||
## [Dylan Graham](https://github.com/DylanGraham)
|
||||
|
||||
Dylan Graham is a cloud engineer from Adeliade, Australia. He has been contributing to the upstream Kubernetes project since 2018.
|
||||
Dylan Graham is a cloud engineer from Adelaide, Australia. He has been contributing to the upstream Kubernetes project since 2018.
|
||||
|
||||
He stated that being a part of such a large-scale project was initially overwhelming, but that the community's friendliness and openness assisted him in getting through it.
|
||||
|
||||
|
|
|
@ -115,7 +115,8 @@ metadata:
|
|||
spec:
|
||||
containers:
|
||||
- name: agnhost
|
||||
image: k8s.gcr.io/e2e-test-images/agnhost:2.35
|
||||
# image changed since publication (previously used registry "k8s.gcr.io")
|
||||
image: registry.k8s.io/e2e-test-images/agnhost:2.35
|
||||
command: ["/agnhost", "grpc-health-checking"]
|
||||
ports:
|
||||
- containerPort: 5000
|
||||
|
|
|
@ -18,7 +18,7 @@ case where you're using the `OrderedReady` Pod management policy for a StatefulS
|
|||
Here are some examples:
|
||||
|
||||
- I am using a StatefulSet to orchestrate a multi-instance, cache based application where the size of the cache is large. The cache
|
||||
starts cold and requires some siginificant amount of time before the container can start. There could be more initial startup tasks
|
||||
starts cold and requires some significant amount of time before the container can start. There could be more initial startup tasks
|
||||
that are required. A RollingUpdate on this StatefulSet would take a lot of time before the application is fully updated. If the
|
||||
StatefulSet supported updating more than one pod at a time, it would result in a much faster update.
|
||||
|
||||
|
@ -50,7 +50,8 @@ spec:
|
|||
app: nginx
|
||||
spec:
|
||||
containers:
|
||||
- image: k8s.gcr.io/nginx-slim:0.8
|
||||
# image changed since publication (previously used registry "k8s.gcr.io")
|
||||
- image: registry.k8s.io/nginx-slim:0.8
|
||||
imagePullPolicy: IfNotPresent
|
||||
name: nginx
|
||||
updateStrategy:
|
||||
|
@ -66,7 +67,7 @@ If you enable the new feature and you don't specify a value for `maxUnavailable`
|
|||
I'll run through a scenario based on that example manifest to demonstrate how this feature works. I will deploy a StatefulSet that
|
||||
has 5 replicas, with `maxUnavailable` set to 2 and `partition` set to 0.
|
||||
|
||||
I can trigger a rolling update by changing the image to `k8s.gcr.io/nginx-slim:0.9`. Once I initiate the rolling update, I can
|
||||
I can trigger a rolling update by changing the image to `registry.k8s.io/nginx-slim:0.9`. Once I initiate the rolling update, I can
|
||||
watch the pods update 2 at a time as the current value of maxUnavailable is 2. The below output shows a span of time and is not
|
||||
complete. The maxUnavailable can be an absolute number (for example, 2) or a percentage of desired Pods (for example, 10%). The
|
||||
absolute number is calculated from percentage by rounding up to the nearest integer.
|
||||
|
|
|
@ -145,7 +145,7 @@ workstream within the Gateway API subproject focused on Gateway API for Mesh
|
|||
Management and Administration.
|
||||
|
||||
This group will deliver [enhancement
|
||||
proposals](https://gateway-api.sigs.k8s.io/v1beta1/contributing/gep/) consisting
|
||||
proposals](https://gateway-api.sigs.k8s.io/geps/overview/) consisting
|
||||
of resources, additions, and modifications to the Gateway API specification for
|
||||
mesh and mesh-adjacent use-cases.
|
||||
|
||||
|
|
|
@ -89,7 +89,7 @@ To use cgroup v2 with Kubernetes, you must meet the following requirements:
|
|||
* The kubelet and the container runtime are configured to use the [systemd cgroup driver](/docs/setup/production-environment/container-runtimes#systemd-cgroup-driver)
|
||||
|
||||
The kubelet and container runtime use a [cgroup driver](/docs/setup/production-environment/container-runtimes#cgroup-drivers)
|
||||
to set cgroup paramaters. When using cgroup v2, it's strongly recommended that both
|
||||
to set cgroup parameters. When using cgroup v2, it's strongly recommended that both
|
||||
the kubelet and your container runtime use the
|
||||
[systemd cgroup driver](/docs/setup/production-environment/container-runtimes#systemd-cgroup-driver),
|
||||
so that there's a single cgroup manager on the system. To configure the kubelet
|
||||
|
|
|
@ -438,7 +438,7 @@ kubectl apply -f crds/stable.example.com_appendonlylists.yaml
|
|||
customresourcedefinition.apiextensions.k8s.io/appendonlylists.stable.example.com created
|
||||
```
|
||||
|
||||
Creating an inital list with one element inside should succeed without problem:
|
||||
Creating an initial list with one element inside should succeed without problem:
|
||||
```shell
|
||||
kubectl apply -f - <<EOF
|
||||
---
|
||||
|
|
|
@ -30,7 +30,7 @@ cloud computing resources.
|
|||
In this release we want to recognise the importance of all these building blocks on which Kubernetes
|
||||
is developed and used, while at the same time raising awareness on the importance of taking the
|
||||
energy consumption footprint into account: environmental sustainability is an inescapable concern of
|
||||
creators and users of any software solution, and the environmental footprint of sofware, like
|
||||
creators and users of any software solution, and the environmental footprint of software, like
|
||||
Kubernetes, an area which we believe will play a significant role in future releases.
|
||||
|
||||
As a community, we always work to make each new release process better than before (in this release,
|
||||
|
|
|
@ -54,7 +54,7 @@ closed and the storage will be unmounted.
|
|||
|
||||
HostProcess and Linux privileged containers enable similar scenarios but differ
|
||||
greatly in their implementation (hence the naming difference). HostProcess containers
|
||||
have their own [PodSecurityContext](https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.25/#windowssecuritycontextoptions-v1-core) fields.
|
||||
have their own [PodSecurityContext](/docs/reference/generated/kubernetes-api/v1.25/#windowssecuritycontextoptions-v1-core) fields.
|
||||
Those used to configure Linux privileged containers **do not** apply. Enabling privileged access to a Windows host is a
|
||||
fundamentally different process than with Linux so the configuration and
|
||||
capabilities of each differ significantly. Below is a diagram detailing the
|
||||
|
@ -110,7 +110,7 @@ Please note that within a Pod, you can't mix HostProcess containers with normal
|
|||
|
||||
- Work through [Create a Windows HostProcess Pod](/docs/tasks/configure-pod-container/create-hostprocess-pod/)
|
||||
|
||||
- Read about Kubernetes [Pod Security Standards](/docs/concepts/security/pod-security-standards/) and [Pod Security Admission](docs/concepts/security/pod-security-admission/)
|
||||
- Read about Kubernetes [Pod Security Standards](/docs/concepts/security/pod-security-standards/) and [Pod Security Admission](/docs/concepts/security/pod-security-admission/)
|
||||
|
||||
- Read the enhancement proposal [Windows Privileged Containers and Host Networking Mode](https://github.com/kubernetes/enhancements/tree/master/keps/sig-windows/1981-windows-privileged-container-support) (KEP-1981)
|
||||
|
||||
|
|
|
@ -60,12 +60,14 @@ kind: ValidatingAdmissionPolicyBinding
|
|||
metadata:
|
||||
name: "demo-binding-test.example.com"
|
||||
spec:
|
||||
policy: "demo-policy.example.com"
|
||||
policyName: "demo-policy.example.com"
|
||||
matchResources:
|
||||
namespaceSelector:
|
||||
- key: environment,
|
||||
operator: In,
|
||||
values: ["test"]
|
||||
matchExpressions:
|
||||
- key: environment
|
||||
operator: In
|
||||
values:
|
||||
- test
|
||||
```
|
||||
|
||||
This `ValidatingAdmissionPolicyBinding` resource binds the above policy only to
|
||||
|
@ -115,14 +117,16 @@ kind: ValidatingAdmissionPolicyBinding
|
|||
metadata:
|
||||
name: "demo-binding-production.example.com"
|
||||
spec:
|
||||
policy: "demo-policy.example.com"
|
||||
paramsRef:
|
||||
policyName: "demo-policy.example.com"
|
||||
paramRef:
|
||||
name: "demo-params-production.example.com"
|
||||
matchResources:
|
||||
namespaceSelector:
|
||||
- key: environment,
|
||||
operator: In,
|
||||
values: ["production"]
|
||||
matchExpressions:
|
||||
- key: environment
|
||||
operator: In
|
||||
values:
|
||||
- production
|
||||
```
|
||||
|
||||
```yaml
|
||||
|
|
|
@ -90,7 +90,7 @@ This dependency made the tracking of Job status unreliable, because Pods can be
|
|||
deleted from the API for a number of reasons, including:
|
||||
- The garbage collector removing orphan Pods when a Node goes down.
|
||||
- The garbage collector removing terminated Pods when they reach a threshold.
|
||||
- The Kubernetes scheduler preempting a Pod to accomodate higher priority Pods.
|
||||
- The Kubernetes scheduler preempting a Pod to accommodate higher priority Pods.
|
||||
- The taint manager evicting a Pod that doesn't tolerate a `NoExecute` taint.
|
||||
- External controllers, not included as part of Kubernetes, or humans deleting
|
||||
Pods.
|
||||
|
|
|
@ -107,38 +107,42 @@ If you want to test the feature whilst it's alpha, you need to enable the releva
|
|||
If you would like to see the feature in action and verify it works fine in your cluster here's what you can try:
|
||||
|
||||
1. Define a basic PersistentVolumeClaim:
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
name: pvc-1
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
resources:
|
||||
requests:
|
||||
storage: 1Gi
|
||||
```
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: PersistentVolumeClaim
|
||||
metadata:
|
||||
name: pvc-1
|
||||
spec:
|
||||
accessModes:
|
||||
- ReadWriteOnce
|
||||
resources:
|
||||
requests:
|
||||
storage: 1Gi
|
||||
```
|
||||
|
||||
2. Create the PersistentVolumeClaim when there is no default StorageClass. The PVC won't provision or bind (unless there is an existing, suitable PV already present) and will remain in <code>Pending</code> state.
|
||||
```
|
||||
$ kc get pvc
|
||||
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
|
||||
pvc-1 Pending
|
||||
```
|
||||
|
||||
```
|
||||
$ kc get pvc
|
||||
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
|
||||
pvc-1 Pending
|
||||
```
|
||||
|
||||
3. Configure one StorageClass as default.
|
||||
```
|
||||
$ kc patch sc -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
|
||||
storageclass.storage.k8s.io/my-storageclass patched
|
||||
```
|
||||
|
||||
```
|
||||
$ kc patch sc -p '{"metadata":{"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}'
|
||||
storageclass.storage.k8s.io/my-storageclass patched
|
||||
```
|
||||
|
||||
4. Verify that PersistentVolumeClaims is now provisioned correctly and was updated retroactively with new default StorageClass.
|
||||
```
|
||||
$ kc get pvc
|
||||
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
|
||||
pvc-1 Bound pvc-06a964ca-f997-4780-8627-b5c3bf5a87d8 1Gi RWO my-storageclass 87m
|
||||
```
|
||||
|
||||
```
|
||||
$ kc get pvc
|
||||
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
|
||||
pvc-1 Bound pvc-06a964ca-f997-4780-8627-b5c3bf5a87d8 1Gi RWO my-storageclass 87m
|
||||
```
|
||||
|
||||
### New metrics
|
||||
|
||||
|
|
|
@ -103,7 +103,7 @@ spec:
|
|||
resources:
|
||||
requests:
|
||||
memory: "256Mi"
|
||||
cpu: "0.2"
|
||||
cpu: "0.2"
|
||||
limits:
|
||||
memory: ".5Gi"
|
||||
cpu: "0.5"
|
||||
|
|
|
@ -0,0 +1,84 @@
|
|||
---
|
||||
layout: blog
|
||||
title: Consider All Microservices Vulnerable — And Monitor Their Behavior
|
||||
date: 2023-01-20
|
||||
slug: security-behavior-analysis
|
||||
---
|
||||
|
||||
**Author:**
|
||||
David Hadas (IBM Research Labs)
|
||||
|
||||
_This post warns Devops from a false sense of security. Following security best practices when developing and configuring microservices do not result in non-vulnerable microservices. The post shows that although all deployed microservices are vulnerable, there is much that can be done to ensure microservices are not exploited. It explains how analyzing the behavior of clients and services from a security standpoint, named here **"Security-Behavior Analytics"**, can protect the deployed vulnerable microservices. It points to [Guard](http://knative.dev/security-guard), an open source project offering security-behavior monitoring and control of Kubernetes microservices presumed vulnerable._
|
||||
|
||||
As cyber attacks continue to intensify in sophistication, organizations deploying cloud services continue to grow their cyber investments aiming to produce safe and non-vulnerable services. However, the year-by-year growth in cyber investments does not result in a parallel reduction in cyber incidents. Instead, the number of cyber incidents continues to grow annually. Evidently, organizations are doomed to fail in this struggle - no matter how much effort is made to detect and remove cyber weaknesses from deployed services, it seems offenders always have the upper hand.
|
||||
|
||||
Considering the current spread of offensive tools, sophistication of offensive players, and ever-growing cyber financial gains to offenders, any cyber strategy that relies on constructing a non-vulnerable, weakness-free service in 2023 is clearly too naïve. It seems the only viable strategy is to:
|
||||
|
||||
➥ **Admit that your services are vulnerable!**
|
||||
|
||||
In other words, consciously accept that you will never create completely invulnerable services. If your opponents find even a single weakness as an entry-point, you lose! Admitting that in spite of your best efforts, all your services are still vulnerable is an important first step. Next, this post discusses what you can do about it...
|
||||
|
||||
## How to protect microservices from being exploited
|
||||
|
||||
Being vulnerable does not necessarily mean that your service will be exploited. Though your services are vulnerable in some ways unknown to you, offenders still need to identify these vulnerabilities and then exploit them. If offenders fail to exploit your service vulnerabilities, you win! In other words, having a vulnerability that can’t be exploited, represents a risk that can’t be realized.
|
||||
|
||||
{{< figure src="security_behavior_figure_1.svg" alt="Image of an example of offender gaining foothold in a service" class="diagram-large" caption="Figure 1. An Offender gaining foothold in a vulnerable service" >}}
|
||||
|
||||
The above diagram shows an example in which the offender does not yet have a foothold in the service; that is, it is assumed that your service does not run code controlled by the offender on day 1. In our example the service has vulnerabilities in the API exposed to clients. To gain an initial foothold the offender uses a malicious client to try and exploit one of the service API vulnerabilities. The malicious client sends an exploit that triggers some unplanned behavior of the service.
|
||||
|
||||
More specifically, let’s assume the service is vulnerable to an SQL injection. The developer failed to sanitize the user input properly, thereby allowing clients to send values that would change the intended behavior. In our example, if a client sends a query string with key “username” and value of _“tom or 1=1”_, the client will receive the data of all users. Exploiting this vulnerability requires the client to send an irregular string as the value. Note that benign users will not be sending a string with spaces or with the equal sign character as a username, instead they will normally send legal usernames which for example may be defined as a short sequence of characters a-z. No legal username can trigger service unplanned behavior.
|
||||
|
||||
In this simple example, one can already identify several opportunities to detect and block an attempt to exploit the vulnerability (un)intentionally left behind by the developer, making the vulnerability unexploitable. First, the malicious client behavior differs from the behavior of benign clients, as it sends irregular requests. If such a change in behavior is detected and blocked, the exploit will never reach the service. Second, the service behavior in response to the exploit differs from the service behavior in response to a regular request. Such behavior may include making subsequent irregular calls to other services such as a data store, taking irregular time to respond, and/or responding to the malicious client with an irregular response (for example, containing much more data than normally sent in case of benign clients making regular requests). Service behavioral changes, if detected, will also allow blocking the exploit in different stages of the exploitation attempt.
|
||||
|
||||
More generally:
|
||||
|
||||
- Monitoring the behavior of clients can help detect and block exploits against service API vulnerabilities. In fact, deploying efficient client behavior monitoring makes many vulnerabilities unexploitable and others very hard to achieve. To succeed, the offender needs to create an exploit undetectable from regular requests.
|
||||
|
||||
- Monitoring the behavior of services can help detect services as they are being exploited regardless of the attack vector used. Efficient service behavior monitoring limits what an attacker may be able to achieve as the offender needs to ensure the service behavior is undetectable from regular service behavior.
|
||||
|
||||
Combining both approaches may add a protection layer to the deployed vulnerable services, drastically decreasing the probability for anyone to successfully exploit any of the deployed vulnerable services. Next, let us identify four use cases where you need to use security-behavior monitoring.
|
||||
|
||||
## Use cases
|
||||
|
||||
One can identify the following four different stages in the life of any service from a security standpoint. In each stage, security-behavior monitoring is required to meet different challenges:
|
||||
|
||||
Service State | Use case | What do you need in order to cope with this use case?
|
||||
------------- | ------------- | -----------------------------------------
|
||||
**Normal** | **No known vulnerabilities:** The service owner is normally not aware of any known vulnerabilities in the service image or configuration. Yet, it is reasonable to assume that the service has weaknesses. | **Provide generic protection against any unknown, zero-day, service vulnerabilities** - Detect/block irregular patterns sent as part of incoming client requests that may be used as exploits.
|
||||
**Vulnerable** | **An applicable CVE is published:** The service owner is required to release a new non-vulnerable revision of the service. Research shows that in practice this process of removing a known vulnerability may take many weeks to accomplish (2 months on average). | **Add protection based on the CVE analysis** - Detect/block incoming requests that include specific patterns that may be used to exploit the discovered vulnerability. Continue to offer services, although the service has a known vulnerability.
|
||||
**Exploitable** | **A known exploit is published:** The service owner needs a way to filter incoming requests that contain the known exploit. | **Add protection based on a known exploit signature** - Detect/block incoming client requests that carry signatures identifying the exploit. Continue to offer services, although the presence of an exploit.
|
||||
**Misused** | **An offender misuses pods backing the service:** The offender can follow an attack pattern enabling him/her to misuse pods. The service owner needs to restart any compromised pods while using non compromised pods to continue offering the service. Note that once a pod is restarted, the offender needs to repeat the attack pattern before he/she may again misuse it. | **Identify and restart instances of the component that is being misused** - At any given time, some backing pods may be compromised and misused, while others behave as designed. Detect/remove the misused pods while allowing other pods to continue servicing client requests.
|
||||
|
||||
Fortunately, microservice architecture is well suited to security-behavior monitoring as discussed next.
|
||||
|
||||
## Security-Behavior of microservices versus monoliths {#microservices-vs-monoliths}
|
||||
|
||||
Kubernetes is often used to support workloads designed with microservice architecture. By design, microservices aim to follow the UNIX philosophy of "Do One Thing And Do It Well". Each microservice has a bounded context and a clear interface. In other words, you can expect the microservice clients to send relatively regular requests and the microservice to present a relatively regular behavior as a response to these requests. Consequently, a microservice architecture is an excellent candidate for security-behavior monitoring.
|
||||
|
||||
{{< figure src="security_behavior_figure_2.svg" alt="Image showing why microservices are well suited for security-behavior monitoring" class="diagram-large" caption="Figure 2. Microservices are well suited for security-behavior monitoring" >}}
|
||||
|
||||
The diagram above clarifies how dividing a monolithic service to a set of microservices improves our ability to perform security-behavior monitoring and control. In a monolithic service approach, different client requests are intertwined, resulting in a diminished ability to identify irregular client behaviors. Without prior knowledge, an observer of the intertwined client requests will find it hard to distinguish between types of requests and their related characteristics. Further, internal client requests are not exposed to the observer. Lastly, the aggregated behavior of the monolithic service is a compound of the many different internal behaviors of its components, making it hard to identify irregular service behavior.
|
||||
|
||||
In a microservice environment, each microservice is expected by design to offer a more well-defined service and serve better defined type of requests. This makes it easier for an observer to identify irregular client behavior and irregular service behavior. Further, a microservice design exposes the internal requests and internal services which offer more security-behavior data to identify irregularities by an observer. Overall, this makes the microservice design pattern better suited for security-behavior monitoring and control.
|
||||
|
||||
## Security-Behavior monitoring on Kubernetes
|
||||
|
||||
Kubernetes deployments seeking to add Security-Behavior may use [Guard](http://knative.dev/security-guard), developed under the CNCF project Knative. Guard is integrated into the full Knative automation suite that runs on top of Kubernetes. Alternatively, **you can deploy Guard as a standalone tool** to protect any HTTP-based workload on Kubernetes.
|
||||
|
||||
See:
|
||||
|
||||
- [Guard](https://github.com/knative-sandbox/security-guard) on Github, for using Guard as a standalone tool.
|
||||
- The Knative automation suite - Read about Knative, in the blog post [Opinionated Kubernetes](https://davidhadas.wordpress.com/2022/08/29/knative-an-opinionated-kubernetes) which describes how Knative simplifies and unifies the way web services are deployed on Kubernetes.
|
||||
- You may contact Guard maintainers on the [SIG Security](https://kubernetes.slack.com/archives/C019LFTGNQ3) Slack channel or on the Knative community [security](https://knative.slack.com/archives/CBYV1E0TG) Slack channel. The Knative community channel will move soon to the [CNCF Slack](https://communityinviter.com/apps/cloud-native/cncf) under the name `#knative-security`.
|
||||
|
||||
The goal of this post is to invite the Kubernetes community to action and introduce Security-Behavior monitoring and control to help secure Kubernetes based deployments. Hopefully, the community as a follow up will:
|
||||
|
||||
1. Analyze the cyber challenges presented for different Kubernetes use cases
|
||||
1. Add appropriate security documentation for users on how to introduce Security-Behavior monitoring and control.
|
||||
1. Consider how to integrate with tools that can help users monitor and control their vulnerable services.
|
||||
|
||||
## Getting involved
|
||||
|
||||
You are welcome to get involved and join the effort to develop security behavior monitoring
|
||||
and control for Kubernetes; to share feedback and contribute to code or documentation;
|
||||
and to make or suggest improvements of any kind.
|
File diff suppressed because one or more lines are too long
After Width: | Height: | Size: 379 KiB |
File diff suppressed because one or more lines are too long
After Width: | Height: | Size: 421 KiB |
|
@ -0,0 +1,141 @@
|
|||
---
|
||||
layout: blog
|
||||
title: "Spotlight on SIG Instrumentation"
|
||||
slug: sig-instrumentation-spotlight-2023
|
||||
date: 2023-02-03
|
||||
canonicalUrl: https://www.kubernetes.dev/blog/2023/02/03/sig-instrumentation-spotlight-2023/
|
||||
---
|
||||
|
||||
**Author:** Imran Noor Mohamed (Delivery Hero)
|
||||
|
||||
Observability requires the right data at the right time for the right consumer
|
||||
(human or piece of software) to make the right decision. In the context of Kubernetes,
|
||||
having best practices for cluster observability across all Kubernetes components is crucial.
|
||||
|
||||
SIG Instrumentation helps to address this issue by providing best practices and tools
|
||||
that all other SIGs use to instrument Kubernetes components-like the *API server*,
|
||||
*scheduler*, *kubelet* and *kube-controller-manager*.
|
||||
|
||||
In this SIG Instrumentation spotlight, [Imran Noor Mohamed](https://www.linkedin.com/in/imrannoormohamed/),
|
||||
SIG ContribEx-Comms tech lead talked with [Elana Hashman](https://twitter.com/ehashdn),
|
||||
and [Han Kang](https://www.linkedin.com/in/hankang), chairs of SIG Instrumentation,
|
||||
on how the SIG is organized, what are the current challenges and how anyone can get involved and contribute.
|
||||
|
||||
## About SIG Instrumentation
|
||||
|
||||
**Imran (INM)**: Hello, thank you for the opportunity of learning more about SIG Instrumentation.
|
||||
Could you tell us a bit about yourself, your role, and how you got involved in SIG Instrumentation?
|
||||
|
||||
**Han (HK)**: I started in SIG Instrumentation in 2018, and became a chair in 2020.
|
||||
I primarily got involved with SIG instrumentation due to a number of upstream issues
|
||||
with metrics which ended up affecting GKE in bad ways. As a result, we ended up
|
||||
launching an initiative to stabilize our metrics and make metrics a proper API.
|
||||
|
||||
**Elana (EH)**: I also joined SIG Instrumentation in 2018 and became a chair at the
|
||||
same time as Han. I was working as a site reliability engineer (SRE) on bare metal
|
||||
Kubernetes clusters and was working to build out our observability stack.
|
||||
I encountered some issues with label joins where Kubernetes metrics didn’t match
|
||||
kube-state-metrics ([KSM](https://github.com/kubernetes/kube-state-metrics)) and
|
||||
started participating in SIG meetings to improve things. I helped test performance
|
||||
improvements to kube-state-metrics and ultimately coauthored a KEP for overhauling
|
||||
metrics in the 1.14 release to improve usability.
|
||||
|
||||
**Imran (INM)**: Interesting! Does that mean SIG Instrumentation involves a lot of plumbing?
|
||||
|
||||
**Han (HK)**: I wouldn’t say it involves a ton of plumbing, though it does touch
|
||||
basically every code base. We have our own dedicated directories for our metrics,
|
||||
logs, and tracing frameworks which we tend to work out of primarily. We do have to
|
||||
interact with other SIGs in order to propagate our changes which makes us more of
|
||||
a horizontal SIG.
|
||||
|
||||
**Imran (INM)**: Speaking about interaction and coordination with other SIG could
|
||||
you describe how the SIGs is organized?
|
||||
|
||||
**Elana (EH)**: In SIG Instrumentation, we have two chairs, Han and myself, as well
|
||||
as two tech leads, David Ashpole and Damien Grisonnet. We all work together as the
|
||||
SIG’s leads in order to run meetings, triage issues and PRs, review and approve KEPs,
|
||||
plan for each release, present at KubeCon and community meetings, and write our annual
|
||||
report. Within the SIG we also have a number of important subprojects, each of which is
|
||||
stewarded by its subproject owners. For example, Marek Siarkowicz is a subproject owner
|
||||
of [metrics-server](https://github.com/kubernetes-sigs/metrics-server).
|
||||
|
||||
Because we’re a horizontal SIG, some of our projects have a wide scope and require
|
||||
coordination from a dedicated group of contributors. For example, in order to guide
|
||||
the Kubernetes migration to structured logging, we chartered the
|
||||
[Structured Logging](https://github.com/kubernetes/community/blob/master/wg-structured-logging/README.md)
|
||||
Working Group (WG), organized by Marek and Patrick Ohly. The WG doesn’t own any code,
|
||||
but helps with various components such as the *kubelet*, *scheduler*, etc. in migrating
|
||||
their code to use structured logs.
|
||||
|
||||
**Imran (INM)**: Walking through the
|
||||
[charter](https://github.com/kubernetes/community/blob/master/sig-instrumentation/charter.md)
|
||||
alone it’s clear that SIG Instrumentation has a lot of sub-projects.
|
||||
Could you highlight some important ones?
|
||||
|
||||
**Han (HK)**: We have many different sub-projects and we are in dire need of
|
||||
people who can come and help shepherd them. Our most important projects in-tree
|
||||
(that is, within the kubernetes/kubernetes repo) are metrics, tracing, and,
|
||||
structured logging. Our most important projects out-of-tree are
|
||||
(a) KSM (kube-state-metrics) and (b) metrics-server.
|
||||
|
||||
**Elana (EH)**: Echoing this, we would love to bring on more maintainers for
|
||||
kube-state-metrics and metrics-server. Our friends at WG Structured Logging are
|
||||
also looking for contributors. Other subprojects include klog, prometheus-adapter,
|
||||
and a new subproject that we just launched for collecting high-fidelity, scalable
|
||||
utilization metrics called [usage-metrics-collector](https://github.com/kubernetes-sigs/usage-metrics-collector).
|
||||
All are seeking new contributors!
|
||||
|
||||
## Current status and ongoing challenges
|
||||
|
||||
**Imran (INM)**: For release [1.26](https://github.com/kubernetes/sig-release/tree/master/releases/release-1.26)
|
||||
we can see that there are a relevant number of metrics, logs, and tracing
|
||||
[KEPs](https://www.k8s.dev/resources/keps/) in the pipeline. Would you like to
|
||||
point out important things for last release (maybe alpha & stable milestone candidates?)
|
||||
|
||||
**Han (HK)**: We can now generate [documentation](https://kubernetes.io/docs/reference/instrumentation/metrics/)
|
||||
for every single metric in the main Kubernetes code base! We have a pretty fancy
|
||||
static analysis pipeline that enables this functionality. We’ve also added feature
|
||||
metrics so that you can look at your metrics to determine which features are enabled
|
||||
in your cluster at a given time. Lastly, we added a component-sli endpoint, which
|
||||
should make it easy for people to create availability SLOs for *control-plane* components.
|
||||
|
||||
**Elana (EH)**: We’ve also been working on tracing KEPs for both the *API server*
|
||||
and *kubelet*, though neither graduated in 1.26. I’m also really excited about the
|
||||
work Han is doing with WG Reliability to extend and improve our metrics stability framework.
|
||||
|
||||
**Imran (INM)**: What do you think are the Kubernetes-specific challenges tackled by
|
||||
the SIG Instrumentation? What are the future efforts to solve them?
|
||||
|
||||
**Han (HK)**: SIG instrumentation suffered a bit in the past from being a horizontal SIG.
|
||||
We did not have an obvious location to put our code and did not have a good mechanism to
|
||||
audit metrics that people would randomly add. We’ve fixed this over the years and now we
|
||||
have dedicated spots for our code and a reliable mechanism for auditing new metrics.
|
||||
We also now offer stability guarantees for metrics. We hope to have full-blown tracing
|
||||
up and down the kubernetes stack, and metric support via exemplars.
|
||||
|
||||
**Elana (EH)**: I think SIG Instrumentation is a really interesting SIG because it
|
||||
poses different kinds of opportunities to get involved than in other SIGs. You don’t
|
||||
have to be a software developer to contribute to our SIG! All of our components and
|
||||
subprojects are focused on better understanding Kubernetes and its performance in
|
||||
production, which allowed me to get involved as one of the few SIG Chairs working as
|
||||
an SRE at that time. I like that we provide opportunities for newcomers to contribute
|
||||
through using, testing, and providing feedback on our subprojects, which is a lower
|
||||
barrier to entry. Because many of these projects are out-of-tree, I think one of our
|
||||
challenges is to figure out what’s in scope for core Kubernetes SIGs instrumentation
|
||||
subprojects, what’s missing, and then fill in the gaps.
|
||||
|
||||
## Community and contribution
|
||||
|
||||
**Imran (INM)**: Kubernetes values community over products. Any recommendation
|
||||
for anyone looking into getting involved in SIG Instrumentation work? Where
|
||||
should they start (new contributor-friendly areas within SIG?)
|
||||
|
||||
**Han(HK) and Elana (EH)**: Come to our bi-weekly triage
|
||||
[meetings](https://github.com/kubernetes/community/tree/master/sig-instrumentation#meetings)!
|
||||
They aren’t recorded and are a great place to ask questions and learn about our ongoing work.
|
||||
We strive to be a friendly community and one of the easiest SIGs to get started with.
|
||||
You can check out our latest KubeCon NA 2022 [SIG Instrumentation Deep Dive](https://youtu.be/JIzrlWtAA8Y)
|
||||
to get more insight into our work. We also invite you to join our Slack channel #sig-instrumentation
|
||||
and feel free to reach out to any of our SIG leads or subproject owners directly.
|
||||
|
||||
Thank you so much for your time and insights into the workings of SIG Instrumentation!
|
|
@ -0,0 +1,50 @@
|
|||
---
|
||||
layout: blog
|
||||
title: "k8s.gcr.io Image Registry Will Be Frozen From the 3rd of April 2023"
|
||||
date: 2023-02-06
|
||||
slug: k8s-gcr-io-freeze-announcement
|
||||
---
|
||||
|
||||
**Authors**: Mahamed Ali (Rackspace Technology)
|
||||
|
||||
The Kubernetes project runs a community-owned image registry called `registry.k8s.io` to host its container images. On the 3rd of April 2023, the old registry `k8s.gcr.io` will be frozen and no further images for Kubernetes and related subprojects will be pushed to the old registry.
|
||||
|
||||
This registry `registry.k8s.io` replaced the old one and has been generally available for several months. We have published a [blog post](/blog/2022/11/28/registry-k8s-io-faster-cheaper-ga/) about its benefits to the community and the Kubernetes project. This post also announced that future versions of Kubernetes will not be available in the old registry. Now that time has come.
|
||||
|
||||
What does this change mean for contributors:
|
||||
- If you are a maintainer of a subproject, you will need to update your manifests and Helm charts to use the new registry.
|
||||
|
||||
What does this change mean for end users:
|
||||
- 1.27 Kubernetes release will not be published to the old registry.
|
||||
- Patch releases for 1.24, 1.25, and 1.26 will no longer be published to the old registry from April. Please read the timelines below for details of the final patch releases in the old registry.
|
||||
- Starting in 1.25, the default image registry has been set to `registry.k8s.io`. This value is overridable in `kubeadm` and `kubelet` but setting it to `k8s.gcr.io` will fail for new releases after April as they won’t be present in the old registry.
|
||||
- If you want to increase the reliability of your cluster and remove dependency on the community-owned registry or you are running Kubernetes in networks where external traffic is restricted, you should consider hosting local image registry mirrors. Some cloud vendors may offer hosted solutions for this.
|
||||
|
||||
## Timeline of the changes
|
||||
|
||||
- `k8s.gcr.io` will be frozen on the 3rd of April 2023
|
||||
- 1.27 is expected to be released on the 12th of April 2023
|
||||
- The last 1.23 release on `k8s.gcr.io` will be 1.23.18 (1.23 goes end-of-life before the freeze)
|
||||
- The last 1.24 release on `k8s.gcr.io` will be 1.24.12
|
||||
- The last 1.25 release on `k8s.gcr.io` will be 1.25.8
|
||||
- The last 1.26 release on `k8s.gcr.io` will be 1.26.3
|
||||
|
||||
## What's next
|
||||
|
||||
Please make sure your cluster does not have dependencies on old image registry. For example, you can run this command to list the images used by pods:
|
||||
|
||||
|
||||
```shell
|
||||
kubectl get pods --all-namespaces -o jsonpath="{.items[*].spec.containers[*].image}" |\
|
||||
tr -s '[[:space:]]' '\n' |\
|
||||
sort |\
|
||||
uniq -c
|
||||
```
|
||||
|
||||
There may be other dependencies on the old image registry. Make sure you review any potential dependencies to keep your cluster healthy and up to date.
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
__Change is hard__, and evolving our image-serving platform is needed to ensure a sustainable future for the project. We strive to make things better for everyone using Kubernetes. Many contributors from all corners of our community have been working long and hard to ensure we are making the best decisions possible, executing plans, and doing our best to communicate those plans.
|
||||
|
||||
Thanks to Aaron Crickenberger, Arnaud Meukam, Benjamin Elder, Caleb Woodbine, Davanum Srinivas, Mahamed Ali, and Tim Hockin from SIG K8s Infra, Brian McQueen, and Sergey Kanzhelev from SIG Node, Lubomir Ivanov from SIG Cluster Lifecycle, Adolfo García Veytia, Jeremy Rickard, Sascha Grunert, and Stephen Augustus from SIG Release, Bob Killen and Kaslin Fields from SIG Contribex, Tim Allclair from the Security Response Committee. Also a big thank you to our friends acting as liaisons with our cloud provider partners: Jay Pipes from Amazon and Jon Johnson Jr. from Google.
|
|
@ -0,0 +1,36 @@
|
|||
---
|
||||
layout: blog
|
||||
title: "Free Katacoda Kubernetes Tutorials Are Shutting Down"
|
||||
date: 2023-02-14
|
||||
slug: kubernetes-katacoda-tutorials-stop-from-2023-03-31
|
||||
evergreen: true
|
||||
---
|
||||
|
||||
**Author**: Natali Vlatko, SIG Docs Co-Chair for Kubernetes
|
||||
|
||||
[Katacoda](https://katacoda.com/kubernetes), the popular learning platform from O’Reilly that has been helping people learn all about
|
||||
Java, Docker, Kubernetes, Python, Go, C++, and more, [shut down for public use in June 2022](https://www.oreilly.com/online-learning/leveraging-katacoda-technology.html).
|
||||
However, tutorials specifically for Kubernetes, linked from the Kubernetes website for our project’s
|
||||
users and contributors, remained available and active after this change. Unfortunately, this will no
|
||||
longer be the case, and Katacoda tutorials for learning Kubernetes will cease working after March 31st, 2023.
|
||||
|
||||
The Kubernetes Project wishes to thank O'Reilly Media for the many years it has supported the community
|
||||
via the Katacoda learning platform. You can read more about [the decision to shutter katacoda.com](https://www.oreilly.com/online-learning/leveraging-katacoda-technology.html)
|
||||
on O'Reilly's own site. With this change, we’ll be focusing on the work needed to remove links to
|
||||
their various tutorials. We have a general issue tracking this topic at [#33936](https://github.com/kubernetes/website/issues/33936) and [GitHub discussion](https://github.com/kubernetes/website/discussions/38878). We’re also
|
||||
interested in researching what other learning platforms could be beneficial for the Kubernetes community,
|
||||
replacing Katacoda with a link to a platform or service that has a similar user experience. However,
|
||||
this research will take time, so we’re actively looking for volunteers to help with this work.
|
||||
If a replacement is found, it will need to be supported by Kubernetes leadership, specifically,
|
||||
SIG Contributor Experience, SIG Docs, and the Kubernetes Steering Committee.
|
||||
|
||||
The Katacoda shutdown affects 25 tutorial pages, their localizations, as well as the Katacoda
|
||||
Scenario repository: [github.com/katacoda-scenarios/kubernetes-bootcamp-scenarios](https://github.com/katacoda-scenarios/kubernetes-bootcamp-scenarios). We recommend
|
||||
that any links, guides, or documentation you have that points to the Katacoda learning platform be
|
||||
updated immediately to reflect this change. While we have yet to find a replacement learning solution,
|
||||
the Kubernetes website contains a lot of helpful documentation to support your continued learning and growth.
|
||||
You can find all of our available documentation tutorials for Kubernetes at https://k8s.io/docs/tutorials/.
|
||||
|
||||
If you have any questions regarding the Katacoda shutdown, or subsequent link removal from Kubernetes
|
||||
tutorial pages, please feel free to comment on the [general issue tracking the shutdown](https://github.com/kubernetes/website/issues/33936),
|
||||
or visit the #sig-docs channel on the Kubernetes Slack.
|
|
@ -103,6 +103,8 @@ updated to newer versions that support cgroup v2. For example:
|
|||
* If you run [cAdvisor](https://github.com/google/cadvisor) as a stand-alone
|
||||
DaemonSet for monitoring pods and containers, update it to v0.43.0 or later.
|
||||
* If you use JDK, prefer to use JDK 11.0.16 and later or JDK 15 and later, which [fully support cgroup v2](https://bugs.openjdk.org/browse/JDK-8230305).
|
||||
* If you are using the [uber-go/automaxprocs](https://github.com/uber-go/automaxprocs) package, make sure
|
||||
the version you use is v1.5.1 or higher.
|
||||
|
||||
## Identify the cgroup version on Linux Nodes {#check-cgroup-version}
|
||||
|
||||
|
|
|
@ -144,7 +144,7 @@ which you can define:
|
|||
|
||||
* `MinAge`: the minimum age at which the kubelet can garbage collect a
|
||||
container. Disable by setting to `0`.
|
||||
* `MaxPerPodContainer`: the maximum number of dead containers each Pod pair
|
||||
* `MaxPerPodContainer`: the maximum number of dead containers each Pod
|
||||
can have. Disable by setting to less than `0`.
|
||||
* `MaxContainers`: the maximum number of dead containers the cluster can have.
|
||||
Disable by setting to less than `0`.
|
||||
|
|
|
@ -6,30 +6,32 @@ weight: 30
|
|||
|
||||
<!-- overview -->
|
||||
|
||||
Distributed systems often have a need for "leases", which provides a mechanism to lock shared resources and coordinate activity between nodes.
|
||||
In Kubernetes, the "lease" concept is represented by `Lease` objects in the `coordination.k8s.io` API group, which are used for system-critical
|
||||
capabilities like node heart beats and component-level leader election.
|
||||
Distributed systems often have a need for _leases_, which provide a mechanism to lock shared resources
|
||||
and coordinate activity between members of a set.
|
||||
In Kubernetes, the lease concept is represented by [Lease](/docs/reference/kubernetes-api/cluster-resources/lease-v1/)
|
||||
objects in the `coordination.k8s.io` {{< glossary_tooltip text="API Group" term_id="api-group" >}},
|
||||
which are used for system-critical capabilities such as node heartbeats and component-level leader election.
|
||||
|
||||
<!-- body -->
|
||||
|
||||
## Node Heart Beats
|
||||
## Node heartbeats {#node-heart-beats}
|
||||
|
||||
Kubernetes uses the Lease API to communicate kubelet node heart beats to the Kubernetes API server.
|
||||
Kubernetes uses the Lease API to communicate kubelet node heartbeats to the Kubernetes API server.
|
||||
For every `Node` , there is a `Lease` object with a matching name in the `kube-node-lease`
|
||||
namespace. Under the hood, every kubelet heart beat is an UPDATE request to this `Lease` object, updating
|
||||
namespace. Under the hood, every kubelet heartbeat is an **update** request to this `Lease` object, updating
|
||||
the `spec.renewTime` field for the Lease. The Kubernetes control plane uses the time stamp of this field
|
||||
to determine the availability of this `Node`.
|
||||
|
||||
See [Node Lease objects](/docs/concepts/architecture/nodes/#heartbeats) for more details.
|
||||
|
||||
## Leader Election
|
||||
## Leader election
|
||||
|
||||
Leases are also used in Kubernetes to ensure only one instance of a component is running at any given time.
|
||||
Kubernetes also uses Leases to ensure only one instance of a component is running at any given time.
|
||||
This is used by control plane components like `kube-controller-manager` and `kube-scheduler` in
|
||||
HA configurations, where only one instance of the component should be actively running while the other
|
||||
instances are on stand-by.
|
||||
|
||||
## API Server Identity
|
||||
## API server identity
|
||||
|
||||
{{< feature-state for_k8s_version="v1.26" state="beta" >}}
|
||||
|
||||
|
@ -43,22 +45,23 @@ You can inspect Leases owned by each kube-apiserver by checking for lease object
|
|||
with the name `kube-apiserver-<sha256-hash>`. Alternatively you can use the label selector `k8s.io/component=kube-apiserver`:
|
||||
|
||||
```shell
|
||||
$ kubectl -n kube-system get lease -l k8s.io/component=kube-apiserver
|
||||
kubectl -n kube-system get lease -l k8s.io/component=kube-apiserver
|
||||
```
|
||||
```
|
||||
NAME HOLDER AGE
|
||||
kube-apiserver-c4vwjftbvpc5os2vvzle4qg27a kube-apiserver-c4vwjftbvpc5os2vvzle4qg27a_9cbf54e5-1136-44bd-8f9a-1dcd15c346b4 5m33s
|
||||
kube-apiserver-dz2dqprdpsgnm756t5rnov7yka kube-apiserver-dz2dqprdpsgnm756t5rnov7yka_84f2a85d-37c1-4b14-b6b9-603e62e4896f 4m23s
|
||||
kube-apiserver-fyloo45sdenffw2ugwaz3likua kube-apiserver-fyloo45sdenffw2ugwaz3likua_c5ffa286-8a9a-45d4-91e7-61118ed58d2e 4m43s
|
||||
```
|
||||
|
||||
The SHA256 hash used in the lease name is based on the OS hostname as seen by kube-apiserver. Each kube-apiserver should be
|
||||
The SHA256 hash used in the lease name is based on the OS hostname as seen by that API server. Each kube-apiserver should be
|
||||
configured to use a hostname that is unique within the cluster. New instances of kube-apiserver that use the same hostname
|
||||
will take over existing Leases using a new holder identity, as opposed to instantiating new lease objects. You can check the
|
||||
will take over existing Leases using a new holder identity, as opposed to instantiating new Lease objects. You can check the
|
||||
hostname used by kube-apisever by checking the value of the `kubernetes.io/hostname` label:
|
||||
|
||||
```shell
|
||||
$ kubectl -n kube-system get lease kube-apiserver-c4vwjftbvpc5os2vvzle4qg27a -o yaml
|
||||
kubectl -n kube-system get lease kube-apiserver-c4vwjftbvpc5os2vvzle4qg27a -o yaml
|
||||
```
|
||||
|
||||
```yaml
|
||||
apiVersion: coordination.k8s.io/v1
|
||||
kind: Lease
|
||||
|
@ -78,3 +81,23 @@ spec:
|
|||
```
|
||||
|
||||
Expired leases from kube-apiservers that no longer exist are garbage collected by new kube-apiservers after 1 hour.
|
||||
|
||||
You can disable API server identity leases by disabling the `APIServerIdentity`
|
||||
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/).
|
||||
|
||||
## Workloads {#custom-workload}
|
||||
|
||||
Your own workload can define its own use of Leases. For example, you might run a custom
|
||||
{{< glossary_tooltip term_id="controller" text="controller" >}} where a primary or leader member
|
||||
performs operations that its peers do not. You define a Lease so that the controller replicas can select
|
||||
or elect a leader, using the Kubernetes API for coordination.
|
||||
If you do use a Lease, it's a good practice to define a name for the Lease that is obviously linked to
|
||||
the product or component. For example, if you have a component named Example Foo, use a Lease named
|
||||
`example-foo`.
|
||||
|
||||
If a cluster operator or another end user could deploy multiple instances of a component, select a name
|
||||
prefix and pick a mechanism (such as hash of the name of the Deployment) to avoid name collisions
|
||||
for the Leases.
|
||||
|
||||
You can use another approach so long as it achieves the same outcome: different software products do
|
||||
not conflict with one another.
|
||||
|
|
|
@ -9,7 +9,7 @@ weight: 10
|
|||
|
||||
<!-- overview -->
|
||||
|
||||
Kubernetes runs your workload by placing containers into Pods to run on _Nodes_.
|
||||
Kubernetes runs your {{< glossary_tooltip text="workload" term_id="workload" >}} by placing containers into Pods to run on _Nodes_.
|
||||
A node may be a virtual or physical machine, depending on the cluster. Each node
|
||||
is managed by the
|
||||
{{< glossary_tooltip text="control plane" term_id="control-plane" >}}
|
||||
|
@ -274,7 +274,7 @@ availability of each node, and to take action when failures are detected.
|
|||
For nodes there are two forms of heartbeats:
|
||||
|
||||
* updates to the `.status` of a Node
|
||||
* [Lease](/docs/reference/kubernetes-api/cluster-resources/lease-v1/) objects
|
||||
* [Lease](/docs/concepts/architecture/leases/) objects
|
||||
within the `kube-node-lease`
|
||||
{{< glossary_tooltip term_id="namespace" text="namespace">}}.
|
||||
Each Node has an associated Lease object.
|
||||
|
@ -563,7 +563,7 @@ ShutdownGracePeriodCriticalPods are not configured properly. Please refer to abo
|
|||
section [Graceful Node Shutdown](#graceful-node-shutdown) for more details.
|
||||
|
||||
When a node is shutdown but not detected by kubelet's Node Shutdown Manager, the pods
|
||||
that are part of a StatefulSet will be stuck in terminating status on
|
||||
that are part of a {{< glossary_tooltip text="StatefulSet" term_id="statefulset" >}} will be stuck in terminating status on
|
||||
the shutdown node and cannot move to a new running node. This is because kubelet on
|
||||
the shutdown node is not available to delete the pods so the StatefulSet cannot
|
||||
create a new pod with the same name. If there are volumes used by the pods, the
|
||||
|
@ -577,7 +577,7 @@ these pods will be stuck in terminating status on the shutdown node forever.
|
|||
To mitigate the above situation, a user can manually add the taint `node.kubernetes.io/out-of-service` with either `NoExecute`
|
||||
or `NoSchedule` effect to a Node marking it out-of-service.
|
||||
If the `NodeOutOfServiceVolumeDetach`[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
|
||||
is enabled on `kube-controller-manager`, and a Node is marked out-of-service with this taint, the
|
||||
is enabled on {{< glossary_tooltip text="kube-controller-manager" term_id="kube-controller-manager" >}}, and a Node is marked out-of-service with this taint, the
|
||||
pods on the node will be forcefully deleted if there are no matching tolerations on it and volume
|
||||
detach operations for the pods terminating on the node will happen immediately. This allows the
|
||||
Pods on the out-of-service node to recover quickly on a different node.
|
||||
|
@ -646,9 +646,11 @@ see [KEP-2400](https://github.com/kubernetes/enhancements/issues/2400) and its
|
|||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
* Learn about the [components](/docs/concepts/overview/components/#node-components) that make up a node.
|
||||
* Read the [API definition for Node](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#node-v1-core).
|
||||
* Read the [Node](https://git.k8s.io/design-proposals-archive/architecture/architecture.md#the-kubernetes-node)
|
||||
section of the architecture design document.
|
||||
* Read about [taints and tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/).
|
||||
Learn more about the following:
|
||||
* [Components](/docs/concepts/overview/components/#node-components) that make up a node.
|
||||
* [API definition for Node](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#node-v1-core).
|
||||
* [Node](https://git.k8s.io/design-proposals-archive/architecture/architecture.md#the-kubernetes-node) section of the architecture design document.
|
||||
* [Taints and Tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/).
|
||||
* [Node Resource Managers](/docs/concepts/policy/node-resource-managers/).
|
||||
* [Resource Management for Windows nodes](/docs/concepts/configuration/windows-resource-management/).
|
||||
|
||||
|
|
|
@ -17,11 +17,11 @@ This page lists some of the available add-ons and links to their respective inst
|
|||
## Networking and Network Policy
|
||||
|
||||
* [ACI](https://www.github.com/noironetworks/aci-containers) provides integrated container networking and network security with Cisco ACI.
|
||||
* [Antrea](https://antrea.io/) operates at Layer 3/4 to provide networking and security services for Kubernetes, leveraging Open vSwitch as the networking data plane.
|
||||
* [Antrea](https://antrea.io/) operates at Layer 3/4 to provide networking and security services for Kubernetes, leveraging Open vSwitch as the networking data plane. Antrea is a [CNCF project at the Sandbox level](https://www.cncf.io/projects/antrea/).
|
||||
* [Calico](https://docs.projectcalico.org/latest/introduction/) is a networking and network policy provider. Calico supports a flexible set of networking options so you can choose the most efficient option for your situation, including non-overlay and overlay networks, with or without BGP. Calico uses the same engine to enforce network policy for hosts, pods, and (if using Istio & Envoy) applications at the service mesh layer.
|
||||
* [Canal](https://projectcalico.docs.tigera.io/getting-started/kubernetes/flannel/flannel) unites Flannel and Calico, providing networking and network policy.
|
||||
* [Cilium](https://github.com/cilium/cilium) is a networking, observability, and security solution with an eBPF-based data plane. Cilium provides a simple flat Layer 3 network with the ability to span multiple clusters in either a native routing or overlay/encapsulation mode, and can enforce network policies on L3-L7 using an identity-based security model that is decoupled from network addressing. Cilium can act as a replacement for kube-proxy; it also offers additional, opt-in observability and security features.
|
||||
* [CNI-Genie](https://github.com/cni-genie/CNI-Genie) enables Kubernetes to seamlessly connect to a choice of CNI plugins, such as Calico, Canal, Flannel, or Weave.
|
||||
* [Cilium](https://github.com/cilium/cilium) is a networking, observability, and security solution with an eBPF-based data plane. Cilium provides a simple flat Layer 3 network with the ability to span multiple clusters in either a native routing or overlay/encapsulation mode, and can enforce network policies on L3-L7 using an identity-based security model that is decoupled from network addressing. Cilium can act as a replacement for kube-proxy; it also offers additional, opt-in observability and security features. Cilium is a [CNCF project at the Incubation level](https://www.cncf.io/projects/cilium/).
|
||||
* [CNI-Genie](https://github.com/cni-genie/CNI-Genie) enables Kubernetes to seamlessly connect to a choice of CNI plugins, such as Calico, Canal, Flannel, or Weave. CNI-Genie is a [CNCF project at the Sandbox level](https://www.cncf.io/projects/cni-genie/).
|
||||
* [Contiv](https://contivpp.io/) provides configurable networking (native L3 using BGP, overlay using vxlan, classic L2, and Cisco-SDN/ACI) for various use cases and a rich policy framework. Contiv project is fully [open sourced](https://github.com/contiv). The [installer](https://github.com/contiv/install) provides both kubeadm and non-kubeadm based installation options.
|
||||
* [Contrail](https://www.juniper.net/us/en/products-services/sdn/contrail/contrail-networking/), based on [Tungsten Fabric](https://tungsten.io), is an open source, multi-cloud network virtualization and policy management platform. Contrail and Tungsten Fabric are integrated with orchestration systems such as Kubernetes, OpenShift, OpenStack and Mesos, and provide isolation modes for virtual machines, containers/pods and bare metal workloads.
|
||||
* [Flannel](https://github.com/flannel-io/flannel#deploying-flannel-manually) is an overlay network provider that can be used with Kubernetes.
|
||||
|
|
|
@ -638,6 +638,10 @@ poorly-behaved workloads that may be harming system health.
|
|||
standard deviation of seat demand seen during the last concurrency
|
||||
borrowing adjustment period.
|
||||
|
||||
* `apiserver_flowcontrol_demand_seats_smoothed` is a gauge vector
|
||||
holding, for each priority level, the smoothed enveloped seat demand
|
||||
determined at the last concurrency adjustment.
|
||||
|
||||
* `apiserver_flowcontrol_target_seats` is a gauge vector holding, for
|
||||
each priority level, the concurrency target going into the borrowing
|
||||
allocation problem.
|
||||
|
@ -701,14 +705,15 @@ serves the following additional paths at its HTTP[S] ports.
|
|||
The output is similar to this:
|
||||
|
||||
```none
|
||||
PriorityLevelName, ActiveQueues, IsIdle, IsQuiescing, WaitingRequests, ExecutingRequests,
|
||||
workload-low, 0, true, false, 0, 0,
|
||||
global-default, 0, true, false, 0, 0,
|
||||
exempt, <none>, <none>, <none>, <none>, <none>,
|
||||
catch-all, 0, true, false, 0, 0,
|
||||
system, 0, true, false, 0, 0,
|
||||
leader-election, 0, true, false, 0, 0,
|
||||
workload-high, 0, true, false, 0, 0,
|
||||
PriorityLevelName, ActiveQueues, IsIdle, IsQuiescing, WaitingRequests, ExecutingRequests, DispatchedRequests, RejectedRequests, TimedoutRequests, CancelledRequests
|
||||
catch-all, 0, true, false, 0, 0, 1, 0, 0, 0
|
||||
exempt, <none>, <none>, <none>, <none>, <none>, <none>, <none>, <none>, <none>
|
||||
global-default, 0, true, false, 0, 0, 46, 0, 0, 0
|
||||
leader-election, 0, true, false, 0, 0, 4, 0, 0, 0
|
||||
node-high, 0, true, false, 0, 0, 34, 0, 0, 0
|
||||
system, 0, true, false, 0, 0, 48, 0, 0, 0
|
||||
workload-high, 0, true, false, 0, 0, 500, 0, 0, 0
|
||||
workload-low, 0, true, false, 0, 0, 0, 0, 0, 0
|
||||
```
|
||||
|
||||
- `/debug/api_priority_and_fairness/dump_queues` - a listing of all the
|
||||
|
@ -761,7 +766,34 @@ serves the following additional paths at its HTTP[S] ports.
|
|||
system, system-nodes, 12, 0, system:node:127.0.0.1, 2020-07-23T15:31:03.583823404Z, system:node:127.0.0.1, create, /api/v1/namespaces/scaletest/configmaps,
|
||||
system, system-nodes, 12, 1, system:node:127.0.0.1, 2020-07-23T15:31:03.594555947Z, system:node:127.0.0.1, create, /api/v1/namespaces/scaletest/configmaps,
|
||||
```
|
||||
|
||||
|
||||
### Debug logging
|
||||
|
||||
At `-v=3` or more verbose the server outputs an httplog line for every
|
||||
request, and it includes the following attributes.
|
||||
|
||||
- `apf_fs`: the name of the flow schema to which the request was classified.
|
||||
- `apf_pl`: the name of the priority level for that flow schema.
|
||||
- `apf_iseats`: the number of seats determined for the initial
|
||||
(normal) stage of execution of the request.
|
||||
- `apf_fseats`: the number of seats determined for the final stage of
|
||||
execution (accounting for the associated WATCH notifications) of the
|
||||
request.
|
||||
- `apf_additionalLatency`: the duration of the final stage of
|
||||
execution of the request.
|
||||
|
||||
At higher levels of verbosity there will be log lines exposing details
|
||||
of how APF handled the request, primarily for debug purposes.
|
||||
|
||||
### Response headers
|
||||
|
||||
APF adds the following two headers to each HTTP response message.
|
||||
|
||||
- `X-Kubernetes-PF-FlowSchema-UID` holds the UID of the FlowSchema
|
||||
object to which the corresponding request was classified.
|
||||
- `X-Kubernetes-PF-PriorityLevel-UID` holds the UID of the
|
||||
PriorityLevelConfiguration object associated with that FlowSchema.
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
|
||||
|
|
|
@ -1,20 +1,26 @@
|
|||
---
|
||||
reviewers:
|
||||
- janetkuo
|
||||
title: Managing Resources
|
||||
content_type: concept
|
||||
reviewers:
|
||||
- janetkuo
|
||||
weight: 40
|
||||
---
|
||||
|
||||
<!-- overview -->
|
||||
|
||||
You've deployed your application and exposed it via a service. Now what? Kubernetes provides a number of tools to help you manage your application deployment, including scaling and updating. Among the features that we will discuss in more depth are [configuration files](/docs/concepts/configuration/overview/) and [labels](/docs/concepts/overview/working-with-objects/labels/).
|
||||
You've deployed your application and exposed it via a service. Now what? Kubernetes provides a
|
||||
number of tools to help you manage your application deployment, including scaling and updating.
|
||||
Among the features that we will discuss in more depth are
|
||||
[configuration files](/docs/concepts/configuration/overview/) and
|
||||
[labels](/docs/concepts/overview/working-with-objects/labels/).
|
||||
|
||||
<!-- body -->
|
||||
|
||||
## Organizing resource configurations
|
||||
|
||||
Many applications require multiple resources to be created, such as a Deployment and a Service. Management of multiple resources can be simplified by grouping them together in the same file (separated by `---` in YAML). For example:
|
||||
Many applications require multiple resources to be created, such as a Deployment and a Service.
|
||||
Management of multiple resources can be simplified by grouping them together in the same file
|
||||
(separated by `---` in YAML). For example:
|
||||
|
||||
{{< codenew file="application/nginx-app.yaml" >}}
|
||||
|
||||
|
@ -24,81 +30,99 @@ Multiple resources can be created the same way as a single resource:
|
|||
kubectl apply -f https://k8s.io/examples/application/nginx-app.yaml
|
||||
```
|
||||
|
||||
```shell
|
||||
```none
|
||||
service/my-nginx-svc created
|
||||
deployment.apps/my-nginx created
|
||||
```
|
||||
|
||||
The resources will be created in the order they appear in the file. Therefore, it's best to specify the service first, since that will ensure the scheduler can spread the pods associated with the service as they are created by the controller(s), such as Deployment.
|
||||
The resources will be created in the order they appear in the file. Therefore, it's best to
|
||||
specify the service first, since that will ensure the scheduler can spread the pods associated
|
||||
with the service as they are created by the controller(s), such as Deployment.
|
||||
|
||||
`kubectl apply` also accepts multiple `-f` arguments:
|
||||
|
||||
```shell
|
||||
kubectl apply -f https://k8s.io/examples/application/nginx/nginx-svc.yaml -f https://k8s.io/examples/application/nginx/nginx-deployment.yaml
|
||||
kubectl apply -f https://k8s.io/examples/application/nginx/nginx-svc.yaml \
|
||||
-f https://k8s.io/examples/application/nginx/nginx-deployment.yaml
|
||||
```
|
||||
|
||||
It is a recommended practice to put resources related to the same microservice or application tier into the same file, and to group all of the files associated with your application in the same directory. If the tiers of your application bind to each other using DNS, you can deploy all of the components of your stack together.
|
||||
|
||||
A URL can also be specified as a configuration source, which is handy for deploying directly from configuration files checked into GitHub:
|
||||
It is a recommended practice to put resources related to the same microservice or application tier
|
||||
into the same file, and to group all of the files associated with your application in the same
|
||||
directory. If the tiers of your application bind to each other using DNS, you can deploy all of
|
||||
the components of your stack together.
|
||||
|
||||
A URL can also be specified as a configuration source, which is handy for deploying directly from
|
||||
configuration files checked into GitHub:
|
||||
|
||||
```shell
|
||||
kubectl apply -f https://raw.githubusercontent.com/kubernetes/website/main/content/en/examples/application/nginx/nginx-deployment.yaml
|
||||
kubectl apply -f https://k8s.io/examples/application/nginx/nginx-deployment.yaml
|
||||
```
|
||||
|
||||
```shell
|
||||
```none
|
||||
deployment.apps/my-nginx created
|
||||
```
|
||||
|
||||
## Bulk operations in kubectl
|
||||
|
||||
Resource creation isn't the only operation that `kubectl` can perform in bulk. It can also extract resource names from configuration files in order to perform other operations, in particular to delete the same resources you created:
|
||||
Resource creation isn't the only operation that `kubectl` can perform in bulk. It can also extract
|
||||
resource names from configuration files in order to perform other operations, in particular to
|
||||
delete the same resources you created:
|
||||
|
||||
```shell
|
||||
kubectl delete -f https://k8s.io/examples/application/nginx-app.yaml
|
||||
```
|
||||
|
||||
```shell
|
||||
```none
|
||||
deployment.apps "my-nginx" deleted
|
||||
service "my-nginx-svc" deleted
|
||||
```
|
||||
|
||||
In the case of two resources, you can specify both resources on the command line using the resource/name syntax:
|
||||
In the case of two resources, you can specify both resources on the command line using the
|
||||
resource/name syntax:
|
||||
|
||||
```shell
|
||||
kubectl delete deployments/my-nginx services/my-nginx-svc
|
||||
```
|
||||
|
||||
For larger numbers of resources, you'll find it easier to specify the selector (label query) specified using `-l` or `--selector`, to filter resources by their labels:
|
||||
For larger numbers of resources, you'll find it easier to specify the selector (label query)
|
||||
specified using `-l` or `--selector`, to filter resources by their labels:
|
||||
|
||||
```shell
|
||||
kubectl delete deployment,services -l app=nginx
|
||||
```
|
||||
|
||||
```shell
|
||||
```none
|
||||
deployment.apps "my-nginx" deleted
|
||||
service "my-nginx-svc" deleted
|
||||
```
|
||||
|
||||
Because `kubectl` outputs resource names in the same syntax it accepts, you can chain operations using `$()` or `xargs`:
|
||||
Because `kubectl` outputs resource names in the same syntax it accepts, you can chain operations
|
||||
using `$()` or `xargs`:
|
||||
|
||||
```shell
|
||||
kubectl get $(kubectl create -f docs/concepts/cluster-administration/nginx/ -o name | grep service)
|
||||
kubectl create -f docs/concepts/cluster-administration/nginx/ -o name | grep service | xargs -i kubectl get {}
|
||||
```
|
||||
|
||||
```shell
|
||||
```none
|
||||
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
||||
my-nginx-svc LoadBalancer 10.0.0.208 <pending> 80/TCP 0s
|
||||
```
|
||||
|
||||
With the above commands, we first create resources under `examples/application/nginx/` and print the resources created with `-o name` output format
|
||||
(print each resource as resource/name). Then we `grep` only the "service", and then print it with `kubectl get`.
|
||||
With the above commands, we first create resources under `examples/application/nginx/` and print
|
||||
the resources created with `-o name` output format (print each resource as resource/name).
|
||||
Then we `grep` only the "service", and then print it with `kubectl get`.
|
||||
|
||||
If you happen to organize your resources across several subdirectories within a particular directory, you can recursively perform the operations on the subdirectories also, by specifying `--recursive` or `-R` alongside the `--filename,-f` flag.
|
||||
If you happen to organize your resources across several subdirectories within a particular
|
||||
directory, you can recursively perform the operations on the subdirectories also, by specifying
|
||||
`--recursive` or `-R` alongside the `--filename,-f` flag.
|
||||
|
||||
For instance, assume there is a directory `project/k8s/development` that holds all of the {{< glossary_tooltip text="manifests" term_id="manifest" >}} needed for the development environment, organized by resource type:
|
||||
For instance, assume there is a directory `project/k8s/development` that holds all of the
|
||||
{{< glossary_tooltip text="manifests" term_id="manifest" >}} needed for the development environment,
|
||||
organized by resource type:
|
||||
|
||||
```
|
||||
```none
|
||||
project/k8s/development
|
||||
├── configmap
|
||||
│ └── my-configmap.yaml
|
||||
|
@ -108,13 +132,15 @@ project/k8s/development
|
|||
└── my-pvc.yaml
|
||||
```
|
||||
|
||||
By default, performing a bulk operation on `project/k8s/development` will stop at the first level of the directory, not processing any subdirectories. If we had tried to create the resources in this directory using the following command, we would have encountered an error:
|
||||
By default, performing a bulk operation on `project/k8s/development` will stop at the first level
|
||||
of the directory, not processing any subdirectories. If we had tried to create the resources in
|
||||
this directory using the following command, we would have encountered an error:
|
||||
|
||||
```shell
|
||||
kubectl apply -f project/k8s/development
|
||||
```
|
||||
|
||||
```shell
|
||||
```none
|
||||
error: you must provide one or more resources by argument or filename (.json|.yaml|.yml|stdin)
|
||||
```
|
||||
|
||||
|
@ -124,13 +150,14 @@ Instead, specify the `--recursive` or `-R` flag with the `--filename,-f` flag as
|
|||
kubectl apply -f project/k8s/development --recursive
|
||||
```
|
||||
|
||||
```shell
|
||||
```none
|
||||
configmap/my-config created
|
||||
deployment.apps/my-deployment created
|
||||
persistentvolumeclaim/my-pvc created
|
||||
```
|
||||
|
||||
The `--recursive` flag works with any operation that accepts the `--filename,-f` flag such as: `kubectl {create,get,delete,describe,rollout}` etc.
|
||||
The `--recursive` flag works with any operation that accepts the `--filename,-f` flag such as:
|
||||
`kubectl {create,get,delete,describe,rollout}` etc.
|
||||
|
||||
The `--recursive` flag also works when multiple `-f` arguments are provided:
|
||||
|
||||
|
@ -138,7 +165,7 @@ The `--recursive` flag also works when multiple `-f` arguments are provided:
|
|||
kubectl apply -f project/k8s/namespaces -f project/k8s/development --recursive
|
||||
```
|
||||
|
||||
```shell
|
||||
```none
|
||||
namespace/development created
|
||||
namespace/staging created
|
||||
configmap/my-config created
|
||||
|
@ -146,36 +173,41 @@ deployment.apps/my-deployment created
|
|||
persistentvolumeclaim/my-pvc created
|
||||
```
|
||||
|
||||
If you're interested in learning more about `kubectl`, go ahead and read [Command line tool (kubectl)](/docs/reference/kubectl/).
|
||||
If you're interested in learning more about `kubectl`, go ahead and read
|
||||
[Command line tool (kubectl)](/docs/reference/kubectl/).
|
||||
|
||||
## Using labels effectively
|
||||
|
||||
The examples we've used so far apply at most a single label to any resource. There are many scenarios where multiple labels should be used to distinguish sets from one another.
|
||||
The examples we've used so far apply at most a single label to any resource. There are many
|
||||
scenarios where multiple labels should be used to distinguish sets from one another.
|
||||
|
||||
For instance, different applications would use different values for the `app` label, but a multi-tier application, such as the [guestbook example](https://github.com/kubernetes/examples/tree/master/guestbook/), would additionally need to distinguish each tier. The frontend could carry the following labels:
|
||||
For instance, different applications would use different values for the `app` label, but a
|
||||
multi-tier application, such as the [guestbook example](https://github.com/kubernetes/examples/tree/master/guestbook/),
|
||||
would additionally need to distinguish each tier. The frontend could carry the following labels:
|
||||
|
||||
```yaml
|
||||
labels:
|
||||
app: guestbook
|
||||
tier: frontend
|
||||
labels:
|
||||
app: guestbook
|
||||
tier: frontend
|
||||
```
|
||||
|
||||
while the Redis master and slave would have different `tier` labels, and perhaps even an additional `role` label:
|
||||
while the Redis master and slave would have different `tier` labels, and perhaps even an
|
||||
additional `role` label:
|
||||
|
||||
```yaml
|
||||
labels:
|
||||
app: guestbook
|
||||
tier: backend
|
||||
role: master
|
||||
labels:
|
||||
app: guestbook
|
||||
tier: backend
|
||||
role: master
|
||||
```
|
||||
|
||||
and
|
||||
|
||||
```yaml
|
||||
labels:
|
||||
app: guestbook
|
||||
tier: backend
|
||||
role: slave
|
||||
labels:
|
||||
app: guestbook
|
||||
tier: backend
|
||||
role: slave
|
||||
```
|
||||
|
||||
The labels allow us to slice and dice our resources along any dimension specified by a label:
|
||||
|
@ -185,7 +217,7 @@ kubectl apply -f examples/guestbook/all-in-one/guestbook-all-in-one.yaml
|
|||
kubectl get pods -Lapp -Ltier -Lrole
|
||||
```
|
||||
|
||||
```shell
|
||||
```none
|
||||
NAME READY STATUS RESTARTS AGE APP TIER ROLE
|
||||
guestbook-fe-4nlpb 1/1 Running 0 1m guestbook frontend <none>
|
||||
guestbook-fe-ght6d 1/1 Running 0 1m guestbook frontend <none>
|
||||
|
@ -200,7 +232,8 @@ my-nginx-o0ef1 1/1 Running 0 29m nginx
|
|||
```shell
|
||||
kubectl get pods -lapp=guestbook,role=slave
|
||||
```
|
||||
```shell
|
||||
|
||||
```none
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
guestbook-redis-slave-2q2yf 1/1 Running 0 3m
|
||||
guestbook-redis-slave-qgazl 1/1 Running 0 3m
|
||||
|
@ -208,62 +241,72 @@ guestbook-redis-slave-qgazl 1/1 Running 0 3m
|
|||
|
||||
## Canary deployments
|
||||
|
||||
Another scenario where multiple labels are needed is to distinguish deployments of different releases or configurations of the same component. It is common practice to deploy a *canary* of a new application release (specified via image tag in the pod template) side by side with the previous release so that the new release can receive live production traffic before fully rolling it out.
|
||||
Another scenario where multiple labels are needed is to distinguish deployments of different
|
||||
releases or configurations of the same component. It is common practice to deploy a *canary* of a
|
||||
new application release (specified via image tag in the pod template) side by side with the
|
||||
previous release so that the new release can receive live production traffic before fully rolling
|
||||
it out.
|
||||
|
||||
For instance, you can use a `track` label to differentiate different releases.
|
||||
|
||||
The primary, stable release would have a `track` label with value as `stable`:
|
||||
|
||||
```yaml
|
||||
name: frontend
|
||||
replicas: 3
|
||||
...
|
||||
labels:
|
||||
app: guestbook
|
||||
tier: frontend
|
||||
track: stable
|
||||
...
|
||||
image: gb-frontend:v3
|
||||
```none
|
||||
name: frontend
|
||||
replicas: 3
|
||||
...
|
||||
labels:
|
||||
app: guestbook
|
||||
tier: frontend
|
||||
track: stable
|
||||
...
|
||||
image: gb-frontend:v3
|
||||
```
|
||||
|
||||
and then you can create a new release of the guestbook frontend that carries the `track` label with different value (i.e. `canary`), so that two sets of pods would not overlap:
|
||||
and then you can create a new release of the guestbook frontend that carries the `track` label
|
||||
with different value (i.e. `canary`), so that two sets of pods would not overlap:
|
||||
|
||||
```yaml
|
||||
name: frontend-canary
|
||||
replicas: 1
|
||||
...
|
||||
labels:
|
||||
app: guestbook
|
||||
tier: frontend
|
||||
track: canary
|
||||
...
|
||||
image: gb-frontend:v4
|
||||
```none
|
||||
name: frontend-canary
|
||||
replicas: 1
|
||||
...
|
||||
labels:
|
||||
app: guestbook
|
||||
tier: frontend
|
||||
track: canary
|
||||
...
|
||||
image: gb-frontend:v4
|
||||
```
|
||||
|
||||
|
||||
The frontend service would span both sets of replicas by selecting the common subset of their labels (i.e. omitting the `track` label), so that the traffic will be redirected to both applications:
|
||||
The frontend service would span both sets of replicas by selecting the common subset of their
|
||||
labels (i.e. omitting the `track` label), so that the traffic will be redirected to both
|
||||
applications:
|
||||
|
||||
```yaml
|
||||
selector:
|
||||
app: guestbook
|
||||
tier: frontend
|
||||
selector:
|
||||
app: guestbook
|
||||
tier: frontend
|
||||
```
|
||||
|
||||
You can tweak the number of replicas of the stable and canary releases to determine the ratio of each release that will receive live production traffic (in this case, 3:1).
|
||||
Once you're confident, you can update the stable track to the new application release and remove the canary one.
|
||||
You can tweak the number of replicas of the stable and canary releases to determine the ratio of
|
||||
each release that will receive live production traffic (in this case, 3:1).
|
||||
Once you're confident, you can update the stable track to the new application release and remove
|
||||
the canary one.
|
||||
|
||||
For a more concrete example, check the [tutorial of deploying Ghost](https://github.com/kelseyhightower/talks/tree/master/kubecon-eu-2016/demo#deploy-a-canary).
|
||||
For a more concrete example, check the
|
||||
[tutorial of deploying Ghost](https://github.com/kelseyhightower/talks/tree/master/kubecon-eu-2016/demo#deploy-a-canary).
|
||||
|
||||
## Updating labels
|
||||
|
||||
Sometimes existing pods and other resources need to be relabeled before creating new resources. This can be done with `kubectl label`.
|
||||
Sometimes existing pods and other resources need to be relabeled before creating new resources.
|
||||
This can be done with `kubectl label`.
|
||||
For example, if you want to label all your nginx pods as frontend tier, run:
|
||||
|
||||
```shell
|
||||
kubectl label pods -l app=nginx tier=fe
|
||||
```
|
||||
|
||||
```shell
|
||||
```none
|
||||
pod/my-nginx-2035384211-j5fhi labeled
|
||||
pod/my-nginx-2035384211-u2c7e labeled
|
||||
pod/my-nginx-2035384211-u3t6x labeled
|
||||
|
@ -275,20 +318,25 @@ To see the pods you labeled, run:
|
|||
```shell
|
||||
kubectl get pods -l app=nginx -L tier
|
||||
```
|
||||
```shell
|
||||
|
||||
```none
|
||||
NAME READY STATUS RESTARTS AGE TIER
|
||||
my-nginx-2035384211-j5fhi 1/1 Running 0 23m fe
|
||||
my-nginx-2035384211-u2c7e 1/1 Running 0 23m fe
|
||||
my-nginx-2035384211-u3t6x 1/1 Running 0 23m fe
|
||||
```
|
||||
|
||||
This outputs all "app=nginx" pods, with an additional label column of pods' tier (specified with `-L` or `--label-columns`).
|
||||
This outputs all "app=nginx" pods, with an additional label column of pods' tier (specified with
|
||||
`-L` or `--label-columns`).
|
||||
|
||||
For more information, please see [labels](/docs/concepts/overview/working-with-objects/labels/) and [kubectl label](/docs/reference/generated/kubectl/kubectl-commands/#label).
|
||||
For more information, please see [labels](/docs/concepts/overview/working-with-objects/labels/)
|
||||
and [kubectl label](/docs/reference/generated/kubectl/kubectl-commands/#label).
|
||||
|
||||
## Updating annotations
|
||||
|
||||
Sometimes you would want to attach annotations to resources. Annotations are arbitrary non-identifying metadata for retrieval by API clients such as tools, libraries, etc. This can be done with `kubectl annotate`. For example:
|
||||
Sometimes you would want to attach annotations to resources. Annotations are arbitrary
|
||||
non-identifying metadata for retrieval by API clients such as tools, libraries, etc.
|
||||
This can be done with `kubectl annotate`. For example:
|
||||
|
||||
```shell
|
||||
kubectl annotate pods my-nginx-v4-9gw19 description='my frontend running nginx'
|
||||
|
@ -304,17 +352,19 @@ metadata:
|
|||
...
|
||||
```
|
||||
|
||||
For more information, please see [annotations](/docs/concepts/overview/working-with-objects/annotations/) and [kubectl annotate](/docs/reference/generated/kubectl/kubectl-commands/#annotate) document.
|
||||
For more information, see [annotations](/docs/concepts/overview/working-with-objects/annotations/)
|
||||
and [kubectl annotate](/docs/reference/generated/kubectl/kubectl-commands/#annotate) document.
|
||||
|
||||
## Scaling your application
|
||||
|
||||
When load on your application grows or shrinks, use `kubectl` to scale your application. For instance, to decrease the number of nginx replicas from 3 to 1, do:
|
||||
When load on your application grows or shrinks, use `kubectl` to scale your application.
|
||||
For instance, to decrease the number of nginx replicas from 3 to 1, do:
|
||||
|
||||
```shell
|
||||
kubectl scale deployment/my-nginx --replicas=1
|
||||
```
|
||||
|
||||
```shell
|
||||
```none
|
||||
deployment.apps/my-nginx scaled
|
||||
```
|
||||
|
||||
|
@ -324,25 +374,27 @@ Now you only have one pod managed by the deployment.
|
|||
kubectl get pods -l app=nginx
|
||||
```
|
||||
|
||||
```shell
|
||||
```none
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
my-nginx-2035384211-j5fhi 1/1 Running 0 30m
|
||||
```
|
||||
|
||||
To have the system automatically choose the number of nginx replicas as needed, ranging from 1 to 3, do:
|
||||
To have the system automatically choose the number of nginx replicas as needed,
|
||||
ranging from 1 to 3, do:
|
||||
|
||||
```shell
|
||||
kubectl autoscale deployment/my-nginx --min=1 --max=3
|
||||
```
|
||||
|
||||
```shell
|
||||
```none
|
||||
horizontalpodautoscaler.autoscaling/my-nginx autoscaled
|
||||
```
|
||||
|
||||
Now your nginx replicas will be scaled up and down as needed, automatically.
|
||||
|
||||
For more information, please see [kubectl scale](/docs/reference/generated/kubectl/kubectl-commands/#scale), [kubectl autoscale](/docs/reference/generated/kubectl/kubectl-commands/#autoscale) and [horizontal pod autoscaler](/docs/tasks/run-application/horizontal-pod-autoscale/) document.
|
||||
|
||||
For more information, please see [kubectl scale](/docs/reference/generated/kubectl/kubectl-commands/#scale),
|
||||
[kubectl autoscale](/docs/reference/generated/kubectl/kubectl-commands/#autoscale) and
|
||||
[horizontal pod autoscaler](/docs/tasks/run-application/horizontal-pod-autoscale/) document.
|
||||
|
||||
## In-place updates of resources
|
||||
|
||||
|
@ -353,20 +405,34 @@ Sometimes it's necessary to make narrow, non-disruptive updates to resources you
|
|||
It is suggested to maintain a set of configuration files in source control
|
||||
(see [configuration as code](https://martinfowler.com/bliki/InfrastructureAsCode.html)),
|
||||
so that they can be maintained and versioned along with the code for the resources they configure.
|
||||
Then, you can use [`kubectl apply`](/docs/reference/generated/kubectl/kubectl-commands/#apply) to push your configuration changes to the cluster.
|
||||
Then, you can use [`kubectl apply`](/docs/reference/generated/kubectl/kubectl-commands/#apply)
|
||||
to push your configuration changes to the cluster.
|
||||
|
||||
This command will compare the version of the configuration that you're pushing with the previous version and apply the changes you've made, without overwriting any automated changes to properties you haven't specified.
|
||||
This command will compare the version of the configuration that you're pushing with the previous
|
||||
version and apply the changes you've made, without overwriting any automated changes to properties
|
||||
you haven't specified.
|
||||
|
||||
```shell
|
||||
kubectl apply -f https://k8s.io/examples/application/nginx/nginx-deployment.yaml
|
||||
```
|
||||
|
||||
```none
|
||||
deployment.apps/my-nginx configured
|
||||
```
|
||||
|
||||
Note that `kubectl apply` attaches an annotation to the resource in order to determine the changes to the configuration since the previous invocation. When it's invoked, `kubectl apply` does a three-way diff between the previous configuration, the provided input and the current configuration of the resource, in order to determine how to modify the resource.
|
||||
Note that `kubectl apply` attaches an annotation to the resource in order to determine the changes
|
||||
to the configuration since the previous invocation. When it's invoked, `kubectl apply` does a
|
||||
three-way diff between the previous configuration, the provided input and the current
|
||||
configuration of the resource, in order to determine how to modify the resource.
|
||||
|
||||
Currently, resources are created without this annotation, so the first invocation of `kubectl apply` will fall back to a two-way diff between the provided input and the current configuration of the resource. During this first invocation, it cannot detect the deletion of properties set when the resource was created. For this reason, it will not remove them.
|
||||
Currently, resources are created without this annotation, so the first invocation of `kubectl
|
||||
apply` will fall back to a two-way diff between the provided input and the current configuration
|
||||
of the resource. During this first invocation, it cannot detect the deletion of properties set
|
||||
when the resource was created. For this reason, it will not remove them.
|
||||
|
||||
All subsequent calls to `kubectl apply`, and other commands that modify the configuration, such as `kubectl replace` and `kubectl edit`, will update the annotation, allowing subsequent calls to `kubectl apply` to detect and perform deletions using a three-way diff.
|
||||
All subsequent calls to `kubectl apply`, and other commands that modify the configuration, such as
|
||||
`kubectl replace` and `kubectl edit`, will update the annotation, allowing subsequent calls to
|
||||
`kubectl apply` to detect and perform deletions using a three-way diff.
|
||||
|
||||
### kubectl edit
|
||||
|
||||
|
@ -376,7 +442,8 @@ Alternatively, you may also update resources with `kubectl edit`:
|
|||
kubectl edit deployment/my-nginx
|
||||
```
|
||||
|
||||
This is equivalent to first `get` the resource, edit it in text editor, and then `apply` the resource with the updated version:
|
||||
This is equivalent to first `get` the resource, edit it in text editor, and then `apply` the
|
||||
resource with the updated version:
|
||||
|
||||
```shell
|
||||
kubectl get deployment my-nginx -o yaml > /tmp/nginx.yaml
|
||||
|
@ -389,7 +456,8 @@ deployment.apps/my-nginx configured
|
|||
rm /tmp/nginx.yaml
|
||||
```
|
||||
|
||||
This allows you to do more significant changes more easily. Note that you can specify the editor with your `EDITOR` or `KUBE_EDITOR` environment variables.
|
||||
This allows you to do more significant changes more easily. Note that you can specify the editor
|
||||
with your `EDITOR` or `KUBE_EDITOR` environment variables.
|
||||
|
||||
For more information, please see [kubectl edit](/docs/reference/generated/kubectl/kubectl-commands/#edit) document.
|
||||
|
||||
|
@ -403,20 +471,25 @@ and
|
|||
|
||||
## Disruptive updates
|
||||
|
||||
In some cases, you may need to update resource fields that cannot be updated once initialized, or you may want to make a recursive change immediately, such as to fix broken pods created by a Deployment. To change such fields, use `replace --force`, which deletes and re-creates the resource. In this case, you can modify your original configuration file:
|
||||
In some cases, you may need to update resource fields that cannot be updated once initialized, or
|
||||
you may want to make a recursive change immediately, such as to fix broken pods created by a
|
||||
Deployment. To change such fields, use `replace --force`, which deletes and re-creates the
|
||||
resource. In this case, you can modify your original configuration file:
|
||||
|
||||
```shell
|
||||
kubectl replace -f https://k8s.io/examples/application/nginx/nginx-deployment.yaml --force
|
||||
```
|
||||
|
||||
```shell
|
||||
```none
|
||||
deployment.apps/my-nginx deleted
|
||||
deployment.apps/my-nginx replaced
|
||||
```
|
||||
|
||||
## Updating your application without a service outage
|
||||
|
||||
At some point, you'll eventually need to update your deployed application, typically by specifying a new image or image tag, as in the canary deployment scenario above. `kubectl` supports several update operations, each of which is applicable to different scenarios.
|
||||
At some point, you'll eventually need to update your deployed application, typically by specifying
|
||||
a new image or image tag, as in the canary deployment scenario above. `kubectl` supports several
|
||||
update operations, each of which is applicable to different scenarios.
|
||||
|
||||
We'll guide you through how to create and update applications with Deployments.
|
||||
|
||||
|
@ -426,7 +499,7 @@ Let's say you were running version 1.14.2 of nginx:
|
|||
kubectl create deployment my-nginx --image=nginx:1.14.2
|
||||
```
|
||||
|
||||
```shell
|
||||
```none
|
||||
deployment.apps/my-nginx created
|
||||
```
|
||||
|
||||
|
@ -436,24 +509,24 @@ with 3 replicas (so the old and new revisions can coexist):
|
|||
kubectl scale deployment my-nginx --current-replicas=1 --replicas=3
|
||||
```
|
||||
|
||||
```
|
||||
```none
|
||||
deployment.apps/my-nginx scaled
|
||||
```
|
||||
|
||||
To update to version 1.16.1, change `.spec.template.spec.containers[0].image` from `nginx:1.14.2` to `nginx:1.16.1` using the previous kubectl commands.
|
||||
To update to version 1.16.1, change `.spec.template.spec.containers[0].image` from `nginx:1.14.2`
|
||||
to `nginx:1.16.1` using the previous kubectl commands.
|
||||
|
||||
```shell
|
||||
kubectl edit deployment/my-nginx
|
||||
```
|
||||
|
||||
That's it! The Deployment will declaratively update the deployed nginx application progressively behind the scene. It ensures that only a certain number of old replicas may be down while they are being updated, and only a certain number of new replicas may be created above the desired number of pods. To learn more details about it, visit [Deployment page](/docs/concepts/workloads/controllers/deployment/).
|
||||
|
||||
|
||||
That's it! The Deployment will declaratively update the deployed nginx application progressively
|
||||
behind the scene. It ensures that only a certain number of old replicas may be down while they are
|
||||
being updated, and only a certain number of new replicas may be created above the desired number
|
||||
of pods. To learn more details about it, visit [Deployment page](/docs/concepts/workloads/controllers/deployment/).
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
|
||||
- Learn about [how to use `kubectl` for application introspection and debugging](/docs/tasks/debug/debug-application/debug-running-pod/).
|
||||
- See [Configuration Best Practices and Tips](/docs/concepts/configuration/overview/).
|
||||
|
||||
|
||||
|
|
|
@ -807,4 +807,5 @@ memory limit (and possibly request) for that container.
|
|||
and its [resource requirements](/docs/reference/kubernetes-api/workload-resources/pod-v1/#resources)
|
||||
* Read about [project quotas](https://xfs.org/index.php/XFS_FAQ#Q:_Quota:_Do_quotas_work_on_XFS.3F) in XFS
|
||||
* Read more about the [kube-scheduler configuration reference (v1beta3)](/docs/reference/config-api/kube-scheduler-config.v1beta3/)
|
||||
* Read more about [Quality of Service classes for Pods](/docs/concepts/workloads/pods/pod-qos/)
|
||||
|
||||
|
|
|
@ -165,15 +165,35 @@ for that Pod, including details of the problem fetching the Secret.
|
|||
|
||||
#### Optional Secrets {#restriction-secret-must-exist}
|
||||
|
||||
When you define a container environment variable based on a Secret,
|
||||
you can mark it as _optional_. The default is for the Secret to be
|
||||
required.
|
||||
When you reference a Secret in a Pod, you can mark the Secret as _optional_,
|
||||
such as in the following example. If an optional Secret doesn't exist,
|
||||
Kubernetes ignores it.
|
||||
|
||||
None of a Pod's containers will start until all non-optional Secrets are
|
||||
available.
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: mypod
|
||||
spec:
|
||||
containers:
|
||||
- name: mypod
|
||||
image: redis
|
||||
volumeMounts:
|
||||
- name: foo
|
||||
mountPath: "/etc/foo"
|
||||
readOnly: true
|
||||
volumes:
|
||||
- name: foo
|
||||
secret:
|
||||
secretName: mysecret
|
||||
optional: true
|
||||
```
|
||||
|
||||
If a Pod references a specific key in a Secret and that Secret does exist, but
|
||||
is missing the named key, the Pod fails during startup.
|
||||
By default, Secrets are required. None of a Pod's containers will start until
|
||||
all non-optional Secrets are available.
|
||||
|
||||
If a Pod references a specific key in a non-optional Secret and that Secret
|
||||
does exist, but is missing the named key, the Pod fails during startup.
|
||||
|
||||
### Using Secrets as files from a Pod {#using-secrets-as-files-from-a-pod}
|
||||
|
||||
|
@ -181,181 +201,8 @@ If you want to access data from a Secret in a Pod, one way to do that is to
|
|||
have Kubernetes make the value of that Secret be available as a file inside
|
||||
the filesystem of one or more of the Pod's containers.
|
||||
|
||||
To configure that, you:
|
||||
|
||||
1. Create a secret or use an existing one. Multiple Pods can reference the same secret.
|
||||
1. Modify your Pod definition to add a volume under `.spec.volumes[]`. Name the volume anything,
|
||||
and have a `.spec.volumes[].secret.secretName` field equal to the name of the Secret object.
|
||||
1. Add a `.spec.containers[].volumeMounts[]` to each container that needs the secret. Specify
|
||||
`.spec.containers[].volumeMounts[].readOnly = true` and
|
||||
`.spec.containers[].volumeMounts[].mountPath` to an unused directory name where you would like the
|
||||
secrets to appear.
|
||||
1. Modify your image or command line so that the program looks for files in that directory. Each
|
||||
key in the secret `data` map becomes the filename under `mountPath`.
|
||||
|
||||
This is an example of a Pod that mounts a Secret named `mysecret` in a volume:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: mypod
|
||||
spec:
|
||||
containers:
|
||||
- name: mypod
|
||||
image: redis
|
||||
volumeMounts:
|
||||
- name: foo
|
||||
mountPath: "/etc/foo"
|
||||
readOnly: true
|
||||
volumes:
|
||||
- name: foo
|
||||
secret:
|
||||
secretName: mysecret
|
||||
optional: false # default setting; "mysecret" must exist
|
||||
```
|
||||
|
||||
Each Secret you want to use needs to be referred to in `.spec.volumes`.
|
||||
|
||||
If there are multiple containers in the Pod, then each container needs its
|
||||
own `volumeMounts` block, but only one `.spec.volumes` is needed per Secret.
|
||||
|
||||
{{< note >}}
|
||||
Versions of Kubernetes before v1.22 automatically created credentials for accessing
|
||||
the Kubernetes API. This older mechanism was based on creating token Secrets that
|
||||
could then be mounted into running Pods.
|
||||
In more recent versions, including Kubernetes v{{< skew currentVersion >}}, API credentials
|
||||
are obtained directly by using the [TokenRequest](/docs/reference/kubernetes-api/authentication-resources/token-request-v1/) API,
|
||||
and are mounted into Pods using a [projected volume](/docs/reference/access-authn-authz/service-accounts-admin/#bound-service-account-token-volume).
|
||||
The tokens obtained using this method have bounded lifetimes, and are automatically
|
||||
invalidated when the Pod they are mounted into is deleted.
|
||||
|
||||
You can still [manually create](/docs/tasks/configure-pod-container/configure-service-account/#manually-create-a-service-account-api-token)
|
||||
a service account token Secret; for example, if you need a token that never expires.
|
||||
However, using the [TokenRequest](/docs/reference/kubernetes-api/authentication-resources/token-request-v1/)
|
||||
subresource to obtain a token to access the API is recommended instead.
|
||||
You can use the [`kubectl create token`](/docs/reference/generated/kubectl/kubectl-commands#-em-token-em-)
|
||||
command to obtain a token from the `TokenRequest` API.
|
||||
{{< /note >}}
|
||||
|
||||
#### Projection of Secret keys to specific paths
|
||||
|
||||
You can also control the paths within the volume where Secret keys are projected.
|
||||
You can use the `.spec.volumes[].secret.items` field to change the target path of each key:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: mypod
|
||||
spec:
|
||||
containers:
|
||||
- name: mypod
|
||||
image: redis
|
||||
volumeMounts:
|
||||
- name: foo
|
||||
mountPath: "/etc/foo"
|
||||
readOnly: true
|
||||
volumes:
|
||||
- name: foo
|
||||
secret:
|
||||
secretName: mysecret
|
||||
items:
|
||||
- key: username
|
||||
path: my-group/my-username
|
||||
```
|
||||
|
||||
What will happen:
|
||||
|
||||
* the `username` key from `mysecret` is available to the container at the path
|
||||
`/etc/foo/my-group/my-username` instead of at `/etc/foo/username`.
|
||||
* the `password` key from that Secret object is not projected.
|
||||
|
||||
If `.spec.volumes[].secret.items` is used, only keys specified in `items` are projected.
|
||||
To consume all keys from the Secret, all of them must be listed in the `items` field.
|
||||
|
||||
If you list keys explicitly, then all listed keys must exist in the corresponding Secret.
|
||||
Otherwise, the volume is not created.
|
||||
|
||||
#### Secret files permissions
|
||||
|
||||
You can set the POSIX file access permission bits for a single Secret key.
|
||||
If you don't specify any permissions, `0644` is used by default.
|
||||
You can also set a default mode for the entire Secret volume and override per key if needed.
|
||||
|
||||
For example, you can specify a default mode like this:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: mypod
|
||||
spec:
|
||||
containers:
|
||||
- name: mypod
|
||||
image: redis
|
||||
volumeMounts:
|
||||
- name: foo
|
||||
mountPath: "/etc/foo"
|
||||
volumes:
|
||||
- name: foo
|
||||
secret:
|
||||
secretName: mysecret
|
||||
defaultMode: 0400
|
||||
```
|
||||
|
||||
The secret is mounted on `/etc/foo`; all the files created by the
|
||||
secret volume mount have permission `0400`.
|
||||
|
||||
{{< note >}}
|
||||
If you're defining a Pod or a Pod template using JSON, beware that the JSON
|
||||
specification doesn't support octal notation. You can use the decimal value
|
||||
for the `defaultMode` (for example, 0400 in octal is 256 in decimal) instead.
|
||||
If you're writing YAML, you can write the `defaultMode` in octal.
|
||||
{{< /note >}}
|
||||
|
||||
#### Consuming Secret values from volumes
|
||||
|
||||
Inside the container that mounts a secret volume, the secret keys appear as
|
||||
files. The secret values are base64 decoded and stored inside these files.
|
||||
|
||||
This is the result of commands executed inside the container from the example above:
|
||||
|
||||
```shell
|
||||
ls /etc/foo/
|
||||
```
|
||||
|
||||
The output is similar to:
|
||||
|
||||
```
|
||||
username
|
||||
password
|
||||
```
|
||||
|
||||
```shell
|
||||
cat /etc/foo/username
|
||||
```
|
||||
|
||||
The output is similar to:
|
||||
|
||||
```
|
||||
admin
|
||||
```
|
||||
|
||||
```shell
|
||||
cat /etc/foo/password
|
||||
```
|
||||
|
||||
The output is similar to:
|
||||
|
||||
```
|
||||
1f2d1e2e67df
|
||||
```
|
||||
|
||||
The program in a container is responsible for reading the secret data from these
|
||||
files, as needed.
|
||||
|
||||
#### Mounted Secrets are updated automatically
|
||||
For instructions, refer to
|
||||
[Distribute credentials securely using Secrets](/docs/tasks/inject-data-application/distribute-credentials-secure/#create-a-pod-that-has-access-to-the-secret-data-through-a-volume).
|
||||
|
||||
When a volume contains data from a Secret, and that Secret is updated, Kubernetes tracks
|
||||
this and updates the data in the volume, using an eventually-consistent approach.
|
||||
|
@ -388,53 +235,23 @@ watch propagation delay, the configured cache TTL, or zero for direct polling).
|
|||
To use a Secret in an {{< glossary_tooltip text="environment variable" term_id="container-env-variables" >}}
|
||||
in a Pod:
|
||||
|
||||
1. Create a Secret (or use an existing one). Multiple Pods can reference the same Secret.
|
||||
1. Modify your Pod definition in each container that you wish to consume the value of a secret
|
||||
key to add an environment variable for each secret key you wish to consume. The environment
|
||||
variable that consumes the secret key should populate the secret's name and key in `env[].valueFrom.secretKeyRef`.
|
||||
1. Modify your image and/or command line so that the program looks for values in the specified
|
||||
environment variables.
|
||||
|
||||
This is an example of a Pod that uses a Secret via environment variables:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: secret-env-pod
|
||||
spec:
|
||||
containers:
|
||||
- name: mycontainer
|
||||
image: redis
|
||||
env:
|
||||
- name: SECRET_USERNAME
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: mysecret
|
||||
key: username
|
||||
optional: false # same as default; "mysecret" must exist
|
||||
# and include a key named "username"
|
||||
- name: SECRET_PASSWORD
|
||||
valueFrom:
|
||||
secretKeyRef:
|
||||
name: mysecret
|
||||
key: password
|
||||
optional: false # same as default; "mysecret" must exist
|
||||
# and include a key named "password"
|
||||
restartPolicy: Never
|
||||
```
|
||||
1. For each container in your Pod specification, add an environment variable
|
||||
for each Secret key that you want to use to the
|
||||
`env[].valueFrom.secretKeyRef` field.
|
||||
1. Modify your image and/or command line so that the program looks for values
|
||||
in the specified environment variables.
|
||||
|
||||
For instructions, refer to
|
||||
[Define container environment variables using Secret data](/docs/tasks/inject-data-application/distribute-credentials-secure/#define-container-environment-variables-using-secret-data).
|
||||
|
||||
#### Invalid environment variables {#restriction-env-from-invalid}
|
||||
|
||||
Secrets used to populate environment variables by the `envFrom` field that have keys
|
||||
that are considered invalid environment variable names will have those keys
|
||||
skipped. The Pod is allowed to start.
|
||||
If your environment variable definitions in your Pod specification are
|
||||
considered to be invalid environment variable names, those keys aren't made
|
||||
available to your container. The Pod is allowed to start.
|
||||
|
||||
If you define a Pod with an invalid variable name, the failed Pod startup includes
|
||||
an event with the reason set to `InvalidVariableNames` and a message that lists the
|
||||
skipped invalid keys. The following example shows a Pod that refers to a Secret
|
||||
named `mysecret`, where `mysecret` contains 2 invalid keys: `1badkey` and `2alsobad`.
|
||||
Kubernetes adds an Event with the reason set to `InvalidVariableNames` and a
|
||||
message that lists the skipped invalid keys. The following example shows a Pod that refers to a Secret named `mysecret`, where `mysecret` contains 2 invalid keys: `1badkey` and `2alsobad`.
|
||||
|
||||
```shell
|
||||
kubectl get events
|
||||
|
@ -447,42 +264,6 @@ LASTSEEN FIRSTSEEN COUNT NAME KIND SUBOBJECT
|
|||
0s 0s 1 dapi-test-pod Pod Warning InvalidEnvironmentVariableNames kubelet, 127.0.0.1 Keys [1badkey, 2alsobad] from the EnvFrom secret default/mysecret were skipped since they are considered invalid environment variable names.
|
||||
```
|
||||
|
||||
|
||||
#### Consuming Secret values from environment variables
|
||||
|
||||
Inside a container that consumes a Secret using environment variables, the secret keys appear
|
||||
as normal environment variables. The values of those variables are the base64 decoded values
|
||||
of the secret data.
|
||||
|
||||
This is the result of commands executed inside the container from the example above:
|
||||
|
||||
```shell
|
||||
echo "$SECRET_USERNAME"
|
||||
```
|
||||
|
||||
The output is similar to:
|
||||
|
||||
```
|
||||
admin
|
||||
```
|
||||
|
||||
```shell
|
||||
echo "$SECRET_PASSWORD"
|
||||
```
|
||||
|
||||
The output is similar to:
|
||||
|
||||
```
|
||||
1f2d1e2e67df
|
||||
```
|
||||
|
||||
{{< note >}}
|
||||
If a container already consumes a Secret in an environment variable,
|
||||
a Secret update will not be seen by the container unless it is
|
||||
restarted. There are third party solutions for triggering restarts when
|
||||
secrets change.
|
||||
{{< /note >}}
|
||||
|
||||
### Container image pull secrets {#using-imagepullsecrets}
|
||||
|
||||
If you want to fetch container images from a private repository, you need a way for
|
||||
|
@ -518,43 +299,10 @@ You cannot use ConfigMaps or Secrets with {{< glossary_tooltip text="static Pods
|
|||
|
||||
## Use cases
|
||||
|
||||
### Use case: As container environment variables
|
||||
### Use case: As container environment variables {#use-case-as-container-environment-variables}
|
||||
|
||||
Create a secret
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Secret
|
||||
metadata:
|
||||
name: mysecret
|
||||
type: Opaque
|
||||
data:
|
||||
USER_NAME: YWRtaW4=
|
||||
PASSWORD: MWYyZDFlMmU2N2Rm
|
||||
```
|
||||
|
||||
Create the Secret:
|
||||
```shell
|
||||
kubectl apply -f mysecret.yaml
|
||||
```
|
||||
|
||||
Use `envFrom` to define all of the Secret's data as container environment variables. The key from
|
||||
the Secret becomes the environment variable name in the Pod.
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
name: secret-test-pod
|
||||
spec:
|
||||
containers:
|
||||
- name: test-container
|
||||
image: registry.k8s.io/busybox
|
||||
command: [ "/bin/sh", "-c", "env" ]
|
||||
envFrom:
|
||||
- secretRef:
|
||||
name: mysecret
|
||||
restartPolicy: Never
|
||||
```
|
||||
You can create a Secret and use it to
|
||||
[set environment variables for a container](/docs/tasks/inject-data-application/distribute-credentials-secure/#define-container-environment-variables-using-secret-data).
|
||||
|
||||
### Use case: Pod with SSH keys
|
||||
|
||||
|
@ -873,13 +621,28 @@ A `kubernetes.io/service-account-token` type of Secret is used to store a
|
|||
token credential that identifies a
|
||||
{{< glossary_tooltip text="service account" term_id="service-account" >}}.
|
||||
|
||||
Since 1.22, this type of Secret is no longer used to mount credentials into Pods,
|
||||
and obtaining tokens via the [TokenRequest](/docs/reference/kubernetes-api/authentication-resources/token-request-v1/)
|
||||
API is recommended instead of using service account token Secret objects.
|
||||
Tokens obtained from the `TokenRequest` API are more secure than ones stored in Secret objects,
|
||||
because they have a bounded lifetime and are not readable by other API clients.
|
||||
You can use the [`kubectl create token`](/docs/reference/generated/kubectl/kubectl-commands#-em-token-em-)
|
||||
{{< note >}}
|
||||
Versions of Kubernetes before v1.22 automatically created credentials for
|
||||
accessing the Kubernetes API. This older mechanism was based on creating token
|
||||
Secrets that could then be mounted into running Pods.
|
||||
In more recent versions, including Kubernetes v{{< skew currentVersion >}}, API
|
||||
credentials are obtained directly by using the
|
||||
[TokenRequest](/docs/reference/kubernetes-api/authentication-resources/token-request-v1/)
|
||||
API, and are mounted into Pods using a
|
||||
[projected volume](/docs/reference/access-authn-authz/service-accounts-admin/#bound-service-account-token-volume).
|
||||
The tokens obtained using this method have bounded lifetimes, and are
|
||||
automatically invalidated when the Pod they are mounted into is deleted.
|
||||
|
||||
You can still
|
||||
[manually create](/docs/tasks/configure-pod-container/configure-service-account/#manually-create-a-service-account-api-token)
|
||||
a service account token Secret; for example, if you need a token that never
|
||||
expires. However, using the
|
||||
[TokenRequest](/docs/reference/kubernetes-api/authentication-resources/token-request-v1/)
|
||||
subresource to obtain a token to access the API is recommended instead.
|
||||
You can use the
|
||||
[`kubectl create token`](/docs/reference/generated/kubectl/kubectl-commands#-em-token-em-)
|
||||
command to obtain a token from the `TokenRequest` API.
|
||||
{{< /note >}}
|
||||
|
||||
You should only create a service account token Secret object
|
||||
if you can't use the `TokenRequest` API to obtain a token,
|
||||
|
|
|
@ -87,60 +87,65 @@ spec:
|
|||
|
||||
The general workflow of a device plugin includes the following steps:
|
||||
|
||||
* Initialization. During this phase, the device plugin performs vendor specific
|
||||
initialization and setup to make sure the devices are in a ready state.
|
||||
1. Initialization. During this phase, the device plugin performs vendor-specific
|
||||
initialization and setup to make sure the devices are in a ready state.
|
||||
|
||||
* The plugin starts a gRPC service, with a Unix socket under host path
|
||||
`/var/lib/kubelet/device-plugins/`, that implements the following interfaces:
|
||||
1. The plugin starts a gRPC service, with a Unix socket under the host path
|
||||
`/var/lib/kubelet/device-plugins/`, that implements the following interfaces:
|
||||
|
||||
```gRPC
|
||||
service DevicePlugin {
|
||||
// GetDevicePluginOptions returns options to be communicated with Device Manager.
|
||||
rpc GetDevicePluginOptions(Empty) returns (DevicePluginOptions) {}
|
||||
```gRPC
|
||||
service DevicePlugin {
|
||||
// GetDevicePluginOptions returns options to be communicated with Device Manager.
|
||||
rpc GetDevicePluginOptions(Empty) returns (DevicePluginOptions) {}
|
||||
|
||||
// ListAndWatch returns a stream of List of Devices
|
||||
// Whenever a Device state change or a Device disappears, ListAndWatch
|
||||
// returns the new list
|
||||
rpc ListAndWatch(Empty) returns (stream ListAndWatchResponse) {}
|
||||
// ListAndWatch returns a stream of List of Devices
|
||||
// Whenever a Device state change or a Device disappears, ListAndWatch
|
||||
// returns the new list
|
||||
rpc ListAndWatch(Empty) returns (stream ListAndWatchResponse) {}
|
||||
|
||||
// Allocate is called during container creation so that the Device
|
||||
// Plugin can run device specific operations and instruct Kubelet
|
||||
// of the steps to make the Device available in the container
|
||||
rpc Allocate(AllocateRequest) returns (AllocateResponse) {}
|
||||
// Allocate is called during container creation so that the Device
|
||||
// Plugin can run device specific operations and instruct Kubelet
|
||||
// of the steps to make the Device available in the container
|
||||
rpc Allocate(AllocateRequest) returns (AllocateResponse) {}
|
||||
|
||||
// GetPreferredAllocation returns a preferred set of devices to allocate
|
||||
// from a list of available ones. The resulting preferred allocation is not
|
||||
// guaranteed to be the allocation ultimately performed by the
|
||||
// devicemanager. It is only designed to help the devicemanager make a more
|
||||
// informed allocation decision when possible.
|
||||
rpc GetPreferredAllocation(PreferredAllocationRequest) returns (PreferredAllocationResponse) {}
|
||||
// GetPreferredAllocation returns a preferred set of devices to allocate
|
||||
// from a list of available ones. The resulting preferred allocation is not
|
||||
// guaranteed to be the allocation ultimately performed by the
|
||||
// devicemanager. It is only designed to help the devicemanager make a more
|
||||
// informed allocation decision when possible.
|
||||
rpc GetPreferredAllocation(PreferredAllocationRequest) returns (PreferredAllocationResponse) {}
|
||||
|
||||
// PreStartContainer is called, if indicated by Device Plugin during registeration phase,
|
||||
// before each container start. Device plugin can run device specific operations
|
||||
// such as resetting the device before making devices available to the container.
|
||||
rpc PreStartContainer(PreStartContainerRequest) returns (PreStartContainerResponse) {}
|
||||
}
|
||||
```
|
||||
// PreStartContainer is called, if indicated by Device Plugin during registeration phase,
|
||||
// before each container start. Device plugin can run device specific operations
|
||||
// such as resetting the device before making devices available to the container.
|
||||
rpc PreStartContainer(PreStartContainerRequest) returns (PreStartContainerResponse) {}
|
||||
}
|
||||
```
|
||||
|
||||
{{< note >}}
|
||||
Plugins are not required to provide useful implementations for
|
||||
`GetPreferredAllocation()` or `PreStartContainer()`. Flags indicating which
|
||||
(if any) of these calls are available should be set in the `DevicePluginOptions`
|
||||
message sent back by a call to `GetDevicePluginOptions()`. The `kubelet` will
|
||||
always call `GetDevicePluginOptions()` to see which optional functions are
|
||||
available, before calling any of them directly.
|
||||
{{< /note >}}
|
||||
{{< note >}}
|
||||
Plugins are not required to provide useful implementations for
|
||||
`GetPreferredAllocation()` or `PreStartContainer()`. Flags indicating
|
||||
the availability of these calls, if any, should be set in the `DevicePluginOptions`
|
||||
message sent back by a call to `GetDevicePluginOptions()`. The `kubelet` will
|
||||
always call `GetDevicePluginOptions()` to see which optional functions are
|
||||
available, before calling any of them directly.
|
||||
{{< /note >}}
|
||||
|
||||
* The plugin registers itself with the kubelet through the Unix socket at host
|
||||
path `/var/lib/kubelet/device-plugins/kubelet.sock`.
|
||||
1. The plugin registers itself with the kubelet through the Unix socket at host
|
||||
path `/var/lib/kubelet/device-plugins/kubelet.sock`.
|
||||
|
||||
* After successfully registering itself, the device plugin runs in serving mode, during which it keeps
|
||||
monitoring device health and reports back to the kubelet upon any device state changes.
|
||||
It is also responsible for serving `Allocate` gRPC requests. During `Allocate`, the device plugin may
|
||||
do device-specific preparation; for example, GPU cleanup or QRNG initialization.
|
||||
If the operations succeed, the device plugin returns an `AllocateResponse` that contains container
|
||||
runtime configurations for accessing the allocated devices. The kubelet passes this information
|
||||
to the container runtime.
|
||||
{{< note >}}
|
||||
The ordering of the workflow is important. A plugin MUST start serving gRPC
|
||||
service before registering itself with kubelet for successful registration.
|
||||
{{< /note >}}
|
||||
|
||||
1. After successfully registering itself, the device plugin runs in serving mode, during which it keeps
|
||||
monitoring device health and reports back to the kubelet upon any device state changes.
|
||||
It is also responsible for serving `Allocate` gRPC requests. During `Allocate`, the device plugin may
|
||||
do device-specific preparation; for example, GPU cleanup or QRNG initialization.
|
||||
If the operations succeed, the device plugin returns an `AllocateResponse` that contains container
|
||||
runtime configurations for accessing the allocated devices. The kubelet passes this information
|
||||
to the container runtime.
|
||||
|
||||
### Handling kubelet restarts
|
||||
|
||||
|
@ -292,7 +297,6 @@ However, calling `GetAllocatableResources` endpoint is not sufficient in case of
|
|||
update and Kubelet needs to be restarted to reflect the correct resource capacity and allocatable.
|
||||
{{< /note >}}
|
||||
|
||||
|
||||
```gRPC
|
||||
// AllocatableResourcesResponses contains informations about all the devices known by the kubelet
|
||||
message AllocatableResourcesResponse {
|
||||
|
@ -313,14 +317,14 @@ Preceding Kubernetes v1.23, to enable this feature `kubelet` must be started wit
|
|||
```
|
||||
|
||||
`ContainerDevices` do expose the topology information declaring to which NUMA cells the device is
|
||||
affine. The NUMA cells are identified using a opaque integer ID, which value is consistent to
|
||||
affine. The NUMA cells are identified using a opaque integer ID, which value is consistent to
|
||||
what device plugins report
|
||||
[when they register themselves to the kubelet](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#device-plugin-integration-with-the-topology-manager).
|
||||
|
||||
The gRPC service is served over a unix socket at `/var/lib/kubelet/pod-resources/kubelet.sock`.
|
||||
Monitoring agents for device plugin resources can be deployed as a daemon, or as a DaemonSet.
|
||||
The canonical directory `/var/lib/kubelet/pod-resources` requires privileged access, so monitoring
|
||||
agents must run in a privileged security context. If a device monitoring agent is running as a
|
||||
agents must run in a privileged security context. If a device monitoring agent is running as a
|
||||
DaemonSet, `/var/lib/kubelet/pod-resources` must be mounted as a
|
||||
{{< glossary_tooltip term_id="volume" >}} in the device monitoring agent's
|
||||
[PodSpec](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#podspec-v1-core).
|
||||
|
@ -355,7 +359,7 @@ resource assignment decisions.
|
|||
`TopologyInfo` supports setting a `nodes` field to either `nil` or a list of NUMA nodes. This
|
||||
allows the Device Plugin to advertise a device that spans multiple NUMA nodes.
|
||||
|
||||
Setting `TopologyInfo` to `nil` or providing an empty list of NUMA nodes for a given device
|
||||
Setting `TopologyInfo` to `nil` or providing an empty list of NUMA nodes for a given device
|
||||
indicates that the Device Plugin does not have a NUMA affinity preference for that device.
|
||||
|
||||
An example `TopologyInfo` struct populated for a device by a Device Plugin:
|
||||
|
@ -391,4 +395,3 @@ Here are some examples of device plugin implementations:
|
|||
* Learn about the [Topology Manager](/docs/tasks/administer-cluster/topology-manager/)
|
||||
* Read about using [hardware acceleration for TLS ingress](/blog/2019/04/24/hardware-accelerated-ssl/tls-termination-in-ingress-controllers-using-kubernetes-device-plugins-and-runtimeclass/)
|
||||
with Kubernetes
|
||||
|
||||
|
|
|
@ -119,6 +119,7 @@ operator.
|
|||
* [kubebuilder](https://book.kubebuilder.io/)
|
||||
* [KubeOps](https://buehler.github.io/dotnet-operator-sdk/) (.NET operator SDK)
|
||||
* [KUDO](https://kudo.dev/) (Kubernetes Universal Declarative Operator)
|
||||
* [Mast](https://docs.ansi.services/mast/user_guide/operator/)
|
||||
* [Metacontroller](https://metacontroller.github.io/metacontroller/intro.html) along with WebHooks that
|
||||
you implement yourself
|
||||
* [Operator Framework](https://operatorframework.io)
|
||||
|
|
|
@ -8,7 +8,6 @@ weight: 60
|
|||
You can use Kubernetes annotations to attach arbitrary non-identifying metadata
|
||||
to objects. Clients such as tools and libraries can retrieve this metadata.
|
||||
|
||||
|
||||
<!-- body -->
|
||||
## Attaching metadata to objects
|
||||
|
||||
|
@ -74,10 +73,9 @@ If the prefix is omitted, the annotation Key is presumed to be private to the us
|
|||
|
||||
The `kubernetes.io/` and `k8s.io/` prefixes are reserved for Kubernetes core components.
|
||||
|
||||
For example, here's the configuration file for a Pod that has the annotation `imageregistry: https://hub.docker.com/` :
|
||||
For example, here's a manifest for a Pod that has the annotation `imageregistry: https://hub.docker.com/` :
|
||||
|
||||
```yaml
|
||||
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
|
@ -90,14 +88,8 @@ spec:
|
|||
image: nginx:1.14.2
|
||||
ports:
|
||||
- containerPort: 80
|
||||
|
||||
```
|
||||
|
||||
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
Learn more about [Labels and Selectors](/docs/concepts/overview/working-with-objects/labels/).
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -9,9 +9,12 @@ weight: 40
|
|||
<!-- overview -->
|
||||
|
||||
_Labels_ are key/value pairs that are attached to objects, such as pods.
|
||||
Labels are intended to be used to specify identifying attributes of objects that are meaningful and relevant to users, but do not directly imply semantics to the core system.
|
||||
Labels can be used to organize and to select subsets of objects. Labels can be attached to objects at creation time and subsequently added and modified at any time.
|
||||
Each object can have a set of key/value labels defined. Each Key must be unique for a given object.
|
||||
Labels are intended to be used to specify identifying attributes of objects
|
||||
that are meaningful and relevant to users, but do not directly imply semantics
|
||||
to the core system. Labels can be used to organize and to select subsets of
|
||||
objects. Labels can be attached to objects at creation time and subsequently
|
||||
added and modified at any time. Each object can have a set of key/value labels
|
||||
defined. Each Key must be unique for a given object.
|
||||
|
||||
```json
|
||||
"metadata": {
|
||||
|
@ -30,37 +33,56 @@ and CLIs. Non-identifying information should be recorded using
|
|||
|
||||
## Motivation
|
||||
|
||||
Labels enable users to map their own organizational structures onto system objects in a loosely coupled fashion, without requiring clients to store these mappings.
|
||||
Labels enable users to map their own organizational structures onto system objects
|
||||
in a loosely coupled fashion, without requiring clients to store these mappings.
|
||||
|
||||
Service deployments and batch processing pipelines are often multi-dimensional entities (e.g., multiple partitions or deployments, multiple release tracks, multiple tiers, multiple micro-services per tier). Management often requires cross-cutting operations, which breaks encapsulation of strictly hierarchical representations, especially rigid hierarchies determined by the infrastructure rather than by users.
|
||||
Service deployments and batch processing pipelines are often multi-dimensional entities
|
||||
(e.g., multiple partitions or deployments, multiple release tracks, multiple tiers,
|
||||
multiple micro-services per tier). Management often requires cross-cutting operations,
|
||||
which breaks encapsulation of strictly hierarchical representations, especially rigid
|
||||
hierarchies determined by the infrastructure rather than by users.
|
||||
|
||||
Example labels:
|
||||
|
||||
* `"release" : "stable"`, `"release" : "canary"`
|
||||
* `"environment" : "dev"`, `"environment" : "qa"`, `"environment" : "production"`
|
||||
* `"tier" : "frontend"`, `"tier" : "backend"`, `"tier" : "cache"`
|
||||
* `"partition" : "customerA"`, `"partition" : "customerB"`
|
||||
* `"track" : "daily"`, `"track" : "weekly"`
|
||||
* `"release" : "stable"`, `"release" : "canary"`
|
||||
* `"environment" : "dev"`, `"environment" : "qa"`, `"environment" : "production"`
|
||||
* `"tier" : "frontend"`, `"tier" : "backend"`, `"tier" : "cache"`
|
||||
* `"partition" : "customerA"`, `"partition" : "customerB"`
|
||||
* `"track" : "daily"`, `"track" : "weekly"`
|
||||
|
||||
These are examples of [commonly used labels](/docs/concepts/overview/working-with-objects/common-labels/); you are free to develop your own conventions. Keep in mind that label Key must be unique for a given object.
|
||||
These are examples of
|
||||
[commonly used labels](/docs/concepts/overview/working-with-objects/common-labels/);
|
||||
you are free to develop your own conventions.
|
||||
Keep in mind that label Key must be unique for a given object.
|
||||
|
||||
## Syntax and character set
|
||||
|
||||
_Labels_ are key/value pairs. Valid label keys have two segments: an optional prefix and name, separated by a slash (`/`). The name segment is required and must be 63 characters or less, beginning and ending with an alphanumeric character (`[a-z0-9A-Z]`) with dashes (`-`), underscores (`_`), dots (`.`), and alphanumerics between. The prefix is optional. If specified, the prefix must be a DNS subdomain: a series of DNS labels separated by dots (`.`), not longer than 253 characters in total, followed by a slash (`/`).
|
||||
_Labels_ are key/value pairs. Valid label keys have two segments: an optional
|
||||
prefix and name, separated by a slash (`/`). The name segment is required and
|
||||
must be 63 characters or less, beginning and ending with an alphanumeric
|
||||
character (`[a-z0-9A-Z]`) with dashes (`-`), underscores (`_`), dots (`.`),
|
||||
and alphanumerics between. The prefix is optional. If specified, the prefix
|
||||
must be a DNS subdomain: a series of DNS labels separated by dots (`.`),
|
||||
not longer than 253 characters in total, followed by a slash (`/`).
|
||||
|
||||
If the prefix is omitted, the label Key is presumed to be private to the user. Automated system components (e.g. `kube-scheduler`, `kube-controller-manager`, `kube-apiserver`, `kubectl`, or other third-party automation) which add labels to end-user objects must specify a prefix.
|
||||
If the prefix is omitted, the label Key is presumed to be private to the user.
|
||||
Automated system components (e.g. `kube-scheduler`, `kube-controller-manager`,
|
||||
`kube-apiserver`, `kubectl`, or other third-party automation) which add labels
|
||||
to end-user objects must specify a prefix.
|
||||
|
||||
The `kubernetes.io/` and `k8s.io/` prefixes are [reserved](/docs/reference/labels-annotations-taints/) for Kubernetes core components.
|
||||
The `kubernetes.io/` and `k8s.io/` prefixes are
|
||||
[reserved](/docs/reference/labels-annotations-taints/) for Kubernetes core components.
|
||||
|
||||
Valid label value:
|
||||
|
||||
* must be 63 characters or less (can be empty),
|
||||
* unless empty, must begin and end with an alphanumeric character (`[a-z0-9A-Z]`),
|
||||
* could contain dashes (`-`), underscores (`_`), dots (`.`), and alphanumerics between.
|
||||
|
||||
For example, here's the configuration file for a Pod that has two labels `environment: production` and `app: nginx` :
|
||||
For example, here's a manifest for a Pod that has two labels
|
||||
`environment: production` and `app: nginx`:
|
||||
|
||||
```yaml
|
||||
|
||||
apiVersion: v1
|
||||
kind: Pod
|
||||
metadata:
|
||||
|
@ -74,34 +96,43 @@ spec:
|
|||
image: nginx:1.14.2
|
||||
ports:
|
||||
- containerPort: 80
|
||||
|
||||
```
|
||||
|
||||
## Label selectors
|
||||
|
||||
Unlike [names and UIDs](/docs/concepts/overview/working-with-objects/names/), labels do not provide uniqueness. In general, we expect many objects to carry the same label(s).
|
||||
Unlike [names and UIDs](/docs/concepts/overview/working-with-objects/names/), labels
|
||||
do not provide uniqueness. In general, we expect many objects to carry the same label(s).
|
||||
|
||||
Via a _label selector_, the client/user can identify a set of objects. The label selector is the core grouping primitive in Kubernetes.
|
||||
Via a _label selector_, the client/user can identify a set of objects.
|
||||
The label selector is the core grouping primitive in Kubernetes.
|
||||
|
||||
The API currently supports two types of selectors: _equality-based_ and _set-based_.
|
||||
A label selector can be made of multiple _requirements_ which are comma-separated. In the case of multiple requirements, all must be satisfied so the comma separator acts as a logical _AND_ (`&&`) operator.
|
||||
A label selector can be made of multiple _requirements_ which are comma-separated.
|
||||
In the case of multiple requirements, all must be satisfied so the comma separator
|
||||
acts as a logical _AND_ (`&&`) operator.
|
||||
|
||||
The semantics of empty or non-specified selectors are dependent on the context,
|
||||
and API types that use selectors should document the validity and meaning of
|
||||
them.
|
||||
|
||||
{{< note >}}
|
||||
For some API types, such as ReplicaSets, the label selectors of two instances must not overlap within a namespace, or the controller can see that as conflicting instructions and fail to determine how many replicas should be present.
|
||||
For some API types, such as ReplicaSets, the label selectors of two instances must
|
||||
not overlap within a namespace, or the controller can see that as conflicting
|
||||
instructions and fail to determine how many replicas should be present.
|
||||
{{< /note >}}
|
||||
|
||||
{{< caution >}}
|
||||
For both equality-based and set-based conditions there is no logical _OR_ (`||`) operator. Ensure your filter statements are structured accordingly.
|
||||
For both equality-based and set-based conditions there is no logical _OR_ (`||`) operator.
|
||||
Ensure your filter statements are structured accordingly.
|
||||
{{< /caution >}}
|
||||
|
||||
### _Equality-based_ requirement
|
||||
|
||||
_Equality-_ or _inequality-based_ requirements allow filtering by label keys and values. Matching objects must satisfy all of the specified label constraints, though they may have additional labels as well.
|
||||
Three kinds of operators are admitted `=`,`==`,`!=`. The first two represent _equality_ (and are synonyms), while the latter represents _inequality_. For example:
|
||||
_Equality-_ or _inequality-based_ requirements allow filtering by label keys and values.
|
||||
Matching objects must satisfy all of the specified label constraints, though they may
|
||||
have additional labels as well. Three kinds of operators are admitted `=`,`==`,`!=`.
|
||||
The first two represent _equality_ (and are synonyms), while the latter represents _inequality_.
|
||||
For example:
|
||||
|
||||
```
|
||||
environment = production
|
||||
|
@ -109,8 +140,9 @@ tier != frontend
|
|||
```
|
||||
|
||||
The former selects all resources with key equal to `environment` and value equal to `production`.
|
||||
The latter selects all resources with key equal to `tier` and value distinct from `frontend`, and all resources with no labels with the `tier` key.
|
||||
One could filter for resources in `production` excluding `frontend` using the comma operator: `environment=production,tier!=frontend`
|
||||
The latter selects all resources with key equal to `tier` and value distinct from `frontend`,
|
||||
and all resources with no labels with the `tier` key. One could filter for resources in `production`
|
||||
excluding `frontend` using the comma operator: `environment=production,tier!=frontend`
|
||||
|
||||
One usage scenario for equality-based label requirement is for Pods to specify
|
||||
node selection criteria. For example, the sample Pod below selects nodes with
|
||||
|
@ -134,7 +166,9 @@ spec:
|
|||
|
||||
### _Set-based_ requirement
|
||||
|
||||
_Set-based_ label requirements allow filtering keys according to a set of values. Three kinds of operators are supported: `in`,`notin` and `exists` (only the key identifier). For example:
|
||||
_Set-based_ label requirements allow filtering keys according to a set of values.
|
||||
Three kinds of operators are supported: `in`,`notin` and `exists` (only the key identifier).
|
||||
For example:
|
||||
|
||||
```
|
||||
environment in (production, qa)
|
||||
|
@ -143,27 +177,38 @@ partition
|
|||
!partition
|
||||
```
|
||||
|
||||
* The first example selects all resources with key equal to `environment` and value equal to `production` or `qa`.
|
||||
* The second example selects all resources with key equal to `tier` and values other than `frontend` and `backend`, and all resources with no labels with the `tier` key.
|
||||
* The third example selects all resources including a label with key `partition`; no values are checked.
|
||||
* The fourth example selects all resources without a label with key `partition`; no values are checked.
|
||||
- The first example selects all resources with key equal to `environment` and value
|
||||
equal to `production` or `qa`.
|
||||
- The second example selects all resources with key equal to `tier` and values other
|
||||
than `frontend` and `backend`, and all resources with no labels with the `tier` key.
|
||||
- The third example selects all resources including a label with key `partition`;
|
||||
no values are checked.
|
||||
- The fourth example selects all resources without a label with key `partition`;
|
||||
no values are checked.
|
||||
|
||||
Similarly the comma separator acts as an _AND_ operator. So filtering resources with a `partition` key (no matter the value) and with `environment` different than `qa` can be achieved using `partition,environment notin (qa)`.
|
||||
The _set-based_ label selector is a general form of equality since `environment=production` is equivalent to `environment in (production)`; similarly for `!=` and `notin`.
|
||||
|
||||
_Set-based_ requirements can be mixed with _equality-based_ requirements. For example: `partition in (customerA, customerB),environment!=qa`.
|
||||
Similarly the comma separator acts as an _AND_ operator. So filtering resources
|
||||
with a `partition` key (no matter the value) and with `environment` different
|
||||
than `qa` can be achieved using `partition,environment notin (qa)`.
|
||||
The _set-based_ label selector is a general form of equality since
|
||||
`environment=production` is equivalent to `environment in (production)`;
|
||||
similarly for `!=` and `notin`.
|
||||
|
||||
_Set-based_ requirements can be mixed with _equality-based_ requirements.
|
||||
For example: `partition in (customerA, customerB),environment!=qa`.
|
||||
|
||||
## API
|
||||
|
||||
### LIST and WATCH filtering
|
||||
|
||||
LIST and WATCH operations may specify label selectors to filter the sets of objects returned using a query parameter. Both requirements are permitted (presented here as they would appear in a URL query string):
|
||||
LIST and WATCH operations may specify label selectors to filter the sets of objects
|
||||
returned using a query parameter. Both requirements are permitted
|
||||
(presented here as they would appear in a URL query string):
|
||||
|
||||
* _equality-based_ requirements: `?labelSelector=environment%3Dproduction,tier%3Dfrontend`
|
||||
* _set-based_ requirements: `?labelSelector=environment+in+%28production%2Cqa%29%2Ctier+in+%28frontend%29`
|
||||
* _equality-based_ requirements: `?labelSelector=environment%3Dproduction,tier%3Dfrontend`
|
||||
* _set-based_ requirements: `?labelSelector=environment+in+%28production%2Cqa%29%2Ctier+in+%28frontend%29`
|
||||
|
||||
Both label selector styles can be used to list or watch resources via a REST client. For example, targeting `apiserver` with `kubectl` and using _equality-based_ one may write:
|
||||
Both label selector styles can be used to list or watch resources via a REST client.
|
||||
For example, targeting `apiserver` with `kubectl` and using _equality-based_ one may write:
|
||||
|
||||
```shell
|
||||
kubectl get pods -l environment=production,tier=frontend
|
||||
|
@ -175,13 +220,14 @@ or using _set-based_ requirements:
|
|||
kubectl get pods -l 'environment in (production),tier in (frontend)'
|
||||
```
|
||||
|
||||
As already mentioned _set-based_ requirements are more expressive. For instance, they can implement the _OR_ operator on values:
|
||||
As already mentioned _set-based_ requirements are more expressive.
|
||||
For instance, they can implement the _OR_ operator on values:
|
||||
|
||||
```shell
|
||||
kubectl get pods -l 'environment in (production, qa)'
|
||||
```
|
||||
|
||||
or restricting negative matching via _exists_ operator:
|
||||
or restricting negative matching via _notin_ operator:
|
||||
|
||||
```shell
|
||||
kubectl get pods -l 'environment,environment notin (frontend)'
|
||||
|
@ -196,23 +242,28 @@ also use label selectors to specify sets of other resources, such as
|
|||
|
||||
#### Service and ReplicationController
|
||||
|
||||
The set of pods that a `service` targets is defined with a label selector. Similarly, the population of pods that a `replicationcontroller` should manage is also defined with a label selector.
|
||||
The set of pods that a `service` targets is defined with a label selector.
|
||||
Similarly, the population of pods that a `replicationcontroller` should
|
||||
manage is also defined with a label selector.
|
||||
|
||||
Labels selectors for both objects are defined in `json` or `yaml` files using maps, and only _equality-based_ requirement selectors are supported:
|
||||
Labels selectors for both objects are defined in `json` or `yaml` files using maps,
|
||||
and only _equality-based_ requirement selectors are supported:
|
||||
|
||||
```json
|
||||
"selector": {
|
||||
"component" : "redis",
|
||||
}
|
||||
```
|
||||
|
||||
or
|
||||
|
||||
```yaml
|
||||
selector:
|
||||
component: redis
|
||||
component: redis
|
||||
```
|
||||
|
||||
this selector (respectively in `json` or `yaml` format) is equivalent to `component=redis` or `component in (redis)`.
|
||||
This selector (respectively in `json` or `yaml` format) is equivalent to
|
||||
`component=redis` or `component in (redis)`.
|
||||
|
||||
#### Resources that support set-based requirements
|
||||
|
||||
|
@ -227,16 +278,23 @@ selector:
|
|||
matchLabels:
|
||||
component: redis
|
||||
matchExpressions:
|
||||
- {key: tier, operator: In, values: [cache]}
|
||||
- {key: environment, operator: NotIn, values: [dev]}
|
||||
- { key: tier, operator: In, values: [cache] }
|
||||
- { key: environment, operator: NotIn, values: [dev] }
|
||||
```
|
||||
|
||||
`matchLabels` is a map of `{key,value}` pairs. A single `{key,value}` in the `matchLabels` map is equivalent to an element of `matchExpressions`, whose `key` field is "key", the `operator` is "In", and the `values` array contains only "value". `matchExpressions` is a list of pod selector requirements. Valid operators include In, NotIn, Exists, and DoesNotExist. The values set must be non-empty in the case of In and NotIn. All of the requirements, from both `matchLabels` and `matchExpressions` are ANDed together -- they must all be satisfied in order to match.
|
||||
`matchLabels` is a map of `{key,value}` pairs. A single `{key,value}` in the
|
||||
`matchLabels` map is equivalent to an element of `matchExpressions`, whose `key`
|
||||
field is "key", the `operator` is "In", and the `values` array contains only "value".
|
||||
`matchExpressions` is a list of pod selector requirements. Valid operators include
|
||||
In, NotIn, Exists, and DoesNotExist. The values set must be non-empty in the case of
|
||||
In and NotIn. All of the requirements, from both `matchLabels` and `matchExpressions`
|
||||
are ANDed together -- they must all be satisfied in order to match.
|
||||
|
||||
#### Selecting sets of nodes
|
||||
|
||||
One use case for selecting over labels is to constrain the set of nodes onto which a pod can schedule.
|
||||
See the documentation on [node selection](/docs/concepts/scheduling-eviction/assign-pod-node/) for more information.
|
||||
One use case for selecting over labels is to constrain the set of nodes onto which
|
||||
a pod can schedule. See the documentation on
|
||||
[node selection](/docs/concepts/scheduling-eviction/assign-pod-node/) for more information.
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
|
|
|
@ -24,6 +24,10 @@ For non-unique user-provided attributes, Kubernetes provides [labels](/docs/conc
|
|||
|
||||
{{< glossary_definition term_id="name" length="all" >}}
|
||||
|
||||
**Names must be unique across all [API versions](/docs/concepts/overview/kubernetes-api/#api-groups-and-versioning)
|
||||
of the same resource. API resources are distinguished by their API group, resource type, namespace
|
||||
(for namespaced resources), and name. In other words, API version is irrelevant in this context.**
|
||||
|
||||
{{< note >}}
|
||||
In cases when objects represent a physical entity, like a Node representing a physical host, when the host is re-created under the same name without deleting and re-creating the Node, Kubernetes treats the new host as the old one, which may lead to inconsistencies.
|
||||
{{< /note >}}
|
||||
|
|
|
@ -44,7 +44,7 @@ Kubernetes starts with four initial namespaces:
|
|||
: Kubernetes includes this namespace so that you can start using your new cluster without first creating a namespace.
|
||||
|
||||
`kube-node-lease`
|
||||
: This namespace holds [Lease](/docs/reference/kubernetes-api/cluster-resources/lease-v1/) objects associated with each node. Node leases allow the kubelet to send [heartbeats](/docs/concepts/architecture/nodes/#heartbeats) so that the control plane can detect node failure.
|
||||
: This namespace holds [Lease](/docs/concepts/architecture/leases/) objects associated with each node. Node leases allow the kubelet to send [heartbeats](/docs/concepts/architecture/nodes/#heartbeats) so that the control plane can detect node failure.
|
||||
|
||||
`kube-public`
|
||||
: This namespace is readable by *all* clients (including those not authenticated). This namespace is mostly reserved for cluster usage, in case that some resources should be visible and readable publicly throughout the whole cluster. The public aspect of this namespace is only a convention, not a requirement.
|
||||
|
@ -147,7 +147,7 @@ kubectl api-resources --namespaced=false
|
|||
|
||||
## Automatic labelling
|
||||
|
||||
{{< feature-state state="beta" for_k8s_version="1.21" >}}
|
||||
{{< feature-state for_k8s_version="1.22" state="stable" >}}
|
||||
|
||||
The Kubernetes control plane sets an immutable {{< glossary_tooltip text="label" term_id="label" >}}
|
||||
`kubernetes.io/metadata.name` on all namespaces, provided that the `NamespaceDefaultLabelName`
|
||||
|
|
|
@ -8,16 +8,15 @@ content_type: concept
|
|||
weight: 20
|
||||
---
|
||||
|
||||
|
||||
<!-- overview -->
|
||||
|
||||
You can constrain a {{< glossary_tooltip text="Pod" term_id="pod" >}} so that it is
|
||||
You can constrain a {{< glossary_tooltip text="Pod" term_id="pod" >}} so that it is
|
||||
_restricted_ to run on particular {{< glossary_tooltip text="node(s)" term_id="node" >}},
|
||||
or to _prefer_ to run on particular nodes.
|
||||
There are several ways to do this and the recommended approaches all use
|
||||
[label selectors](/docs/concepts/overview/working-with-objects/labels/) to facilitate the selection.
|
||||
Often, you do not need to set any such constraints; the
|
||||
{{< glossary_tooltip text="scheduler" term_id="kube-scheduler" >}} will automatically do a reasonable placement
|
||||
{{< glossary_tooltip text="scheduler" term_id="kube-scheduler" >}} will automatically do a reasonable placement
|
||||
(for example, spreading your Pods across nodes so as not place Pods on a node with insufficient free resources).
|
||||
However, there are some circumstances where you may want to control which node
|
||||
the Pod deploys to, for example, to ensure that a Pod ends up on a node with an SSD attached to it,
|
||||
|
@ -28,10 +27,10 @@ or to co-locate Pods from two different services that communicate a lot into the
|
|||
You can use any of the following methods to choose where Kubernetes schedules
|
||||
specific Pods:
|
||||
|
||||
* [nodeSelector](#nodeselector) field matching against [node labels](#built-in-node-labels)
|
||||
* [Affinity and anti-affinity](#affinity-and-anti-affinity)
|
||||
* [nodeName](#nodename) field
|
||||
* [Pod topology spread constraints](#pod-topology-spread-constraints)
|
||||
- [nodeSelector](#nodeselector) field matching against [node labels](#built-in-node-labels)
|
||||
- [Affinity and anti-affinity](#affinity-and-anti-affinity)
|
||||
- [nodeName](#nodename) field
|
||||
- [Pod topology spread constraints](#pod-topology-spread-constraints)
|
||||
|
||||
## Node labels {#built-in-node-labels}
|
||||
|
||||
|
@ -51,7 +50,7 @@ and a different value in other environments.
|
|||
Adding labels to nodes allows you to target Pods for scheduling on specific
|
||||
nodes or groups of nodes. You can use this functionality to ensure that specific
|
||||
Pods only run on nodes with certain isolation, security, or regulatory
|
||||
properties.
|
||||
properties.
|
||||
|
||||
If you use labels for node isolation, choose label keys that the {{<glossary_tooltip text="kubelet" term_id="kubelet">}}
|
||||
cannot modify. This prevents a compromised node from setting those labels on
|
||||
|
@ -59,7 +58,7 @@ itself so that the scheduler schedules workloads onto the compromised node.
|
|||
|
||||
The [`NodeRestriction` admission plugin](/docs/reference/access-authn-authz/admission-controllers/#noderestriction)
|
||||
prevents the kubelet from setting or modifying labels with a
|
||||
`node-restriction.kubernetes.io/` prefix.
|
||||
`node-restriction.kubernetes.io/` prefix.
|
||||
|
||||
To make use of that label prefix for node isolation:
|
||||
|
||||
|
@ -73,7 +72,7 @@ To make use of that label prefix for node isolation:
|
|||
You can add the `nodeSelector` field to your Pod specification and specify the
|
||||
[node labels](#built-in-node-labels) you want the target node to have.
|
||||
Kubernetes only schedules the Pod onto nodes that have each of the labels you
|
||||
specify.
|
||||
specify.
|
||||
|
||||
See [Assign Pods to Nodes](/docs/tasks/configure-pod-container/assign-pods-nodes) for more
|
||||
information.
|
||||
|
@ -84,20 +83,20 @@ information.
|
|||
labels. Affinity and anti-affinity expands the types of constraints you can
|
||||
define. Some of the benefits of affinity and anti-affinity include:
|
||||
|
||||
* The affinity/anti-affinity language is more expressive. `nodeSelector` only
|
||||
- The affinity/anti-affinity language is more expressive. `nodeSelector` only
|
||||
selects nodes with all the specified labels. Affinity/anti-affinity gives you
|
||||
more control over the selection logic.
|
||||
* You can indicate that a rule is *soft* or *preferred*, so that the scheduler
|
||||
- You can indicate that a rule is *soft* or *preferred*, so that the scheduler
|
||||
still schedules the Pod even if it can't find a matching node.
|
||||
* You can constrain a Pod using labels on other Pods running on the node (or other topological domain),
|
||||
- You can constrain a Pod using labels on other Pods running on the node (or other topological domain),
|
||||
instead of just node labels, which allows you to define rules for which Pods
|
||||
can be co-located on a node.
|
||||
|
||||
The affinity feature consists of two types of affinity:
|
||||
|
||||
* *Node affinity* functions like the `nodeSelector` field but is more expressive and
|
||||
- *Node affinity* functions like the `nodeSelector` field but is more expressive and
|
||||
allows you to specify soft rules.
|
||||
* *Inter-pod affinity/anti-affinity* allows you to constrain Pods against labels
|
||||
- *Inter-pod affinity/anti-affinity* allows you to constrain Pods against labels
|
||||
on other Pods.
|
||||
|
||||
### Node affinity
|
||||
|
@ -106,12 +105,12 @@ Node affinity is conceptually similar to `nodeSelector`, allowing you to constra
|
|||
Pod can be scheduled on based on node labels. There are two types of node
|
||||
affinity:
|
||||
|
||||
* `requiredDuringSchedulingIgnoredDuringExecution`: The scheduler can't
|
||||
schedule the Pod unless the rule is met. This functions like `nodeSelector`,
|
||||
but with a more expressive syntax.
|
||||
* `preferredDuringSchedulingIgnoredDuringExecution`: The scheduler tries to
|
||||
find a node that meets the rule. If a matching node is not available, the
|
||||
scheduler still schedules the Pod.
|
||||
- `requiredDuringSchedulingIgnoredDuringExecution`: The scheduler can't
|
||||
schedule the Pod unless the rule is met. This functions like `nodeSelector`,
|
||||
but with a more expressive syntax.
|
||||
- `preferredDuringSchedulingIgnoredDuringExecution`: The scheduler tries to
|
||||
find a node that meets the rule. If a matching node is not available, the
|
||||
scheduler still schedules the Pod.
|
||||
|
||||
{{<note>}}
|
||||
In the preceding types, `IgnoredDuringExecution` means that if the node labels
|
||||
|
@ -127,17 +126,17 @@ For example, consider the following Pod spec:
|
|||
|
||||
In this example, the following rules apply:
|
||||
|
||||
* The node *must* have a label with the key `topology.kubernetes.io/zone` and
|
||||
the value of that label *must* be either `antarctica-east1` or `antarctica-west1`.
|
||||
* The node *preferably* has a label with the key `another-node-label-key` and
|
||||
the value `another-node-label-value`.
|
||||
- The node *must* have a label with the key `topology.kubernetes.io/zone` and
|
||||
the value of that label *must* be either `antarctica-east1` or `antarctica-west1`.
|
||||
- The node *preferably* has a label with the key `another-node-label-key` and
|
||||
the value `another-node-label-value`.
|
||||
|
||||
You can use the `operator` field to specify a logical operator for Kubernetes to use when
|
||||
interpreting the rules. You can use `In`, `NotIn`, `Exists`, `DoesNotExist`,
|
||||
`Gt` and `Lt`.
|
||||
|
||||
`NotIn` and `DoesNotExist` allow you to define node anti-affinity behavior.
|
||||
Alternatively, you can use [node taints](/docs/concepts/scheduling-eviction/taint-and-toleration/)
|
||||
Alternatively, you can use [node taints](/docs/concepts/scheduling-eviction/taint-and-toleration/)
|
||||
to repel Pods from specific nodes.
|
||||
|
||||
{{<note>}}
|
||||
|
@ -168,7 +167,7 @@ The final sum is added to the score of other priority functions for the node.
|
|||
Nodes with the highest total score are prioritized when the scheduler makes a
|
||||
scheduling decision for the Pod.
|
||||
|
||||
For example, consider the following Pod spec:
|
||||
For example, consider the following Pod spec:
|
||||
|
||||
{{< codenew file="pods/pod-with-affinity-anti-affinity.yaml" >}}
|
||||
|
||||
|
@ -268,8 +267,8 @@ to unintended behavior.
|
|||
Similar to [node affinity](#node-affinity) are two types of Pod affinity and
|
||||
anti-affinity as follows:
|
||||
|
||||
* `requiredDuringSchedulingIgnoredDuringExecution`
|
||||
* `preferredDuringSchedulingIgnoredDuringExecution`
|
||||
- `requiredDuringSchedulingIgnoredDuringExecution`
|
||||
- `preferredDuringSchedulingIgnoredDuringExecution`
|
||||
|
||||
For example, you could use
|
||||
`requiredDuringSchedulingIgnoredDuringExecution` affinity to tell the scheduler to
|
||||
|
@ -297,7 +296,7 @@ The affinity rule says that the scheduler can only schedule a Pod onto a node if
|
|||
the node is in the same zone as one or more existing Pods with the label
|
||||
`security=S1`. More precisely, the scheduler must place the Pod on a node that has the
|
||||
`topology.kubernetes.io/zone=V` label, as long as there is at least one node in
|
||||
that zone that currently has one or more Pods with the Pod label `security=S1`.
|
||||
that zone that currently has one or more Pods with the Pod label `security=S1`.
|
||||
|
||||
The anti-affinity rule says that the scheduler should try to avoid scheduling
|
||||
the Pod onto a node that is in the same zone as one or more Pods with the label
|
||||
|
@ -314,9 +313,9 @@ You can use the `In`, `NotIn`, `Exists` and `DoesNotExist` values in the
|
|||
In principle, the `topologyKey` can be any allowed label key with the following
|
||||
exceptions for performance and security reasons:
|
||||
|
||||
* For Pod affinity and anti-affinity, an empty `topologyKey` field is not allowed in both `requiredDuringSchedulingIgnoredDuringExecution`
|
||||
- For Pod affinity and anti-affinity, an empty `topologyKey` field is not allowed in both `requiredDuringSchedulingIgnoredDuringExecution`
|
||||
and `preferredDuringSchedulingIgnoredDuringExecution`.
|
||||
* For `requiredDuringSchedulingIgnoredDuringExecution` Pod anti-affinity rules,
|
||||
- For `requiredDuringSchedulingIgnoredDuringExecution` Pod anti-affinity rules,
|
||||
the admission controller `LimitPodHardAntiAffinityTopology` limits
|
||||
`topologyKey` to `kubernetes.io/hostname`. You can modify or disable the
|
||||
admission controller if you want to allow custom topologies.
|
||||
|
@ -328,17 +327,18 @@ If omitted or empty, `namespaces` defaults to the namespace of the Pod where the
|
|||
affinity/anti-affinity definition appears.
|
||||
|
||||
#### Namespace selector
|
||||
|
||||
{{< feature-state for_k8s_version="v1.24" state="stable" >}}
|
||||
|
||||
You can also select matching namespaces using `namespaceSelector`, which is a label query over the set of namespaces.
|
||||
The affinity term is applied to namespaces selected by both `namespaceSelector` and the `namespaces` field.
|
||||
Note that an empty `namespaceSelector` ({}) matches all namespaces, while a null or empty `namespaces` list and
|
||||
Note that an empty `namespaceSelector` ({}) matches all namespaces, while a null or empty `namespaces` list and
|
||||
null `namespaceSelector` matches the namespace of the Pod where the rule is defined.
|
||||
|
||||
#### More practical use-cases
|
||||
|
||||
Inter-pod affinity and anti-affinity can be even more useful when they are used with higher
|
||||
level collections such as ReplicaSets, StatefulSets, Deployments, etc. These
|
||||
level collections such as ReplicaSets, StatefulSets, Deployments, etc. These
|
||||
rules allow you to configure that a set of workloads should
|
||||
be co-located in the same defined topology; for example, preferring to place two related
|
||||
Pods onto the same node.
|
||||
|
@ -430,10 +430,10 @@ spec:
|
|||
Creating the two preceding Deployments results in the following cluster layout,
|
||||
where each web server is co-located with a cache, on three separate nodes.
|
||||
|
||||
| node-1 | node-2 | node-3 |
|
||||
|:--------------------:|:-------------------:|:------------------:|
|
||||
| *webserver-1* | *webserver-2* | *webserver-3* |
|
||||
| *cache-1* | *cache-2* | *cache-3* |
|
||||
| node-1 | node-2 | node-3 |
|
||||
| :-----------: | :-----------: | :-----------: |
|
||||
| *webserver-1* | *webserver-2* | *webserver-3* |
|
||||
| *cache-1* | *cache-2* | *cache-3* |
|
||||
|
||||
The overall effect is that each cache instance is likely to be accessed by a single client, that
|
||||
is running on the same node. This approach aims to minimize both skew (imbalanced load) and latency.
|
||||
|
@ -453,13 +453,18 @@ tries to place the Pod on that node. Using `nodeName` overrules using
|
|||
|
||||
Some of the limitations of using `nodeName` to select nodes are:
|
||||
|
||||
- If the named node does not exist, the Pod will not run, and in
|
||||
some cases may be automatically deleted.
|
||||
- If the named node does not have the resources to accommodate the
|
||||
Pod, the Pod will fail and its reason will indicate why,
|
||||
for example OutOfmemory or OutOfcpu.
|
||||
- Node names in cloud environments are not always predictable or
|
||||
stable.
|
||||
- If the named node does not exist, the Pod will not run, and in
|
||||
some cases may be automatically deleted.
|
||||
- If the named node does not have the resources to accommodate the
|
||||
Pod, the Pod will fail and its reason will indicate why,
|
||||
for example OutOfmemory or OutOfcpu.
|
||||
- Node names in cloud environments are not always predictable or stable.
|
||||
|
||||
{{< note >}}
|
||||
`nodeName` is intended for use by custom schedulers or advanced use cases where
|
||||
you need to bypass any configured schedulers. Bypassing the schedulers might lead to
|
||||
failed Pods if the assigned Nodes get oversubscribed. You can use [node affinity](#node-affinity) or a the [`nodeselector` field](#nodeselector) to assign a Pod to a specific Node without bypassing the schedulers.
|
||||
{{</ note >}}
|
||||
|
||||
Here is an example of a Pod spec using the `nodeName` field:
|
||||
|
||||
|
@ -489,12 +494,10 @@ to learn more about how these work.
|
|||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
* Read more about [taints and tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/) .
|
||||
* Read the design docs for [node affinity](https://git.k8s.io/design-proposals-archive/scheduling/nodeaffinity.md)
|
||||
- Read more about [taints and tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/) .
|
||||
- Read the design docs for [node affinity](https://git.k8s.io/design-proposals-archive/scheduling/nodeaffinity.md)
|
||||
and for [inter-pod affinity/anti-affinity](https://git.k8s.io/design-proposals-archive/scheduling/podaffinity.md).
|
||||
* Learn about how the [topology manager](/docs/tasks/administer-cluster/topology-manager/) takes part in node-level
|
||||
resource allocation decisions.
|
||||
* Learn how to use [nodeSelector](/docs/tasks/configure-pod-container/assign-pods-nodes/).
|
||||
* Learn how to use [affinity and anti-affinity](/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/).
|
||||
|
||||
|
||||
- Learn about how the [topology manager](/docs/tasks/administer-cluster/topology-manager/) takes part in node-level
|
||||
resource allocation decisions.
|
||||
- Learn how to use [nodeSelector](/docs/tasks/configure-pod-container/assign-pods-nodes/).
|
||||
- Learn how to use [affinity and anti-affinity](/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/).
|
||||
|
|
|
@ -6,16 +6,16 @@ weight: 100
|
|||
|
||||
{{<glossary_definition term_id="node-pressure-eviction" length="short">}}</br>
|
||||
|
||||
The {{<glossary_tooltip term_id="kubelet" text="kubelet">}} monitors resources
|
||||
like memory, disk space, and filesystem inodes on your cluster's nodes.
|
||||
When one or more of these resources reach specific consumption levels, the
|
||||
The {{<glossary_tooltip term_id="kubelet" text="kubelet">}} monitors resources
|
||||
like memory, disk space, and filesystem inodes on your cluster's nodes.
|
||||
When one or more of these resources reach specific consumption levels, the
|
||||
kubelet can proactively fail one or more pods on the node to reclaim resources
|
||||
and prevent starvation.
|
||||
and prevent starvation.
|
||||
|
||||
During a node-pressure eviction, the kubelet sets the `PodPhase` for the
|
||||
selected pods to `Failed`. This terminates the pods.
|
||||
selected pods to `Failed`. This terminates the pods.
|
||||
|
||||
Node-pressure eviction is not the same as
|
||||
Node-pressure eviction is not the same as
|
||||
[API-initiated eviction](/docs/concepts/scheduling-eviction/api-eviction/).
|
||||
|
||||
The kubelet does not respect your configured `PodDisruptionBudget` or the pod's
|
||||
|
@ -26,7 +26,7 @@ the kubelet respects your configured `eviction-max-pod-grace-period`. If you use
|
|||
If the pods are managed by a {{< glossary_tooltip text="workload" term_id="workload" >}}
|
||||
resource (such as {{< glossary_tooltip text="StatefulSet" term_id="statefulset" >}}
|
||||
or {{< glossary_tooltip text="Deployment" term_id="deployment" >}}) that
|
||||
replaces failed pods, the control plane or `kube-controller-manager` creates new
|
||||
replaces failed pods, the control plane or `kube-controller-manager` creates new
|
||||
pods in place of the evicted pods.
|
||||
|
||||
{{<note>}}
|
||||
|
@ -37,16 +37,16 @@ images when disk resources are starved.
|
|||
|
||||
The kubelet uses various parameters to make eviction decisions, like the following:
|
||||
|
||||
* Eviction signals
|
||||
* Eviction thresholds
|
||||
* Monitoring intervals
|
||||
- Eviction signals
|
||||
- Eviction thresholds
|
||||
- Monitoring intervals
|
||||
|
||||
### Eviction signals {#eviction-signals}
|
||||
|
||||
Eviction signals are the current state of a particular resource at a specific
|
||||
point in time. Kubelet uses eviction signals to make eviction decisions by
|
||||
comparing the signals to eviction thresholds, which are the minimum amount of
|
||||
the resource that should be available on the node.
|
||||
comparing the signals to eviction thresholds, which are the minimum amount of
|
||||
the resource that should be available on the node.
|
||||
|
||||
Kubelet uses the following eviction signals:
|
||||
|
||||
|
@ -60,9 +60,9 @@ Kubelet uses the following eviction signals:
|
|||
| `pid.available` | `pid.available` := `node.stats.rlimit.maxpid` - `node.stats.rlimit.curproc` |
|
||||
|
||||
In this table, the `Description` column shows how kubelet gets the value of the
|
||||
signal. Each signal supports either a percentage or a literal value. Kubelet
|
||||
signal. Each signal supports either a percentage or a literal value. Kubelet
|
||||
calculates the percentage value relative to the total capacity associated with
|
||||
the signal.
|
||||
the signal.
|
||||
|
||||
The value for `memory.available` is derived from the cgroupfs instead of tools
|
||||
like `free -m`. This is important because `free -m` does not work in a
|
||||
|
@ -78,7 +78,7 @@ memory is reclaimable under pressure.
|
|||
The kubelet supports the following filesystem partitions:
|
||||
|
||||
1. `nodefs`: The node's main filesystem, used for local disk volumes, emptyDir,
|
||||
log storage, and more. For example, `nodefs` contains `/var/lib/kubelet/`.
|
||||
log storage, and more. For example, `nodefs` contains `/var/lib/kubelet/`.
|
||||
1. `imagefs`: An optional filesystem that container runtimes use to store container
|
||||
images and container writable layers.
|
||||
|
||||
|
@ -102,10 +102,10 @@ eviction decisions.
|
|||
|
||||
Eviction thresholds have the form `[eviction-signal][operator][quantity]`, where:
|
||||
|
||||
* `eviction-signal` is the [eviction signal](#eviction-signals) to use.
|
||||
* `operator` is the [relational operator](https://en.wikipedia.org/wiki/Relational_operator#Standard_relational_operators)
|
||||
- `eviction-signal` is the [eviction signal](#eviction-signals) to use.
|
||||
- `operator` is the [relational operator](https://en.wikipedia.org/wiki/Relational_operator#Standard_relational_operators)
|
||||
you want, such as `<` (less than).
|
||||
* `quantity` is the eviction threshold amount, such as `1Gi`. The value of `quantity`
|
||||
- `quantity` is the eviction threshold amount, such as `1Gi`. The value of `quantity`
|
||||
must match the quantity representation used by Kubernetes. You can use either
|
||||
literal values or percentages (`%`).
|
||||
|
||||
|
@ -120,22 +120,22 @@ You can configure soft and hard eviction thresholds.
|
|||
A soft eviction threshold pairs an eviction threshold with a required
|
||||
administrator-specified grace period. The kubelet does not evict pods until the
|
||||
grace period is exceeded. The kubelet returns an error on startup if there is no
|
||||
specified grace period.
|
||||
specified grace period.
|
||||
|
||||
You can specify both a soft eviction threshold grace period and a maximum
|
||||
allowed pod termination grace period for kubelet to use during evictions. If you
|
||||
specify a maximum allowed grace period and the soft eviction threshold is met,
|
||||
specify a maximum allowed grace period and the soft eviction threshold is met,
|
||||
the kubelet uses the lesser of the two grace periods. If you do not specify a
|
||||
maximum allowed grace period, the kubelet kills evicted pods immediately without
|
||||
graceful termination.
|
||||
|
||||
You can use the following flags to configure soft eviction thresholds:
|
||||
|
||||
* `eviction-soft`: A set of eviction thresholds like `memory.available<1.5Gi`
|
||||
- `eviction-soft`: A set of eviction thresholds like `memory.available<1.5Gi`
|
||||
that can trigger pod eviction if held over the specified grace period.
|
||||
* `eviction-soft-grace-period`: A set of eviction grace periods like `memory.available=1m30s`
|
||||
- `eviction-soft-grace-period`: A set of eviction grace periods like `memory.available=1m30s`
|
||||
that define how long a soft eviction threshold must hold before triggering a Pod eviction.
|
||||
* `eviction-max-pod-grace-period`: The maximum allowed grace period (in seconds)
|
||||
- `eviction-max-pod-grace-period`: The maximum allowed grace period (in seconds)
|
||||
to use when terminating pods in response to a soft eviction threshold being met.
|
||||
|
||||
#### Hard eviction thresholds {#hard-eviction-thresholds}
|
||||
|
@ -144,20 +144,20 @@ A hard eviction threshold has no grace period. When a hard eviction threshold is
|
|||
met, the kubelet kills pods immediately without graceful termination to reclaim
|
||||
the starved resource.
|
||||
|
||||
You can use the `eviction-hard` flag to configure a set of hard eviction
|
||||
thresholds like `memory.available<1Gi`.
|
||||
You can use the `eviction-hard` flag to configure a set of hard eviction
|
||||
thresholds like `memory.available<1Gi`.
|
||||
|
||||
The kubelet has the following default hard eviction thresholds:
|
||||
|
||||
* `memory.available<100Mi`
|
||||
* `nodefs.available<10%`
|
||||
* `imagefs.available<15%`
|
||||
* `nodefs.inodesFree<5%` (Linux nodes)
|
||||
- `memory.available<100Mi`
|
||||
- `nodefs.available<10%`
|
||||
- `imagefs.available<15%`
|
||||
- `nodefs.inodesFree<5%` (Linux nodes)
|
||||
|
||||
These default values of hard eviction thresholds will only be set if none
|
||||
of the parameters is changed. If you changed the value of any parameter,
|
||||
then the values of other parameters will not be inherited as the default
|
||||
values and will be set to zero. In order to provide custom values, you
|
||||
These default values of hard eviction thresholds will only be set if none
|
||||
of the parameters is changed. If you changed the value of any parameter,
|
||||
then the values of other parameters will not be inherited as the default
|
||||
values and will be set to zero. In order to provide custom values, you
|
||||
should provide all the thresholds respectively.
|
||||
|
||||
### Eviction monitoring interval
|
||||
|
@ -169,9 +169,9 @@ which defaults to `10s`.
|
|||
|
||||
The kubelet reports node conditions to reflect that the node is under pressure
|
||||
because hard or soft eviction threshold is met, independent of configured grace
|
||||
periods.
|
||||
periods.
|
||||
|
||||
The kubelet maps eviction signals to node conditions as follows:
|
||||
The kubelet maps eviction signals to node conditions as follows:
|
||||
|
||||
| Node Condition | Eviction Signal | Description |
|
||||
|-------------------|---------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------|
|
||||
|
@ -179,7 +179,7 @@ The kubelet maps eviction signals to node conditions as follows:
|
|||
| `DiskPressure` | `nodefs.available`, `nodefs.inodesFree`, `imagefs.available`, or `imagefs.inodesFree` | Available disk space and inodes on either the node's root filesystem or image filesystem has satisfied an eviction threshold |
|
||||
| `PIDPressure` | `pid.available` | Available processes identifiers on the (Linux) node has fallen below an eviction threshold |
|
||||
|
||||
The kubelet updates the node conditions based on the configured
|
||||
The kubelet updates the node conditions based on the configured
|
||||
`--node-status-update-frequency`, which defaults to `10s`.
|
||||
|
||||
#### Node condition oscillation
|
||||
|
@ -197,17 +197,17 @@ condition to a different state. The transition period has a default value of `5m
|
|||
The kubelet tries to reclaim node-level resources before it evicts end-user pods.
|
||||
|
||||
When a `DiskPressure` node condition is reported, the kubelet reclaims node-level
|
||||
resources based on the filesystems on the node.
|
||||
resources based on the filesystems on the node.
|
||||
|
||||
#### With `imagefs`
|
||||
|
||||
If the node has a dedicated `imagefs` filesystem for container runtimes to use,
|
||||
the kubelet does the following:
|
||||
|
||||
* If the `nodefs` filesystem meets the eviction thresholds, the kubelet garbage collects
|
||||
dead pods and containers.
|
||||
* If the `imagefs` filesystem meets the eviction thresholds, the kubelet
|
||||
deletes all unused images.
|
||||
- If the `nodefs` filesystem meets the eviction thresholds, the kubelet garbage collects
|
||||
dead pods and containers.
|
||||
- If the `imagefs` filesystem meets the eviction thresholds, the kubelet
|
||||
deletes all unused images.
|
||||
|
||||
#### Without `imagefs`
|
||||
|
||||
|
@ -220,7 +220,7 @@ the kubelet frees up disk space in the following order:
|
|||
### Pod selection for kubelet eviction
|
||||
|
||||
If the kubelet's attempts to reclaim node-level resources don't bring the eviction
|
||||
signal below the threshold, the kubelet begins to evict end-user pods.
|
||||
signal below the threshold, the kubelet begins to evict end-user pods.
|
||||
|
||||
The kubelet uses the following parameters to determine the pod eviction order:
|
||||
|
||||
|
@ -238,7 +238,7 @@ As a result, kubelet ranks and evicts pods in the following order:
|
|||
|
||||
{{<note>}}
|
||||
The kubelet does not use the pod's QoS class to determine the eviction order.
|
||||
You can use the QoS class to estimate the most likely pod eviction order when
|
||||
You can use the QoS class to estimate the most likely pod eviction order when
|
||||
reclaiming resources like memory. QoS does not apply to EphemeralStorage requests,
|
||||
so the above scenario will not apply if the node is, for example, under `DiskPressure`.
|
||||
{{</note>}}
|
||||
|
@ -246,7 +246,7 @@ so the above scenario will not apply if the node is, for example, under `DiskPre
|
|||
`Guaranteed` pods are guaranteed only when requests and limits are specified for
|
||||
all the containers and they are equal. These pods will never be evicted because
|
||||
of another pod's resource consumption. If a system daemon (such as `kubelet`
|
||||
and `journald`) is consuming more resources than were reserved via
|
||||
and `journald`) is consuming more resources than were reserved via
|
||||
`system-reserved` or `kube-reserved` allocations, and the node only has
|
||||
`Guaranteed` or `Burstable` pods using less resources than requests left on it,
|
||||
then the kubelet must choose to evict one of these pods to preserve node stability
|
||||
|
@ -277,14 +277,14 @@ disk usage (`local volumes + logs & writable layer of all containers`)
|
|||
|
||||
In some cases, pod eviction only reclaims a small amount of the starved resource.
|
||||
This can lead to the kubelet repeatedly hitting the configured eviction thresholds
|
||||
and triggering multiple evictions.
|
||||
and triggering multiple evictions.
|
||||
|
||||
You can use the `--eviction-minimum-reclaim` flag or a [kubelet config file](/docs/tasks/administer-cluster/kubelet-config-file/)
|
||||
to configure a minimum reclaim amount for each resource. When the kubelet notices
|
||||
that a resource is starved, it continues to reclaim that resource until it
|
||||
reclaims the quantity you specify.
|
||||
reclaims the quantity you specify.
|
||||
|
||||
For example, the following configuration sets minimum reclaim amounts:
|
||||
For example, the following configuration sets minimum reclaim amounts:
|
||||
|
||||
```yaml
|
||||
apiVersion: kubelet.config.k8s.io/v1beta1
|
||||
|
@ -302,10 +302,10 @@ evictionMinimumReclaim:
|
|||
In this example, if the `nodefs.available` signal meets the eviction threshold,
|
||||
the kubelet reclaims the resource until the signal reaches the threshold of `1Gi`,
|
||||
and then continues to reclaim the minimum amount of `500Mi` it until the signal
|
||||
reaches `1.5Gi`.
|
||||
reaches `1.5Gi`.
|
||||
|
||||
Similarly, the kubelet reclaims the `imagefs` resource until the `imagefs.available`
|
||||
signal reaches `102Gi`.
|
||||
signal reaches `102Gi`.
|
||||
|
||||
The default `eviction-minimum-reclaim` is `0` for all resources.
|
||||
|
||||
|
@ -336,7 +336,7 @@ for each container. It then kills the container with the highest score.
|
|||
This means that containers in low QoS pods that consume a large amount of memory
|
||||
relative to their scheduling requests are killed first.
|
||||
|
||||
Unlike pod eviction, if a container is OOM killed, the `kubelet` can restart it
|
||||
Unlike pod eviction, if a container is OOM killed, the `kubelet` can restart it
|
||||
based on its `RestartPolicy`.
|
||||
|
||||
### Best practices {#node-pressure-eviction-good-practices}
|
||||
|
@ -351,9 +351,9 @@ immediately induce memory pressure.
|
|||
|
||||
Consider the following scenario:
|
||||
|
||||
* Node memory capacity: `10Gi`
|
||||
* Operator wants to reserve 10% of memory capacity for system daemons (kernel, `kubelet`, etc.)
|
||||
* Operator wants to evict Pods at 95% memory utilization to reduce incidence of system OOM.
|
||||
- Node memory capacity: `10Gi`
|
||||
- Operator wants to reserve 10% of memory capacity for system daemons (kernel, `kubelet`, etc.)
|
||||
- Operator wants to evict Pods at 95% memory utilization to reduce incidence of system OOM.
|
||||
|
||||
For this to work, the kubelet is launched as follows:
|
||||
|
||||
|
@ -363,18 +363,18 @@ For this to work, the kubelet is launched as follows:
|
|||
```
|
||||
|
||||
In this configuration, the `--system-reserved` flag reserves `1.5Gi` of memory
|
||||
for the system, which is `10% of the total memory + the eviction threshold amount`.
|
||||
for the system, which is `10% of the total memory + the eviction threshold amount`.
|
||||
|
||||
The node can reach the eviction threshold if a pod is using more than its request,
|
||||
or if the system is using more than `1Gi` of memory, which makes the `memory.available`
|
||||
signal fall below `500Mi` and triggers the threshold.
|
||||
signal fall below `500Mi` and triggers the threshold.
|
||||
|
||||
#### DaemonSet
|
||||
|
||||
Pod Priority is a major factor in making eviction decisions. If you do not want
|
||||
the kubelet to evict pods that belong to a `DaemonSet`, give those pods a high
|
||||
enough `priorityClass` in the pod spec. You can also use a lower `priorityClass`
|
||||
or the default to only allow `DaemonSet` pods to run when there are enough
|
||||
or the default to only allow `DaemonSet` pods to run when there are enough
|
||||
resources.
|
||||
|
||||
### Known issues
|
||||
|
@ -386,7 +386,7 @@ The following sections describe known issues related to out of resource handling
|
|||
By default, the kubelet polls `cAdvisor` to collect memory usage stats at a
|
||||
regular interval. If memory usage increases within that window rapidly, the
|
||||
kubelet may not observe `MemoryPressure` fast enough, and the `OOMKiller`
|
||||
will still be invoked.
|
||||
will still be invoked.
|
||||
|
||||
You can use the `--kernel-memcg-notification` flag to enable the `memcg`
|
||||
notification API on the kubelet to get notified immediately when a threshold
|
||||
|
@ -394,29 +394,29 @@ is crossed.
|
|||
|
||||
If you are not trying to achieve extreme utilization, but a sensible measure of
|
||||
overcommit, a viable workaround for this issue is to use the `--kube-reserved`
|
||||
and `--system-reserved` flags to allocate memory for the system.
|
||||
and `--system-reserved` flags to allocate memory for the system.
|
||||
|
||||
#### active_file memory is not considered as available memory
|
||||
|
||||
On Linux, the kernel tracks the number of bytes of file-backed memory on active
|
||||
On Linux, the kernel tracks the number of bytes of file-backed memory on active
|
||||
LRU list as the `active_file` statistic. The kubelet treats `active_file` memory
|
||||
areas as not reclaimable. For workloads that make intensive use of block-backed
|
||||
local storage, including ephemeral local storage, kernel-level caches of file
|
||||
and block data means that many recently accessed cache pages are likely to be
|
||||
counted as `active_file`. If enough of these kernel block buffers are on the
|
||||
active LRU list, the kubelet is liable to observe this as high resource use and
|
||||
areas as not reclaimable. For workloads that make intensive use of block-backed
|
||||
local storage, including ephemeral local storage, kernel-level caches of file
|
||||
and block data means that many recently accessed cache pages are likely to be
|
||||
counted as `active_file`. If enough of these kernel block buffers are on the
|
||||
active LRU list, the kubelet is liable to observe this as high resource use and
|
||||
taint the node as experiencing memory pressure - triggering pod eviction.
|
||||
|
||||
For more details, see [https://github.com/kubernetes/kubernetes/issues/43916](https://github.com/kubernetes/kubernetes/issues/43916)
|
||||
|
||||
You can work around that behavior by setting the memory limit and memory request
|
||||
the same for containers likely to perform intensive I/O activity. You will need
|
||||
the same for containers likely to perform intensive I/O activity. You will need
|
||||
to estimate or measure an optimal memory limit value for that container.
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
* Learn about [API-initiated Eviction](/docs/concepts/scheduling-eviction/api-eviction/)
|
||||
* Learn about [Pod Priority and Preemption](/docs/concepts/scheduling-eviction/pod-priority-preemption/)
|
||||
* Learn about [PodDisruptionBudgets](/docs/tasks/run-application/configure-pdb/)
|
||||
* Learn about [Quality of Service](/docs/tasks/configure-pod-container/quality-service-pod/) (QoS)
|
||||
* Check out the [Eviction API](/docs/reference/generated/kubernetes-api/{{<param "version">}}/#create-eviction-pod-v1-core)
|
||||
- Learn about [API-initiated Eviction](/docs/concepts/scheduling-eviction/api-eviction/)
|
||||
- Learn about [Pod Priority and Preemption](/docs/concepts/scheduling-eviction/pod-priority-preemption/)
|
||||
- Learn about [PodDisruptionBudgets](/docs/tasks/run-application/configure-pdb/)
|
||||
- Learn about [Quality of Service](/docs/tasks/configure-pod-container/quality-service-pod/) (QoS)
|
||||
- Check out the [Eviction API](/docs/reference/generated/kubernetes-api/{{<param "version">}}/#create-eviction-pod-v1-core)
|
||||
|
|
|
@ -26,22 +26,7 @@ criteria that Pod should be satisfied before considered schedulable. This field
|
|||
only when a Pod is created (either by the client, or mutated during admission). After creation,
|
||||
each schedulingGate can be removed in arbitrary order, but addition of a new scheduling gate is disallowed.
|
||||
|
||||
{{<mermaid>}}
|
||||
stateDiagram-v2
|
||||
s1: pod created
|
||||
s2: pod scheduling gated
|
||||
s3: pod scheduling ready
|
||||
s4: pod running
|
||||
if: empty scheduling gates?
|
||||
[*] --> s1
|
||||
s1 --> if
|
||||
s2 --> if: scheduling gate removed
|
||||
if --> s2: no
|
||||
if --> s3: yes
|
||||
s3 --> s4
|
||||
s4 --> [*]
|
||||
{{< /mermaid >}}
|
||||
|
||||
{{< figure src="/docs/images/podSchedulingGates.svg" alt="pod-scheduling-gates-diagram" caption="Figure. Pod SchedulingGates" class="diagram-large" link="https://mermaid.live/edit#pako:eNplkktTwyAUhf8KgzuHWpukaYszutGlK3caFxQuCVMCGSDVTKf_XfKyPlhxz4HDB9wT5lYAptgHFuBRsdKxenFMClMYFIdfUdRYgbiD6ItJTEbR8wpEq5UpUfnDTf-5cbPoJjcbXdcaE61RVJIiqJvQ_Y30D-OCt-t3tFjcR5wZayiVnIGmkv4NiEfX9jijKTmmRH5jf0sRugOP0HyHUc1m6KGMFP27cM28fwSJDluPpNKaXqVJzmFNfHD2APRKSjnNFx9KhIpmzSfhVls3eHdTRrwG8QnxKfEZUUNeYTDBNbiaKRF_5dSfX-BQQQ0FpnEqQLJWhwIX5hyXsjbYl85wTINrgeC2EZd_xFQy7b_VJ6GCdd-itkxALE84dE3fAqXyIUZya6Qqe711OspVCI2ny2Vv35QqVO3-htt66ZWomAvVcZcv8yTfsiSFfJOydZoKvl_ttjLJVlJsblcJw-czwQ0zr9ZeqGDgeR77b2jD8xdtjtDn" >}}
|
||||
## Usage example
|
||||
|
||||
To mark a Pod not-ready for scheduling, you can create it with one or more scheduling gates like this:
|
||||
|
|
|
@ -10,7 +10,7 @@ weight: 80
|
|||
|
||||
<!-- overview -->
|
||||
|
||||
In the [scheduling-plugin](/docs/reference/scheduling/config/#scheduling-plugins) `NodeResourcesFit` of kube-scheduler, there are two
|
||||
In the [scheduling-plugin](/docs/reference/scheduling/config/#scheduling-plugins) `NodeResourcesFit` of kube-scheduler, there are two
|
||||
scoring strategies that support the bin packing of resources: `MostAllocated` and `RequestedToCapacityRatio`.
|
||||
|
||||
<!-- body -->
|
||||
|
@ -42,7 +42,7 @@ profiles:
|
|||
name: NodeResourcesFit
|
||||
```
|
||||
|
||||
To learn more about other parameters and their default configuration, see the API documentation for
|
||||
To learn more about other parameters and their default configuration, see the API documentation for
|
||||
[`NodeResourcesFitArgs`](/docs/reference/config-api/kube-scheduler-config.v1beta3/#kubescheduler-config-k8s-io-v1beta3-NodeResourcesFitArgs).
|
||||
|
||||
## Enabling bin packing using RequestedToCapacityRatio
|
||||
|
@ -55,10 +55,10 @@ configured function of the allocated resources. The behavior of the `RequestedTo
|
|||
the `NodeResourcesFit` score function can be controlled by the
|
||||
[scoringStrategy](/docs/reference/config-api/kube-scheduler-config.v1beta3/#kubescheduler-config-k8s-io-v1beta3-ScoringStrategy) field.
|
||||
Within the `scoringStrategy` field, you can configure two parameters: `requestedToCapacityRatio` and
|
||||
`resources`. The `shape` in the `requestedToCapacityRatio`
|
||||
parameter allows the user to tune the function as least requested or most
|
||||
requested based on `utilization` and `score` values. The `resources` parameter
|
||||
consists of `name` of the resource to be considered during scoring and `weight`
|
||||
`resources`. The `shape` in the `requestedToCapacityRatio`
|
||||
parameter allows the user to tune the function as least requested or most
|
||||
requested based on `utilization` and `score` values. The `resources` parameter
|
||||
consists of `name` of the resource to be considered during scoring and `weight`
|
||||
specify the weight of each resource.
|
||||
|
||||
Below is an example configuration that sets
|
||||
|
@ -87,11 +87,11 @@ profiles:
|
|||
name: NodeResourcesFit
|
||||
```
|
||||
|
||||
Referencing the `KubeSchedulerConfiguration` file with the kube-scheduler
|
||||
flag `--config=/path/to/config/file` will pass the configuration to the
|
||||
Referencing the `KubeSchedulerConfiguration` file with the kube-scheduler
|
||||
flag `--config=/path/to/config/file` will pass the configuration to the
|
||||
scheduler.
|
||||
|
||||
To learn more about other parameters and their default configuration, see the API documentation for
|
||||
To learn more about other parameters and their default configuration, see the API documentation for
|
||||
[`NodeResourcesFitArgs`](/docs/reference/config-api/kube-scheduler-config.v1beta3/#kubescheduler-config-k8s-io-v1beta3-NodeResourcesFitArgs).
|
||||
|
||||
### Tuning the score function
|
||||
|
@ -100,10 +100,10 @@ To learn more about other parameters and their default configuration, see the AP
|
|||
|
||||
```yaml
|
||||
shape:
|
||||
- utilization: 0
|
||||
score: 0
|
||||
- utilization: 100
|
||||
score: 10
|
||||
- utilization: 0
|
||||
score: 0
|
||||
- utilization: 100
|
||||
score: 10
|
||||
```
|
||||
|
||||
The above arguments give the node a `score` of 0 if `utilization` is 0% and 10 for
|
||||
|
@ -120,7 +120,7 @@ shape:
|
|||
|
||||
`resources` is an optional parameter which defaults to:
|
||||
|
||||
``` yaml
|
||||
```yaml
|
||||
resources:
|
||||
- name: cpu
|
||||
weight: 1
|
||||
|
@ -128,7 +128,7 @@ resources:
|
|||
weight: 1
|
||||
```
|
||||
|
||||
It can be used to add extended resources as follows:
|
||||
It can be used to add extended resources as follows:
|
||||
|
||||
```yaml
|
||||
resources:
|
||||
|
@ -188,8 +188,8 @@ intel.com/foo = resourceScoringFunction((2+1),4)
|
|||
= (100 - ((4-3)*100/4)
|
||||
= (100 - 25)
|
||||
= 75 # requested + used = 75% * available
|
||||
= rawScoringFunction(75)
|
||||
= 7 # floor(75/10)
|
||||
= rawScoringFunction(75)
|
||||
= 7 # floor(75/10)
|
||||
|
||||
memory = resourceScoringFunction((256+256),1024)
|
||||
= (100 -((1024-512)*100/1024))
|
||||
|
@ -251,4 +251,3 @@ NodeScore = (5 * 5) + (7 * 1) + (10 * 3) / (5 + 1 + 3)
|
|||
|
||||
- Read more about the [scheduling framework](/docs/concepts/scheduling-eviction/scheduling-framework/)
|
||||
- Read more about [scheduler configuration](/docs/reference/scheduling/config/)
|
||||
|
||||
|
|
|
@ -8,8 +8,8 @@ weight: 90
|
|||
|
||||
<!-- overview -->
|
||||
|
||||
The Kubernetes API server is the main point of entry to a cluster for external parties
|
||||
(users and services) interacting with it.
|
||||
The Kubernetes API server is the main point of entry to a cluster for external parties
|
||||
(users and services) interacting with it.
|
||||
|
||||
As part of this role, the API server has several key built-in security controls, such as
|
||||
audit logging and {{< glossary_tooltip text="admission controllers" term_id="admission-controller" >}}.
|
||||
|
@ -48,13 +48,13 @@ API server. However, the Pod still runs on the node. For more information, refer
|
|||
### Mitigations {#static-pods-mitigations}
|
||||
|
||||
- Only [enable the kubelet static Pod manifest functionality](/docs/tasks/configure-pod-container/static-pod/#static-pod-creation)
|
||||
if required by the node.
|
||||
if required by the node.
|
||||
- If a node uses the static Pod functionality, restrict filesystem access to the static Pod manifest directory
|
||||
or URL to users who need the access.
|
||||
or URL to users who need the access.
|
||||
- Restrict access to kubelet configuration parameters and files to prevent an attacker setting
|
||||
a static Pod path or URL.
|
||||
a static Pod path or URL.
|
||||
- Regularly audit and centrally report all access to directories or web storage locations that host
|
||||
static Pod manifests and kubelet configuration files.
|
||||
static Pod manifests and kubelet configuration files.
|
||||
|
||||
## The kubelet API {#kubelet-api}
|
||||
|
||||
|
@ -73,7 +73,7 @@ Direct access to the kubelet API is not subject to admission control and is not
|
|||
by Kubernetes audit logging. An attacker with direct access to this API may be able to
|
||||
bypass controls that detect or prevent certain actions.
|
||||
|
||||
The kubelet API can be configured to authenticate requests in a number of ways.
|
||||
The kubelet API can be configured to authenticate requests in a number of ways.
|
||||
By default, the kubelet configuration allows anonymous access. Most Kubernetes providers
|
||||
change the default to use webhook and certificate authentication. This lets the control plane
|
||||
ensure that the caller is authorized to access the `nodes` API resource or sub-resources.
|
||||
|
@ -86,7 +86,7 @@ The default anonymous access doesn't make this assertion with the control plane.
|
|||
such as by monitoring services.
|
||||
- Restrict access to the kubelet port. Only allow specified and trusted IP address
|
||||
ranges to access the port.
|
||||
- Ensure that [kubelet authentication](/docs/reference/access-authn-authz/kubelet-authn-authz/#kubelet-authentication).
|
||||
- Ensure that [kubelet authentication](/docs/reference/access-authn-authz/kubelet-authn-authz/#kubelet-authentication).
|
||||
is set to webhook or certificate mode.
|
||||
- Ensure that the unauthenticated "read-only" Kubelet port is not enabled on the cluster.
|
||||
|
||||
|
@ -108,7 +108,7 @@ cluster admin rights by accessing cluster secrets or modifying access rules. Eve
|
|||
elevating their Kubernetes RBAC privileges, an attacker who can modify etcd can retrieve any API object
|
||||
or create new workloads inside the cluster.
|
||||
|
||||
Many Kubernetes providers configure
|
||||
Many Kubernetes providers configure
|
||||
etcd to use mutual TLS (both client and server verify each other's certificate for authentication).
|
||||
There is no widely accepted implementation of authorization for the etcd API, although
|
||||
the feature exists. Since there is no authorization model, any certificate
|
||||
|
@ -124,10 +124,9 @@ that are only used for health checking can also grant full read and write access
|
|||
- Consider restricting access to the etcd port at a network level, to only allow access
|
||||
from specified and trusted IP address ranges.
|
||||
|
||||
|
||||
## Container runtime socket {#runtime-socket}
|
||||
|
||||
On each node in a Kubernetes cluster, access to interact with containers is controlled
|
||||
On each node in a Kubernetes cluster, access to interact with containers is controlled
|
||||
by the container runtime (or runtimes, if you have configured more than one). Typically,
|
||||
the container runtime exposes a Unix socket that the kubelet can access. An attacker with
|
||||
access to this socket can launch new containers or interact with running containers.
|
||||
|
@ -139,12 +138,12 @@ control plane components.
|
|||
|
||||
### Mitigations {#runtime-socket-mitigations}
|
||||
|
||||
- Ensure that you tightly control filesystem access to container runtime sockets.
|
||||
When possible, restrict this access to the `root` user.
|
||||
- Ensure that you tightly control filesystem access to container runtime sockets.
|
||||
When possible, restrict this access to the `root` user.
|
||||
- Isolate the kubelet from other components running on the node, using
|
||||
mechanisms such as Linux kernel namespaces.
|
||||
mechanisms such as Linux kernel namespaces.
|
||||
- Ensure that you restrict or forbid the use of [`hostPath` mounts](/docs/concepts/storage/volumes/#hostpath)
|
||||
that include the container runtime socket, either directly or by mounting a parent
|
||||
directory. Also `hostPath` mounts must be set as read-only to mitigate risks
|
||||
of attackers bypassing directory restrictions.
|
||||
that include the container runtime socket, either directly or by mounting a parent
|
||||
directory. Also `hostPath` mounts must be set as read-only to mitigate risks
|
||||
of attackers bypassing directory restrictions.
|
||||
- Restrict user access to nodes, and especially restrict superuser access to nodes.
|
||||
|
|
|
@ -131,4 +131,7 @@ current policy level:
|
|||
- [Enforcing Pod Security Standards](/docs/setup/best-practices/enforcing-pod-security-standards)
|
||||
- [Enforce Pod Security Standards by Configuring the Built-in Admission Controller](/docs/tasks/configure-pod-container/enforce-standards-admission-controller)
|
||||
- [Enforce Pod Security Standards with Namespace Labels](/docs/tasks/configure-pod-container/enforce-standards-namespace-labels)
|
||||
- [Migrate from PodSecurityPolicy to the Built-In PodSecurity Admission Controller](/docs/tasks/configure-pod-container/migrate-from-psp)
|
||||
|
||||
If you are running an older version of Kubernetes and want to upgrade
|
||||
to a version of Kubernetes that does not include PodSecurityPolicies,
|
||||
read [migrate from PodSecurityPolicy to the Built-In PodSecurity Admission Controller](/docs/tasks/configure-pod-container/migrate-from-psp).
|
||||
|
|
|
@ -152,7 +152,7 @@ fail validation.
|
|||
<tr>
|
||||
<td style="white-space: nowrap">Host Ports</td>
|
||||
<td>
|
||||
<p>HostPorts should be disallowed, or at minimum restricted to a known list.</p>
|
||||
<p>HostPorts should be disallowed entirely (recommended) or restricted to a known list</p>
|
||||
<p><strong>Restricted Fields</strong></p>
|
||||
<ul>
|
||||
<li><code>spec.containers[*].ports[*].hostPort</code></li>
|
||||
|
@ -162,7 +162,7 @@ fail validation.
|
|||
<p><strong>Allowed Values</strong></p>
|
||||
<ul>
|
||||
<li>Undefined/nil</li>
|
||||
<li>Known list</li>
|
||||
<li>Known list (not supported by the built-in <a href="/docs/concepts/security/pod-security-admission/">Pod Security Admission controller</a>)</li>
|
||||
<li><code>0</code></li>
|
||||
</ul>
|
||||
</td>
|
||||
|
|
|
@ -121,8 +121,20 @@ considered weak.
|
|||
|
||||
### Persistent volume creation
|
||||
|
||||
As noted in the [PodSecurityPolicy](/docs/concepts/security/pod-security-policy/#volumes-and-file-systems)
|
||||
documentation, access to create PersistentVolumes can allow for escalation of access to the underlying host.
|
||||
If someone - or some application - is allowed to create arbitrary PersistentVolumes, that access
|
||||
includes the creation of `hostPath` volumes, which then means that a Pod would get access
|
||||
to the underlying host filesystem(s) on the associated node. Granting that ability is a security risk.
|
||||
|
||||
There are many ways a container with unrestricted access to the host filesystem can escalate privileges, including
|
||||
reading data from other containers, and abusing the credentials of system services, such as Kubelet.
|
||||
|
||||
You should only allow access to create PersistentVolume objects for:
|
||||
|
||||
- users (cluster operators) that need this access for their work, and who you trust,
|
||||
- the Kubernetes control plane components which creates PersistentVolumes based on PersistentVolumeClaims
|
||||
that are configured for automatic provisioning.
|
||||
This is usually setup by the Kubernetes provider or by the operator when installing a CSI driver.
|
||||
|
||||
Where access to persistent storage is required trusted administrators should create
|
||||
PersistentVolumes, and constrained users should use PersistentVolumeClaims to access that storage.
|
||||
|
||||
|
|
|
@ -0,0 +1,266 @@
|
|||
---
|
||||
title: Service Accounts
|
||||
description: >
|
||||
Learn about ServiceAccount objects in Kubernetes.
|
||||
content_type: concept
|
||||
weight: 10
|
||||
---
|
||||
|
||||
<!-- overview -->
|
||||
|
||||
This page introduces the ServiceAccount object in Kubernetes, providing
|
||||
information about how service accounts work, use cases, limitations,
|
||||
alternatives, and links to resources for additional guidance.
|
||||
|
||||
<!-- body -->
|
||||
|
||||
## What are service accounts? {#what-are-service-accounts}
|
||||
|
||||
A service account is a type of non-human account that, in Kubernetes, provides
|
||||
a distinct identity in a Kubernetes cluster. Application Pods, system
|
||||
components, and entities inside and outside the cluster can use a specific
|
||||
ServiceAccount's credentials to identify as that ServiceAccount. This identity
|
||||
is useful in various situations, including authenticating to the API server or
|
||||
implementing identity-based security policies.
|
||||
|
||||
Service accounts exist as ServiceAccount objects in the API server. Service
|
||||
accounts have the following properties:
|
||||
|
||||
* **Namespaced:** Each service account is bound to a Kubernetes
|
||||
{{<glossary_tooltip text="namespace" term_id="namespace">}}. Every namespace
|
||||
gets a [`default` ServiceAccount](#default-service-accounts) upon creation.
|
||||
|
||||
* **Lightweight:** Service accounts exist in the cluster and are
|
||||
defined in the Kubernetes API. You can quickly create service accounts to
|
||||
enable specific tasks.
|
||||
|
||||
* **Portable:** A configuration bundle for a complex containerized workload
|
||||
might include service account definitions for the system's components. The
|
||||
lightweight nature of service accounts and the namespaced identities make
|
||||
the configurations portable.
|
||||
|
||||
Service accounts are different from user accounts, which are authenticated
|
||||
human users in the cluster. By default, user accounts don't exist in the Kubernetes
|
||||
API server; instead, the API server treats user identities as opaque
|
||||
data. You can authenticate as a user account using multiple methods. Some
|
||||
Kubernetes distributions might add custom extension APIs to represent user
|
||||
accounts in the API server.
|
||||
|
||||
{{< table caption="Comparison between service accounts and users" >}}
|
||||
|
||||
| Description | ServiceAccount | User or group |
|
||||
| --- | --- | --- |
|
||||
| Location | Kubernetes API (ServiceAccount object) | External |
|
||||
| Access control | Kubernetes RBAC or other [authorization mechanisms](/docs/reference/access-authn-authz/authorization/#authorization-modules) | Kubernetes RBAC or other identity and access management mechanisms |
|
||||
| Intended use | Workloads, automation | People |
|
||||
|
||||
{{< /table >}}
|
||||
|
||||
### Default service accounts {#default-service-accounts}
|
||||
|
||||
When you create a cluster, Kubernetes automatically creates a ServiceAccount
|
||||
object named `default` for every namespace in your cluster. The `default`
|
||||
service accounts in each namespace get no permissions by default other than the
|
||||
[default API discovery permissions](/docs/reference/access-authn-authz/rbac/#default-roles-and-role-bindings)
|
||||
that Kubernetes grants to all authenticated principals if role-based access control (RBAC) is enabled.
|
||||
If you delete the `default` ServiceAccount object in a namespace, the
|
||||
{{< glossary_tooltip text="control plane" term_id="control-plane" >}}
|
||||
replaces it with a new one.
|
||||
|
||||
If you deploy a Pod in a namespace, and you don't
|
||||
[manually assign a ServiceAccount to the Pod](#assign-to-pod), Kubernetes
|
||||
assigns the `default` ServiceAccount for that namespace to the Pod.
|
||||
|
||||
## Use cases for Kubernetes service accounts {#use-cases}
|
||||
|
||||
As a general guideline, you can use service accounts to provide identities in
|
||||
the following scenarios:
|
||||
|
||||
* Your Pods need to communicate with the Kubernetes API server, for example in
|
||||
situations such as the following:
|
||||
* Providing read-only access to sensitive information stored in Secrets.
|
||||
* Granting [cross-namespace access](#cross-namespace), such as allowing a
|
||||
Pod in namespace `example` to read, list, and watch for Lease objects in
|
||||
the `kube-node-lease` namespace.
|
||||
* Your Pods need to communicate with an external service. For example, a
|
||||
workload Pod requires an identity for a commercially available cloud API,
|
||||
and the commercial provider allows configuring a suitable trust relationship.
|
||||
* [Authenticating to a private image registry using an `imagePullSecret`](/docs/tasks/configure-pod-container/configure-service-account/#add-imagepullsecrets-to-a-service-account).
|
||||
* An external service needs to communicate with the Kubernetes API server. For
|
||||
example, authenticating to the cluster as part of a CI/CD pipeline.
|
||||
* You use third-party security software in your cluster that relies on the
|
||||
ServiceAccount identity of different Pods to group those Pods into different
|
||||
contexts.
|
||||
|
||||
|
||||
## How to use service accounts {#how-to-use}
|
||||
|
||||
To use a Kubernetes service account, you do the following:
|
||||
|
||||
1. Create a ServiceAccount object using a Kubernetes
|
||||
client like `kubectl` or a manifest that defines the object.
|
||||
1. Grant permissions to the ServiceAccount object using an authorization
|
||||
mechanism such as
|
||||
[RBAC](https://kubernetes.io/docs/reference/access-authn-authz/rbac/).
|
||||
1. Assign the ServiceAccount object to Pods during Pod creation.
|
||||
|
||||
If you're using the identity from an external service,
|
||||
[retrieve the ServiceAccount token](#get-a-token) and use it from that
|
||||
service instead.
|
||||
|
||||
For instructions, refer to
|
||||
[Configure Service Accounts for Pods](/docs/tasks/configure-pod-container/configure-service-account/).
|
||||
|
||||
### Grant permissions to a ServiceAccount {#grant-permissions}
|
||||
|
||||
You can use the built-in Kubernetes
|
||||
[role-based access control (RBAC)](/docs/reference/access-authn-authz/rbac/)
|
||||
mechanism to grant the minimum permissions required by each service account.
|
||||
You create a *role*, which grants access, and then *bind* the role to your
|
||||
ServiceAccount. RBAC lets you define a minimum set of permissions so that the
|
||||
service account permissions follow the principle of least privilege. Pods that
|
||||
use that service account don't get more permissions than are required to
|
||||
function correctly.
|
||||
|
||||
For instructions, refer to
|
||||
[ServiceAccount permissions](/docs/reference/access-authn-authz/rbac/#service-account-permissions).
|
||||
|
||||
#### Cross-namespace access using a ServiceAccount {#cross-namespace}
|
||||
|
||||
You can use RBAC to allow service accounts in one namespace to perform actions
|
||||
on resources in a different namespace in the cluster. For example, consider a
|
||||
scenario where you have a service account and Pod in the `dev` namespace and
|
||||
you want your Pod to see Jobs running in the `maintenance` namespace. You could
|
||||
create a Role object that grants permissions to list Job objects. Then,
|
||||
you'd create a RoleBinding object in the `maintenance` namespace to bind the
|
||||
Role to the ServiceAccount object. Now, Pods in the `dev` namespace can list
|
||||
Job objects in the `maintenance` namespace using that service account.
|
||||
|
||||
### Assign a ServiceAccount to a Pod {#assign-to-pod}
|
||||
|
||||
To assign a ServiceAccount to a Pod, you set the `spec.serviceAccountName`
|
||||
field in the Pod specification. Kubernetes then automatically provides the
|
||||
credentials for that ServiceAccount to the Pod. In v1.22 and later, Kubernetes
|
||||
gets a short-lived, **automatically rotating** token using the `TokenRequest`
|
||||
API and mounts the token as a
|
||||
[projected volume](/docs/concepts/storage/projected-volumes/#serviceaccounttoken).
|
||||
|
||||
By default, Kubernetes provides the Pod
|
||||
with the credentials for an assigned ServiceAccount, whether that is the
|
||||
`default` ServiceAccount or a custom ServiceAccount that you specify.
|
||||
|
||||
To prevent Kubernetes from automatically injecting
|
||||
credentials for a specified ServiceAccount or the `default` ServiceAccount, set the
|
||||
`automountServiceAccountToken` field in your Pod specification to `false`.
|
||||
|
||||
<!-- OK to remove this historical detail after Kubernetes 1.31 is released -->
|
||||
|
||||
In versions earlier than 1.22, Kubernetes provides a long-lived, static token
|
||||
to the Pod as a Secret.
|
||||
|
||||
#### Manually retrieve ServiceAccount credentials {#get-a-token}
|
||||
|
||||
If you need the credentials for a ServiceAccount to mount in a non-standard
|
||||
location, or for an audience that isn't the API server, use one of the
|
||||
following methods:
|
||||
|
||||
* [TokenRequest API](/docs/reference/kubernetes-api/authentication-resources/token-request-v1/)
|
||||
(recommended): Request a short-lived service account token from within
|
||||
your own *application code*. The token expires automatically and can rotate
|
||||
upon expiration.
|
||||
If you have a legacy application that is not aware of Kubernetes, you
|
||||
could use a sidecar container within the same pod to fetch these tokens
|
||||
and make them available to the application workload.
|
||||
* [Token Volume Projection](/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection)
|
||||
(also recommended): In Kubernetes v1.20 and later, use the Pod specification to
|
||||
tell the kubelet to add the service account token to the Pod as a
|
||||
*projected volume*. Projected tokens expire automatically, and the kubelet
|
||||
rotates the token before it expires.
|
||||
* [Service Account Token Secrets](/docs/tasks/configure-pod-container/configure-service-account/#manually-create-a-service-account-api-token)
|
||||
(not recommended): You can mount service account tokens as Kubernetes
|
||||
Secrets in Pods. These tokens don't expire and don't rotate. This method
|
||||
is not recommended, especially at scale, because of the risks associated
|
||||
with static, long-lived credentials. In Kubernetes v1.24 and later, the
|
||||
[LegacyServiceAccountTokenNoAutoGeneration feature gate](/docs/reference/command-line-tools-reference/feature-gates/#feature-gates-for-graduated-or-deprecated-features)
|
||||
prevents Kubernetes from automatically creating these tokens for
|
||||
ServiceAccounts. `LegacyServiceAccountTokenNoAutoGeneration` is enabled
|
||||
by default; in other words, Kubernetes does not create these tokens.
|
||||
|
||||
## Authenticating service account credentials {#authenticating-credentials}
|
||||
|
||||
ServiceAccounts use signed
|
||||
{{<glossary_tooltip term_id="jwt" text="JSON Web Tokens">}} (JWTs)
|
||||
to authenticate to the Kubernetes API server, and to any other system where a
|
||||
trust relationship exists. Depending on how the token was issued
|
||||
(either time-limited using a `TokenRequest` or using a legacy mechanism with
|
||||
a Secret), a ServiceAccount token might also have an expiry time, an audience,
|
||||
and a time after which the token *starts* being valid. When a client that is
|
||||
acting as a ServiceAccount tries to communicate with the Kubernetes API server,
|
||||
the client includes an `Authorization: Bearer <token>` header with the HTTP
|
||||
request. The API server checks the validity of that bearer token as follows:
|
||||
|
||||
1. Check the token signature.
|
||||
1. Check whether the token has expired.
|
||||
1. Check whether object references in the token claims are currently valid.
|
||||
1. Check whether the token is currently valid.
|
||||
1. Check the audience claims.
|
||||
|
||||
The TokenRequest API produces _bound tokens_ for a ServiceAccount. This
|
||||
binding is linked to the lifetime of the client, such as a Pod, that is acting
|
||||
as that ServiceAccount.
|
||||
|
||||
For tokens issued using the `TokenRequest` API, the API server also checks that
|
||||
the specific object reference that is using the ServiceAccount still exists,
|
||||
matching by the {{< glossary_tooltip term_id="uid" text="unique ID" >}} of that
|
||||
object. For legacy tokens that are mounted as Secrets in Pods, the API server
|
||||
checks the token against the Secret.
|
||||
|
||||
For more information about the authentication process, refer to
|
||||
[Authentication](/docs/reference/access-authn-authz/authentication/#service-account-tokens).
|
||||
|
||||
### Authenticating service account credentials in your own code {#authenticating-in-code}
|
||||
|
||||
If you have services of your own that need to validate Kubernetes service
|
||||
account credentials, you can use the following methods:
|
||||
|
||||
* [TokenReview API](/docs/reference/kubernetes-api/authentication-resources/token-review-v1/)
|
||||
(recommended)
|
||||
* OIDC discovery
|
||||
|
||||
The Kubernetes project recommends that you use the TokenReview API, because
|
||||
this method invalidates tokens that are bound to API objects such as Secrets,
|
||||
ServiceAccounts, and Pods when those objects are deleted. For example, if you
|
||||
delete the Pod that contains a projected ServiceAccount token, the cluster
|
||||
invalidates that token immediately and a TokenReview immediately fails.
|
||||
If you use OIDC validation instead, your clients continue to treat the token
|
||||
as valid until the token reaches its expiration timestamp.
|
||||
|
||||
Your application should always define the audience that it accepts, and should
|
||||
check that the token's audiences match the audiences that the application
|
||||
expects. This helps to minimize the scope of the token so that it can only be
|
||||
used in your application and nowhere else.
|
||||
|
||||
## Alternatives
|
||||
|
||||
* Issue your own tokens using another mechanism, and then use
|
||||
[Webhook Token Authentication](/docs/reference/access-authn-authz/authentication/#webhook-token-authentication)
|
||||
to validate bearer tokens using your own validation service.
|
||||
* Provide your own identities to Pods.
|
||||
* [Use the SPIFFE CSI driver plugin to provide SPIFFE SVIDs as X.509 certificate pairs to Pods](https://cert-manager.io/docs/projects/csi-driver-spiffe/).
|
||||
{{% thirdparty-content single="true" %}}
|
||||
* [Use a service mesh such as Istio to provide certificates to Pods](https://istio.io/latest/docs/tasks/security/cert-management/plugin-ca-cert/).
|
||||
* Authenticate from outside the cluster to the API server without using service account tokens:
|
||||
* [Configure the API server to accept OpenID Connect (OIDC) tokens from your identity provider](/docs/reference/access-authn-authz/authentication/#openid-connect-tokens).
|
||||
* Use service accounts or user accounts created using an external Identity
|
||||
and Access Management (IAM) service, such as from a cloud provider, to
|
||||
authenticate to your cluster.
|
||||
* [Use the CertificateSigningRequest API with client certificates](/docs/tasks/tls/managing-tls-in-a-cluster/).
|
||||
* [Configure the kubelet to retrieve credentials from an image registry](/docs/tasks/administer-cluster/kubelet-credential-provider/).
|
||||
* Use a Device Plugin to access a virtual Trusted Platform Module (TPM), which
|
||||
then allows authentication using a private key.
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
* Learn how to [manage your ServiceAccounts as a cluster administrator](/docs/reference/access-authn-authz/service-accounts-admin/).
|
||||
* Learn how to [assign a ServiceAccount to a Pod](/docs/tasks/configure-pod-container/configure-service-account/).
|
||||
* Read the [ServiceAccount API reference](/docs/reference/kubernetes-api/authentication-resources/service-account-v1/).
|
|
@ -306,7 +306,7 @@ When the Pod above is created, the container `test` gets the following contents
|
|||
in its `/etc/resolv.conf` file:
|
||||
|
||||
```
|
||||
nameserver 1.2.3.4
|
||||
nameserver 192.0.2.1
|
||||
search ns1.svc.cluster-domain.example my.dns.search.suffix
|
||||
options ndots:2 edns0
|
||||
```
|
||||
|
|
|
@ -104,7 +104,7 @@ the pod is also terminating.
|
|||
|
||||
{{< note >}}
|
||||
|
||||
Although `serving` is almost identical to `ready`, it was added to prevent break the existing meaning
|
||||
Although `serving` is almost identical to `ready`, it was added to prevent breaking the existing meaning
|
||||
of `ready`. It may be unexpected for existing clients if `ready` could be `true` for terminating
|
||||
endpoints, since historically terminating endpoints were never included in the Endpoints or
|
||||
EndpointSlice API to begin with. For this reason, `ready` is _always_ `false` for terminating
|
||||
|
|
|
@ -69,7 +69,7 @@ The name of an Ingress object must be a valid
|
|||
[DNS subdomain name](/docs/concepts/overview/working-with-objects/names#dns-subdomain-names).
|
||||
For general information about working with config files, see [deploying applications](/docs/tasks/run-application/run-stateless-application-deployment/), [configuring containers](/docs/tasks/configure-pod-container/configure-pod-configmap/), [managing resources](/docs/concepts/cluster-administration/manage-deployment/).
|
||||
Ingress frequently uses annotations to configure some options depending on the Ingress controller, an example of which
|
||||
is the [rewrite-target annotation](https://github.com/kubernetes/ingress-nginx/blob/master/docs/examples/rewrite/README.md).
|
||||
is the [rewrite-target annotation](https://github.com/kubernetes/ingress-nginx/blob/main/docs/examples/rewrite/README.md).
|
||||
Different [Ingress controllers](/docs/concepts/services-networking/ingress-controllers) support different annotations. Review the documentation for
|
||||
your choice of Ingress controller to learn which annotations are supported.
|
||||
|
||||
|
|
|
@ -18,22 +18,23 @@ weight: 10
|
|||
|
||||
{{< glossary_definition term_id="service" length="short" >}}
|
||||
|
||||
With Kubernetes you don't need to modify your application to use an unfamiliar service discovery mechanism.
|
||||
Kubernetes gives Pods their own IP addresses and a single DNS name for a set of Pods,
|
||||
and can load-balance across them.
|
||||
A key aim of Services in Kubernetes is that you don't need to modify your existing
|
||||
application to use an unfamiliar service discovery mechanism.
|
||||
You can run code in Pods, whether this is a code designed for a cloud-native world, or
|
||||
an older app you've containerized. You use a Service to make that set of Pods available
|
||||
on the network so that clients can interact with it.
|
||||
|
||||
<!-- body -->
|
||||
|
||||
## Motivation
|
||||
|
||||
Kubernetes {{< glossary_tooltip term_id="pod" text="Pods" >}} are created and destroyed
|
||||
to match the desired state of your cluster. Pods are nonpermanent resources.
|
||||
If you use a {{< glossary_tooltip term_id="deployment" >}} to run your app,
|
||||
it can create and destroy Pods dynamically.
|
||||
that Deployment can create and destroy Pods dynamically. From one moment to the next,
|
||||
you don't know how many of those Pods are working and healthy; you might not even know
|
||||
what those healthy Pods are named.
|
||||
Kubernetes {{< glossary_tooltip term_id="pod" text="Pods" >}} are created and destroyed
|
||||
to match the desired state of your cluster. Pods are emphemeral resources (you should not
|
||||
expect that an individual Pod is reliable and durable).
|
||||
|
||||
Each Pod gets its own IP address, however in a Deployment, the set of Pods
|
||||
running in one moment in time could be different from
|
||||
the set of Pods running that application a moment later.
|
||||
Each Pod gets its own IP address (Kubernetes expects network plugins to ensure this).
|
||||
For a given Deployment in your cluster, the set of Pods running in one moment in
|
||||
time could be different from the set of Pods running that application a moment later.
|
||||
|
||||
This leads to a problem: if some set of Pods (call them "backends") provides
|
||||
functionality to other Pods (call them "frontends") inside your cluster,
|
||||
|
@ -42,14 +43,13 @@ to, so that the frontend can use the backend part of the workload?
|
|||
|
||||
Enter _Services_.
|
||||
|
||||
## Service resources {#service-resource}
|
||||
<!-- body -->
|
||||
|
||||
In Kubernetes, a Service is an abstraction which defines a logical set of Pods
|
||||
and a policy by which to access them (sometimes this pattern is called
|
||||
a micro-service). The set of Pods targeted by a Service is usually determined
|
||||
by a {{< glossary_tooltip text="selector" term_id="selector" >}}.
|
||||
To learn about other ways to define Service endpoints,
|
||||
see [Services _without_ selectors](#services-without-selectors).
|
||||
## Services in Kubernetes
|
||||
|
||||
The Service API, part of Kubernetes, is an abstraction to help you expose groups of
|
||||
Pods over a network. Each Service object defines a logical set of endpoints (usually
|
||||
these endpoints are Pods) along with a policy about how to make those pods accessible.
|
||||
|
||||
For example, consider a stateless image-processing backend which is running with
|
||||
3 replicas. Those replicas are fungible—frontends do not care which backend
|
||||
|
@ -59,6 +59,26 @@ track of the set of backends themselves.
|
|||
|
||||
The Service abstraction enables this decoupling.
|
||||
|
||||
The set of Pods targeted by a Service is usually determined
|
||||
by a {{< glossary_tooltip text="selector" term_id="selector" >}} that you
|
||||
define.
|
||||
To learn about other ways to define Service endpoints,
|
||||
see [Services _without_ selectors](#services-without-selectors).
|
||||
|
||||
If your workload speaks HTTP, you might choose to use an
|
||||
[Ingress](/docs/concepts/services-networking/ingress/) to control how web traffic
|
||||
reaches that workload.
|
||||
Ingress is not a Service type, but it acts as the entry point for your
|
||||
cluster. An Ingress lets you consolidate your routing rules into a single resource, so
|
||||
that you can expose multiple components of your workload, running separately in your
|
||||
cluster, behind a single listener.
|
||||
|
||||
The [Gateway](https://gateway-api.sigs.k8s.io/#what-is-the-gateway-api) API for Kubernetes
|
||||
provides extra capabilities beyond Ingress and Service. You can add Gateway to your cluster -
|
||||
it is a family of extension APIs, implemented using
|
||||
{{< glossary_tooltip term_id="CustomResourceDefinition" text="CustomResourceDefinitions" >}} -
|
||||
and then use these to configure access to network services that are running in your cluster.
|
||||
|
||||
### Cloud-native service discovery
|
||||
|
||||
If you're able to use Kubernetes APIs for service discovery in your application,
|
||||
|
@ -69,16 +89,20 @@ whenever the set of Pods in a Service changes.
|
|||
For non-native applications, Kubernetes offers ways to place a network port or load
|
||||
balancer in between your application and the backend Pods.
|
||||
|
||||
Either way, your workload can use these [service discovery](#discovering-services)
|
||||
mechanisms to find the target it wants to connect to.
|
||||
|
||||
## Defining a Service
|
||||
|
||||
A Service in Kubernetes is a REST object, similar to a Pod. Like all of the
|
||||
REST objects, you can `POST` a Service definition to the API server to create
|
||||
a new instance.
|
||||
The name of a Service object must be a valid
|
||||
[RFC 1035 label name](/docs/concepts/overview/working-with-objects/names#rfc-1035-label-names).
|
||||
A Service in Kubernetes is an
|
||||
{{< glossary_tooltip text="object" term_id="object" >}}
|
||||
(the same way that a Pod or a ConfigMap is an object). You can create,
|
||||
view or modify Service definitions using the Kubernetes API. Usually
|
||||
you use a tool such as `kubectl` to make those API calls for you.
|
||||
|
||||
For example, suppose you have a set of Pods where each listens on TCP port 9376
|
||||
and contains a label `app.kubernetes.io/name=MyApp`:
|
||||
For example, suppose you have a set of Pods that each listen on TCP port 9376
|
||||
and are labelled as `app.kubernetes.io/name=MyApp`. You can define a Service to
|
||||
publish that TCP listener:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
|
@ -94,16 +118,20 @@ spec:
|
|||
targetPort: 9376
|
||||
```
|
||||
|
||||
This specification creates a new Service object named "my-service", which
|
||||
targets TCP port 9376 on any Pod with the `app.kubernetes.io/name=MyApp` label.
|
||||
Applying this manifest creates a new Service named "my-service", which
|
||||
targets TCP port 9376 on any Pod with the `app.kubernetes.io/name: MyApp` label.
|
||||
|
||||
Kubernetes assigns this Service an IP address (sometimes called the "cluster IP"),
|
||||
which is used by the Service proxies
|
||||
(see [Virtual IP addressing mechanism](#virtual-ip-addressing-mechanism) below).
|
||||
Kubernetes assigns this Service an IP address (the _cluster IP_),
|
||||
that is used by the virtual IP address mechanism. For more details on that mechanism,
|
||||
read [Virtual IPs and Service Proxies](/docs/reference/networking/virtual-ips/).
|
||||
|
||||
The controller for that Service continuously scans for Pods that
|
||||
match its selector, and then makes any necessary updates to the set of
|
||||
EndpointSlices for the Service.
|
||||
|
||||
The name of a Service object must be a valid
|
||||
[RFC 1035 label name](/docs/concepts/overview/working-with-objects/names#rfc-1035-label-names).
|
||||
|
||||
The controller for the Service selector continuously scans for Pods that
|
||||
match its selector, and then POSTs any updates to an Endpoint object
|
||||
also named "my-service".
|
||||
|
||||
{{< note >}}
|
||||
A Service can map _any_ incoming `port` to a `targetPort`. By default and
|
||||
|
@ -177,8 +205,8 @@ For example:
|
|||
* You are migrating a workload to Kubernetes. While evaluating the approach,
|
||||
you run only a portion of your backends in Kubernetes.
|
||||
|
||||
In any of these scenarios you can define a Service _without_ a Pod selector.
|
||||
For example:
|
||||
In any of these scenarios you can define a Service _without_ specifying a
|
||||
selector to match Pods. For example:
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
|
@ -262,9 +290,9 @@ selector will fail due to this constraint. This prevents the Kubernetes API serv
|
|||
from being used as a proxy to endpoints the caller may not be authorized to access.
|
||||
{{< /note >}}
|
||||
|
||||
An ExternalName Service is a special case of Service that does not have
|
||||
An `ExternalName` Service is a special case of Service that does not have
|
||||
selectors and uses DNS names instead. For more information, see the
|
||||
[ExternalName](#externalname) section later in this document.
|
||||
[ExternalName](#externalname) section.
|
||||
|
||||
### EndpointSlices
|
||||
|
||||
|
@ -436,7 +464,7 @@ the port number for `http`, as well as the IP address.
|
|||
|
||||
The Kubernetes DNS server is the only way to access `ExternalName` Services.
|
||||
You can find more information about `ExternalName` resolution in
|
||||
[DNS Pods and Services](/docs/concepts/services-networking/dns-pod-service/).
|
||||
[DNS for Services and Pods](/docs/concepts/services-networking/dns-pod-service/).
|
||||
|
||||
## Headless Services
|
||||
|
||||
|
@ -483,6 +511,8 @@ Kubernetes `ServiceTypes` allow you to specify what kind of Service you want.
|
|||
* `ClusterIP`: Exposes the Service on a cluster-internal IP. Choosing this value
|
||||
makes the Service only reachable from within the cluster. This is the
|
||||
default that is used if you don't explicitly specify a `type` for a Service.
|
||||
You can expose the service to the public with an [Ingress](/docs/concepts/services-networking/ingress/) or the
|
||||
[Gateway API](https://gateway-api.sigs.k8s.io/).
|
||||
* [`NodePort`](#type-nodeport): Exposes the Service on each Node's IP at a static port
|
||||
(the `NodePort`).
|
||||
To make the node port available, Kubernetes sets up a cluster IP address,
|
||||
|
@ -702,7 +732,7 @@ In a split-horizon DNS environment you would need two Services to be able to rou
|
|||
and internal traffic to your endpoints.
|
||||
|
||||
To set an internal load balancer, add one of the following annotations to your Service
|
||||
depending on the cloud Service provider you're using.
|
||||
depending on the cloud service provider you're using:
|
||||
|
||||
{{< tabs name="service_tabs" >}}
|
||||
{{% tab name="Default" %}}
|
||||
|
@ -1149,9 +1179,9 @@ spec:
|
|||
- name: http
|
||||
protocol: TCP
|
||||
port: 80
|
||||
targetPort: 9376
|
||||
targetPort: 49152
|
||||
externalIPs:
|
||||
- 80.11.12.10
|
||||
- 198.51.100.32
|
||||
```
|
||||
|
||||
## Session stickiness
|
||||
|
@ -1176,12 +1206,17 @@ mechanism Kubernetes provides to expose a Service with a virtual IP address.
|
|||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
* Follow the [Connecting Applications with Services](/docs/tutorials/services/connect-applications-service/) tutorial
|
||||
* Read about [Ingress](/docs/concepts/services-networking/ingress/)
|
||||
* Read about [EndpointSlices](/docs/concepts/services-networking/endpoint-slices/)
|
||||
Learn more about Services and how they fit into Kubernetes:
|
||||
* Follow the [Connecting Applications with Services](/docs/tutorials/services/connect-applications-service/) tutorial.
|
||||
* Read about [Ingress](/docs/concepts/services-networking/ingress/), which
|
||||
exposes HTTP and HTTPS routes from outside the cluster to Services within
|
||||
your cluster.
|
||||
* Read about [Gateway](https://gateway-api.sigs.k8s.io/), an extension to
|
||||
Kubernetes that provides more flexibility than Ingress.
|
||||
|
||||
For more context:
|
||||
* Read [Virtual IPs and Service Proxies](/docs/reference/networking/virtual-ips/)
|
||||
* Read the [API reference](/docs/reference/kubernetes-api/service-resources/service-v1/) for the Service API
|
||||
* Read the [API reference](/docs/reference/kubernetes-api/service-resources/endpoints-v1/) for the Endpoints API
|
||||
* Read the [API reference](/docs/reference/kubernetes-api/service-resources/endpoint-slice-v1/) for the EndpointSlice API
|
||||
For more context, read the following:
|
||||
* [Virtual IPs and Service Proxies](/docs/reference/networking/virtual-ips/)
|
||||
* [EndpointSlices](/docs/concepts/services-networking/endpoint-slices/)
|
||||
* [Service API reference](/docs/reference/kubernetes-api/service-resources/service-v1/)
|
||||
* [EndpointSlice API reference](/docs/reference/kubernetes-api/service-resources/endpoint-slice-v1/)
|
||||
* [Endpoint API reference (legacy)](/docs/reference/kubernetes-api/service-resources/endpoints-v1/)
|
||||
|
|
|
@ -126,7 +126,7 @@ zone.
|
|||
|
||||
5. **A zone is not represented in hints:** If the kube-proxy is unable to find
|
||||
at least one endpoint with a hint targeting the zone it is running in, it falls
|
||||
to using endpoints from all zones. This is most likely to happen as you add
|
||||
back to using endpoints from all zones. This is most likely to happen as you add
|
||||
a new zone into your existing cluster.
|
||||
|
||||
## Constraints
|
||||
|
|
|
@ -16,25 +16,45 @@ weight: 20
|
|||
|
||||
<!-- overview -->
|
||||
|
||||
This document describes _persistent volumes_ in Kubernetes. Familiarity with [volumes](/docs/concepts/storage/volumes/) is suggested.
|
||||
This document describes _persistent volumes_ in Kubernetes. Familiarity with
|
||||
[volumes](/docs/concepts/storage/volumes/) is suggested.
|
||||
|
||||
<!-- body -->
|
||||
|
||||
## Introduction
|
||||
|
||||
Managing storage is a distinct problem from managing compute instances. The PersistentVolume subsystem provides an API for users and administrators that abstracts details of how storage is provided from how it is consumed. To do this, we introduce two new API resources: PersistentVolume and PersistentVolumeClaim.
|
||||
Managing storage is a distinct problem from managing compute instances.
|
||||
The PersistentVolume subsystem provides an API for users and administrators
|
||||
that abstracts details of how storage is provided from how it is consumed.
|
||||
To do this, we introduce two new API resources: PersistentVolume and PersistentVolumeClaim.
|
||||
|
||||
A _PersistentVolume_ (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using [Storage Classes](/docs/concepts/storage/storage-classes/). It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes, but have a lifecycle independent of any individual Pod that uses the PV. This API object captures the details of the implementation of the storage, be that NFS, iSCSI, or a cloud-provider-specific storage system.
|
||||
A _PersistentVolume_ (PV) is a piece of storage in the cluster that has been
|
||||
provisioned by an administrator or dynamically provisioned using
|
||||
[Storage Classes](/docs/concepts/storage/storage-classes/). It is a resource in
|
||||
the cluster just like a node is a cluster resource. PVs are volume plugins like
|
||||
Volumes, but have a lifecycle independent of any individual Pod that uses the PV.
|
||||
This API object captures the details of the implementation of the storage, be that
|
||||
NFS, iSCSI, or a cloud-provider-specific storage system.
|
||||
|
||||
A _PersistentVolumeClaim_ (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request specific size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or ReadWriteMany, see [AccessModes](#access-modes)).
|
||||
A _PersistentVolumeClaim_ (PVC) is a request for storage by a user. It is similar
|
||||
to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can
|
||||
request specific levels of resources (CPU and Memory). Claims can request specific
|
||||
size and access modes (e.g., they can be mounted ReadWriteOnce, ReadOnlyMany or
|
||||
ReadWriteMany, see [AccessModes](#access-modes)).
|
||||
|
||||
While PersistentVolumeClaims allow a user to consume abstract storage resources, it is common that users need PersistentVolumes with varying properties, such as performance, for different problems. Cluster administrators need to be able to offer a variety of PersistentVolumes that differ in more ways than size and access modes, without exposing users to the details of how those volumes are implemented. For these needs, there is the _StorageClass_ resource.
|
||||
While PersistentVolumeClaims allow a user to consume abstract storage resources,
|
||||
it is common that users need PersistentVolumes with varying properties, such as
|
||||
performance, for different problems. Cluster administrators need to be able to
|
||||
offer a variety of PersistentVolumes that differ in more ways than size and access
|
||||
modes, without exposing users to the details of how those volumes are implemented.
|
||||
For these needs, there is the _StorageClass_ resource.
|
||||
|
||||
See the [detailed walkthrough with working examples](/docs/tasks/configure-pod-container/configure-persistent-volume-storage/).
|
||||
|
||||
## Lifecycle of a volume and claim
|
||||
|
||||
PVs are resources in the cluster. PVCs are requests for those resources and also act as claim checks to the resource. The interaction between PVs and PVCs follows this lifecycle:
|
||||
PVs are resources in the cluster. PVCs are requests for those resources and also act
|
||||
as claim checks to the resource. The interaction between PVs and PVCs follows this lifecycle:
|
||||
|
||||
### Provisioning
|
||||
|
||||
|
@ -42,7 +62,9 @@ There are two ways PVs may be provisioned: statically or dynamically.
|
|||
|
||||
#### Static
|
||||
|
||||
A cluster administrator creates a number of PVs. They carry the details of the real storage, which is available for use by cluster users. They exist in the Kubernetes API and are available for consumption.
|
||||
A cluster administrator creates a number of PVs. They carry the details of the
|
||||
real storage, which is available for use by cluster users. They exist in the
|
||||
Kubernetes API and are available for consumption.
|
||||
|
||||
#### Dynamic
|
||||
|
||||
|
@ -55,7 +77,8 @@ provisioning to occur. Claims that request the class `""` effectively disable
|
|||
dynamic provisioning for themselves.
|
||||
|
||||
To enable dynamic storage provisioning based on storage class, the cluster administrator
|
||||
needs to enable the `DefaultStorageClass` [admission controller](/docs/reference/access-authn-authz/admission-controllers/#defaultstorageclass)
|
||||
needs to enable the `DefaultStorageClass`
|
||||
[admission controller](/docs/reference/access-authn-authz/admission-controllers/#defaultstorageclass)
|
||||
on the API server. This can be done, for example, by ensuring that `DefaultStorageClass` is
|
||||
among the comma-delimited, ordered list of values for the `--enable-admission-plugins` flag of
|
||||
the API server component. For more information on API server command-line flags,
|
||||
|
@ -63,26 +86,51 @@ check [kube-apiserver](/docs/admin/kube-apiserver/) documentation.
|
|||
|
||||
### Binding
|
||||
|
||||
A user creates, or in the case of dynamic provisioning, has already created, a PersistentVolumeClaim with a specific amount of storage requested and with certain access modes. A control loop in the master watches for new PVCs, finds a matching PV (if possible), and binds them together. If a PV was dynamically provisioned for a new PVC, the loop will always bind that PV to the PVC. Otherwise, the user will always get at least what they asked for, but the volume may be in excess of what was requested. Once bound, PersistentVolumeClaim binds are exclusive, regardless of how they were bound. A PVC to PV binding is a one-to-one mapping, using a ClaimRef which is a bi-directional binding between the PersistentVolume and the PersistentVolumeClaim.
|
||||
A user creates, or in the case of dynamic provisioning, has already created,
|
||||
a PersistentVolumeClaim with a specific amount of storage requested and with
|
||||
certain access modes. A control loop in the master watches for new PVCs, finds
|
||||
a matching PV (if possible), and binds them together. If a PV was dynamically
|
||||
provisioned for a new PVC, the loop will always bind that PV to the PVC. Otherwise,
|
||||
the user will always get at least what they asked for, but the volume may be in
|
||||
excess of what was requested. Once bound, PersistentVolumeClaim binds are exclusive,
|
||||
regardless of how they were bound. A PVC to PV binding is a one-to-one mapping,
|
||||
using a ClaimRef which is a bi-directional binding between the PersistentVolume
|
||||
and the PersistentVolumeClaim.
|
||||
|
||||
Claims will remain unbound indefinitely if a matching volume does not exist. Claims will be bound as matching volumes become available. For example, a cluster provisioned with many 50Gi PVs would not match a PVC requesting 100Gi. The PVC can be bound when a 100Gi PV is added to the cluster.
|
||||
Claims will remain unbound indefinitely if a matching volume does not exist.
|
||||
Claims will be bound as matching volumes become available. For example, a
|
||||
cluster provisioned with many 50Gi PVs would not match a PVC requesting 100Gi.
|
||||
The PVC can be bound when a 100Gi PV is added to the cluster.
|
||||
|
||||
### Using
|
||||
|
||||
Pods use claims as volumes. The cluster inspects the claim to find the bound volume and mounts that volume for a Pod. For volumes that support multiple access modes, the user specifies which mode is desired when using their claim as a volume in a Pod.
|
||||
Pods use claims as volumes. The cluster inspects the claim to find the bound
|
||||
volume and mounts that volume for a Pod. For volumes that support multiple
|
||||
access modes, the user specifies which mode is desired when using their claim
|
||||
as a volume in a Pod.
|
||||
|
||||
Once a user has a claim and that claim is bound, the bound PV belongs to the user for as long as they need it. Users schedule Pods and access their claimed PVs by including a `persistentVolumeClaim` section in a Pod's `volumes` block. See [Claims As Volumes](#claims-as-volumes) for more details on this.
|
||||
Once a user has a claim and that claim is bound, the bound PV belongs to the
|
||||
user for as long as they need it. Users schedule Pods and access their claimed
|
||||
PVs by including a `persistentVolumeClaim` section in a Pod's `volumes` block.
|
||||
See [Claims As Volumes](#claims-as-volumes) for more details on this.
|
||||
|
||||
### Storage Object in Use Protection
|
||||
The purpose of the Storage Object in Use Protection feature is to ensure that PersistentVolumeClaims (PVCs) in active use by a Pod and PersistentVolume (PVs) that are bound to PVCs are not removed from the system, as this may result in data loss.
|
||||
|
||||
The purpose of the Storage Object in Use Protection feature is to ensure that
|
||||
PersistentVolumeClaims (PVCs) in active use by a Pod and PersistentVolume (PVs)
|
||||
that are bound to PVCs are not removed from the system, as this may result in data loss.
|
||||
|
||||
{{< note >}}
|
||||
PVC is in active use by a Pod when a Pod object exists that is using the PVC.
|
||||
{{< /note >}}
|
||||
|
||||
If a user deletes a PVC in active use by a Pod, the PVC is not removed immediately. PVC removal is postponed until the PVC is no longer actively used by any Pods. Also, if an admin deletes a PV that is bound to a PVC, the PV is not removed immediately. PV removal is postponed until the PV is no longer bound to a PVC.
|
||||
If a user deletes a PVC in active use by a Pod, the PVC is not removed immediately.
|
||||
PVC removal is postponed until the PVC is no longer actively used by any Pods. Also,
|
||||
if an admin deletes a PV that is bound to a PVC, the PV is not removed immediately.
|
||||
PV removal is postponed until the PV is no longer bound to a PVC.
|
||||
|
||||
You can see that a PVC is protected when the PVC's status is `Terminating` and the `Finalizers` list includes `kubernetes.io/pvc-protection`:
|
||||
You can see that a PVC is protected when the PVC's status is `Terminating` and the
|
||||
`Finalizers` list includes `kubernetes.io/pvc-protection`:
|
||||
|
||||
```shell
|
||||
kubectl describe pvc hostpath
|
||||
|
@ -98,7 +146,8 @@ Finalizers: [kubernetes.io/pvc-protection]
|
|||
...
|
||||
```
|
||||
|
||||
You can see that a PV is protected when the PV's status is `Terminating` and the `Finalizers` list includes `kubernetes.io/pv-protection` too:
|
||||
You can see that a PV is protected when the PV's status is `Terminating` and
|
||||
the `Finalizers` list includes `kubernetes.io/pv-protection` too:
|
||||
|
||||
```shell
|
||||
kubectl describe pv task-pv-volume
|
||||
|
@ -122,29 +171,48 @@ Events: <none>
|
|||
|
||||
### Reclaiming
|
||||
|
||||
When a user is done with their volume, they can delete the PVC objects from the API that allows reclamation of the resource. The reclaim policy for a PersistentVolume tells the cluster what to do with the volume after it has been released of its claim. Currently, volumes can either be Retained, Recycled, or Deleted.
|
||||
When a user is done with their volume, they can delete the PVC objects from the
|
||||
API that allows reclamation of the resource. The reclaim policy for a PersistentVolume
|
||||
tells the cluster what to do with the volume after it has been released of its claim.
|
||||
Currently, volumes can either be Retained, Recycled, or Deleted.
|
||||
|
||||
#### Retain
|
||||
|
||||
The `Retain` reclaim policy allows for manual reclamation of the resource. When the PersistentVolumeClaim is deleted, the PersistentVolume still exists and the volume is considered "released". But it is not yet available for another claim because the previous claimant's data remains on the volume. An administrator can manually reclaim the volume with the following steps.
|
||||
The `Retain` reclaim policy allows for manual reclamation of the resource.
|
||||
When the PersistentVolumeClaim is deleted, the PersistentVolume still exists
|
||||
and the volume is considered "released". But it is not yet available for
|
||||
another claim because the previous claimant's data remains on the volume.
|
||||
An administrator can manually reclaim the volume with the following steps.
|
||||
|
||||
1. Delete the PersistentVolume. The associated storage asset in external infrastructure (such as an AWS EBS, GCE PD, Azure Disk, or Cinder volume) still exists after the PV is deleted.
|
||||
1. Delete the PersistentVolume. The associated storage asset in external infrastructure
|
||||
(such as an AWS EBS, GCE PD, Azure Disk, or Cinder volume) still exists after the PV is deleted.
|
||||
1. Manually clean up the data on the associated storage asset accordingly.
|
||||
1. Manually delete the associated storage asset.
|
||||
|
||||
If you want to reuse the same storage asset, create a new PersistentVolume with the same storage asset definition.
|
||||
If you want to reuse the same storage asset, create a new PersistentVolume with
|
||||
the same storage asset definition.
|
||||
|
||||
#### Delete
|
||||
|
||||
For volume plugins that support the `Delete` reclaim policy, deletion removes both the PersistentVolume object from Kubernetes, as well as the associated storage asset in the external infrastructure, such as an AWS EBS, GCE PD, Azure Disk, or Cinder volume. Volumes that were dynamically provisioned inherit the [reclaim policy of their StorageClass](#reclaim-policy), which defaults to `Delete`. The administrator should configure the StorageClass according to users' expectations; otherwise, the PV must be edited or patched after it is created. See [Change the Reclaim Policy of a PersistentVolume](/docs/tasks/administer-cluster/change-pv-reclaim-policy/).
|
||||
For volume plugins that support the `Delete` reclaim policy, deletion removes
|
||||
both the PersistentVolume object from Kubernetes, as well as the associated
|
||||
storage asset in the external infrastructure, such as an AWS EBS, GCE PD,
|
||||
Azure Disk, or Cinder volume. Volumes that were dynamically provisioned
|
||||
inherit the [reclaim policy of their StorageClass](#reclaim-policy), which
|
||||
defaults to `Delete`. The administrator should configure the StorageClass
|
||||
according to users' expectations; otherwise, the PV must be edited or
|
||||
patched after it is created. See
|
||||
[Change the Reclaim Policy of a PersistentVolume](/docs/tasks/administer-cluster/change-pv-reclaim-policy/).
|
||||
|
||||
#### Recycle
|
||||
|
||||
{{< warning >}}
|
||||
The `Recycle` reclaim policy is deprecated. Instead, the recommended approach is to use dynamic provisioning.
|
||||
The `Recycle` reclaim policy is deprecated. Instead, the recommended approach
|
||||
is to use dynamic provisioning.
|
||||
{{< /warning >}}
|
||||
|
||||
If supported by the underlying volume plugin, the `Recycle` reclaim policy performs a basic scrub (`rm -rf /thevolume/*`) on the volume and makes it available again for a new claim.
|
||||
If supported by the underlying volume plugin, the `Recycle` reclaim policy performs
|
||||
a basic scrub (`rm -rf /thevolume/*`) on the volume and makes it available again for a new claim.
|
||||
|
||||
However, an administrator can configure a custom recycler Pod template using
|
||||
the Kubernetes controller manager command line arguments as described in the
|
||||
|
@ -173,7 +241,8 @@ spec:
|
|||
mountPath: /scrub
|
||||
```
|
||||
|
||||
However, the particular path specified in the custom recycler Pod template in the `volumes` part is replaced with the particular path of the volume that is being recycled.
|
||||
However, the particular path specified in the custom recycler Pod template in the
|
||||
`volumes` part is replaced with the particular path of the volume that is being recycled.
|
||||
|
||||
### PersistentVolume deletion protection finalizer
|
||||
{{< feature-state for_k8s_version="v1.23" state="alpha" >}}
|
||||
|
@ -181,10 +250,12 @@ However, the particular path specified in the custom recycler Pod template in th
|
|||
Finalizers can be added on a PersistentVolume to ensure that PersistentVolumes
|
||||
having `Delete` reclaim policy are deleted only after the backing storage are deleted.
|
||||
|
||||
The newly introduced finalizers `kubernetes.io/pv-controller` and `external-provisioner.volume.kubernetes.io/finalizer`
|
||||
The newly introduced finalizers `kubernetes.io/pv-controller` and
|
||||
`external-provisioner.volume.kubernetes.io/finalizer`
|
||||
are only added to dynamically provisioned volumes.
|
||||
|
||||
The finalizer `kubernetes.io/pv-controller` is added to in-tree plugin volumes. The following is an example
|
||||
The finalizer `kubernetes.io/pv-controller` is added to in-tree plugin volumes.
|
||||
The following is an example
|
||||
|
||||
```shell
|
||||
kubectl describe pv pvc-74a498d6-3929-47e8-8c02-078c1ece4d78
|
||||
|
@ -213,6 +284,7 @@ Events: <none>
|
|||
|
||||
The finalizer `external-provisioner.volume.kubernetes.io/finalizer` is added for CSI volumes.
|
||||
The following is an example:
|
||||
|
||||
```shell
|
||||
Name: pvc-2f0bab97-85a8-4552-8044-eb8be45cf48d
|
||||
Labels: <none>
|
||||
|
@ -244,14 +316,17 @@ the `kubernetes.io/pv-controller` finalizer is replaced by the
|
|||
|
||||
### Reserving a PersistentVolume
|
||||
|
||||
The control plane can [bind PersistentVolumeClaims to matching PersistentVolumes](#binding) in the
|
||||
cluster. However, if you want a PVC to bind to a specific PV, you need to pre-bind them.
|
||||
The control plane can [bind PersistentVolumeClaims to matching PersistentVolumes](#binding)
|
||||
in the cluster. However, if you want a PVC to bind to a specific PV, you need to pre-bind them.
|
||||
|
||||
By specifying a PersistentVolume in a PersistentVolumeClaim, you declare a binding between that specific PV and PVC.
|
||||
If the PersistentVolume exists and has not reserved PersistentVolumeClaims through its `claimRef` field, then the PersistentVolume and PersistentVolumeClaim will be bound.
|
||||
By specifying a PersistentVolume in a PersistentVolumeClaim, you declare a binding
|
||||
between that specific PV and PVC. If the PersistentVolume exists and has not reserved
|
||||
PersistentVolumeClaims through its `claimRef` field, then the PersistentVolume and
|
||||
PersistentVolumeClaim will be bound.
|
||||
|
||||
The binding happens regardless of some volume matching criteria, including node affinity.
|
||||
The control plane still checks that [storage class](/docs/concepts/storage/storage-classes/), access modes, and requested storage size are valid.
|
||||
The control plane still checks that [storage class](/docs/concepts/storage/storage-classes/),
|
||||
access modes, and requested storage size are valid.
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
|
@ -265,7 +340,10 @@ spec:
|
|||
...
|
||||
```
|
||||
|
||||
This method does not guarantee any binding privileges to the PersistentVolume. If other PersistentVolumeClaims could use the PV that you specify, you first need to reserve that storage volume. Specify the relevant PersistentVolumeClaim in the `claimRef` field of the PV so that other PVCs can not bind to it.
|
||||
This method does not guarantee any binding privileges to the PersistentVolume.
|
||||
If other PersistentVolumeClaims could use the PV that you specify, you first
|
||||
need to reserve that storage volume. Specify the relevant PersistentVolumeClaim
|
||||
in the `claimRef` field of the PV so that other PVCs can not bind to it.
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
|
@ -334,8 +412,9 @@ increased and that no resize is necessary.
|
|||
|
||||
{{< feature-state for_k8s_version="v1.24" state="stable" >}}
|
||||
|
||||
Support for expanding CSI volumes is enabled by default but it also requires a specific CSI driver to support volume expansion. Refer to documentation of the specific CSI driver for more information.
|
||||
|
||||
Support for expanding CSI volumes is enabled by default but it also requires a
|
||||
specific CSI driver to support volume expansion. Refer to documentation of the
|
||||
specific CSI driver for more information.
|
||||
|
||||
#### Resizing a volume containing a file system
|
||||
|
||||
|
@ -364,22 +443,33 @@ FlexVolume resize is possible only when the underlying driver supports resize.
|
|||
{{< /note >}}
|
||||
|
||||
{{< note >}}
|
||||
Expanding EBS volumes is a time-consuming operation. Also, there is a per-volume quota of one modification every 6 hours.
|
||||
Expanding EBS volumes is a time-consuming operation.
|
||||
Also, there is a per-volume quota of one modification every 6 hours.
|
||||
{{< /note >}}
|
||||
|
||||
#### Recovering from Failure when Expanding Volumes
|
||||
|
||||
If a user specifies a new size that is too big to be satisfied by underlying storage system, expansion of PVC will be continuously retried until user or cluster administrator takes some action. This can be undesirable and hence Kubernetes provides following methods of recovering from such failures.
|
||||
If a user specifies a new size that is too big to be satisfied by underlying
|
||||
storage system, expansion of PVC will be continuously retried until user or
|
||||
cluster administrator takes some action. This can be undesirable and hence
|
||||
Kubernetes provides following methods of recovering from such failures.
|
||||
|
||||
{{< tabs name="recovery_methods" >}}
|
||||
{{% tab name="Manually with Cluster Administrator access" %}}
|
||||
|
||||
If expanding underlying storage fails, the cluster administrator can manually recover the Persistent Volume Claim (PVC) state and cancel the resize requests. Otherwise, the resize requests are continuously retried by the controller without administrator intervention.
|
||||
If expanding underlying storage fails, the cluster administrator can manually
|
||||
recover the Persistent Volume Claim (PVC) state and cancel the resize requests.
|
||||
Otherwise, the resize requests are continuously retried by the controller without
|
||||
administrator intervention.
|
||||
|
||||
1. Mark the PersistentVolume(PV) that is bound to the PersistentVolumeClaim(PVC) with `Retain` reclaim policy.
|
||||
2. Delete the PVC. Since PV has `Retain` reclaim policy - we will not lose any data when we recreate the PVC.
|
||||
3. Delete the `claimRef` entry from PV specs, so as new PVC can bind to it. This should make the PV `Available`.
|
||||
4. Re-create the PVC with smaller size than PV and set `volumeName` field of the PVC to the name of the PV. This should bind new PVC to existing PV.
|
||||
1. Mark the PersistentVolume(PV) that is bound to the PersistentVolumeClaim(PVC)
|
||||
with `Retain` reclaim policy.
|
||||
2. Delete the PVC. Since PV has `Retain` reclaim policy - we will not lose any data
|
||||
when we recreate the PVC.
|
||||
3. Delete the `claimRef` entry from PV specs, so as new PVC can bind to it.
|
||||
This should make the PV `Available`.
|
||||
4. Re-create the PVC with smaller size than PV and set `volumeName` field of the
|
||||
PVC to the name of the PV. This should bind new PVC to existing PV.
|
||||
5. Don't forget to restore the reclaim policy of the PV.
|
||||
|
||||
{{% /tab %}}
|
||||
|
@ -387,7 +477,11 @@ If expanding underlying storage fails, the cluster administrator can manually re
|
|||
{{% feature-state for_k8s_version="v1.23" state="alpha" %}}
|
||||
|
||||
{{< note >}}
|
||||
Recovery from failing PVC expansion by users is available as an alpha feature since Kubernetes 1.23. The `RecoverVolumeExpansionFailure` feature must be enabled for this feature to work. Refer to the [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) documentation for more information.
|
||||
Recovery from failing PVC expansion by users is available as an alpha feature
|
||||
since Kubernetes 1.23. The `RecoverVolumeExpansionFailure` feature must be
|
||||
enabled for this feature to work. Refer to the
|
||||
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
|
||||
documentation for more information.
|
||||
{{< /note >}}
|
||||
|
||||
If the feature gates `RecoverVolumeExpansionFailure` is
|
||||
|
@ -397,7 +491,8 @@ smaller proposed size, edit `.spec.resources` for that PVC and choose a value th
|
|||
value you previously tried.
|
||||
This is useful if expansion to a higher value did not succeed because of capacity constraint.
|
||||
If that has happened, or you suspect that it might have, you can retry expansion by specifying a
|
||||
size that is within the capacity limits of underlying storage provider. You can monitor status of resize operation by watching `.status.resizeStatus` and events on the PVC.
|
||||
size that is within the capacity limits of underlying storage provider. You can monitor status of
|
||||
resize operation by watching `.status.resizeStatus` and events on the PVC.
|
||||
|
||||
Note that,
|
||||
although you can specify a lower amount of storage than what was requested previously,
|
||||
|
@ -406,7 +501,6 @@ Kubernetes does not support shrinking a PVC to less than its current size.
|
|||
{{% /tab %}}
|
||||
{{% /tabs %}}
|
||||
|
||||
|
||||
## Types of Persistent Volumes
|
||||
|
||||
PersistentVolume types are implemented as plugins. Kubernetes currently supports the following plugins:
|
||||
|
@ -423,7 +517,8 @@ PersistentVolume types are implemented as plugins. Kubernetes currently supports
|
|||
* [`nfs`](/docs/concepts/storage/volumes/#nfs) - Network File System (NFS) storage
|
||||
* [`rbd`](/docs/concepts/storage/volumes/#rbd) - Rados Block Device (RBD) volume
|
||||
|
||||
The following types of PersistentVolume are deprecated. This means that support is still available but will be removed in a future Kubernetes release.
|
||||
The following types of PersistentVolume are deprecated.
|
||||
This means that support is still available but will be removed in a future Kubernetes release.
|
||||
|
||||
* [`awsElasticBlockStore`](/docs/concepts/storage/volumes/#awselasticblockstore) - AWS Elastic Block Store (EBS)
|
||||
(**deprecated** in v1.17)
|
||||
|
@ -483,14 +578,21 @@ spec:
|
|||
```
|
||||
|
||||
{{< note >}}
|
||||
Helper programs relating to the volume type may be required for consumption of a PersistentVolume within a cluster. In this example, the PersistentVolume is of type NFS and the helper program /sbin/mount.nfs is required to support the mounting of NFS filesystems.
|
||||
Helper programs relating to the volume type may be required for consumption of
|
||||
a PersistentVolume within a cluster. In this example, the PersistentVolume is
|
||||
of type NFS and the helper program /sbin/mount.nfs is required to support the
|
||||
mounting of NFS filesystems.
|
||||
{{< /note >}}
|
||||
|
||||
### Capacity
|
||||
|
||||
Generally, a PV will have a specific storage capacity. This is set using the PV's `capacity` attribute. Read the glossary term [Quantity](/docs/reference/glossary/?all=true#term-quantity) to understand the units expected by `capacity`.
|
||||
Generally, a PV will have a specific storage capacity. This is set using the PV's
|
||||
`capacity` attribute. Read the glossary term
|
||||
[Quantity](/docs/reference/glossary/?all=true#term-quantity) to understand the units
|
||||
expected by `capacity`.
|
||||
|
||||
Currently, storage size is the only resource that can be set or requested. Future attributes may include IOPS, throughput, etc.
|
||||
Currently, storage size is the only resource that can be set or requested.
|
||||
Future attributes may include IOPS, throughput, etc.
|
||||
|
||||
### Volume Mode
|
||||
|
||||
|
@ -515,12 +617,18 @@ for an example on how to use a volume with `volumeMode: Block` in a Pod.
|
|||
|
||||
### Access Modes
|
||||
|
||||
A PersistentVolume can be mounted on a host in any way supported by the resource provider. As shown in the table below, providers will have different capabilities and each PV's access modes are set to the specific modes supported by that particular volume. For example, NFS can support multiple read/write clients, but a specific NFS PV might be exported on the server as read-only. Each PV gets its own set of access modes describing that specific PV's capabilities.
|
||||
A PersistentVolume can be mounted on a host in any way supported by the resource
|
||||
provider. As shown in the table below, providers will have different capabilities
|
||||
and each PV's access modes are set to the specific modes supported by that particular
|
||||
volume. For example, NFS can support multiple read/write clients, but a specific
|
||||
NFS PV might be exported on the server as read-only. Each PV gets its own set of
|
||||
access modes describing that specific PV's capabilities.
|
||||
|
||||
The access modes are:
|
||||
|
||||
`ReadWriteOnce`
|
||||
: the volume can be mounted as read-write by a single node. ReadWriteOnce access mode still can allow multiple pods to access the volume when the pods are running on the same node.
|
||||
`ReadWriteOnce`
|
||||
: the volume can be mounted as read-write by a single node. ReadWriteOnce access
|
||||
mode still can allow multiple pods to access the volume when the pods are running on the same node.
|
||||
|
||||
`ReadOnlyMany`
|
||||
: the volume can be mounted as read-only by many nodes.
|
||||
|
@ -529,12 +637,14 @@ The access modes are:
|
|||
: the volume can be mounted as read-write by many nodes.
|
||||
|
||||
`ReadWriteOncePod`
|
||||
: the volume can be mounted as read-write by a single Pod. Use ReadWriteOncePod access mode if you want to ensure that only one pod across whole cluster can read that PVC or write to it. This is only supported for CSI volumes and Kubernetes version 1.22+.
|
||||
: the volume can be mounted as read-write by a single Pod. Use ReadWriteOncePod
|
||||
access mode if you want to ensure that only one pod across whole cluster can
|
||||
read that PVC or write to it. This is only supported for CSI volumes and
|
||||
Kubernetes version 1.22+.
|
||||
|
||||
|
||||
|
||||
The blog article [Introducing Single Pod Access Mode for PersistentVolumes](/blog/2021/09/13/read-write-once-pod-access-mode-alpha/) covers this in more detail.
|
||||
|
||||
The blog article
|
||||
[Introducing Single Pod Access Mode for PersistentVolumes](/blog/2021/09/13/read-write-once-pod-access-mode-alpha/)
|
||||
covers this in more detail.
|
||||
|
||||
In the CLI, the access modes are abbreviated to:
|
||||
|
||||
|
@ -547,13 +657,15 @@ In the CLI, the access modes are abbreviated to:
|
|||
Kubernetes uses volume access modes to match PersistentVolumeClaims and PersistentVolumes.
|
||||
In some cases, the volume access modes also constrain where the PersistentVolume can be mounted.
|
||||
Volume access modes do **not** enforce write protection once the storage has been mounted.
|
||||
Even if the access modes are specified as ReadWriteOnce, ReadOnlyMany, or ReadWriteMany, they don't set any constraints on the volume.
|
||||
For example, even if a PersistentVolume is created as ReadOnlyMany, it is no guarantee that it will be read-only.
|
||||
If the access modes are specified as ReadWriteOncePod, the volume is constrained and can be mounted on only a single Pod.
|
||||
Even if the access modes are specified as ReadWriteOnce, ReadOnlyMany, or ReadWriteMany,
|
||||
they don't set any constraints on the volume. For example, even if a PersistentVolume is
|
||||
created as ReadOnlyMany, it is no guarantee that it will be read-only. If the access modes
|
||||
are specified as ReadWriteOncePod, the volume is constrained and can be mounted on only a single Pod.
|
||||
{{< /note >}}
|
||||
|
||||
> __Important!__ A volume can only be mounted using one access mode at a time, even if it supports many. For example, a GCEPersistentDisk can be mounted as ReadWriteOnce by a single node or ReadOnlyMany by many nodes, but not at the same time.
|
||||
|
||||
> __Important!__ A volume can only be mounted using one access mode at a time,
|
||||
> even if it supports many. For example, a GCEPersistentDisk can be mounted as
|
||||
> ReadWriteOnce by a single node or ReadOnlyMany by many nodes, but not at the same time.
|
||||
|
||||
| Volume Plugin | ReadWriteOnce | ReadOnlyMany | ReadWriteMany | ReadWriteOncePod |
|
||||
| :--- | :---: | :---: | :---: | - |
|
||||
|
@ -593,13 +705,16 @@ Current reclaim policies are:
|
|||
|
||||
* Retain -- manual reclamation
|
||||
* Recycle -- basic scrub (`rm -rf /thevolume/*`)
|
||||
* Delete -- associated storage asset such as AWS EBS, GCE PD, Azure Disk, or OpenStack Cinder volume is deleted
|
||||
* Delete -- associated storage asset such as AWS EBS, GCE PD, Azure Disk,
|
||||
or OpenStack Cinder volume is deleted
|
||||
|
||||
Currently, only NFS and HostPath support recycling. AWS EBS, GCE PD, Azure Disk, and Cinder volumes support deletion.
|
||||
Currently, only NFS and HostPath support recycling. AWS EBS, GCE PD, Azure Disk,
|
||||
and Cinder volumes support deletion.
|
||||
|
||||
### Mount Options
|
||||
|
||||
A Kubernetes administrator can specify additional mount options for when a Persistent Volume is mounted on a node.
|
||||
A Kubernetes administrator can specify additional mount options for when a
|
||||
Persistent Volume is mounted on a node.
|
||||
|
||||
{{< note >}}
|
||||
Not all Persistent Volume types support mount options.
|
||||
|
@ -627,10 +742,19 @@ it will become fully deprecated in a future Kubernetes release.
|
|||
### Node Affinity
|
||||
|
||||
{{< note >}}
|
||||
For most volume types, you do not need to set this field. It is automatically populated for [AWS EBS](/docs/concepts/storage/volumes/#awselasticblockstore), [GCE PD](/docs/concepts/storage/volumes/#gcepersistentdisk) and [Azure Disk](/docs/concepts/storage/volumes/#azuredisk) volume block types. You need to explicitly set this for [local](/docs/concepts/storage/volumes/#local) volumes.
|
||||
For most volume types, you do not need to set this field. It is automatically
|
||||
populated for [AWS EBS](/docs/concepts/storage/volumes/#awselasticblockstore),
|
||||
[GCE PD](/docs/concepts/storage/volumes/#gcepersistentdisk) and
|
||||
[Azure Disk](/docs/concepts/storage/volumes/#azuredisk) volume block types. You
|
||||
need to explicitly set this for [local](/docs/concepts/storage/volumes/#local) volumes.
|
||||
{{< /note >}}
|
||||
|
||||
A PV can specify node affinity to define constraints that limit what nodes this volume can be accessed from. Pods that use a PV will only be scheduled to nodes that are selected by the node affinity. To specify node affinity, set `nodeAffinity` in the `.spec` of a PV. The [PersistentVolume](/docs/reference/kubernetes-api/config-and-storage-resources/persistent-volume-v1/#PersistentVolumeSpec) API reference has more details on this field.
|
||||
A PV can specify node affinity to define constraints that limit what nodes this
|
||||
volume can be accessed from. Pods that use a PV will only be scheduled to nodes
|
||||
that are selected by the node affinity. To specify node affinity, set
|
||||
`nodeAffinity` in the `.spec` of a PV. The
|
||||
[PersistentVolume](/docs/reference/kubernetes-api/config-and-storage-resources/persistent-volume-v1/#PersistentVolumeSpec)
|
||||
API reference has more details on this field.
|
||||
|
||||
### Phase
|
||||
|
||||
|
@ -671,24 +795,35 @@ spec:
|
|||
|
||||
### Access Modes
|
||||
|
||||
Claims use [the same conventions as volumes](#access-modes) when requesting storage with specific access modes.
|
||||
Claims use [the same conventions as volumes](#access-modes) when requesting
|
||||
storage with specific access modes.
|
||||
|
||||
### Volume Modes
|
||||
|
||||
Claims use [the same convention as volumes](#volume-mode) to indicate the consumption of the volume as either a filesystem or block device.
|
||||
Claims use [the same convention as volumes](#volume-mode) to indicate the
|
||||
consumption of the volume as either a filesystem or block device.
|
||||
|
||||
### Resources
|
||||
|
||||
Claims, like Pods, can request specific quantities of a resource. In this case, the request is for storage. The same [resource model](https://git.k8s.io/design-proposals-archive/scheduling/resources.md) applies to both volumes and claims.
|
||||
Claims, like Pods, can request specific quantities of a resource. In this case,
|
||||
the request is for storage. The same
|
||||
[resource model](https://git.k8s.io/design-proposals-archive/scheduling/resources.md)
|
||||
applies to both volumes and claims.
|
||||
|
||||
### Selector
|
||||
|
||||
Claims can specify a [label selector](/docs/concepts/overview/working-with-objects/labels/#label-selectors) to further filter the set of volumes. Only the volumes whose labels match the selector can be bound to the claim. The selector can consist of two fields:
|
||||
Claims can specify a
|
||||
[label selector](/docs/concepts/overview/working-with-objects/labels/#label-selectors)
|
||||
to further filter the set of volumes. Only the volumes whose labels match the selector
|
||||
can be bound to the claim. The selector can consist of two fields:
|
||||
|
||||
* `matchLabels` - the volume must have a label with this value
|
||||
* `matchExpressions` - a list of requirements made by specifying key, list of values, and operator that relates the key and values. Valid operators include In, NotIn, Exists, and DoesNotExist.
|
||||
* `matchExpressions` - a list of requirements made by specifying key, list of values,
|
||||
and operator that relates the key and values. Valid operators include In, NotIn,
|
||||
Exists, and DoesNotExist.
|
||||
|
||||
All of the requirements, from both `matchLabels` and `matchExpressions`, are ANDed together – they must all be satisfied in order to match.
|
||||
All of the requirements, from both `matchLabels` and `matchExpressions`, are
|
||||
ANDed together – they must all be satisfied in order to match.
|
||||
|
||||
### Class
|
||||
|
||||
|
@ -738,22 +873,38 @@ In the past, the annotation `volume.beta.kubernetes.io/storage-class` was used i
|
|||
of `storageClassName` attribute. This annotation is still working; however,
|
||||
it won't be supported in a future Kubernetes release.
|
||||
|
||||
|
||||
#### Retroactive default StorageClass assignment
|
||||
|
||||
{{< feature-state for_k8s_version="v1.26" state="beta" >}}
|
||||
|
||||
You can create a PersistentVolumeClaim without specifying a `storageClassName` for the new PVC, and you can do so even when no default StorageClass exists in your cluster. In this case, the new PVC creates as you defined it, and the `storageClassName` of that PVC remains unset until default becomes available.
|
||||
You can create a PersistentVolumeClaim without specifying a `storageClassName`
|
||||
for the new PVC, and you can do so even when no default StorageClass exists
|
||||
in your cluster. In this case, the new PVC creates as you defined it, and the
|
||||
`storageClassName` of that PVC remains unset until default becomes available.
|
||||
|
||||
When a default StorageClass becomes available, the control plane identifies any existing PVCs without `storageClassName`. For the PVCs that either have an empty value for `storageClassName` or do not have this key, the control plane then updates those PVCs to set `storageClassName` to match the new default StorageClass. If you have an existing PVC where the `storageClassName` is `""`, and you configure a default StorageClass, then this PVC will not get updated.
|
||||
When a default StorageClass becomes available, the control plane identifies any
|
||||
existing PVCs without `storageClassName`. For the PVCs that either have an empty
|
||||
value for `storageClassName` or do not have this key, the control plane then
|
||||
updates those PVCs to set `storageClassName` to match the new default StorageClass.
|
||||
If you have an existing PVC where the `storageClassName` is `""`, and you configure
|
||||
a default StorageClass, then this PVC will not get updated.
|
||||
|
||||
In order to keep binding to PVs with `storageClassName` set to `""` (while a default StorageClass is present), you need to set the `storageClassName` of the associated PVC to `""`.
|
||||
In order to keep binding to PVs with `storageClassName` set to `""`
|
||||
(while a default StorageClass is present), you need to set the `storageClassName`
|
||||
of the associated PVC to `""`.
|
||||
|
||||
This behavior helps administrators change default StorageClass by removing the old one first and then creating or setting another one. This brief window while there is no default causes PVCs without `storageClassName` created at that time to not have any default, but due to the retroactive default StorageClass assignment this way of changing defaults is safe.
|
||||
This behavior helps administrators change default StorageClass by removing the
|
||||
old one first and then creating or setting another one. This brief window while
|
||||
there is no default causes PVCs without `storageClassName` created at that time
|
||||
to not have any default, but due to the retroactive default StorageClass
|
||||
assignment this way of changing defaults is safe.
|
||||
|
||||
## Claims As Volumes
|
||||
|
||||
Pods access storage by using the claim as a volume. Claims must exist in the same namespace as the Pod using the claim. The cluster finds the claim in the Pod's namespace and uses it to get the PersistentVolume backing the claim. The volume is then mounted to the host and into the Pod.
|
||||
Pods access storage by using the claim as a volume. Claims must exist in the
|
||||
same namespace as the Pod using the claim. The cluster finds the claim in the
|
||||
Pod's namespace and uses it to get the PersistentVolume backing the claim.
|
||||
The volume is then mounted to the host and into the Pod.
|
||||
|
||||
```yaml
|
||||
apiVersion: v1
|
||||
|
@ -775,12 +926,15 @@ spec:
|
|||
|
||||
### A Note on Namespaces
|
||||
|
||||
PersistentVolumes binds are exclusive, and since PersistentVolumeClaims are namespaced objects, mounting claims with "Many" modes (`ROX`, `RWX`) is only possible within one namespace.
|
||||
PersistentVolumes binds are exclusive, and since PersistentVolumeClaims are
|
||||
namespaced objects, mounting claims with "Many" modes (`ROX`, `RWX`) is only
|
||||
possible within one namespace.
|
||||
|
||||
### PersistentVolumes typed `hostPath`
|
||||
|
||||
A `hostPath` PersistentVolume uses a file or directory on the Node to emulate network-attached storage.
|
||||
See [an example of `hostPath` typed volume](/docs/tasks/configure-pod-container/configure-persistent-volume-storage/#create-a-persistentvolume).
|
||||
A `hostPath` PersistentVolume uses a file or directory on the Node to emulate
|
||||
network-attached storage. See
|
||||
[an example of `hostPath` typed volume](/docs/tasks/configure-pod-container/configure-persistent-volume-storage/#create-a-persistentvolume).
|
||||
|
||||
## Raw Block Volume Support
|
||||
|
||||
|
@ -819,6 +973,7 @@ spec:
|
|||
lun: 0
|
||||
readOnly: false
|
||||
```
|
||||
|
||||
### PersistentVolumeClaim requesting a Raw Block Volume {#persistent-volume-claim-requesting-a-raw-block-volume}
|
||||
|
||||
```yaml
|
||||
|
@ -858,14 +1013,18 @@ spec:
|
|||
```
|
||||
|
||||
{{< note >}}
|
||||
When adding a raw block device for a Pod, you specify the device path in the container instead of a mount path.
|
||||
When adding a raw block device for a Pod, you specify the device path in the
|
||||
container instead of a mount path.
|
||||
{{< /note >}}
|
||||
|
||||
### Binding Block Volumes
|
||||
|
||||
If a user requests a raw block volume by indicating this using the `volumeMode` field in the PersistentVolumeClaim spec, the binding rules differ slightly from previous releases that didn't consider this mode as part of the spec.
|
||||
Listed is a table of possible combinations the user and admin might specify for requesting a raw block device. The table indicates if the volume will be bound or not given the combinations:
|
||||
Volume binding matrix for statically provisioned volumes:
|
||||
If a user requests a raw block volume by indicating this using the `volumeMode`
|
||||
field in the PersistentVolumeClaim spec, the binding rules differ slightly from
|
||||
previous releases that didn't consider this mode as part of the spec.
|
||||
Listed is a table of possible combinations the user and admin might specify for
|
||||
requesting a raw block device. The table indicates if the volume will be bound or
|
||||
not given the combinations: Volume binding matrix for statically provisioned volumes:
|
||||
|
||||
| PV volumeMode | PVC volumeMode | Result |
|
||||
| --------------|:---------------:| ----------------:|
|
||||
|
@ -880,15 +1039,19 @@ Volume binding matrix for statically provisioned volumes:
|
|||
| Filesystem | unspecified | BIND |
|
||||
|
||||
{{< note >}}
|
||||
Only statically provisioned volumes are supported for alpha release. Administrators should take care to consider these values when working with raw block devices.
|
||||
Only statically provisioned volumes are supported for alpha release. Administrators
|
||||
should take care to consider these values when working with raw block devices.
|
||||
{{< /note >}}
|
||||
|
||||
## Volume Snapshot and Restore Volume from Snapshot Support
|
||||
|
||||
{{< feature-state for_k8s_version="v1.20" state="stable" >}}
|
||||
|
||||
Volume snapshots only support the out-of-tree CSI volume plugins. For details, see [Volume Snapshots](/docs/concepts/storage/volume-snapshots/).
|
||||
In-tree volume plugins are deprecated. You can read about the deprecated volume plugins in the [Volume Plugin FAQ](https://github.com/kubernetes/community/blob/master/sig-storage/volume-plugin-faq.md).
|
||||
Volume snapshots only support the out-of-tree CSI volume plugins.
|
||||
For details, see [Volume Snapshots](/docs/concepts/storage/volume-snapshots/).
|
||||
In-tree volume plugins are deprecated. You can read about the deprecated volume
|
||||
plugins in the
|
||||
[Volume Plugin FAQ](https://github.com/kubernetes/community/blob/master/sig-storage/volume-plugin-faq.md).
|
||||
|
||||
### Create a PersistentVolumeClaim from a Volume Snapshot {#create-persistent-volume-claim-from-volume-snapshot}
|
||||
|
||||
|
@ -912,7 +1075,8 @@ spec:
|
|||
|
||||
## Volume Cloning
|
||||
|
||||
[Volume Cloning](/docs/concepts/storage/volume-pvc-datasource/) only available for CSI volume plugins.
|
||||
[Volume Cloning](/docs/concepts/storage/volume-pvc-datasource/)
|
||||
only available for CSI volume plugins.
|
||||
|
||||
### Create PersistentVolumeClaim from an existing PVC {#create-persistent-volume-claim-from-an-existing-pvc}
|
||||
|
||||
|
@ -949,27 +1113,32 @@ same namespace, except for core objects other than PVCs. For clusters that have
|
|||
gate enabled, use of the `dataSourceRef` is preferred over `dataSource`.
|
||||
|
||||
## Cross namespace data sources
|
||||
|
||||
{{< feature-state for_k8s_version="v1.26" state="alpha" >}}
|
||||
|
||||
Kubernetes supports cross namespace volume data sources.
|
||||
To use cross namespace volume data sources, you must enable the `AnyVolumeDataSource` and `CrossNamespaceVolumeDataSource`
|
||||
To use cross namespace volume data sources, you must enable the `AnyVolumeDataSource`
|
||||
and `CrossNamespaceVolumeDataSource`
|
||||
[feature gates](/docs/reference/command-line-tools-reference/feature-gates/) for
|
||||
the kube-apiserver, kube-controller-manager.
|
||||
Also, you must enable the `CrossNamespaceVolumeDataSource` feature gate for the csi-provisioner.
|
||||
|
||||
Enabling the `CrossNamespaceVolumeDataSource` feature gate allow you to specify a namespace in the dataSourceRef field.
|
||||
Enabling the `CrossNamespaceVolumeDataSource` feature gate allows you to specify
|
||||
a namespace in the dataSourceRef field.
|
||||
|
||||
{{< note >}}
|
||||
When you specify a namespace for a volume data source, Kubernetes checks for a
|
||||
ReferenceGrant in the other namespace before accepting the reference.
|
||||
ReferenceGrant is part of the `gateway.networking.k8s.io` extension APIs.
|
||||
See [ReferenceGrant](https://gateway-api.sigs.k8s.io/api-types/referencegrant/) in the Gateway API documentation for details.
|
||||
See [ReferenceGrant](https://gateway-api.sigs.k8s.io/api-types/referencegrant/)
|
||||
in the Gateway API documentation for details.
|
||||
This means that you must extend your Kubernetes cluster with at least ReferenceGrant from the
|
||||
Gateway API before you can use this mechanism.
|
||||
{{< /note >}}
|
||||
|
||||
## Data source references
|
||||
|
||||
The `dataSourceRef` field behaves almost the same as the `dataSource` field. If either one is
|
||||
The `dataSourceRef` field behaves almost the same as the `dataSource` field. If one is
|
||||
specified while the other is not, the API server will give both fields the same value. Neither
|
||||
field can be changed after creation, and attempting to specify different values for the two
|
||||
fields will result in a validation error. Therefore the two fields will always have the same
|
||||
|
@ -986,7 +1155,8 @@ users should be aware of:
|
|||
|
||||
When the `CrossNamespaceVolumeDataSource` feature is enabled, there are additional differences:
|
||||
|
||||
* The `dataSource` field only allows local objects, while the `dataSourceRef` field allows objects in any namespaces.
|
||||
* The `dataSource` field only allows local objects, while the `dataSourceRef` field allows
|
||||
objects in any namespaces.
|
||||
* When namespace is specified, `dataSource` and `dataSourceRef` are not synced.
|
||||
|
||||
Users should always use `dataSourceRef` on clusters that have the feature gate enabled, and
|
||||
|
@ -1030,10 +1200,13 @@ responsibility of that populator controller to report Events that relate to volu
|
|||
the process.
|
||||
|
||||
### Using a cross-namespace volume data source
|
||||
|
||||
{{< feature-state for_k8s_version="v1.26" state="alpha" >}}
|
||||
|
||||
Create a ReferenceGrant to allow the namespace owner to accept the reference.
|
||||
You define a populated volume by specifying a cross namespace volume data source using the `dataSourceRef` field. You must already have a valid ReferenceGrant in the source namespace:
|
||||
You define a populated volume by specifying a cross namespace volume data source
|
||||
using the `dataSourceRef` field. You must already have a valid ReferenceGrant
|
||||
in the source namespace:
|
||||
|
||||
```yaml
|
||||
apiVersion: gateway.networking.k8s.io/v1beta1
|
||||
|
|
|
@ -62,22 +62,22 @@ volumeBindingMode: Immediate
|
|||
Each StorageClass has a provisioner that determines what volume plugin is used
|
||||
for provisioning PVs. This field must be specified.
|
||||
|
||||
| Volume Plugin | Internal Provisioner| Config Example |
|
||||
| :--- | :---: | :---: |
|
||||
| AWSElasticBlockStore | ✓ | [AWS EBS](#aws-ebs) |
|
||||
| AzureFile | ✓ | [Azure File](#azure-file) |
|
||||
| AzureDisk | ✓ | [Azure Disk](#azure-disk) |
|
||||
| CephFS | - | - |
|
||||
| Cinder | ✓ | [OpenStack Cinder](#openstack-cinder)|
|
||||
| FC | - | - |
|
||||
| FlexVolume | - | - |
|
||||
| GCEPersistentDisk | ✓ | [GCE PD](#gce-pd) |
|
||||
| iSCSI | - | - |
|
||||
| NFS | - | [NFS](#nfs) |
|
||||
| RBD | ✓ | [Ceph RBD](#ceph-rbd) |
|
||||
| VsphereVolume | ✓ | [vSphere](#vsphere) |
|
||||
| PortworxVolume | ✓ | [Portworx Volume](#portworx-volume) |
|
||||
| Local | - | [Local](#local) |
|
||||
| Volume Plugin | Internal Provisioner | Config Example |
|
||||
| :------------------- | :------------------: | :-----------------------------------: |
|
||||
| AWSElasticBlockStore | ✓ | [AWS EBS](#aws-ebs) |
|
||||
| AzureFile | ✓ | [Azure File](#azure-file) |
|
||||
| AzureDisk | ✓ | [Azure Disk](#azure-disk) |
|
||||
| CephFS | - | - |
|
||||
| Cinder | ✓ | [OpenStack Cinder](#openstack-cinder) |
|
||||
| FC | - | - |
|
||||
| FlexVolume | - | - |
|
||||
| GCEPersistentDisk | ✓ | [GCE PD](#gce-pd) |
|
||||
| iSCSI | - | - |
|
||||
| NFS | - | [NFS](#nfs) |
|
||||
| RBD | ✓ | [Ceph RBD](#ceph-rbd) |
|
||||
| VsphereVolume | ✓ | [vSphere](#vsphere) |
|
||||
| PortworxVolume | ✓ | [Portworx Volume](#portworx-volume) |
|
||||
| Local | - | [Local](#local) |
|
||||
|
||||
You are not restricted to specifying the "internal" provisioners
|
||||
listed here (whose names are prefixed with "kubernetes.io" and shipped
|
||||
|
@ -109,29 +109,28 @@ whatever reclaim policy they were assigned at creation.
|
|||
|
||||
{{< feature-state for_k8s_version="v1.11" state="beta" >}}
|
||||
|
||||
PersistentVolumes can be configured to be expandable. This feature when set to `true`,
|
||||
allows the users to resize the volume by editing the corresponding PVC object.
|
||||
PersistentVolumes can be configured to be expandable. This feature when set to `true`,
|
||||
allows the users to resize the volume by editing the corresponding PVC object.
|
||||
|
||||
The following types of volumes support volume expansion, when the underlying
|
||||
StorageClass has the field `allowVolumeExpansion` set to true.
|
||||
|
||||
{{< table caption = "Table of Volume types and the version of Kubernetes they require" >}}
|
||||
|
||||
Volume type | Required Kubernetes version
|
||||
:---------- | :--------------------------
|
||||
gcePersistentDisk | 1.11
|
||||
awsElasticBlockStore | 1.11
|
||||
Cinder | 1.11
|
||||
rbd | 1.11
|
||||
Azure File | 1.11
|
||||
Azure Disk | 1.11
|
||||
Portworx | 1.11
|
||||
FlexVolume | 1.13
|
||||
CSI | 1.14 (alpha), 1.16 (beta)
|
||||
| Volume type | Required Kubernetes version |
|
||||
| :------------------- | :-------------------------- |
|
||||
| gcePersistentDisk | 1.11 |
|
||||
| awsElasticBlockStore | 1.11 |
|
||||
| Cinder | 1.11 |
|
||||
| rbd | 1.11 |
|
||||
| Azure File | 1.11 |
|
||||
| Azure Disk | 1.11 |
|
||||
| Portworx | 1.11 |
|
||||
| FlexVolume | 1.13 |
|
||||
| CSI | 1.14 (alpha), 1.16 (beta) |
|
||||
|
||||
{{< /table >}}
|
||||
|
||||
|
||||
{{< note >}}
|
||||
You can only use the volume expansion feature to grow a Volume, not to shrink it.
|
||||
{{< /note >}}
|
||||
|
@ -168,14 +167,14 @@ and [taints and tolerations](/docs/concepts/scheduling-eviction/taint-and-tolera
|
|||
|
||||
The following plugins support `WaitForFirstConsumer` with dynamic provisioning:
|
||||
|
||||
* [AWSElasticBlockStore](#aws-ebs)
|
||||
* [GCEPersistentDisk](#gce-pd)
|
||||
* [AzureDisk](#azure-disk)
|
||||
- [AWSElasticBlockStore](#aws-ebs)
|
||||
- [GCEPersistentDisk](#gce-pd)
|
||||
- [AzureDisk](#azure-disk)
|
||||
|
||||
The following plugins support `WaitForFirstConsumer` with pre-created PersistentVolume binding:
|
||||
|
||||
* All of the above
|
||||
* [Local](#local)
|
||||
- All of the above
|
||||
- [Local](#local)
|
||||
|
||||
{{< feature-state state="stable" for_k8s_version="v1.17" >}}
|
||||
[CSI volumes](/docs/concepts/storage/volumes/#csi) are also supported with dynamic provisioning
|
||||
|
@ -183,10 +182,10 @@ and pre-created PVs, but you'll need to look at the documentation for a specific
|
|||
to see its supported topology keys and examples.
|
||||
|
||||
{{< note >}}
|
||||
If you choose to use `WaitForFirstConsumer`, do not use `nodeName` in the Pod spec
|
||||
to specify node affinity. If `nodeName` is used in this case, the scheduler will be bypassed and PVC will remain in `pending` state.
|
||||
If you choose to use `WaitForFirstConsumer`, do not use `nodeName` in the Pod spec
|
||||
to specify node affinity. If `nodeName` is used in this case, the scheduler will be bypassed and PVC will remain in `pending` state.
|
||||
|
||||
Instead, you can use node selector for hostname in this case as shown below.
|
||||
Instead, you can use node selector for hostname in this case as shown below.
|
||||
{{< /note >}}
|
||||
|
||||
```yaml
|
||||
|
@ -243,7 +242,7 @@ allowedTopologies:
|
|||
|
||||
Storage Classes have parameters that describe volumes belonging to the storage
|
||||
class. Different parameters may be accepted depending on the `provisioner`. For
|
||||
example, the value `io1`, for the parameter `type`, and the parameter
|
||||
example, the value `io1`, for the parameter `type`, and the parameter
|
||||
`iopsPerGB` are specific to EBS. When a parameter is omitted, some default is
|
||||
used.
|
||||
|
||||
|
@ -265,26 +264,26 @@ parameters:
|
|||
fsType: ext4
|
||||
```
|
||||
|
||||
* `type`: `io1`, `gp2`, `sc1`, `st1`. See
|
||||
- `type`: `io1`, `gp2`, `sc1`, `st1`. See
|
||||
[AWS docs](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html)
|
||||
for details. Default: `gp2`.
|
||||
* `zone` (Deprecated): AWS zone. If neither `zone` nor `zones` is specified, volumes are
|
||||
- `zone` (Deprecated): AWS zone. If neither `zone` nor `zones` is specified, volumes are
|
||||
generally round-robin-ed across all active zones where Kubernetes cluster
|
||||
has a node. `zone` and `zones` parameters must not be used at the same time.
|
||||
* `zones` (Deprecated): A comma separated list of AWS zone(s). If neither `zone` nor `zones`
|
||||
- `zones` (Deprecated): A comma separated list of AWS zone(s). If neither `zone` nor `zones`
|
||||
is specified, volumes are generally round-robin-ed across all active zones
|
||||
where Kubernetes cluster has a node. `zone` and `zones` parameters must not
|
||||
be used at the same time.
|
||||
* `iopsPerGB`: only for `io1` volumes. I/O operations per second per GiB. AWS
|
||||
- `iopsPerGB`: only for `io1` volumes. I/O operations per second per GiB. AWS
|
||||
volume plugin multiplies this with size of requested volume to compute IOPS
|
||||
of the volume and caps it at 20 000 IOPS (maximum supported by AWS, see
|
||||
[AWS docs](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSVolumeTypes.html)).
|
||||
A string is expected here, i.e. `"10"`, not `10`.
|
||||
* `fsType`: fsType that is supported by kubernetes. Default: `"ext4"`.
|
||||
* `encrypted`: denotes whether the EBS volume should be encrypted or not.
|
||||
- `fsType`: fsType that is supported by kubernetes. Default: `"ext4"`.
|
||||
- `encrypted`: denotes whether the EBS volume should be encrypted or not.
|
||||
Valid values are `"true"` or `"false"`. A string is expected here,
|
||||
i.e. `"true"`, not `true`.
|
||||
* `kmsKeyId`: optional. The full Amazon Resource Name of the key to use when
|
||||
- `kmsKeyId`: optional. The full Amazon Resource Name of the key to use when
|
||||
encrypting the volume. If none is supplied but `encrypted` is true, a key is
|
||||
generated by AWS. See AWS docs for valid ARN value.
|
||||
|
||||
|
@ -307,17 +306,17 @@ parameters:
|
|||
replication-type: none
|
||||
```
|
||||
|
||||
* `type`: `pd-standard` or `pd-ssd`. Default: `pd-standard`
|
||||
* `zone` (Deprecated): GCE zone. If neither `zone` nor `zones` is specified, volumes are
|
||||
- `type`: `pd-standard` or `pd-ssd`. Default: `pd-standard`
|
||||
- `zone` (Deprecated): GCE zone. If neither `zone` nor `zones` is specified, volumes are
|
||||
generally round-robin-ed across all active zones where Kubernetes cluster has
|
||||
a node. `zone` and `zones` parameters must not be used at the same time.
|
||||
* `zones` (Deprecated): A comma separated list of GCE zone(s). If neither `zone` nor `zones`
|
||||
- `zones` (Deprecated): A comma separated list of GCE zone(s). If neither `zone` nor `zones`
|
||||
is specified, volumes are generally round-robin-ed across all active zones
|
||||
where Kubernetes cluster has a node. `zone` and `zones` parameters must not
|
||||
be used at the same time.
|
||||
* `fstype`: `ext4` or `xfs`. Default: `ext4`. The defined filesystem type must be supported by the host operating system.
|
||||
- `fstype`: `ext4` or `xfs`. Default: `ext4`. The defined filesystem type must be supported by the host operating system.
|
||||
|
||||
* `replication-type`: `none` or `regional-pd`. Default: `none`.
|
||||
- `replication-type`: `none` or `regional-pd`. Default: `none`.
|
||||
|
||||
If `replication-type` is set to `none`, a regular (zonal) PD will be provisioned.
|
||||
|
||||
|
@ -350,14 +349,15 @@ parameters:
|
|||
readOnly: "false"
|
||||
```
|
||||
|
||||
* `server`: Server is the hostname or IP address of the NFS server.
|
||||
* `path`: Path that is exported by the NFS server.
|
||||
* `readOnly`: A flag indicating whether the storage will be mounted as read only (default false).
|
||||
- `server`: Server is the hostname or IP address of the NFS server.
|
||||
- `path`: Path that is exported by the NFS server.
|
||||
- `readOnly`: A flag indicating whether the storage will be mounted as read only (default false).
|
||||
|
||||
Kubernetes doesn't include an internal NFS provisioner. You need to use an external provisioner to create a StorageClass for NFS.
|
||||
Here are some examples:
|
||||
* [NFS Ganesha server and external provisioner](https://github.com/kubernetes-sigs/nfs-ganesha-server-and-external-provisioner)
|
||||
* [NFS subdir external provisioner](https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner)
|
||||
|
||||
- [NFS Ganesha server and external provisioner](https://github.com/kubernetes-sigs/nfs-ganesha-server-and-external-provisioner)
|
||||
- [NFS subdir external provisioner](https://github.com/kubernetes-sigs/nfs-subdir-external-provisioner)
|
||||
|
||||
### OpenStack Cinder
|
||||
|
||||
|
@ -371,7 +371,7 @@ parameters:
|
|||
availability: nova
|
||||
```
|
||||
|
||||
* `availability`: Availability Zone. If not specified, volumes are generally
|
||||
- `availability`: Availability Zone. If not specified, volumes are generally
|
||||
round-robin-ed across all active zones where Kubernetes cluster has a node.
|
||||
|
||||
{{< note >}}
|
||||
|
@ -381,7 +381,7 @@ This internal provisioner of OpenStack is deprecated. Please use [the external c
|
|||
|
||||
### vSphere
|
||||
|
||||
There are two types of provisioners for vSphere storage classes:
|
||||
There are two types of provisioners for vSphere storage classes:
|
||||
|
||||
- [CSI provisioner](#vsphere-provisioner-csi): `csi.vsphere.vmware.com`
|
||||
- [vCP provisioner](#vcp-provisioner): `kubernetes.io/vsphere-volume`
|
||||
|
@ -392,73 +392,73 @@ In-tree provisioners are [deprecated](/blog/2019/12/09/kubernetes-1-17-feature-c
|
|||
|
||||
The vSphere CSI StorageClass provisioner works with Tanzu Kubernetes clusters. For an example, refer to the [vSphere CSI repository](https://github.com/kubernetes-sigs/vsphere-csi-driver/blob/master/example/vanilla-k8s-RWM-filesystem-volumes/example-sc.yaml).
|
||||
|
||||
#### vCP Provisioner
|
||||
#### vCP Provisioner
|
||||
|
||||
The following examples use the VMware Cloud Provider (vCP) StorageClass provisioner.
|
||||
The following examples use the VMware Cloud Provider (vCP) StorageClass provisioner.
|
||||
|
||||
1. Create a StorageClass with a user specified disk format.
|
||||
|
||||
```yaml
|
||||
apiVersion: storage.k8s.io/v1
|
||||
kind: StorageClass
|
||||
metadata:
|
||||
name: fast
|
||||
provisioner: kubernetes.io/vsphere-volume
|
||||
parameters:
|
||||
diskformat: zeroedthick
|
||||
```
|
||||
```yaml
|
||||
apiVersion: storage.k8s.io/v1
|
||||
kind: StorageClass
|
||||
metadata:
|
||||
name: fast
|
||||
provisioner: kubernetes.io/vsphere-volume
|
||||
parameters:
|
||||
diskformat: zeroedthick
|
||||
```
|
||||
|
||||
`diskformat`: `thin`, `zeroedthick` and `eagerzeroedthick`. Default: `"thin"`.
|
||||
`diskformat`: `thin`, `zeroedthick` and `eagerzeroedthick`. Default: `"thin"`.
|
||||
|
||||
2. Create a StorageClass with a disk format on a user specified datastore.
|
||||
|
||||
```yaml
|
||||
apiVersion: storage.k8s.io/v1
|
||||
kind: StorageClass
|
||||
metadata:
|
||||
name: fast
|
||||
provisioner: kubernetes.io/vsphere-volume
|
||||
parameters:
|
||||
diskformat: zeroedthick
|
||||
datastore: VSANDatastore
|
||||
```
|
||||
```yaml
|
||||
apiVersion: storage.k8s.io/v1
|
||||
kind: StorageClass
|
||||
metadata:
|
||||
name: fast
|
||||
provisioner: kubernetes.io/vsphere-volume
|
||||
parameters:
|
||||
diskformat: zeroedthick
|
||||
datastore: VSANDatastore
|
||||
```
|
||||
|
||||
`datastore`: The user can also specify the datastore in the StorageClass.
|
||||
The volume will be created on the datastore specified in the StorageClass,
|
||||
which in this case is `VSANDatastore`. This field is optional. If the
|
||||
datastore is not specified, then the volume will be created on the datastore
|
||||
specified in the vSphere config file used to initialize the vSphere Cloud
|
||||
Provider.
|
||||
`datastore`: The user can also specify the datastore in the StorageClass.
|
||||
The volume will be created on the datastore specified in the StorageClass,
|
||||
which in this case is `VSANDatastore`. This field is optional. If the
|
||||
datastore is not specified, then the volume will be created on the datastore
|
||||
specified in the vSphere config file used to initialize the vSphere Cloud
|
||||
Provider.
|
||||
|
||||
3. Storage Policy Management inside kubernetes
|
||||
|
||||
* Using existing vCenter SPBM policy
|
||||
- Using existing vCenter SPBM policy
|
||||
|
||||
One of the most important features of vSphere for Storage Management is
|
||||
policy based Management. Storage Policy Based Management (SPBM) is a
|
||||
storage policy framework that provides a single unified control plane
|
||||
across a broad range of data services and storage solutions. SPBM enables
|
||||
vSphere administrators to overcome upfront storage provisioning challenges,
|
||||
such as capacity planning, differentiated service levels and managing
|
||||
capacity headroom.
|
||||
One of the most important features of vSphere for Storage Management is
|
||||
policy based Management. Storage Policy Based Management (SPBM) is a
|
||||
storage policy framework that provides a single unified control plane
|
||||
across a broad range of data services and storage solutions. SPBM enables
|
||||
vSphere administrators to overcome upfront storage provisioning challenges,
|
||||
such as capacity planning, differentiated service levels and managing
|
||||
capacity headroom.
|
||||
|
||||
The SPBM policies can be specified in the StorageClass using the
|
||||
`storagePolicyName` parameter.
|
||||
The SPBM policies can be specified in the StorageClass using the
|
||||
`storagePolicyName` parameter.
|
||||
|
||||
* Virtual SAN policy support inside Kubernetes
|
||||
- Virtual SAN policy support inside Kubernetes
|
||||
|
||||
Vsphere Infrastructure (VI) Admins will have the ability to specify custom
|
||||
Virtual SAN Storage Capabilities during dynamic volume provisioning. You
|
||||
can now define storage requirements, such as performance and availability,
|
||||
in the form of storage capabilities during dynamic volume provisioning.
|
||||
The storage capability requirements are converted into a Virtual SAN
|
||||
policy which are then pushed down to the Virtual SAN layer when a
|
||||
persistent volume (virtual disk) is being created. The virtual disk is
|
||||
distributed across the Virtual SAN datastore to meet the requirements.
|
||||
Vsphere Infrastructure (VI) Admins will have the ability to specify custom
|
||||
Virtual SAN Storage Capabilities during dynamic volume provisioning. You
|
||||
can now define storage requirements, such as performance and availability,
|
||||
in the form of storage capabilities during dynamic volume provisioning.
|
||||
The storage capability requirements are converted into a Virtual SAN
|
||||
policy which are then pushed down to the Virtual SAN layer when a
|
||||
persistent volume (virtual disk) is being created. The virtual disk is
|
||||
distributed across the Virtual SAN datastore to meet the requirements.
|
||||
|
||||
You can see [Storage Policy Based Management for dynamic provisioning of volumes](https://github.com/vmware-archive/vsphere-storage-for-kubernetes/blob/fa4c8b8ad46a85b6555d715dd9d27ff69839df53/documentation/policy-based-mgmt.md)
|
||||
for more details on how to use storage policies for persistent volumes
|
||||
management.
|
||||
You can see [Storage Policy Based Management for dynamic provisioning of volumes](https://github.com/vmware-archive/vsphere-storage-for-kubernetes/blob/fa4c8b8ad46a85b6555d715dd9d27ff69839df53/documentation/policy-based-mgmt.md)
|
||||
for more details on how to use storage policies for persistent volumes
|
||||
management.
|
||||
|
||||
There are few
|
||||
[vSphere examples](https://github.com/kubernetes/examples/tree/master/staging/volumes/vsphere)
|
||||
|
@ -486,29 +486,30 @@ parameters:
|
|||
imageFeatures: "layering"
|
||||
```
|
||||
|
||||
* `monitors`: Ceph monitors, comma delimited. This parameter is required.
|
||||
* `adminId`: Ceph client ID that is capable of creating images in the pool.
|
||||
- `monitors`: Ceph monitors, comma delimited. This parameter is required.
|
||||
- `adminId`: Ceph client ID that is capable of creating images in the pool.
|
||||
Default is "admin".
|
||||
* `adminSecretName`: Secret Name for `adminId`. This parameter is required.
|
||||
- `adminSecretName`: Secret Name for `adminId`. This parameter is required.
|
||||
The provided secret must have type "kubernetes.io/rbd".
|
||||
* `adminSecretNamespace`: The namespace for `adminSecretName`. Default is "default".
|
||||
* `pool`: Ceph RBD pool. Default is "rbd".
|
||||
* `userId`: Ceph client ID that is used to map the RBD image. Default is the
|
||||
- `adminSecretNamespace`: The namespace for `adminSecretName`. Default is "default".
|
||||
- `pool`: Ceph RBD pool. Default is "rbd".
|
||||
- `userId`: Ceph client ID that is used to map the RBD image. Default is the
|
||||
same as `adminId`.
|
||||
* `userSecretName`: The name of Ceph Secret for `userId` to map RBD image. It
|
||||
- `userSecretName`: The name of Ceph Secret for `userId` to map RBD image. It
|
||||
must exist in the same namespace as PVCs. This parameter is required.
|
||||
The provided secret must have type "kubernetes.io/rbd", for example created in this
|
||||
way:
|
||||
|
||||
```shell
|
||||
kubectl create secret generic ceph-secret --type="kubernetes.io/rbd" \
|
||||
--from-literal=key='QVFEQ1pMdFhPUnQrSmhBQUFYaERWNHJsZ3BsMmNjcDR6RFZST0E9PQ==' \
|
||||
--namespace=kube-system
|
||||
```
|
||||
* `userSecretNamespace`: The namespace for `userSecretName`.
|
||||
* `fsType`: fsType that is supported by kubernetes. Default: `"ext4"`.
|
||||
* `imageFormat`: Ceph RBD image format, "1" or "2". Default is "2".
|
||||
* `imageFeatures`: This parameter is optional and should only be used if you
|
||||
```shell
|
||||
kubectl create secret generic ceph-secret --type="kubernetes.io/rbd" \
|
||||
--from-literal=key='QVFEQ1pMdFhPUnQrSmhBQUFYaERWNHJsZ3BsMmNjcDR6RFZST0E9PQ==' \
|
||||
--namespace=kube-system
|
||||
```
|
||||
|
||||
- `userSecretNamespace`: The namespace for `userSecretName`.
|
||||
- `fsType`: fsType that is supported by kubernetes. Default: `"ext4"`.
|
||||
- `imageFormat`: Ceph RBD image format, "1" or "2". Default is "2".
|
||||
- `imageFeatures`: This parameter is optional and should only be used if you
|
||||
set `imageFormat` to "2". Currently supported features are `layering` only.
|
||||
Default is "", and no features are turned on.
|
||||
|
||||
|
@ -528,9 +529,9 @@ parameters:
|
|||
storageAccount: azure_storage_account_name
|
||||
```
|
||||
|
||||
* `skuName`: Azure storage account Sku tier. Default is empty.
|
||||
* `location`: Azure storage account location. Default is empty.
|
||||
* `storageAccount`: Azure storage account name. If a storage account is provided,
|
||||
- `skuName`: Azure storage account Sku tier. Default is empty.
|
||||
- `location`: Azure storage account location. Default is empty.
|
||||
- `storageAccount`: Azure storage account name. If a storage account is provided,
|
||||
it must reside in the same resource group as the cluster, and `location` is
|
||||
ignored. If a storage account is not provided, a new storage account will be
|
||||
created in the same resource group as the cluster.
|
||||
|
@ -548,21 +549,21 @@ parameters:
|
|||
kind: managed
|
||||
```
|
||||
|
||||
* `storageaccounttype`: Azure storage account Sku tier. Default is empty.
|
||||
* `kind`: Possible values are `shared`, `dedicated`, and `managed` (default).
|
||||
- `storageaccounttype`: Azure storage account Sku tier. Default is empty.
|
||||
- `kind`: Possible values are `shared`, `dedicated`, and `managed` (default).
|
||||
When `kind` is `shared`, all unmanaged disks are created in a few shared
|
||||
storage accounts in the same resource group as the cluster. When `kind` is
|
||||
`dedicated`, a new dedicated storage account will be created for the new
|
||||
unmanaged disk in the same resource group as the cluster. When `kind` is
|
||||
`managed`, all managed disks are created in the same resource group as
|
||||
unmanaged disk in the same resource group as the cluster. When `kind` is
|
||||
`managed`, all managed disks are created in the same resource group as
|
||||
the cluster.
|
||||
* `resourceGroup`: Specify the resource group in which the Azure disk will be created.
|
||||
It must be an existing resource group name. If it is unspecified, the disk will be
|
||||
placed in the same resource group as the current Kubernetes cluster.
|
||||
- `resourceGroup`: Specify the resource group in which the Azure disk will be created.
|
||||
It must be an existing resource group name. If it is unspecified, the disk will be
|
||||
placed in the same resource group as the current Kubernetes cluster.
|
||||
|
||||
- Premium VM can attach both Standard_LRS and Premium_LRS disks, while Standard
|
||||
* Premium VM can attach both Standard_LRS and Premium_LRS disks, while Standard
|
||||
VM can only attach Standard_LRS disks.
|
||||
- Managed VM can only attach managed disks and unmanaged VM can only attach
|
||||
* Managed VM can only attach managed disks and unmanaged VM can only attach
|
||||
unmanaged disks.
|
||||
|
||||
### Azure File
|
||||
|
@ -579,29 +580,29 @@ parameters:
|
|||
storageAccount: azure_storage_account_name
|
||||
```
|
||||
|
||||
* `skuName`: Azure storage account Sku tier. Default is empty.
|
||||
* `location`: Azure storage account location. Default is empty.
|
||||
* `storageAccount`: Azure storage account name. Default is empty. If a storage
|
||||
- `skuName`: Azure storage account Sku tier. Default is empty.
|
||||
- `location`: Azure storage account location. Default is empty.
|
||||
- `storageAccount`: Azure storage account name. Default is empty. If a storage
|
||||
account is not provided, all storage accounts associated with the resource
|
||||
group are searched to find one that matches `skuName` and `location`. If a
|
||||
storage account is provided, it must reside in the same resource group as the
|
||||
cluster, and `skuName` and `location` are ignored.
|
||||
* `secretNamespace`: the namespace of the secret that contains the Azure Storage
|
||||
- `secretNamespace`: the namespace of the secret that contains the Azure Storage
|
||||
Account Name and Key. Default is the same as the Pod.
|
||||
* `secretName`: the name of the secret that contains the Azure Storage Account Name and
|
||||
- `secretName`: the name of the secret that contains the Azure Storage Account Name and
|
||||
Key. Default is `azure-storage-account-<accountName>-secret`
|
||||
* `readOnly`: a flag indicating whether the storage will be mounted as read only.
|
||||
Defaults to false which means a read/write mount. This setting will impact the
|
||||
- `readOnly`: a flag indicating whether the storage will be mounted as read only.
|
||||
Defaults to false which means a read/write mount. This setting will impact the
|
||||
`ReadOnly` setting in VolumeMounts as well.
|
||||
|
||||
During storage provisioning, a secret named by `secretName` is created for the
|
||||
mounting credentials. If the cluster has enabled both
|
||||
[RBAC](/docs/reference/access-authn-authz/rbac/) and
|
||||
During storage provisioning, a secret named by `secretName` is created for the
|
||||
mounting credentials. If the cluster has enabled both
|
||||
[RBAC](/docs/reference/access-authn-authz/rbac/) and
|
||||
[Controller Roles](/docs/reference/access-authn-authz/rbac/#controller-roles),
|
||||
add the `create` permission of resource `secret` for clusterrole
|
||||
`system:controller:persistent-volume-binder`.
|
||||
|
||||
In a multi-tenancy context, it is strongly recommended to set the value for
|
||||
In a multi-tenancy context, it is strongly recommended to set the value for
|
||||
`secretNamespace` explicitly, otherwise the storage account credentials may
|
||||
be read by other users.
|
||||
|
||||
|
@ -615,26 +616,25 @@ metadata:
|
|||
provisioner: kubernetes.io/portworx-volume
|
||||
parameters:
|
||||
repl: "1"
|
||||
snap_interval: "70"
|
||||
priority_io: "high"
|
||||
|
||||
snap_interval: "70"
|
||||
priority_io: "high"
|
||||
```
|
||||
|
||||
* `fs`: filesystem to be laid out: `none/xfs/ext4` (default: `ext4`).
|
||||
* `block_size`: block size in Kbytes (default: `32`).
|
||||
* `repl`: number of synchronous replicas to be provided in the form of
|
||||
- `fs`: filesystem to be laid out: `none/xfs/ext4` (default: `ext4`).
|
||||
- `block_size`: block size in Kbytes (default: `32`).
|
||||
- `repl`: number of synchronous replicas to be provided in the form of
|
||||
replication factor `1..3` (default: `1`) A string is expected here i.e.
|
||||
`"1"` and not `1`.
|
||||
* `priority_io`: determines whether the volume will be created from higher
|
||||
- `priority_io`: determines whether the volume will be created from higher
|
||||
performance or a lower priority storage `high/medium/low` (default: `low`).
|
||||
* `snap_interval`: clock/time interval in minutes for when to trigger snapshots.
|
||||
- `snap_interval`: clock/time interval in minutes for when to trigger snapshots.
|
||||
Snapshots are incremental based on difference with the prior snapshot, 0
|
||||
disables snaps (default: `0`). A string is expected here i.e.
|
||||
`"70"` and not `70`.
|
||||
* `aggregation_level`: specifies the number of chunks the volume would be
|
||||
- `aggregation_level`: specifies the number of chunks the volume would be
|
||||
distributed into, 0 indicates a non-aggregated volume (default: `0`). A string
|
||||
is expected here i.e. `"0"` and not `0`
|
||||
* `ephemeral`: specifies whether the volume should be cleaned-up after unmount
|
||||
- `ephemeral`: specifies whether the volume should be cleaned-up after unmount
|
||||
or should be persistent. `emptyDir` use case can set this value to true and
|
||||
`persistent volumes` use case such as for databases like Cassandra should set
|
||||
to false, `true/false` (default `false`). A string is expected here i.e.
|
||||
|
@ -660,4 +660,3 @@ specified by the `WaitForFirstConsumer` volume binding mode.
|
|||
Delaying volume binding allows the scheduler to consider all of a Pod's
|
||||
scheduling constraints when choosing an appropriate PersistentVolume for a
|
||||
PersistentVolumeClaim.
|
||||
|
||||
|
|
|
@ -227,7 +227,7 @@ $ kubectl get crd volumesnapshotcontent -o yaml
|
|||
|
||||
If you want to allow users to create a `PersistentVolumeClaim` from an existing
|
||||
`VolumeSnapshot`, but with a different volume mode than the source, the annotation
|
||||
`snapshot.storage.kubernetes.io/allowVolumeModeChange: "true"`needs to be added to
|
||||
`snapshot.storage.kubernetes.io/allow-volume-mode-change: "true"`needs to be added to
|
||||
the `VolumeSnapshotContent` that corresponds to the `VolumeSnapshot`.
|
||||
|
||||
For pre-provisioned snapshots, `spec.sourceVolumeMode` needs to be populated
|
||||
|
@ -241,7 +241,7 @@ kind: VolumeSnapshotContent
|
|||
metadata:
|
||||
name: new-snapshot-content-test
|
||||
annotations:
|
||||
- snapshot.storage.kubernetes.io/allowVolumeModeChange: "true"
|
||||
- snapshot.storage.kubernetes.io/allow-volume-mode-change: "true"
|
||||
spec:
|
||||
deletionPolicy: Delete
|
||||
driver: hostpath.csi.k8s.io
|
||||
|
|
|
@ -549,7 +549,7 @@ spec:
|
|||
|
||||
<!-- maintenance note: OK to remove all mention of glusterfs once the v1.25 release of
|
||||
Kubernetes has gone out of support -->
|
||||
-
|
||||
|
||||
Kubernetes {{< skew currentVersion >}} does not include a `glusterfs` volume type.
|
||||
|
||||
The GlusterFS in-tree storage driver was deprecated in the Kubernetes v1.25 release
|
||||
|
@ -1282,8 +1282,13 @@ in `Container.volumeMounts`. Its values are:
|
|||
In similar fashion, no mounts created by the container will be visible on
|
||||
the host. This is the default mode.
|
||||
|
||||
This mode is equal to `private` mount propagation as described in the
|
||||
[Linux kernel documentation](https://www.kernel.org/doc/Documentation/filesystems/sharedsubtree.txt)
|
||||
This mode is equal to `rprivate` mount propagation as described in
|
||||
[`mount(8)`](https://man7.org/linux/man-pages/man8/mount.8.html)
|
||||
|
||||
However, the CRI runtime may choose `rslave` mount propagation (i.e.,
|
||||
`HostToContainer`) instead, when `rprivate` propagation is not applicable.
|
||||
cri-dockerd (Docker) is known to choose `rslave` mount propagation when the
|
||||
mount source contains the Docker daemon's root directory (`/var/lib/docker`).
|
||||
|
||||
* `HostToContainer` - This volume mount will receive all subsequent mounts
|
||||
that are mounted to this volume or any of its subdirectories.
|
||||
|
@ -1296,7 +1301,7 @@ in `Container.volumeMounts`. Its values are:
|
|||
propagation will see it.
|
||||
|
||||
This mode is equal to `rslave` mount propagation as described in the
|
||||
[Linux kernel documentation](https://www.kernel.org/doc/Documentation/filesystems/sharedsubtree.txt)
|
||||
[`mount(8)`](https://man7.org/linux/man-pages/man8/mount.8.html)
|
||||
|
||||
* `Bidirectional` - This volume mount behaves the same the `HostToContainer` mount.
|
||||
In addition, all volume mounts created by the container will be propagated
|
||||
|
@ -1306,7 +1311,7 @@ in `Container.volumeMounts`. Its values are:
|
|||
a Pod that needs to mount something on the host using a `hostPath` volume.
|
||||
|
||||
This mode is equal to `rshared` mount propagation as described in the
|
||||
[Linux kernel documentation](https://www.kernel.org/doc/Documentation/filesystems/sharedsubtree.txt)
|
||||
[`mount(8)`](https://man7.org/linux/man-pages/man8/mount.8.html)
|
||||
|
||||
{{< warning >}}
|
||||
`Bidirectional` mount propagation can be dangerous. It can damage
|
||||
|
|
|
@ -274,7 +274,8 @@ This functionality requires a container runtime that supports this functionality
|
|||
|
||||
#### Field compatibility for Pod security context {#compatibility-v1-pod-spec-containers-securitycontext}
|
||||
|
||||
None of the Pod [`securityContext`](/docs/reference/kubernetes-api/workload-resources/pod-v1/#security-context) fields work on Windows.
|
||||
Only the `securityContext.runAsNonRoot` and `securityContext.windowsOptions` from the Pod
|
||||
[`securityContext`](/docs/reference/kubernetes-api/workload-resources/pod-v1/#security-context) fields work on Windows.
|
||||
|
||||
## Node problem detector
|
||||
|
||||
|
|
|
@ -105,30 +105,24 @@ If you do not specify either, then the DaemonSet controller will create Pods on
|
|||
|
||||
## How Daemon Pods are scheduled
|
||||
|
||||
### Scheduled by default scheduler
|
||||
A DaemonSet ensures that all eligible nodes run a copy of a Pod. The DaemonSet
|
||||
controller creates a Pod for each eligible node and adds the
|
||||
`spec.affinity.nodeAffinity` field of the Pod to match the target host. After
|
||||
the Pod is created, the default scheduler typically takes over and then binds
|
||||
the Pod to the target host by setting the `.spec.nodeName` field. If the new
|
||||
Pod cannot fit on the node, the default scheduler may preempt (evict) some of
|
||||
the existing Pods based on the
|
||||
[priority](/docs/concepts/scheduling-eviction/pod-priority-preemption/#pod-priority)
|
||||
of the new Pod.
|
||||
|
||||
{{< feature-state for_k8s_version="1.17" state="stable" >}}
|
||||
The user can specify a different scheduler for the Pods of the DamonSet, by
|
||||
setting the `.spec.template.spec.schedulerName` field of the DaemonSet.
|
||||
|
||||
A DaemonSet ensures that all eligible nodes run a copy of a Pod. Normally, the
|
||||
node that a Pod runs on is selected by the Kubernetes scheduler. However,
|
||||
DaemonSet pods are created and scheduled by the DaemonSet controller instead.
|
||||
That introduces the following issues:
|
||||
|
||||
* Inconsistent Pod behavior: Normal Pods waiting to be scheduled are created
|
||||
and in `Pending` state, but DaemonSet pods are not created in `Pending`
|
||||
state. This is confusing to the user.
|
||||
* [Pod preemption](/docs/concepts/scheduling-eviction/pod-priority-preemption/)
|
||||
is handled by default scheduler. When preemption is enabled, the DaemonSet controller
|
||||
will make scheduling decisions without considering pod priority and preemption.
|
||||
|
||||
`ScheduleDaemonSetPods` allows you to schedule DaemonSets using the default
|
||||
scheduler instead of the DaemonSet controller, by adding the `NodeAffinity` term
|
||||
to the DaemonSet pods, instead of the `.spec.nodeName` term. The default
|
||||
scheduler is then used to bind the pod to the target host. If node affinity of
|
||||
the DaemonSet pod already exists, it is replaced (the original node affinity was
|
||||
taken into account before selecting the target host). The DaemonSet controller only
|
||||
performs these operations when creating or modifying DaemonSet pods, and no
|
||||
changes are made to the `spec.template` of the DaemonSet.
|
||||
The original node affinity specified at the
|
||||
`.spec.template.spec.affinity.nodeAffinity` field (if specified) is taken into
|
||||
consideration by the DaemonSet controller when evaluating the eligible nodes,
|
||||
but is replaced on the created Pod with the node affinity that matches the name
|
||||
of the eligible node.
|
||||
|
||||
```yaml
|
||||
nodeAffinity:
|
||||
|
@ -141,25 +135,40 @@ nodeAffinity:
|
|||
- target-host-name
|
||||
```
|
||||
|
||||
In addition, `node.kubernetes.io/unschedulable:NoSchedule` toleration is added
|
||||
automatically to DaemonSet Pods. The default scheduler ignores
|
||||
`unschedulable` Nodes when scheduling DaemonSet Pods.
|
||||
|
||||
### Taints and Tolerations
|
||||
### Taints and tolerations
|
||||
|
||||
Although Daemon Pods respect
|
||||
[taints and tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/),
|
||||
the following tolerations are added to DaemonSet Pods automatically according to
|
||||
the related features.
|
||||
The DaemonSet controller automatically adds a set of {{< glossary_tooltip
|
||||
text="tolerations" term_id="toleration" >}} to DaemonSet Pods:
|
||||
|
||||
| Toleration Key | Effect | Version | Description |
|
||||
| ---------------------------------------- | ---------- | ------- | ----------- |
|
||||
| `node.kubernetes.io/not-ready` | NoExecute | 1.13+ | DaemonSet pods will not be evicted when there are node problems such as a network partition. |
|
||||
| `node.kubernetes.io/unreachable` | NoExecute | 1.13+ | DaemonSet pods will not be evicted when there are node problems such as a network partition. |
|
||||
| `node.kubernetes.io/disk-pressure` | NoSchedule | 1.8+ | DaemonSet pods tolerate disk-pressure attributes by default scheduler. |
|
||||
| `node.kubernetes.io/memory-pressure` | NoSchedule | 1.8+ | DaemonSet pods tolerate memory-pressure attributes by default scheduler. |
|
||||
| `node.kubernetes.io/unschedulable` | NoSchedule | 1.12+ | DaemonSet pods tolerate unschedulable attributes by default scheduler. |
|
||||
| `node.kubernetes.io/network-unavailable` | NoSchedule | 1.12+ | DaemonSet pods, who uses host network, tolerate network-unavailable attributes by default scheduler. |
|
||||
{{< table caption="Tolerations for DaemonSet pods" >}}
|
||||
|
||||
| Toleration key | Effect | Details |
|
||||
| --------------------------------------------------------------------------------------------------------------------- | ------------ | --------------------------------------------------------------------------------------------------------------------------------------------- |
|
||||
| [`node.kubernetes.io/not-ready`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-not-ready) | `NoExecute` | DaemonSet Pods can be scheduled onto nodes that are not healthy or ready to accept Pods. Any DaemonSet Pods running on such nodes will not be evicted. |
|
||||
| [`node.kubernetes.io/unreachable`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-unreachable) | `NoExecute` | DaemonSet Pods can be scheduled onto nodes that are unreachable from the node controller. Any DaemonSet Pods running on such nodes will not be evicted. |
|
||||
| [`node.kubernetes.io/disk-pressure`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-disk-pressure) | `NoSchedule` | DaemonSet Pods can be scheduled onto nodes with disk pressure issues. |
|
||||
| [`node.kubernetes.io/memory-pressure`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-memory-pressure) | `NoSchedule` | DaemonSet Pods can be scheduled onto nodes with memory pressure issues. |
|
||||
| [`node.kubernetes.io/pid-pressure`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-pid-pressure) | `NoSchedule` | DaemonSet Pods can be scheduled onto nodes with process pressure issues. |
|
||||
| [`node.kubernetes.io/unschedulable`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-unschedulable) | `NoSchedule` | DaemonSet Pods can be scheduled onto nodes that are unschedulable. |
|
||||
| [`node.kubernetes.io/network-unavailable`](/docs/reference/labels-annotations-taints/#node-kubernetes-io-network-unavailable) | `NoSchedule` | **Only added for DaemonSet Pods that request host networking**, i.e., Pods having `spec.hostNetwork: true`. Such DaemonSet Pods can be scheduled onto nodes with unavailable network.|
|
||||
|
||||
{{< /table >}}
|
||||
|
||||
You can add your own tolerations to the Pods of a Daemonset as well, by
|
||||
defining these in the Pod template of the DaemonSet.
|
||||
|
||||
Because the DaemonSet controller sets the
|
||||
`node.kubernetes.io/unschedulable:NoSchedule` toleration automatically,
|
||||
Kubernetes can run DaemonSet Pods on nodes that are marked as _unschedulable_.
|
||||
|
||||
If you use a DaemonSet to provide an important node-level function, such as
|
||||
[cluster networking](/docs/concepts/cluster-administration/networking/), it is
|
||||
helpful that Kubernetes places DaemonSet Pods on nodes before they are ready.
|
||||
For example, without that special toleration, you could end up in a deadlock
|
||||
situation where the node is not marked as ready because the network plugin is
|
||||
not running there, and at the same time the network plugin is not running on
|
||||
that node because the node is not yet ready.
|
||||
|
||||
## Communicating with Daemon Pods
|
||||
|
||||
|
|
|
@ -794,7 +794,7 @@ These are some requirements and semantics of the API:
|
|||
are evaluated in order. Once a rule matches a Pod failure, the remaining rules
|
||||
are ignored. When no rule matches the Pod failure, the default
|
||||
handling applies.
|
||||
- you may want to restrict a rule to a specific container by specifing its name
|
||||
- you may want to restrict a rule to a specific container by specifying its name
|
||||
in`spec.podFailurePolicy.rules[*].containerName`. When not specified the rule
|
||||
applies to all containers. When specified, it should match one the container
|
||||
or `initContainer` names in the Pod template.
|
||||
|
|
|
@ -69,7 +69,7 @@ kubectl get rs
|
|||
|
||||
And see the frontend one you created:
|
||||
|
||||
```shell
|
||||
```
|
||||
NAME DESIRED CURRENT READY AGE
|
||||
frontend 3 3 3 6s
|
||||
```
|
||||
|
@ -118,7 +118,7 @@ kubectl get pods
|
|||
|
||||
You should see Pod information similar to:
|
||||
|
||||
```shell
|
||||
```
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
frontend-b2zdv 1/1 Running 0 6m36s
|
||||
frontend-vcmts 1/1 Running 0 6m36s
|
||||
|
@ -160,7 +160,7 @@ While you can create bare Pods with no problems, it is strongly recommended to m
|
|||
labels which match the selector of one of your ReplicaSets. The reason for this is because a ReplicaSet is not limited
|
||||
to owning Pods specified by its template-- it can acquire other Pods in the manner specified in the previous sections.
|
||||
|
||||
Take the previous frontend ReplicaSet example, and the Pods specified in the following manifest:
|
||||
Take the previous frontend ReplicaSet example, and the Pods specified in the following manifest:
|
||||
|
||||
{{< codenew file="pods/pod-rs.yaml" >}}
|
||||
|
||||
|
@ -229,9 +229,9 @@ As with all other Kubernetes API objects, a ReplicaSet needs the `apiVersion`, `
|
|||
For ReplicaSets, the `kind` is always a ReplicaSet.
|
||||
|
||||
When the control plane creates new Pods for a ReplicaSet, the `.metadata.name` of the
|
||||
ReplicaSet is part of the basis for naming those Pods. The name of a ReplicaSet must be a valid
|
||||
ReplicaSet is part of the basis for naming those Pods. The name of a ReplicaSet must be a valid
|
||||
[DNS subdomain](/docs/concepts/overview/working-with-objects/names#dns-subdomain-names)
|
||||
value, but this can produce unexpected results for the Pod hostnames. For best compatibility,
|
||||
value, but this can produce unexpected results for the Pod hostnames. For best compatibility,
|
||||
the name should follow the more restrictive rules for a
|
||||
[DNS label](/docs/concepts/overview/working-with-objects/names#dns-label-names).
|
||||
|
||||
|
@ -288,8 +288,8 @@ When using the REST API or the `client-go` library, you must set `propagationPol
|
|||
```shell
|
||||
kubectl proxy --port=8080
|
||||
curl -X DELETE 'localhost:8080/apis/apps/v1/namespaces/default/replicasets/frontend' \
|
||||
> -d '{"kind":"DeleteOptions","apiVersion":"v1","propagationPolicy":"Foreground"}' \
|
||||
> -H "Content-Type: application/json"
|
||||
-d '{"kind":"DeleteOptions","apiVersion":"v1","propagationPolicy":"Foreground"}' \
|
||||
-H "Content-Type: application/json"
|
||||
```
|
||||
|
||||
### Deleting just a ReplicaSet
|
||||
|
@ -303,11 +303,11 @@ For example:
|
|||
```shell
|
||||
kubectl proxy --port=8080
|
||||
curl -X DELETE 'localhost:8080/apis/apps/v1/namespaces/default/replicasets/frontend' \
|
||||
> -d '{"kind":"DeleteOptions","apiVersion":"v1","propagationPolicy":"Orphan"}' \
|
||||
> -H "Content-Type: application/json"
|
||||
-d '{"kind":"DeleteOptions","apiVersion":"v1","propagationPolicy":"Orphan"}' \
|
||||
-H "Content-Type: application/json"
|
||||
```
|
||||
|
||||
Once the original is deleted, you can create a new ReplicaSet to replace it. As long
|
||||
Once the original is deleted, you can create a new ReplicaSet to replace it. As long
|
||||
as the old and new `.spec.selector` are the same, then the new one will adopt the old Pods.
|
||||
However, it will not make any effort to make existing Pods match a new, different pod template.
|
||||
To update Pods to a new spec in a controlled way, use a
|
||||
|
@ -335,19 +335,19 @@ prioritize scaling down pods based on the following general algorithm:
|
|||
1. If the pods' creation times differ, the pod that was created more recently
|
||||
comes before the older pod (the creation times are bucketed on an integer log scale
|
||||
when the `LogarithmicScaleDown` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled)
|
||||
|
||||
|
||||
If all of the above match, then selection is random.
|
||||
|
||||
### Pod deletion cost
|
||||
### Pod deletion cost
|
||||
|
||||
{{< feature-state for_k8s_version="v1.22" state="beta" >}}
|
||||
|
||||
Using the [`controller.kubernetes.io/pod-deletion-cost`](/docs/reference/labels-annotations-taints/#pod-deletion-cost)
|
||||
Using the [`controller.kubernetes.io/pod-deletion-cost`](/docs/reference/labels-annotations-taints/#pod-deletion-cost)
|
||||
annotation, users can set a preference regarding which pods to remove first when downscaling a ReplicaSet.
|
||||
|
||||
The annotation should be set on the pod, the range is [-2147483647, 2147483647]. It represents the cost of
|
||||
deleting a pod compared to other pods belonging to the same ReplicaSet. Pods with lower deletion
|
||||
cost are preferred to be deleted before pods with higher deletion cost.
|
||||
cost are preferred to be deleted before pods with higher deletion cost.
|
||||
|
||||
The implicit value for this annotation for pods that don't set it is 0; negative values are permitted.
|
||||
Invalid values will be rejected by the API server.
|
||||
|
@ -360,13 +360,13 @@ This feature is beta and enabled by default. You can disable it using the
|
|||
- This is honored on a best-effort basis, so it does not offer any guarantees on pod deletion order.
|
||||
- Users should avoid updating the annotation frequently, such as updating it based on a metric value,
|
||||
because doing so will generate a significant number of pod updates on the apiserver.
|
||||
{{< /note >}}
|
||||
{{< /note >}}
|
||||
|
||||
#### Example Use Case
|
||||
|
||||
The different pods of an application could have different utilization levels. On scale down, the application
|
||||
The different pods of an application could have different utilization levels. On scale down, the application
|
||||
may prefer to remove the pods with lower utilization. To avoid frequently updating the pods, the application
|
||||
should update `controller.kubernetes.io/pod-deletion-cost` once before issuing a scale down (setting the
|
||||
should update `controller.kubernetes.io/pod-deletion-cost` once before issuing a scale down (setting the
|
||||
annotation to a value proportional to pod utilization level). This works if the application itself controls
|
||||
the down scaling; for example, the driver pod of a Spark deployment.
|
||||
|
||||
|
@ -400,7 +400,7 @@ kubectl autoscale rs frontend --max=10 --min=3 --cpu-percent=50
|
|||
|
||||
[`Deployment`](/docs/concepts/workloads/controllers/deployment/) is an object which can own ReplicaSets and update
|
||||
them and their Pods via declarative, server-side rolling updates.
|
||||
While ReplicaSets can be used independently, today they're mainly used by Deployments as a mechanism to orchestrate Pod
|
||||
While ReplicaSets can be used independently, today they're mainly used by Deployments as a mechanism to orchestrate Pod
|
||||
creation, deletion and updates. When you use Deployments you don't have to worry about managing the ReplicaSets that
|
||||
they create. Deployments own and manage their ReplicaSets.
|
||||
As such, it is recommended to use Deployments when you want ReplicaSets.
|
||||
|
@ -422,7 +422,7 @@ expected to terminate on their own (that is, batch jobs).
|
|||
### DaemonSet
|
||||
|
||||
Use a [`DaemonSet`](/docs/concepts/workloads/controllers/daemonset/) instead of a ReplicaSet for Pods that provide a
|
||||
machine-level function, such as machine monitoring or machine logging. These Pods have a lifetime that is tied
|
||||
machine-level function, such as machine monitoring or machine logging. These Pods have a lifetime that is tied
|
||||
to a machine lifetime: the Pod needs to be running on the machine before other Pods start, and are
|
||||
safe to terminate when the machine is otherwise ready to be rebooted/shutdown.
|
||||
|
||||
|
@ -444,4 +444,3 @@ As such, ReplicaSets are preferred over ReplicationControllers
|
|||
object definition to understand the API for replica sets.
|
||||
* Read about [PodDisruptionBudget](/docs/concepts/workloads/pods/disruptions/) and how
|
||||
you can use it to manage application availability during disruptions.
|
||||
|
||||
|
|
|
@ -1,75 +1,87 @@
|
|||
---
|
||||
reviewers:
|
||||
- janetkuo
|
||||
title: Automatic Clean-up for Finished Jobs
|
||||
title: Automatic Cleanup for Finished Jobs
|
||||
content_type: concept
|
||||
weight: 70
|
||||
description: >-
|
||||
A time-to-live mechanism to clean up old Jobs that have finished execution.
|
||||
---
|
||||
|
||||
<!-- overview -->
|
||||
|
||||
{{< feature-state for_k8s_version="v1.23" state="stable" >}}
|
||||
|
||||
TTL-after-finished {{<glossary_tooltip text="controller" term_id="controller">}} provides a
|
||||
TTL (time to live) mechanism to limit the lifetime of resource objects that
|
||||
have finished execution. TTL controller only handles
|
||||
{{< glossary_tooltip text="Jobs" term_id="job" >}}.
|
||||
When your Job has finished, it's useful to keep that Job in the API (and not immediately delete the Job)
|
||||
so that you can tell whether the Job succeeded or failed.
|
||||
|
||||
Kubernetes' TTL-after-finished {{<glossary_tooltip text="controller" term_id="controller">}} provides a
|
||||
TTL (time to live) mechanism to limit the lifetime of Job objects that
|
||||
have finished execution.
|
||||
|
||||
<!-- body -->
|
||||
|
||||
## TTL-after-finished Controller
|
||||
## Cleanup for finished Jobs
|
||||
|
||||
The TTL-after-finished controller is only supported for Jobs. A cluster operator can use this feature to clean
|
||||
The TTL-after-finished controller is only supported for Jobs. You can use this mechanism to clean
|
||||
up finished Jobs (either `Complete` or `Failed`) automatically by specifying the
|
||||
`.spec.ttlSecondsAfterFinished` field of a Job, as in this
|
||||
[example](/docs/concepts/workloads/controllers/job/#clean-up-finished-jobs-automatically).
|
||||
The TTL-after-finished controller will assume that a job is eligible to be cleaned up
|
||||
TTL seconds after the job has finished, in other words, when the TTL has expired. When the
|
||||
TTL-after-finished controller cleans up a job, it will delete it cascadingly, that is to say it will delete
|
||||
its dependent objects together with it. Note that when the job is deleted,
|
||||
its lifecycle guarantees, such as finalizers, will be honored.
|
||||
|
||||
The TTL seconds can be set at any time. Here are some examples for setting the
|
||||
The TTL-after-finished controller assumes that a Job is eligible to be cleaned up
|
||||
TTL seconds after the Job has finished. The timer starts once the
|
||||
status condition of the Job changes to show that the Job is either `Complete` or `Failed`; once the TTL has
|
||||
expired, that Job becomes eligible for
|
||||
[cascading](/docs/concepts/architecture/garbage-collection/#cascading-deletion) removal. When the
|
||||
TTL-after-finished controller cleans up a job, it will delete it cascadingly, that is to say it will delete
|
||||
its dependent objects together with it.
|
||||
|
||||
Kubernetes honors object lifecycle guarantees on the Job, such as waiting for
|
||||
[finalizers](/docs/concepts/overview/working-with-objects/finalizers/).
|
||||
|
||||
You can set the TTL seconds at any time. Here are some examples for setting the
|
||||
`.spec.ttlSecondsAfterFinished` field of a Job:
|
||||
|
||||
* Specify this field in the job manifest, so that a Job can be cleaned up
|
||||
* Specify this field in the Job manifest, so that a Job can be cleaned up
|
||||
automatically some time after it finishes.
|
||||
* Set this field of existing, already finished jobs, to adopt this new
|
||||
feature.
|
||||
* Manually set this field of existing, already finished Jobs, so that they become eligible
|
||||
for cleanup.
|
||||
* Use a
|
||||
[mutating admission webhook](/docs/reference/access-authn-authz/extensible-admission-controllers/#admission-webhooks)
|
||||
to set this field dynamically at job creation time. Cluster administrators can
|
||||
[mutating admission webhook](/docs/reference/access-authn-authz/admission-controllers/#mutatingadmissionwebhook)
|
||||
to set this field dynamically at Job creation time. Cluster administrators can
|
||||
use this to enforce a TTL policy for finished jobs.
|
||||
* Use a
|
||||
[mutating admission webhook](/docs/reference/access-authn-authz/extensible-admission-controllers/#admission-webhooks)
|
||||
to set this field dynamically after the job has finished, and choose
|
||||
different TTL values based on job status, labels, etc.
|
||||
[mutating admission webhook](/docs/reference/access-authn-authz/admission-controllers/#mutatingadmissionwebhook)
|
||||
to set this field dynamically after the Job has finished, and choose
|
||||
different TTL values based on job status, labels. For this case, the webhook needs
|
||||
to detect changes to the `.status` of the Job and only set a TTL when the Job
|
||||
is being marked as completed.
|
||||
* Write your own controller to manage the cleanup TTL for Jobs that match a particular
|
||||
{{< glossary_tooltip term_id="selector" text="selector-selector" >}}.
|
||||
|
||||
## Caveat
|
||||
## Caveats
|
||||
|
||||
### Updating TTL Seconds
|
||||
### Updating TTL for finished Jobs
|
||||
|
||||
Note that the TTL period, e.g. `.spec.ttlSecondsAfterFinished` field of Jobs,
|
||||
can be modified after the job is created or has finished. However, once the
|
||||
Job becomes eligible to be deleted (when the TTL has expired), the system won't
|
||||
guarantee that the Jobs will be kept, even if an update to extend the TTL
|
||||
returns a successful API response.
|
||||
You can modify the TTL period, e.g. `.spec.ttlSecondsAfterFinished` field of Jobs,
|
||||
after the job is created or has finished. If you extend the TTL period after the
|
||||
existing `ttlSecondsAfterFinished` period has expired, Kubernetes doesn't guarantee
|
||||
to retain that Job, even if an update to extend the TTL returns a successful API
|
||||
response.
|
||||
|
||||
### Time Skew
|
||||
### Time skew
|
||||
|
||||
Because TTL-after-finished controller uses timestamps stored in the Kubernetes jobs to
|
||||
Because the TTL-after-finished controller uses timestamps stored in the Kubernetes jobs to
|
||||
determine whether the TTL has expired or not, this feature is sensitive to time
|
||||
skew in the cluster, which may cause TTL-after-finish controller to clean up job objects
|
||||
skew in your cluster, which may cause the control plane to clean up Job objects
|
||||
at the wrong time.
|
||||
|
||||
Clocks aren't always correct, but the difference should be
|
||||
very small. Please be aware of this risk when setting a non-zero TTL.
|
||||
|
||||
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
* [Clean up Jobs automatically](/docs/concepts/workloads/controllers/job/#clean-up-finished-jobs-automatically)
|
||||
|
||||
* [Design doc](https://github.com/kubernetes/enhancements/blob/master/keps/sig-apps/592-ttl-after-finish/README.md)
|
||||
* Read [Clean up Jobs automatically](/docs/concepts/workloads/controllers/job/#clean-up-finished-jobs-automatically)
|
||||
|
||||
* Refer to the [Kubernetes Enhancement Proposal](https://github.com/kubernetes/enhancements/blob/master/keps/sig-apps/592-ttl-after-finish/README.md)
|
||||
(KEP) for adding this mechanism.
|
||||
|
|
|
@ -289,14 +289,31 @@ section.
|
|||
|
||||
## Privileged mode for containers
|
||||
|
||||
In Linux, any container in a Pod can enable privileged mode using the `privileged` (Linux) flag on the [security context](/docs/tasks/configure-pod-container/security-context/) of the container spec. This is useful for containers that want to use operating system administrative capabilities such as manipulating the network stack or accessing hardware devices.
|
||||
|
||||
If your cluster has the `WindowsHostProcessContainers` feature enabled, you can create a [Windows HostProcess pod](/docs/tasks/configure-pod-container/create-hostprocess-pod) by setting the `windowsOptions.hostProcess` flag on the security context of the pod spec. All containers in these pods must run as Windows HostProcess containers. HostProcess pods run directly on the host and can also be used to perform administrative tasks as is done with Linux privileged containers.
|
||||
|
||||
{{< note >}}
|
||||
Your {{< glossary_tooltip text="container runtime" term_id="container-runtime" >}} must support the concept of a privileged container for this setting to be relevant.
|
||||
{{< /note >}}
|
||||
|
||||
Any container in a pod can run in privileged mode to use operating system administrative capabilities
|
||||
that would otherwise be inaccessible. This is available for both Windows and Linux.
|
||||
|
||||
### Linux priviledged containers
|
||||
|
||||
In Linux, any container in a Pod can enable privileged mode using the `privileged` (Linux) flag
|
||||
on the [security context](/docs/tasks/configure-pod-container/security-context/) of the
|
||||
container spec. This is useful for containers that want to use operating system administrative
|
||||
capabilities such as manipulating the network stack or accessing hardware devices.
|
||||
|
||||
### Windows priviledged containers
|
||||
|
||||
{{< feature-state for_k8s_version="v1.26" state="stable" >}}
|
||||
|
||||
In Windows, you can create a [Windows HostProcess pod](/docs/tasks/configure-pod-container/create-hostprocess-pod)
|
||||
by setting the `windowsOptions.hostProcess` flag on the security context of the pod spec. All containers in these
|
||||
pods must run as Windows HostProcess containers. HostProcess pods run directly on the host and can also be used
|
||||
to perform administrative tasks as is done with Linux privileged containers. In order to use this feature, the
|
||||
`WindowsHostProcessContainers` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) must be enabled.
|
||||
|
||||
|
||||
## Static Pods
|
||||
|
||||
_Static Pods_ are managed directly by the kubelet daemon on a specific node,
|
||||
|
|
|
@ -63,6 +63,9 @@ via either mechanism are:
|
|||
`metadata.labels['<KEY>']`
|
||||
: the text value of the pod's {{< glossary_tooltip text="label" term_id="label" >}} named `<KEY>` (for example, `metadata.labels['mylabel']`)
|
||||
|
||||
The following information is available through environment variables
|
||||
**but not as a downwardAPI volume fieldRef**:
|
||||
|
||||
`spec.serviceAccountName`
|
||||
: the name of the pod's {{< glossary_tooltip text="service account" term_id="service-account" >}}
|
||||
|
||||
|
@ -75,8 +78,8 @@ via either mechanism are:
|
|||
`status.podIP`
|
||||
: the pod's primary IP address (usually, its IPv4 address)
|
||||
|
||||
In addition, the following information is available through
|
||||
a `downwardAPI` volume `fieldRef`, but **not as environment variables**:
|
||||
The following information is available through a `downwardAPI` volume
|
||||
`fieldRef`, **but not as environment variables**:
|
||||
|
||||
`metadata.labels`
|
||||
: all of the pod's labels, formatted as `label-key="escaped-label-value"` with one label per line
|
||||
|
|
|
@ -267,6 +267,11 @@ after successful sandbox creation and network configuration by the runtime
|
|||
plugin). For a Pod without init containers, the kubelet sets the `Initialized`
|
||||
condition to `True` before sandbox creation and network configuration starts.
|
||||
|
||||
### Pod scheduling readiness {#pod-scheduling-readiness-gate}
|
||||
|
||||
{{< feature-state for_k8s_version="v1.26" state="alpha" >}}
|
||||
|
||||
See [Pod Scheduling Readiness](/docs/concepts/scheduling-eviction/pod-scheduling-readiness/) for more information.
|
||||
|
||||
## Container probes
|
||||
|
||||
|
|
|
@ -0,0 +1,117 @@
|
|||
---
|
||||
title: Pod Quality of Service Classes
|
||||
content_type: concept
|
||||
weight: 85
|
||||
---
|
||||
|
||||
<!-- overview -->
|
||||
|
||||
This page introduces _Quality of Service (QoS) classes_ in Kubernetes, and explains
|
||||
how Kubernetes assigns a QoS class to each Pods as a consequence of the resource
|
||||
constraints that you specify for the containers in that Pod. Kubernetes relies on this
|
||||
classification to make decisions about which Pods to evict when there are not enough
|
||||
available resources on a Node.
|
||||
|
||||
<!-- body -->
|
||||
|
||||
## Quality of Service classes
|
||||
|
||||
Kubernetes classifies the Pods that you run and allocates each Pod into a specific
|
||||
_quality of service (QoS) class_. Kubernetes uses that classification to influence how different
|
||||
pods are handled. Kubernetes does this classification based on the
|
||||
[resource requests](/docs/concepts/configuration/manage-resources-containers/)
|
||||
of the {{< glossary_tooltip text="Containers" term_id="container" >}} in that Pod, along with
|
||||
how those requests relate to resource limits.
|
||||
This is known as {{< glossary_tooltip text="Quality of Service" term_id="qos-class" >}}
|
||||
(QoS) class. Kubernetes assigns every Pod a QoS class based on the resource requests
|
||||
and limits of its component Containers. QoS classes are used by Kubernetes to decide
|
||||
which Pods to evict from a Node experiencing
|
||||
[Node Pressure](/docs/concepts/scheduling-eviction/node-pressure-eviction/). The possible
|
||||
QoS classes are `Guaranteed`, `Burstable`, and `BestEffort`. When a Node runs out of resources,
|
||||
Kubernetes will first evict `BestEffort` Pods running on that Node, followed by `Burstable` and
|
||||
finally `Guaranteed` Pods. When this eviction is due to resource pressure, only Pods exceeding
|
||||
resource requests are candidates for eviction.
|
||||
|
||||
### Guaranteed
|
||||
|
||||
Pods that are `Guaranteed` have the strictest resource limits and are least likely
|
||||
to face eviction. They are guaranteed not to be killed until they exceed their limits
|
||||
or there are no lower-priority Pods that can be preempted from the Node. They may
|
||||
not acquire resources beyond their specified limits. These Pods can also make
|
||||
use of exclusive CPUs using the
|
||||
[`static`](/docs/tasks/administer-cluster/cpu-management-policies/#static-policy) CPU management policy.
|
||||
|
||||
#### Criteria
|
||||
|
||||
For a Pod to be given a QoS class of `Guaranteed`:
|
||||
|
||||
* Every Container in the Pod must have a memory limit and a memory request.
|
||||
* For every Container in the Pod, the memory limit must equal the memory request.
|
||||
* Every Container in the Pod must have a CPU limit and a CPU request.
|
||||
* For every Container in the Pod, the CPU limit must equal the CPU request.
|
||||
|
||||
### Burstable
|
||||
|
||||
Pods that are `Burstable` have some lower-bound resource guarantees based on the request, but
|
||||
do not require a specific limit. If a limit is not specified, it defaults to a
|
||||
limit equivalent to the capacity of the Node, which allows the Pods to flexibly increase
|
||||
their resources if resources are available. In the event of Pod eviction due to Node
|
||||
resource pressure, these Pods are evicted only after all `BestEffort` Pods are evicted.
|
||||
Because a `Burstable` Pod can include a Container that has no resource limits or requests, a Pod
|
||||
that is `Burstable` can try to use any amount of node resources.
|
||||
|
||||
#### Criteria
|
||||
|
||||
A Pod is given a QoS class of `Burstable` if:
|
||||
|
||||
* The Pod does not meet the criteria for QoS class `Guaranteed`.
|
||||
* At least one Container in the Pod has a memory or CPU request or limit.
|
||||
|
||||
### BestEffort
|
||||
|
||||
Pods in the `BestEffort` QoS class can use node resources that aren't specifically assigned
|
||||
to Pods in other QoS classes. For example, if you have a node with 16 CPU cores available to the
|
||||
kubelet, and you assign assign 4 CPU cores to a `Guaranteed` Pod, then a Pod in the `BestEffort`
|
||||
QoS class can try to use any amount of the remaining 12 CPU cores.
|
||||
|
||||
The kubelet prefers to evict `BestEffort` Pods if the node comes under resource pressure.
|
||||
|
||||
#### Criteria
|
||||
|
||||
A Pod has a QoS class of `BestEffort` if it doesn't meet the criteria for either `Guaranteed`
|
||||
or `Burstable`. In other words, a Pod is `BestEffort` only if none of the Containers in the Pod have a
|
||||
memory limit or a memory request, and none of the Containers in the Pod have a
|
||||
CPU limit or a CPU request.
|
||||
Containers in a Pod can request other resources (not CPU or memory) and still be classified as
|
||||
`BestEffort`.
|
||||
|
||||
## Some behavior is independent of QoS class {#class-independent-behavior}
|
||||
|
||||
Certain behavior is independent of the QoS class assigned by Kubernetes. For example:
|
||||
|
||||
* Any Container exceeding a resource limit will be killed and restarted by the kubelet without
|
||||
affecting other Containers in that Pod.
|
||||
|
||||
* If a Container exceeds its resource request and the node it runs on faces
|
||||
resource pressure, the Pod it is in becomes a candidate for [eviction](/docs/concepts/scheduling-eviction/node-pressure-eviction/).
|
||||
If this occurs, all Containers in the Pod will be terminated. Kubernetes may create a
|
||||
replacement Pod, usually on a different node.
|
||||
|
||||
* The resource request of a Pod is equal to the sum of the resource requests of
|
||||
its component Containers, and the resource limit of a Pod is equal to the sum of
|
||||
the resource limits of its component Containers.
|
||||
|
||||
* The kube-scheduler does not consider QoS class when selecting which Pods to
|
||||
[preempt](/docs/concepts/scheduling-eviction/pod-priority-preemption/#preemption).
|
||||
Preemption can occur when a cluster does not have enough resources to run all the Pods
|
||||
you defined.
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
* Learn about [resource management for Pods and Containers](/docs/concepts/configuration/manage-resources-containers/).
|
||||
* Learn about [Node-pressure eviction](/docs/concepts/scheduling-eviction/node-pressure-eviction/).
|
||||
* Learn about [Pod priority and preemption](/docs/concepts/scheduling-eviction/pod-priority-preemption/).
|
||||
* Learn about [Pod disruptions](/docs/concepts/workload/pods/disruptions/).
|
||||
* Learn how to [assign memory resources to containers and pods](/docs/tasks/configure-pod-container/assign-memory-resource/).
|
||||
* Learn how to [assign CPU resources to containers and pods](/docs/tasks/configure-pod-container/assign-cpu-resource/).
|
||||
* Learn how to [configure Quality of Service for Pods](/docs/tasks/configure-pod-container/quality-service-pod/).
|
File diff suppressed because one or more lines are too long
After Width: | Height: | Size: 12 KiB |
|
@ -373,21 +373,21 @@ An example request body:
|
|||
|
||||
```json
|
||||
{
|
||||
"apiVersion":"imagepolicy.k8s.io/v1alpha1",
|
||||
"kind":"ImageReview",
|
||||
"spec":{
|
||||
"containers":[
|
||||
"apiVersion": "imagepolicy.k8s.io/v1alpha1",
|
||||
"kind": "ImageReview",
|
||||
"spec": {
|
||||
"containers": [
|
||||
{
|
||||
"image":"myrepo/myimage:v1"
|
||||
"image": "myrepo/myimage:v1"
|
||||
},
|
||||
{
|
||||
"image":"myrepo/myimage@sha256:beb6bd6a68f114c1dc2ea4b28db81bdf91de202a9014972bec5e4d9171d90ed"
|
||||
"image": "myrepo/myimage@sha256:beb6bd6a68f114c1dc2ea4b28db81bdf91de202a9014972bec5e4d9171d90ed"
|
||||
}
|
||||
],
|
||||
"annotations":{
|
||||
"annotations": {
|
||||
"mycluster.image-policy.k8s.io/ticket-1234": "break-glass"
|
||||
},
|
||||
"namespace":"mynamespace"
|
||||
"namespace": "mynamespace"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
@ -610,9 +610,9 @@ This file may be json or yaml and has the following format:
|
|||
|
||||
```yaml
|
||||
podNodeSelectorPluginConfig:
|
||||
clusterDefaultNodeSelector: name-of-node-selector
|
||||
namespace1: name-of-node-selector
|
||||
namespace2: name-of-node-selector
|
||||
clusterDefaultNodeSelector: name-of-node-selector
|
||||
namespace1: name-of-node-selector
|
||||
namespace2: name-of-node-selector
|
||||
```
|
||||
|
||||
Reference the `PodNodeSelector` configuration file from the file provided to the API server's
|
||||
|
@ -663,23 +663,15 @@ admission plugin, which allows preventing pods from running on specifically tain
|
|||
|
||||
{{< feature-state for_k8s_version="v1.25" state="stable" >}}
|
||||
|
||||
This is the replacement for the deprecated [PodSecurityPolicy](#podsecuritypolicy) admission controller
|
||||
defined in the next section. This admission controller acts on creation and modification of the pod and
|
||||
determines if it should be admitted based on the requested security context and the
|
||||
[Pod Security Standards](/docs/concepts/security/pod-security-standards/).
|
||||
The PodSecurity admission controller checks new Pods before they are
|
||||
admitted, determines if it should be admitted based on the requested security context and the restrictions on permitted
|
||||
[Pod Security Standards](/docs/concepts/security/pod-security-standards/)
|
||||
for the namespace that the Pod would be in.
|
||||
|
||||
See the [Pod Security Admission documentation](/docs/concepts/security/pod-security-admission/)
|
||||
for more information.
|
||||
See the [Pod Security Admission](/docs/concepts/security/pod-security-admission/)
|
||||
documentation for more information.
|
||||
|
||||
### PodSecurityPolicy {#podsecuritypolicy}
|
||||
|
||||
{{< feature-state for_k8s_version="v1.21" state="deprecated" >}}
|
||||
|
||||
This admission controller acts on creation and modification of the pod and determines if it should be admitted
|
||||
based on the requested security context and the available Pod Security Policies.
|
||||
|
||||
See also the [PodSecurityPolicy](/docs/concepts/security/pod-security-policy/) documentation
|
||||
for more information.
|
||||
PodSecurity replaced an older admission controller named PodSecurityPolicy.
|
||||
|
||||
### PodTolerationRestriction {#podtolerationrestriction}
|
||||
|
||||
|
@ -744,17 +736,37 @@ for more information.
|
|||
|
||||
### SecurityContextDeny {#securitycontextdeny}
|
||||
|
||||
This admission controller will deny any Pod that attempts to set certain escalating
|
||||
[SecurityContext](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#securitycontext-v1-core)
|
||||
fields, as shown in the
|
||||
[Configure a Security Context for a Pod or Container](/docs/tasks/configure-pod-container/security-context/)
|
||||
task.
|
||||
If you don't use [Pod Security admission](/docs/concepts/security/pod-security-admission/),
|
||||
[PodSecurityPolicies](/docs/concepts/security/pod-security-policy/), nor any external enforcement mechanism,
|
||||
then you could use this admission controller to restrict the set of values a security context can take.
|
||||
{{< feature-state for_k8s_version="v1.0" state="alpha" >}}
|
||||
|
||||
See [Pod Security Standards](/docs/concepts/security/pod-security-standards/) for more context on restricting
|
||||
pod privileges.
|
||||
{{< caution >}}
|
||||
This admission controller plugin is **outdated** and **incomplete**, it may be
|
||||
unusable or not do what you would expect. It was originally designed to prevent
|
||||
the use of some, but not all, security-sensitive fields. Indeed, fields like
|
||||
`privileged`, were not filtered at creation and the plugin was not updated with
|
||||
the most recent fields, and new APIs like the `ephemeralContainers` field for a
|
||||
Pod.
|
||||
|
||||
The [Pod Security Admission](/docs/concepts/security/pod-security-admission/)
|
||||
plugin enforcing the [Pod Security Standards](/docs/concepts/security/pod-security-standards/)
|
||||
`Restricted` profile captures what this plugin was trying to achieve in a better
|
||||
and up-to-date way.
|
||||
{{< /caution >}}
|
||||
|
||||
This admission controller will deny any Pod that attempts to set the following
|
||||
[SecurityContext](/docs/reference/kubernetes-api/workload-resources/pod-v1/#security-context)
|
||||
fields:
|
||||
- `.spec.securityContext.supplementalGroups`
|
||||
- `.spec.securityContext.seLinuxOptions`
|
||||
- `.spec.securityContext.runAsUser`
|
||||
- `.spec.securityContext.fsGroup`
|
||||
- `.spec.(init)Containers[*].securityContext.seLinuxOptions`
|
||||
- `.spec.(init)Containers[*].securityContext.runAsUser`
|
||||
|
||||
For more historical context on this plugin, see
|
||||
[The birth of PodSecurityPolicy](/blog/2022/08/23/podsecuritypolicy-the-historical-context/#the-birth-of-podsecuritypolicy)
|
||||
from the Kubernetes blog article about PodSecurityPolicy and its removal. The
|
||||
article details the PodSecurityPolicy historical context and the birth of the
|
||||
`securityContext` field for Pods.
|
||||
|
||||
### ServiceAccount {#serviceaccount}
|
||||
|
||||
|
|
|
@ -104,54 +104,54 @@ Kubernetes provides built-in signers that each have a well-known `signerName`:
|
|||
|
||||
1. `kubernetes.io/kube-apiserver-client`: signs certificates that will be honored as client certificates by the API server.
|
||||
Never auto-approved by {{< glossary_tooltip term_id="kube-controller-manager" >}}.
|
||||
1. Trust distribution: signed certificates must be honored as client certificates by the API server. The CA bundle is not distributed by any other means.
|
||||
1. Permitted subjects - no subject restrictions, but approvers and signers may choose not to approve or sign.
|
||||
Certain subjects like cluster-admin level users or groups vary between distributions and installations,
|
||||
but deserve additional scrutiny before approval and signing.
|
||||
The `CertificateSubjectRestriction` admission plugin is enabled by default to restrict `system:masters`,
|
||||
but it is often not the only cluster-admin subject in a cluster.
|
||||
1. Permitted x509 extensions - honors subjectAltName and key usage extensions and discards other extensions.
|
||||
1. Permitted key usages - must include `["client auth"]`. Must not include key usages beyond `["digital signature", "key encipherment", "client auth"]`.
|
||||
1. Expiration/certificate lifetime - for the kube-controller-manager implementation of this signer, set to the minimum
|
||||
of the `--cluster-signing-duration` option or, if specified, the `spec.expirationSeconds` field of the CSR object.
|
||||
1. CA bit allowed/disallowed - not allowed.
|
||||
1. Trust distribution: signed certificates must be honored as client certificates by the API server. The CA bundle is not distributed by any other means.
|
||||
1. Permitted subjects - no subject restrictions, but approvers and signers may choose not to approve or sign.
|
||||
Certain subjects like cluster-admin level users or groups vary between distributions and installations,
|
||||
but deserve additional scrutiny before approval and signing.
|
||||
The `CertificateSubjectRestriction` admission plugin is enabled by default to restrict `system:masters`,
|
||||
but it is often not the only cluster-admin subject in a cluster.
|
||||
1. Permitted x509 extensions - honors subjectAltName and key usage extensions and discards other extensions.
|
||||
1. Permitted key usages - must include `["client auth"]`. Must not include key usages beyond `["digital signature", "key encipherment", "client auth"]`.
|
||||
1. Expiration/certificate lifetime - for the kube-controller-manager implementation of this signer, set to the minimum
|
||||
of the `--cluster-signing-duration` option or, if specified, the `spec.expirationSeconds` field of the CSR object.
|
||||
1. CA bit allowed/disallowed - not allowed.
|
||||
|
||||
1. `kubernetes.io/kube-apiserver-client-kubelet`: signs client certificates that will be honored as client certificates by the
|
||||
API server.
|
||||
May be auto-approved by {{< glossary_tooltip term_id="kube-controller-manager" >}}.
|
||||
1. Trust distribution: signed certificates must be honored as client certificates by the API server. The CA bundle
|
||||
is not distributed by any other means.
|
||||
1. Permitted subjects - organizations are exactly `["system:nodes"]`, common name starts with "`system:node:`".
|
||||
1. Permitted x509 extensions - honors key usage extensions, forbids subjectAltName extensions and drops other extensions.
|
||||
1. Permitted key usages - exactly `["key encipherment", "digital signature", "client auth"]`.
|
||||
1. Expiration/certificate lifetime - for the kube-controller-manager implementation of this signer, set to the minimum
|
||||
of the `--cluster-signing-duration` option or, if specified, the `spec.expirationSeconds` field of the CSR object.
|
||||
1. CA bit allowed/disallowed - not allowed.
|
||||
1. Trust distribution: signed certificates must be honored as client certificates by the API server. The CA bundle
|
||||
is not distributed by any other means.
|
||||
1. Permitted subjects - organizations are exactly `["system:nodes"]`, common name starts with "`system:node:`".
|
||||
1. Permitted x509 extensions - honors key usage extensions, forbids subjectAltName extensions and drops other extensions.
|
||||
1. Permitted key usages - exactly `["key encipherment", "digital signature", "client auth"]`.
|
||||
1. Expiration/certificate lifetime - for the kube-controller-manager implementation of this signer, set to the minimum
|
||||
of the `--cluster-signing-duration` option or, if specified, the `spec.expirationSeconds` field of the CSR object.
|
||||
1. CA bit allowed/disallowed - not allowed.
|
||||
|
||||
1. `kubernetes.io/kubelet-serving`: signs serving certificates that are honored as a valid kubelet serving certificate
|
||||
by the API server, but has no other guarantees.
|
||||
Never auto-approved by {{< glossary_tooltip term_id="kube-controller-manager" >}}.
|
||||
1. Trust distribution: signed certificates must be honored by the API server as valid to terminate connections to a kubelet.
|
||||
The CA bundle is not distributed by any other means.
|
||||
1. Permitted subjects - organizations are exactly `["system:nodes"]`, common name starts with "`system:node:`".
|
||||
1. Permitted x509 extensions - honors key usage and DNSName/IPAddress subjectAltName extensions, forbids EmailAddress and
|
||||
URI subjectAltName extensions, drops other extensions. At least one DNS or IP subjectAltName must be present.
|
||||
1. Permitted key usages - exactly `["key encipherment", "digital signature", "server auth"]`.
|
||||
1. Expiration/certificate lifetime - for the kube-controller-manager implementation of this signer, set to the minimum
|
||||
of the `--cluster-signing-duration` option or, if specified, the `spec.expirationSeconds` field of the CSR object.
|
||||
1. CA bit allowed/disallowed - not allowed.
|
||||
1. Trust distribution: signed certificates must be honored by the API server as valid to terminate connections to a kubelet.
|
||||
The CA bundle is not distributed by any other means.
|
||||
1. Permitted subjects - organizations are exactly `["system:nodes"]`, common name starts with "`system:node:`".
|
||||
1. Permitted x509 extensions - honors key usage and DNSName/IPAddress subjectAltName extensions, forbids EmailAddress and
|
||||
URI subjectAltName extensions, drops other extensions. At least one DNS or IP subjectAltName must be present.
|
||||
1. Permitted key usages - exactly `["key encipherment", "digital signature", "server auth"]`.
|
||||
1. Expiration/certificate lifetime - for the kube-controller-manager implementation of this signer, set to the minimum
|
||||
of the `--cluster-signing-duration` option or, if specified, the `spec.expirationSeconds` field of the CSR object.
|
||||
1. CA bit allowed/disallowed - not allowed.
|
||||
|
||||
1. `kubernetes.io/legacy-unknown`: has no guarantees for trust at all. Some third-party distributions of Kubernetes
|
||||
1. `kubernetes.io/legacy-unknown`: has no guarantees for trust at all. Some third-party distributions of Kubernetes
|
||||
may honor client certificates signed by it. The stable CertificateSigningRequest API (version `certificates.k8s.io/v1` and later)
|
||||
does not allow to set the `signerName` as `kubernetes.io/legacy-unknown`.
|
||||
Never auto-approved by {{< glossary_tooltip term_id="kube-controller-manager" >}}.
|
||||
1. Trust distribution: None. There is no standard trust or distribution for this signer in a Kubernetes cluster.
|
||||
1. Permitted subjects - any
|
||||
1. Permitted x509 extensions - honors subjectAltName and key usage extensions and discards other extensions.
|
||||
1. Permitted key usages - any
|
||||
1. Expiration/certificate lifetime - for the kube-controller-manager implementation of this signer, set to the minimum
|
||||
of the `--cluster-signing-duration` option or, if specified, the `spec.expirationSeconds` field of the CSR object.
|
||||
1. CA bit allowed/disallowed - not allowed.
|
||||
1. Trust distribution: None. There is no standard trust or distribution for this signer in a Kubernetes cluster.
|
||||
1. Permitted subjects - any
|
||||
1. Permitted x509 extensions - honors subjectAltName and key usage extensions and discards other extensions.
|
||||
1. Permitted key usages - any
|
||||
1. Expiration/certificate lifetime - for the kube-controller-manager implementation of this signer, set to the minimum
|
||||
of the `--cluster-signing-duration` option or, if specified, the `spec.expirationSeconds` field of the CSR object.
|
||||
1. CA bit allowed/disallowed - not allowed.
|
||||
|
||||
{{< note >}}
|
||||
Failures for all of these are only reported in kube-controller-manager logs.
|
||||
|
@ -238,7 +238,11 @@ Some points to note:
|
|||
- `usages` has to be '`client auth`'
|
||||
- `expirationSeconds` could be made longer (i.e. `864000` for ten days) or shorter (i.e. `3600` for one hour)
|
||||
- `request` is the base64 encoded value of the CSR file content.
|
||||
You can get the content using this command: ```cat myuser.csr | base64 | tr -d "\n"```
|
||||
You can get the content using this command:
|
||||
|
||||
```shell
|
||||
cat myuser.csr | base64 | tr -d "\n"
|
||||
```
|
||||
|
||||
### Approve certificate signing request
|
||||
|
||||
|
|
|
@ -591,7 +591,7 @@ is not considered to match.
|
|||
Use the object selector only if the webhook is opt-in, because end users may skip
|
||||
the admission webhook by setting the labels.
|
||||
|
||||
This example shows a mutating webhook that would match a `CREATE` of any resource with the label `foo: bar`:
|
||||
This example shows a mutating webhook that would match a `CREATE` of any resource (but not subresources) with the label `foo: bar`:
|
||||
|
||||
```yaml
|
||||
apiVersion: admissionregistration.k8s.io/v1
|
||||
|
|
|
@ -193,9 +193,10 @@ spec:
|
|||
matchResources:
|
||||
namespaceSelector:
|
||||
matchExpressions:
|
||||
- key: environment,
|
||||
operator: NotIn,
|
||||
values: ["test"]
|
||||
- key: environment
|
||||
operator: NotIn
|
||||
values:
|
||||
- test
|
||||
```
|
||||
|
||||
And have a parameter resource like:
|
||||
|
@ -222,7 +223,7 @@ spec:
|
|||
matchResources:
|
||||
namespaceSelector:
|
||||
matchExpressions:
|
||||
- key: environment,
|
||||
- key: environment
|
||||
operator: Exists
|
||||
```
|
||||
|
||||
|
|
|
@ -383,7 +383,7 @@ Each feature gate is designed for enabling/disabling a specific feature:
|
|||
to see the requesting subject's authentication information.
|
||||
See [API access to authentication information for a client](/docs/reference/access-authn-authz/authentication/#self-subject-review)
|
||||
for more details.
|
||||
- `APIServerIdentity`: Assign each API server an ID in a cluster.
|
||||
- `APIServerIdentity`: Assign each API server an ID in a cluster, using a [Lease](/docs/concepts/architecture/leases).
|
||||
- `APIServerTracing`: Add support for distributed tracing in the API server.
|
||||
See [Traces for Kubernetes System Components](/docs/concepts/cluster-administration/system-traces) for more details.
|
||||
- `AdvancedAuditing`: Enable [advanced auditing](/docs/tasks/debug/debug-cluster/audit/#advanced-audit)
|
||||
|
@ -697,15 +697,12 @@ Each feature gate is designed for enabling/disabling a specific feature:
|
|||
- `RotateKubeletServerCertificate`: Enable the rotation of the server TLS certificate on the kubelet.
|
||||
See [kubelet configuration](/docs/reference/access-authn-authz/kubelet-tls-bootstrapping/#kubelet-configuration)
|
||||
for more details.
|
||||
- `SELinuxMountReadWriteOncePod`: Speed up container startup by mounting volumes with the correct
|
||||
SELinux label instead of changing each file on the volumes recursively. The initial implementation
|
||||
focused on ReadWriteOncePod volumes.
|
||||
- `SELinuxMountReadWriteOncePod`: Speeds up container startup by allowing kubelet to mount volumes
|
||||
for a Pod directly with the correct SELinux label instead of changing each file on the volumes
|
||||
recursively. The initial implementation focused on ReadWriteOncePod volumes.
|
||||
- `SeccompDefault`: Enables the use of `RuntimeDefault` as the default seccomp profile
|
||||
for all workloads.
|
||||
The seccomp profile is specified in the `securityContext` of a Pod and/or a Container.
|
||||
- `SELinuxMountReadWriteOncePod`: Allows kubelet to mount volumes for a Pod directly with the
|
||||
right SELinux label instead of applying the SELinux label recursively on every file on the
|
||||
volume.
|
||||
- `ServerSideApply`: Enables the [Sever Side Apply (SSA)](/docs/reference/using-api/server-side-apply/)
|
||||
feature on the API Server.
|
||||
- `ServerSideFieldValidation`: Enables server-side field validation. This means the validation
|
||||
|
|
|
@ -38,13 +38,6 @@ kubelet [flags]
|
|||
</colgroup>
|
||||
<tbody>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--add-dir-header</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">If true, adds the file directory to the header of the log messages (DEPRECATED: will be removed in a future release, see <a href="https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components">here</a>.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--address string Default: 0.0.0.0 </td>
|
||||
</tr>
|
||||
|
@ -59,13 +52,6 @@ kubelet [flags]
|
|||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Comma-separated whitelist of unsafe sysctls or unsafe sysctl patterns (ending in <code>*</code>). Use these at your own risk. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--alsologtostderr</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Log to standard error as well as files (DEPRECATED: will be removed in a future release, see <a href="https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components">here</a>.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--anonymous-auth Default: true</td>
|
||||
</tr>
|
||||
|
@ -91,7 +77,7 @@ kubelet [flags]
|
|||
<td colspan="2">--authorization-mode string Default: <code>AlwaysAllow</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Authorization mode for Kubelet server. Valid options are AlwaysAllow or Webhook. Webhook mode uses the SubjectAccessReview API to determine authorization. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Authorization mode for Kubelet server. Valid options are AlwaysAllow or Webhook. Webhook mode uses the SubjectAccessReview API to determine authorization. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
|
@ -140,7 +126,7 @@ kubelet [flags]
|
|||
<td colspan="2">--cgroup-root string Default: <code>''</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Optional root cgroup to use for pods. This is handled by the container runtime on a best effort basis. Default: '', which means use the container runtime default. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Optional root cgroup to use for pods. This is handled by the container runtime on a best effort basis. Default: '', which means use the container runtime default. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
|
@ -154,7 +140,7 @@ kubelet [flags]
|
|||
<td colspan="2">--client-ca-file string</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">If set, any request presenting a client certificate signed by one of the authorities in the client-ca-file is authenticated with an identity corresponding to the <code>CommonName</code> of the client certificate. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">If set, any request presenting a client certificate signed by one of the authorities in the client-ca-file is authenticated with an identity corresponding to the <code>CommonName</code> of the client certificate. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
|
@ -196,7 +182,7 @@ kubelet [flags]
|
|||
<td colspan="2">--container-log-max-files int32 Default: 5</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;"><Warning: Beta feature> Set the maximum number of container log files that can be present for a container. The number must be >= 2. This flag can only be used with <code>--container-runtime=remote</code>. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;"><Warning: Beta feature> Set the maximum number of container log files that can be present for a container. The number must be >= 2. This flag can only be used with <code>--container-runtime=remote</code>. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
|
@ -249,7 +235,7 @@ kubelet [flags]
|
|||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--cpu-manager-policy-options mapStringString</td>
|
||||
<td colspan="2">--cpu-manager-policy-options string</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">A set of key=value CPU Manager policy options to use, to fine tune their behaviour. If not supplied, keep the default behaviour. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
|
@ -287,7 +273,7 @@ kubelet [flags]
|
|||
<td colspan="2">--enforce-node-allocatable strings Default: <code>pods</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">A comma separated list of levels of node allocatable enforcement to be enforced by kubelet. Acceptable options are <code>none</code>, <code>pods</code>, <code>system-reserved</code>, and <code>kube-reserved</code>. If the latter two options are specified, <code>--system-reserved-cgroup</code> and <code>--kube-reserved-cgroup</code> must also be set, respectively. If <code>none</code> is specified, no additional options should be set. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/">here</a> for more details. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">A comma separated list of levels of node allocatable enforcement to be enforced by kubelet. Acceptable options are <code>none</code>, <code>pods</code>, <code>system-reserved</code>, and <code>kube-reserved</code>. If the latter two options are specified, <code>--system-reserved-cgroup</code> and <code>--kube-reserved-cgroup</code> must also be set, respectively. If <code>none</code> is specified, no additional options should be set. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/">here</a> for more details. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
|
@ -305,7 +291,7 @@ kubelet [flags]
|
|||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--eviction-hard mapStringString Default: <code>imagefs.available<15%,memory.available<100Mi,nodefs.available<10%</code></td>
|
||||
<td colspan="2">--eviction-hard string Default: <code>imagefs.available<15%,memory.available<100Mi,nodefs.available<10%</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">A set of eviction thresholds (e.g. <code>memory.available<1Gi</code>) that if met would trigger a pod eviction. On a Linux node, the default value also includes <code>nodefs.inodesFree<5%</code>. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
|
@ -319,7 +305,7 @@ kubelet [flags]
|
|||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--eviction-minimum-reclaim mapStringString</td>
|
||||
<td colspan="2">--eviction-minimum-reclaim string</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">A set of minimum reclaims (e.g. <code>imagefs.available=2Gi</code>) that describes the minimum amount of resource the kubelet will reclaim when performing a pod eviction if that resource is under pressure. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
|
@ -333,14 +319,14 @@ kubelet [flags]
|
|||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--eviction-soft mapStringString</td>
|
||||
<td colspan="2">--eviction-soft string</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">A set of eviction thresholds (e.g. <code>memory.available<1.5Gi</code>) that if met over a corresponding grace period would trigger a pod eviction. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--eviction-soft-grace-period mapStringString</td>
|
||||
<td colspan="2">--eviction-soft-grace-period string</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">A set of eviction grace periods (e.g. <code>memory.available=1m30s</code>) that correspond to how long a soft eviction threshold must hold before triggering a pod eviction. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
|
@ -360,13 +346,6 @@ kubelet [flags]
|
|||
<td></td><td style="line-height: 130%; word-wrap: break-word;">When set to <code>true</code>, hard eviction thresholds will be ignored while calculating node allocatable. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/reserve-compute-resources/">here</a> for more details. (DEPRECATED: will be removed in 1.24 or later)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--experimental-kernel-memcg-notification</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Use kernelMemcgNotification configuration, this flag will be removed in 1.24 or later. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--experimental-mounter-path string Default: <code>mount</code></td>
|
||||
</tr>
|
||||
|
@ -389,39 +368,34 @@ kubelet [flags]
|
|||
APIListChunking=true|false (BETA - default=true)<br/>
|
||||
APIPriorityAndFairness=true|false (BETA - default=true)<br/>
|
||||
APIResponseCompression=true|false (BETA - default=true)<br/>
|
||||
APIServerIdentity=true|false (ALPHA - default=false)<br/>
|
||||
APISelfSubjectReview=true|false (ALPHA - default=false)<br/>
|
||||
APIServerIdentity=true|false (BETA - default=true)<br/>
|
||||
APIServerTracing=true|false (ALPHA - default=false)<br/>
|
||||
AggregatedDiscoveryEndpoint=true|false (ALPHA - default=false)<br/>
|
||||
AllAlpha=true|false (ALPHA - default=false)<br/>
|
||||
AllBeta=true|false (BETA - default=false)<br/>
|
||||
AnyVolumeDataSource=true|false (BETA - default=true)<br/>
|
||||
AppArmor=true|false (BETA - default=true)<br/>
|
||||
CPUManager=true|false (BETA - default=true)<br/>
|
||||
CPUManagerPolicyAlphaOptions=true|false (ALPHA - default=false)<br/>
|
||||
CPUManagerPolicyBetaOptions=true|false (BETA - default=true)<br/>
|
||||
CPUManagerPolicyOptions=true|false (BETA - default=true)<br/>
|
||||
CSIInlineVolume=true|false (BETA - default=true)<br/>
|
||||
CSIMigration=true|false (BETA - default=true)<br/>
|
||||
CSIMigrationAWS=true|false (BETA - default=true)<br/>
|
||||
CSIMigrationAzureFile=true|false (BETA - default=true)<br/>
|
||||
CSIMigrationGCE=true|false (BETA - default=true)<br/>
|
||||
CSIMigrationPortworx=true|false (ALPHA - default=false)<br/>
|
||||
CSIMigrationPortworx=true|false (BETA - default=false)<br/>
|
||||
CSIMigrationRBD=true|false (ALPHA - default=false)<br/>
|
||||
CSIMigrationvSphere=true|false (BETA - default=false)<br/>
|
||||
CSINodeExpandSecret=true|false (ALPHA - default=false)<br/>
|
||||
CSIVolumeHealth=true|false (ALPHA - default=false)<br/>
|
||||
ComponentSLIs=true|false (ALPHA - default=false)<br/>
|
||||
ContainerCheckpoint=true|false (ALPHA - default=false)<br/>
|
||||
ContextualLogging=true|false (ALPHA - default=false)<br/>
|
||||
CronJobTimeZone=true|false (ALPHA - default=false)<br/>
|
||||
CronJobTimeZone=true|false (BETA - default=true)<br/>
|
||||
CrossNamespaceVolumeDataSource=true|false (ALPHA - default=false)<br/>
|
||||
CustomCPUCFSQuotaPeriod=true|false (ALPHA - default=false)<br/>
|
||||
CustomResourceValidationExpressions=true|false (ALPHA - default=false)<br/>
|
||||
DaemonSetUpdateSurge=true|false (BETA - default=true)<br/>
|
||||
DelegateFSGroupToCSIDriver=true|false (BETA - default=true)<br/>
|
||||
DevicePlugins=true|false (BETA - default=true)<br/>
|
||||
DisableAcceleratorUsageMetrics=true|false (BETA - default=true)<br/>
|
||||
CustomResourceValidationExpressions=true|false (BETA - default=true)<br/>
|
||||
DisableCloudProviders=true|false (ALPHA - default=false)<br/>
|
||||
DisableKubeletCloudCredentialProviders=true|false (ALPHA - default=false)<br/>
|
||||
DownwardAPIHugePages=true|false (BETA - default=true)<br/>
|
||||
EndpointSliceTerminatingCondition=true|false (BETA - default=true)<br/>
|
||||
EphemeralContainers=true|false (BETA - default=true)<br/>
|
||||
ExpandedDNSConfig=true|false (ALPHA - default=false)<br/>
|
||||
DynamicResourceAllocation=true|false (ALPHA - default=false)<br/>
|
||||
EventedPLEG=true|false (ALPHA - default=false)<br/>
|
||||
ExpandedDNSConfig=true|false (BETA - default=true)<br/>
|
||||
ExperimentalHostUserNamespaceDefaulting=true|false (BETA - default=false)<br/>
|
||||
GRPCContainerProbe=true|false (BETA - default=true)<br/>
|
||||
GracefulNodeShutdown=true|false (BETA - default=true)<br/>
|
||||
|
@ -429,7 +403,7 @@ GracefulNodeShutdownBasedOnPodPriority=true|false (BETA - default=true)<br/>
|
|||
HPAContainerMetrics=true|false (ALPHA - default=false)<br/>
|
||||
HPAScaleToZero=true|false (ALPHA - default=false)<br/>
|
||||
HonorPVReclaimPolicy=true|false (ALPHA - default=false)<br/>
|
||||
IdentifyPodOS=true|false (BETA - default=true)<br/>
|
||||
IPTablesOwnershipCleanup=true|false (ALPHA - default=false)<br/>
|
||||
InTreePluginAWSUnregister=true|false (ALPHA - default=false)<br/>
|
||||
InTreePluginAzureDiskUnregister=true|false (ALPHA - default=false)<br/>
|
||||
InTreePluginAzureFileUnregister=true|false (ALPHA - default=false)<br/>
|
||||
|
@ -439,53 +413,65 @@ InTreePluginPortworxUnregister=true|false (ALPHA - default=false)<br/>
|
|||
InTreePluginRBDUnregister=true|false (ALPHA - default=false)<br/>
|
||||
InTreePluginvSphereUnregister=true|false (ALPHA - default=false)<br/>
|
||||
JobMutableNodeSchedulingDirectives=true|false (BETA - default=true)<br/>
|
||||
JobPodFailurePolicy=true|false (BETA - default=true)<br/>
|
||||
JobReadyPods=true|false (BETA - default=true)<br/>
|
||||
JobTrackingWithFinalizers=true|false (BETA - default=false)<br/>
|
||||
KubeletCredentialProviders=true|false (BETA - default=true)<br/>
|
||||
KMSv2=true|false (ALPHA - default=false)<br/>
|
||||
KubeletInUserNamespace=true|false (ALPHA - default=false)<br/>
|
||||
KubeletPodResources=true|false (BETA - default=true)<br/>
|
||||
KubeletPodResourcesGetAllocatable=true|false (BETA - default=true)<br/>
|
||||
LegacyServiceAccountTokenNoAutoGeneration=true|false (BETA - default=true)<br/>
|
||||
LocalStorageCapacityIsolation=true|false (BETA - default=true)<br/>
|
||||
KubeletTracing=true|false (ALPHA - default=false)<br/>
|
||||
LegacyServiceAccountTokenTracking=true|false (ALPHA - default=false)<br/>
|
||||
LocalStorageCapacityIsolationFSQuotaMonitoring=true|false (ALPHA - default=false)<br/>
|
||||
LogarithmicScaleDown=true|false (BETA - default=true)<br/>
|
||||
LoggingAlphaOptions=true|false (ALPHA - default=false)<br/>
|
||||
LoggingBetaOptions=true|false (BETA - default=true)<br/>
|
||||
MatchLabelKeysInPodTopologySpread=true|false (ALPHA - default=false)<br/>
|
||||
MaxUnavailableStatefulSet=true|false (ALPHA - default=false)<br/>
|
||||
MemoryManager=true|false (BETA - default=true)<br/>
|
||||
MemoryQoS=true|false (ALPHA - default=false)<br/>
|
||||
MinDomainsInPodTopologySpread=true|false (ALPHA - default=false)<br/>
|
||||
MixedProtocolLBService=true|false (BETA - default=true)<br/>
|
||||
NetworkPolicyEndPort=true|false (BETA - default=true)<br/>
|
||||
MinDomainsInPodTopologySpread=true|false (BETA - default=false)<br/>
|
||||
MinimizeIPTablesRestore=true|false (ALPHA - default=false)<br/>
|
||||
MultiCIDRRangeAllocator=true|false (ALPHA - default=false)<br/>
|
||||
NetworkPolicyStatus=true|false (ALPHA - default=false)<br/>
|
||||
NodeOutOfServiceVolumeDetach=true|false (ALPHA - default=false)<br/>
|
||||
NodeInclusionPolicyInPodTopologySpread=true|false (BETA - default=true)<br/>
|
||||
NodeOutOfServiceVolumeDetach=true|false (BETA - default=true)<br/>
|
||||
NodeSwap=true|false (ALPHA - default=false)<br/>
|
||||
OpenAPIEnums=true|false (BETA - default=true)<br/>
|
||||
OpenAPIV3=true|false (BETA - default=true)<br/>
|
||||
PDBUnhealthyPodEvictionPolicy=true|false (ALPHA - default=false)<br/>
|
||||
PodAndContainerStatsFromCRI=true|false (ALPHA - default=false)<br/>
|
||||
PodDeletionCost=true|false (BETA - default=true)<br/>
|
||||
PodSecurity=true|false (BETA - default=true)<br/>
|
||||
ProbeTerminationGracePeriod=true|false (BETA - default=false)<br/>
|
||||
PodDisruptionConditions=true|false (BETA - default=true)<br/>
|
||||
PodHasNetworkCondition=true|false (ALPHA - default=false)<br/>
|
||||
PodSchedulingReadiness=true|false (ALPHA - default=false)<br/>
|
||||
ProbeTerminationGracePeriod=true|false (BETA - default=true)<br/>
|
||||
ProcMountType=true|false (ALPHA - default=false)<br/>
|
||||
ProxyTerminatingEndpoints=true|false (ALPHA - default=false)<br/>
|
||||
ProxyTerminatingEndpoints=true|false (BETA - default=true)<br/>
|
||||
QOSReserved=true|false (ALPHA - default=false)<br/>
|
||||
ReadWriteOncePod=true|false (ALPHA - default=false)<br/>
|
||||
RecoverVolumeExpansionFailure=true|false (ALPHA - default=false)<br/>
|
||||
RemainingItemCount=true|false (BETA - default=true)<br/>
|
||||
RetroactiveDefaultStorageClass=true|false (BETA - default=true)<br/>
|
||||
RotateKubeletServerCertificate=true|false (BETA - default=true)<br/>
|
||||
SeccompDefault=true|false (ALPHA - default=false)<br/>
|
||||
ServerSideFieldValidation=true|false (ALPHA - default=false)<br/>
|
||||
ServiceIPStaticSubrange=true|false (ALPHA - default=false)<br/>
|
||||
ServiceInternalTrafficPolicy=true|false (BETA - default=true)<br/>
|
||||
SELinuxMountReadWriteOncePod=true|false (ALPHA - default=false)<br/>
|
||||
SeccompDefault=true|false (BETA - default=true)<br/>
|
||||
ServerSideFieldValidation=true|false (BETA - default=true)<br/>
|
||||
SizeMemoryBackedVolumes=true|false (BETA - default=true)<br/>
|
||||
StatefulSetAutoDeletePVC=true|false (ALPHA - default=false)<br/>
|
||||
StatefulSetMinReadySeconds=true|false (BETA - default=true)<br/>
|
||||
StatefulSetStartOrdinal=true|false (ALPHA - default=false)<br/>
|
||||
StorageVersionAPI=true|false (ALPHA - default=false)<br/>
|
||||
StorageVersionHash=true|false (BETA - default=true)<br/>
|
||||
TopologyAwareHints=true|false (BETA - default=true)<br/>
|
||||
TopologyManager=true|false (BETA - default=true)<br/>
|
||||
TopologyManagerPolicyAlphaOptions=true|false (ALPHA - default=false)<br/>
|
||||
TopologyManagerPolicyBetaOptions=true|false (BETA - default=false)<br/>
|
||||
TopologyManagerPolicyOptions=true|false (ALPHA - default=false)<br/>
|
||||
UserNamespacesStatelessPodsSupport=true|false (ALPHA - default=false)<br/>
|
||||
ValidatingAdmissionPolicy=true|false (ALPHA - default=false)<br/>
|
||||
VolumeCapacityPriority=true|false (ALPHA - default=false)<br/>
|
||||
WinDSR=true|false (ALPHA - default=false)<br/>
|
||||
WinOverlay=true|false (BETA - default=true)<br/>
|
||||
WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
||||
WindowsHostNetwork=true|false (ALPHA - default=true)<br/>
|
||||
(DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
|
@ -623,7 +609,7 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--kube-reserved mapStringString Default: <None></td>
|
||||
<td colspan="2">--kube-reserved string Default: <None></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">A set of <code><resource name>=<resource quantity></code> (e.g. <code>cpu=200m,memory=500Mi,ephemeral-storage=1Gi,pid='100'</code>) pairs that describe resources reserved for kubernetes system components. Currently <code>cpu</code>, <code>memory</code> and local <code>ephemeral-storage</code> for root file system are supported. See <a href="http://kubernetes.io/docs/user-guide/compute-resources">here</a> for more detail. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
|
@ -650,6 +636,13 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Optional absolute name of cgroups to create and run the Kubelet in. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--local-storage-capacity-isolation> Default: <code>true</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">If true, local ephemeral storage isolation is enabled. Otherwise, local storage isolation feature will be disabled. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/ for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--lock-file string</td>
|
||||
</tr>
|
||||
|
@ -657,34 +650,6 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
<td></td><td style="line-height: 130%; word-wrap: break-word;"><Warning: Alpha feature> The path to file for kubelet to use as a lock file.</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--log-backtrace-at <A string of format 'file:line'> Default: <code>":0"</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">When logging hits line <code><file>:<N></code>, emit a stack trace. (DEPRECATED: will be removed in a future release, see <a href="https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components">here</a>.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--log-dir string</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">If non-empty, write log files in this directory. (DEPRECATED: will be removed in a future release, see <a href="https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components">here</a>.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--log-file string</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">If non-empty, use this log file. (DEPRECATED: will be removed in a future release, see <a href="https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components">here</a>.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--log-file-max-size uint Default: 1800</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Defines the maximum size a log file can grow to. Unit is megabytes. If the value is 0, the maximum file size is unlimited. (DEPRECATED: will be removed in a future release, see <a href="https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components">here</a>.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--log-flush-frequency duration Default: <code>5s</code></td>
|
||||
</tr>
|
||||
|
@ -696,28 +661,21 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
<td colspan="2">--log-json-info-buffer-size string Default: <code>'0'</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">[Experimental] In JSON format with split output streams, the info messages can be buffered for a while to increase performance. The default value of zero bytes disables buffering. The size can be specified as number of bytes (512), multiples of 1000 (1K), multiples of 1024 (2Ki), or powers of those (3M, 4G, 5Mi, 6Gi). (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">[Alpha] In JSON format with split output streams, the info messages can be buffered for a while to increase performance. The default value of zero bytes disables buffering. The size can be specified as number of bytes (512), multiples of 1000 (1K), multiples of 1024 (2Ki), or powers of those (3M, 4G, 5Mi, 6Gi). Enable the <code>LoggingAlphaOptions</code> feature gate to use this. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--log-json-split-stream</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">[Experimental] In JSON format, write error messages to stderr and info messages to stdout. The default is to write a single stream to stdout. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">[Alpha] In JSON format, write error messages to stderr and info messages to stdout. The default is to write a single stream to stdout. Enable the <code>LoggingAlphaOptions</code> feature gate to use this. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--logging-format string Default: <code>text</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Sets the log format. Permitted formats: <code>text</code>, <code>json</code>.<br/>Non-default formats don't honor these flags: <code>--add-dir-header</code>, <code>--alsologtostderr</code>, <code>--log-backtrace-at</code>, <code>--log-dir</code>, <code>--log-file</code>, <code>--log-file-max-size</code>, <code>--logtostderr</code>, <code>--skip_headers</code>, <code>--skip_log_headers</code>, <code>--stderrthreshold</code>, <code>--log-flush-frequency</code>.<br/>Non-default choices are currently alpha and subject to change without warning. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--logtostderr Default: <code>true</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">log to standard error instead of files. (DEPRECATED: will be removed in a future release, see <a href="https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components">here</a>.)</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Sets the log format. Permitted formats: <code>text</code>, <code>json</code> (gated by <code>LoggingBetaOptions</code>). (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
|
@ -805,7 +763,7 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--node-labels mapStringString</td>
|
||||
<td colspan="2">--node-labels <key=value pairs></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;"><Warning: Alpha feature>Labels to add when registering the node in the cluster. Labels must be <code>key=value pairs</code> separated by <code>','</code>. Labels in the <code>'kubernetes.io'</code> namespace must begin with an allowed prefix (<code>'kubelet.kubernetes.io'</code>, <code>'node.kubernetes.io'</code>) or be in the specifically allowed set (<code>'beta.kubernetes.io/arch'</code>, <code>'beta.kubernetes.io/instance-type'</code>, <code>'beta.kubernetes.io/os'</code>, <code>'failure-domain.beta.kubernetes.io/region'</code>, <code>'failure-domain.beta.kubernetes.io/zone'</code>, <code>'kubernetes.io/arch'</code>, <code>'kubernetes.io/hostname'</code>, <code>'kubernetes.io/os'</code>, <code>'node.kubernetes.io/instance-type'</code>, <code>'topology.kubernetes.io/region'</code>, <code>'topology.kubernetes.io/zone'</code>)</td>
|
||||
|
@ -825,13 +783,6 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Specifies how often kubelet posts node status to master. Note: be cautious when changing the constant, it must work with <code>nodeMonitorGracePeriod</code> in Node controller. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--one-output</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">If true, only write logs to their native severity level (vs also writing to each lower severity level). (DEPRECATED: will be removed in a future release, see <a href="https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components">here</a>.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--oom-score-adj int32 Default: -999</td>
|
||||
</tr>
|
||||
|
@ -847,10 +798,10 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--pod-infra-container-image string Default: <code>registry.k8s.io/pause:3.6</code></td>
|
||||
<td colspan="2">--pod-infra-container-image string Default: <code>registry.k8s.io/pause:3.9</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Specified image will not be pruned by the image garbage collector. When container-runtime is set to <code>docker</code>, all containers in each pod will use the network/IPC namespaces from this image. Other CRI implementations have their own configuration to set this image.</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Specified image will not be pruned by the image garbage collector. CRI implementations have their own configuration to set this image. (DEPRECATED: will be removed in 1.27. Image garbage collector will get sandbox image information from CRI.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
|
@ -885,7 +836,7 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
<td colspan="2">--protect-kernel-defaults</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;"> Default kubelet behaviour for kernel tuning. If set, kubelet errors if any of kernel tunables is different than kubelet defaults. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Default kubelet behaviour for kernel tuning. If set, kubelet errors if any of kernel tunables is different than kubelet defaults. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
|
@ -896,7 +847,7 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--qos-reserved mapStringString</td>
|
||||
<td colspan="2">--qos-reserved string</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;"><Warning: Alpha feature> A set of <code><resource name>=<percentage></code> (e.g. <code>memory=50%</code>) pairs that describe how pod resource requests are reserved at the QoS level. Currently only <code>memory</code> is supported. Requires the <code>QOSReserved</code> feature gate to be enabled. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
|
@ -913,7 +864,7 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
<td colspan="2">--register-node Default: <code>true</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Register the node with the API server. If <code>--kubeconfig</code> is not provided, this flag is irrelevant, as the Kubelet won't have an API server to register with. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Register the node with the API server. If <code>--kubeconfig</code> is not provided, this flag is irrelevant, as the Kubelet won't have an API server to register with. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
|
@ -924,10 +875,10 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--register-with-taints mapStringString</td>
|
||||
<td colspan="2">--register-with-taints string</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Register the node with the given list of taints (comma separated <code><key>=<value>:<effect></code>). No-op if <code>--register-node</code> is <code>false</code>. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Register the node with the given list of taints (comma separated <code><key>=<value>:<effect></code>). No-op if <code>--register-node</code> is <code>false</code>. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
|
@ -976,21 +927,21 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
<td colspan="2">--rotate-certificates</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;"><Warning: Beta feature> Auto rotate the kubelet client certificates by requesting new certificates from the <code>kube-apiserver</code> when the certificate expiration approaches. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Auto rotate the kubelet client certificates by requesting new certificates from the <code>kube-apiserver</code> when the certificate expiration approaches. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--rotate-server-certificates</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Auto-request and rotate the kubelet serving certificates by requesting new certificates from the <code>kube-apiserver</code> when the certificate expiration approaches. Requires the <code>RotateKubeletServerCertificate</code> feature gate to be enabled, and approval of the submitted <code>CertificateSigningRequest</code> objects. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;"><Warning: Beta feature> Auto-request and rotate the kubelet serving certificates by requesting new certificates from the <code>kube-apiserver</code> when the certificate expiration approaches. Requires the <code>RotateKubeletServerCertificate</code> feature gate to be enabled, and approval of the submitted <code>CertificateSigningRequest</code> objects. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--runonce</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">If <code>true</code>, exit after spawning pods from local manifests or remote urls. Exclusive with <code>--enable-server</code> (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">If <code>true</code>, exit after spawning pods from local manifests or remote urls. Exclusive with <code>--enable-server</code> (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
|
@ -1011,7 +962,7 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
<td colspan="2">--seccomp-default string</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;"><Warning: Alpha feature> Enable the use of <code>RuntimeDefault</code> as the default seccomp profile for all workloads. The <code>SeccompDefault</code> feature gate must be enabled to allow this flag, which is disabled by default.</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;"><Warning: Beta feature> Enable the use of <code>RuntimeDefault</code> as the default seccomp profile for all workloads. The <code>SeccompDefault</code> feature gate must be enabled to allow this flag.</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
|
@ -1021,27 +972,6 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Pull images one at a time. We recommend *not* changing the default value on nodes that run docker daemon with version < 1.9 or an <code>aufs</code> storage backend. Issue #10959 has more details. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--skip-headers</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">If <code>true</code>, avoid header prefixes in the log messages. (DEPRECATED: will be removed in a future release, see <a href="https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components">here</a>.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--skip-log-headers</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">If <code>true</code>, avoid headers when opening log files. (DEPRECATED: will be removed in a future release, see <a href="https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components">here</a>.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--stderrthreshold int Default: 2</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">logs at or above this threshold go to stderr. (DEPRECATED: will be removed in a future release, see <a href="https://github.com/kubernetes/enhancements/tree/master/keps/sig-instrumentation/2845-deprecate-klog-specific-flags-in-k8s-components">here</a>.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--streaming-connection-idle-timeout duration Default: <code>4h0m0s</code></td>
|
||||
</tr>
|
||||
|
@ -1064,7 +994,7 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--system-reserved mapStringString Default: <none></td>
|
||||
<td colspan="2">--system-reserved string Default: <none></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">A set of <code><resource name>=<resource quantity></code> (e.g. <code>cpu=200m,memory=500Mi,ephemeral-storage=1Gi,pid='100'</code>) pairs that describe resources reserved for non-kubernetes components. Currently only <code>cpu</code> and <code>memory</code> are supported. See <a href="http://kubernetes.io/docs/user-guide/compute-resources">here</a> for more detail. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
|
@ -1085,15 +1015,15 @@ WindowsHostProcessContainers=true|false (BETA - default=true)<br/>
|
|||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--tls-cipher-suites strings</td>
|
||||
<td colspan="2">--tls-cipher-suites string</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Comma-separated list of cipher suites for the server. If omitted, the default Go cipher suites will be used.<br/>
|
||||
Preferred values:
|
||||
`TLS_AES_128_GCM_SHA256`, `TLS_AES_256_GCM_SHA384`, `TLS_CHACHA20_POLY1305_SHA256`, `TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA`, `TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256`, `TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA`, `TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384`, `TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305`, `TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256`, `TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA`, `TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256`, `TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA`, `TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384`, `TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305`, `TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256`, `TLS_RSA_WITH_AES_128_CBC_SHA`, `TLS_RSA_WITH_AES_128_GCM_SHA256`, `TLS_RSA_WITH_AES_256_CBC_SHA`, `TLS_RSA_WITH_AES_256_GCM_SHA384`<br/>
|
||||
<code>TLS_AES_128_GCM_SHA256</code>, <code>TLS_AES_256_GCM_SHA384</code>, <code>TLS_CHACHA20_POLY1305_SHA256</code>, <code>TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA</code>, <code>TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256</code>, <code>TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA</code>, <code>TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384</code>, <code>TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305</code>, <code>TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305_SHA256</code>, <code>TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA</code>, <code>TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256</code>, <code>TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA</code>, <code>TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384</code>, <code>TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305</code>, <code>TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256</code>, <code>TLS_RSA_WITH_AES_128_CBC_SHA</code>, <code>TLS_RSA_WITH_AES_128_GCM_SHA256</code>, <code>TLS_RSA_WITH_AES_256_CBC_SHA</code>, <code>TLS_RSA_WITH_AES_256_GCM_SHA384</code><br/>
|
||||
Insecure values:
|
||||
`TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256`, `TLS_ECDHE_ECDSA_WITH_RC4_128_SHA`, `TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA`, `TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256`, `TLS_ECDHE_RSA_WITH_RC4_128_SHA`, `TLS_RSA_WITH_3DES_EDE_CBC_SHA`, `TLS_RSA_WITH_AES_128_CBC_SHA256`, `TLS_RSA_WITH_RC4_128_SHA`.<br/>
|
||||
(DEPRECATED: This parameter should be set via the config file specified by the Kubelet's `--config` flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)
|
||||
<code>TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256</code>, <code>TLS_ECDHE_ECDSA_WITH_RC4_128_SHA</code>, <code>TLS_ECDHE_RSA_WITH_3DES_EDE_CBC_SHA</code>, <code>TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256</code>, <code>TLS_ECDHE_RSA_WITH_RC4_128_SHA</code>, <code>TLS_RSA_WITH_3DES_EDE_CBC_SHA</code>, <code>TLS_RSA_WITH_AES_128_CBC_SHA256</code>, <code>TLS_RSA_WITH_RC4_128_SHA</code>.<br/>
|
||||
(DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
|
@ -1118,11 +1048,18 @@ Insecure values:
|
|||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Topology Manager policy to use. Possible values: <code>'none'</code>, <code>'best-effort'</code>, <code>'restricted'</code>, <code>'single-numa-node'</code>. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--topology-manager-policy-options string</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">A set of key=value Topology Manager policy options to use, to fine tune their behaviour. If not supplied, keep the default behaviour. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
<td colspan="2">--topology-manager-scope string Default: <code>container</code></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Scope to which topology hints applied. Topology Manager collects hints from Hint Providers and applies them to defined scope to ensure the pod admission. Possible values: <code>'container'</code>, <code>'pod'</code>. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's --config flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
<td></td><td style="line-height: 130%; word-wrap: break-word;">Scope to which topology hints are applied. Topology Manager collects hints from Hint Providers and applies them to the defined scope to ensure the pod admission. Possible values: <code>'container'</code>, <code>'pod'</code>. (DEPRECATED: This parameter should be set via the config file specified by the Kubelet's <code>--config</code> flag. See <a href="https://kubernetes.io/docs/tasks/administer-cluster/kubelet-config-file/">kubelet-config-file</a> for more information.)</td>
|
||||
</tr>
|
||||
|
||||
<tr>
|
||||
|
|
|
@ -0,0 +1,23 @@
|
|||
---
|
||||
title: Feature gate
|
||||
id: feature-gate
|
||||
date: 2023-01-12
|
||||
full_link: /docs/reference/command-line-tools-reference/feature-gates/
|
||||
short_description: >
|
||||
A way to control whether or not a particular Kubernetes feature is enabled.
|
||||
|
||||
aka:
|
||||
tags:
|
||||
- fundamental
|
||||
- operation
|
||||
---
|
||||
|
||||
Feature gates are a set of keys (opaque string values) that you can use to control which
|
||||
Kubernetes features are enabled in your cluster.
|
||||
|
||||
<!--more-->
|
||||
|
||||
You can turn these features on or off using the `--feature-gates` command line flag on each Kubernetes component.
|
||||
Each Kubernetes component lets you enable or disable a set of feature gates that are relevant to that component.
|
||||
The Kubernetes documentation lists all current
|
||||
[feature gates](/docs/reference/command-line-tools-reference/feature-gates/) and what they control.
|
|
@ -0,0 +1,20 @@
|
|||
---
|
||||
title: JSON Web Token (JWT)
|
||||
id: jwt
|
||||
date: 2023-01-17
|
||||
full_link: https://www.rfc-editor.org/rfc/rfc7519
|
||||
short_description: >
|
||||
A means of representing claims to be transferred between two parties.
|
||||
|
||||
aka:
|
||||
tags:
|
||||
- security
|
||||
- architecture
|
||||
---
|
||||
A means of representing claims to be transferred between two parties.
|
||||
|
||||
<!--more-->
|
||||
|
||||
JWTs can be digitally signed and encrypted. Kubernetes uses JWTs as
|
||||
authentication tokens to verify the identity of entities that want to perform
|
||||
actions in a cluster.
|
|
@ -2,7 +2,7 @@
|
|||
title: QoS Class
|
||||
id: qos-class
|
||||
date: 2019-04-15
|
||||
full_link:
|
||||
full_link: /docs/concepts/workloads/pods/pod-qos/
|
||||
short_description: >
|
||||
QoS Class (Quality of Service Class) provides a way for Kubernetes to classify pods within the cluster into several classes and make decisions about scheduling and eviction.
|
||||
|
||||
|
|
|
@ -171,6 +171,16 @@ There are two possible values:
|
|||
- `onstart`: The APIService should be reconciled when an API server starts up, but not otherwise.
|
||||
- `true`: The API server should reconcile this APIService continuously.
|
||||
|
||||
### service.alpha.kubernetes.io/tolerate-unready-endpoints (deprecated)
|
||||
|
||||
Used on: StatefulSet
|
||||
|
||||
This annotation on a Service denotes if the Endpoints controller should go ahead and create Endpoints for unready Pods.
|
||||
Endpoints of these Services retain their DNS records and continue receiving
|
||||
traffic for the Service from the moment the kubelet starts all containers in the pod
|
||||
and marks it _Running_, til the kubelet stops all containers and deletes the pod from
|
||||
the API server.
|
||||
|
||||
### kubernetes.io/hostname {#kubernetesiohostname}
|
||||
|
||||
Example: `kubernetes.io/hostname: "ip-172-20-114-199.ec2.internal"`
|
||||
|
@ -310,6 +320,50 @@ See [topology.kubernetes.io/zone](#topologykubernetesiozone).
|
|||
|
||||
{{< note >}} Starting in v1.17, this label is deprecated in favor of [topology.kubernetes.io/zone](#topologykubernetesiozone). {{< /note >}}
|
||||
|
||||
### pv.kubernetes.io/bind-completed {#pv-kubernetesiobind-completed}
|
||||
|
||||
Example: `pv.kubernetes.io/bind-completed: "yes"`
|
||||
|
||||
Used on: PersistentVolumeClaim
|
||||
|
||||
When this annotation is set on a PersistentVolumeClaim (PVC), that indicates that the lifecycle
|
||||
of the PVC has passed through initial binding setup. When present, that information changes
|
||||
how the control plane interprets the state of PVC objects.
|
||||
The value of this annotation does not matter to Kubernetes.
|
||||
|
||||
### pv.kubernetes.io/bound-by-controller {#pv-kubernetesioboundby-controller}
|
||||
|
||||
Example: `pv.kubernetes.io/bound-by-controller: "yes"`
|
||||
|
||||
Used on: PersistentVolume, PersistentVolumeClaim
|
||||
|
||||
If this annotation is set on a PersistentVolume or PersistentVolumeClaim, it indicates that a storage binding
|
||||
(PersistentVolume → PersistentVolumeClaim, or PersistentVolumeClaim → PersistentVolume) was installed
|
||||
by the {{< glossary_tooltip text="controller" term_id="controller" >}}.
|
||||
If the annotation isn't set, and there is a storage binding in place, the absence of that annotation means that
|
||||
the binding was done manually. The value of this annotation does not matter.
|
||||
|
||||
### pv.kubernetes.io/provisioned-by {#pv-kubernetesiodynamically-provisioned}
|
||||
|
||||
Example: `pv.kubernetes.io/provisioned-by: "kubernetes.io/rbd"`
|
||||
|
||||
Used on: PersistentVolume
|
||||
|
||||
This annotation is added to a PersistentVolume(PV) that has been dynamically provisioned by Kubernetes.
|
||||
Its value is the name of volume plugin that created the volume. It serves both user (to show where a PV
|
||||
comes from) and Kubernetes (to recognize dynamically provisioned PVs in its decisions).
|
||||
|
||||
### pv.kubernetes.io/migrated-to {#pv-kubernetesio-migratedto}
|
||||
|
||||
Example: `pv.kubernetes.io/migrated-to: pd.csi.storage.gke.io`
|
||||
|
||||
Used on: PersistentVolume, PersistentVolumeClaim
|
||||
|
||||
It is added to a PersistentVolume(PV) and PersistentVolumeClaim(PVC) that is supposed to be
|
||||
dynamically provisioned/deleted by its corresponding CSI driver through the `CSIMigration` feature gate.
|
||||
When this annotation is set, the Kubernetes components will "stand-down" and the `external-provisioner`
|
||||
will act on the objects.
|
||||
|
||||
### statefulset.kubernetes.io/pod-name {#statefulsetkubernetesiopod-name}
|
||||
|
||||
Example:
|
||||
|
@ -393,6 +447,12 @@ Used on: PersistentVolumeClaim
|
|||
|
||||
This annotation will be added to dynamic provisioning required PVC.
|
||||
|
||||
### volume.kubernetes.io/selected-node
|
||||
|
||||
Used on: PersistentVolumeClaim
|
||||
|
||||
This annotation is added to a PVC that is triggered by a scheduler to be dynamically provisioned. Its value is the name of the selected node.
|
||||
|
||||
### volumes.kubernetes.io/controller-managed-attach-detach
|
||||
|
||||
Used on: Node
|
||||
|
@ -784,9 +844,9 @@ you through the steps you follow to apply a seccomp profile to a Pod or to one o
|
|||
its containers. That tutorial covers the supported mechanism for configuring seccomp in Kubernetes,
|
||||
based on setting `securityContext` within the Pod's `.spec`.
|
||||
|
||||
### snapshot.storage.kubernetes.io/allowVolumeModeChange
|
||||
### snapshot.storage.kubernetes.io/allow-volume-mode-change
|
||||
|
||||
Example: `snapshot.storage.kubernetes.io/allowVolumeModeChange: "true"`
|
||||
Example: `snapshot.storage.kubernetes.io/allow-volume-mode-change: "true"`
|
||||
|
||||
Used on: VolumeSnapshotContent
|
||||
|
||||
|
|
|
@ -6,7 +6,8 @@ weight: 50
|
|||
|
||||
<!-- overview -->
|
||||
Every {{< glossary_tooltip term_id="node" text="node" >}} in a Kubernetes
|
||||
cluster runs a [kube-proxy](/docs/reference/command-line-tools-reference/kube-proxy/)
|
||||
{{< glossary_tooltip term_id="cluster" text="cluster" >}} runs a
|
||||
[kube-proxy](/docs/reference/command-line-tools-reference/kube-proxy/)
|
||||
(unless you have deployed your own alternative component in place of `kube-proxy`).
|
||||
|
||||
The `kube-proxy` component is responsible for implementing a _virtual IP_
|
||||
|
@ -39,8 +40,10 @@ network proxying service on a computer. Although the `kube-proxy` executable su
|
|||
to use as-is.
|
||||
|
||||
<a id="example"></a>
|
||||
Some of the details in this reference refer to an example: the back end Pods for a stateless
|
||||
image-processing workload, running with three replicas. Those replicas are
|
||||
Some of the details in this reference refer to an example: the backend
|
||||
{{< glossary_tooltip term_id="pod" text="Pods" >}} for a stateless
|
||||
image-processing workloads, running with
|
||||
three replicas. Those replicas are
|
||||
fungible—frontends do not care which backend they use. While the actual Pods that
|
||||
compose the backend set may change, the frontend clients should not need to be aware of that,
|
||||
nor should they need to keep track of the set of backends themselves.
|
||||
|
@ -61,8 +64,10 @@ Note that the kube-proxy starts up in different modes, which are determined by i
|
|||
|
||||
### `iptables` proxy mode {#proxy-mode-iptables}
|
||||
|
||||
In this mode, kube-proxy watches the Kubernetes control plane for the addition and
|
||||
removal of Service and EndpointSlice objects. For each Service, it installs
|
||||
In this mode, kube-proxy watches the Kubernetes
|
||||
{{< glossary_tooltip term_id="control-plane" text="control plane" >}} for the addition and
|
||||
removal of Service and EndpointSlice {{< glossary_tooltip term_id="object" text="objects." >}}
|
||||
For each Service, it installs
|
||||
iptables rules, which capture traffic to the Service's `clusterIP` and `port`,
|
||||
and redirect that traffic to one of the Service's
|
||||
backend sets. For each endpoint, it installs iptables rules which
|
||||
|
@ -84,7 +89,7 @@ to verify that backend Pods are working OK, so that kube-proxy in iptables mode
|
|||
only sees backends that test out as healthy. Doing this means you avoid
|
||||
having traffic sent via kube-proxy to a Pod that's known to have failed.
|
||||
|
||||
{{< figure src="/images/docs/services-iptables-overview.svg" title="Services overview diagram for iptables proxy" class="diagram-medium" >}}
|
||||
{{< figure src="/images/docs/services-iptables-overview.svg" title="Virtual IP mechanism for Services, using iptables mode" class="diagram-medium" >}}
|
||||
|
||||
#### Example {#packet-processing-iptables}
|
||||
|
||||
|
@ -134,11 +139,13 @@ attempts to resynchronize iptables rules with the kernel. If it is
|
|||
every time any Service or Endpoint changes. This works fine in very
|
||||
small clusters, but it results in a lot of redundant work when lots of
|
||||
things change in a small time period. For example, if you have a
|
||||
Service backed by a Deployment with 100 pods, and you delete the
|
||||
Service backed by a {{< glossary_tooltip term_id="deployment" text="Deployment" >}}
|
||||
with 100 pods, and you delete the
|
||||
Deployment, then with `minSyncPeriod: 0s`, kube-proxy would end up
|
||||
removing the Service's Endpoints from the iptables rules one by one,
|
||||
for a total of 100 updates. With a larger `minSyncPeriod`, multiple
|
||||
Pod deletion events would get aggregated together, so kube-proxy might
|
||||
Pod deletion events would get aggregated
|
||||
together, so kube-proxy might
|
||||
instead end up making, say, 5 updates, each removing 20 endpoints,
|
||||
which will be much more efficient in terms of CPU, and result in the
|
||||
full set of changes being synchronized faster.
|
||||
|
@ -182,7 +189,8 @@ enable the `MinimizeIPTablesRestore` [feature
|
|||
gate](/docs/reference/command-line-tools-reference/feature-gates/) for
|
||||
kube-proxy with `--feature-gates=MinimizeIPTablesRestore=true,…`.
|
||||
|
||||
If you enable that feature gate and you were previously overriding
|
||||
If you enable that feature gate and
|
||||
you were previously overriding
|
||||
`minSyncPeriod`, you should try removing that override and letting
|
||||
kube-proxy use the default value (`1s`) or at least a smaller value
|
||||
than you were using before.
|
||||
|
@ -229,7 +237,7 @@ kernel modules are available. If the IPVS kernel modules are not detected, then
|
|||
falls back to running in iptables proxy mode.
|
||||
{{< /note >}}
|
||||
|
||||
{{< figure src="/images/docs/services-ipvs-overview.svg" title="Services overview diagram for IPVS proxy" class="diagram-medium" >}}
|
||||
{{< figure src="/images/docs/services-ipvs-overview.svg" title="Virtual IP address mechanism for Services, using IPVS mode" class="diagram-medium" >}}
|
||||
|
||||
## Session affinity
|
||||
|
||||
|
@ -274,7 +282,7 @@ someone else's choice. That is an isolation failure.
|
|||
In order to allow you to choose a port number for your Services, we must
|
||||
ensure that no two Services can collide. Kubernetes does that by allocating each
|
||||
Service its own IP address from within the `service-cluster-ip-range`
|
||||
CIDR range that is configured for the API server.
|
||||
CIDR range that is configured for the {{< glossary_tooltip term_id="kube-apiserver" text="API Server" >}}.
|
||||
|
||||
To ensure each Service receives a unique IP, an internal allocator atomically
|
||||
updates a global allocation map in {{< glossary_tooltip term_id="etcd" >}}
|
||||
|
@ -353,7 +361,8 @@ N to 0 replicas of that deployment. In some cases, external load balancers can s
|
|||
a node with 0 replicas in between health check probes. Routing traffic to terminating endpoints
|
||||
ensures that Node's that are scaling down Pods can gracefully receive and drain traffic to
|
||||
those terminating Pods. By the time the Pod completes termination, the external load balancer
|
||||
should have seen the node's health check failing and fully removed the node from the backend pool.
|
||||
should have seen the node's health check failing and fully removed the node from the backend
|
||||
pool.
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
|
|
|
@ -77,7 +77,7 @@ their authors, not the Kubernetes team.
|
|||
| Ruby | [github.com/abonas/kubeclient](https://github.com/abonas/kubeclient) |
|
||||
| Ruby | [github.com/k8s-ruby/k8s-ruby](https://github.com/k8s-ruby/k8s-ruby) |
|
||||
| Ruby | [github.com/kontena/k8s-client](https://github.com/kontena/k8s-client) |
|
||||
| Rust | [github.com/clux/kube-rs](https://github.com/clux/kube-rs) |
|
||||
| Rust | [github.com/kube-rs/kube](https://github.com/kube-rs/kube) |
|
||||
| Rust | [github.com/ynqa/kubernetes-rust](https://github.com/ynqa/kubernetes-rust) |
|
||||
| Scala | [github.com/hagay3/skuber](https://github.com/hagay3/skuber) |
|
||||
| Scala | [github.com/hnaderi/scala-k8s](https://github.com/hnaderi/scala-k8s) |
|
||||
|
|
|
@ -366,12 +366,26 @@ There are two solutions:
|
|||
|
||||
First, the user defines a new configuration containing only the `replicas` field:
|
||||
|
||||
{{< codenew file="application/ssa/nginx-deployment-replicas-only.yaml" >}}
|
||||
```yaml
|
||||
# Save this file as 'nginx-deployment-replicas-only.yaml'.
|
||||
apiVersion: apps/v1
|
||||
kind: Deployment
|
||||
metadata:
|
||||
name: nginx-deployment
|
||||
spec:
|
||||
replicas: 3
|
||||
```
|
||||
|
||||
{{< note >}}
|
||||
The YAML file for SSA in this case only contains the fields you want to change.
|
||||
You are not supposed to provide a fully compliant Deployment manifest if you only
|
||||
want to modify the `spec.replicas` field using SSA.
|
||||
{{< /note >}}
|
||||
|
||||
The user applies that configuration using the field manager name `handover-to-hpa`:
|
||||
|
||||
```shell
|
||||
kubectl apply -f https://k8s.io/examples/application/ssa/nginx-deployment-replicas-only.yaml \
|
||||
kubectl apply -f nginx-deployment-replicas-only.yaml \
|
||||
--server-side --field-manager=handover-to-hpa \
|
||||
--validate=false
|
||||
```
|
||||
|
|
|
@ -9,13 +9,13 @@ weight: 10
|
|||
A cluster is a set of {{< glossary_tooltip text="nodes" term_id="node" >}} (physical
|
||||
or virtual machines) running Kubernetes agents, managed by the
|
||||
{{< glossary_tooltip text="control plane" term_id="control-plane" >}}.
|
||||
Kubernetes {{< param "version" >}} supports clusters with up to 5000 nodes. More specifically,
|
||||
Kubernetes {{< param "version" >}} supports clusters with up to 5,000 nodes. More specifically,
|
||||
Kubernetes is designed to accommodate configurations that meet *all* of the following criteria:
|
||||
|
||||
* No more than 110 pods per node
|
||||
* No more than 5000 nodes
|
||||
* No more than 150000 total pods
|
||||
* No more than 300000 total containers
|
||||
* No more than 5,000 nodes
|
||||
* No more than 150,000 total pods
|
||||
* No more than 300,000 total containers
|
||||
|
||||
You can scale your cluster by adding or removing nodes. The way you do this depends
|
||||
on how your cluster is deployed.
|
||||
|
|
|
@ -26,15 +26,15 @@ etcd cluster of three members that can be used by kubeadm during cluster creatio
|
|||
|
||||
## {{% heading "prerequisites" %}}
|
||||
|
||||
* Three hosts that can talk to each other over TCP ports 2379 and 2380. This
|
||||
- Three hosts that can talk to each other over TCP ports 2379 and 2380. This
|
||||
document assumes these default ports. However, they are configurable through
|
||||
the kubeadm config file.
|
||||
* Each host must have systemd and a bash compatible shell installed.
|
||||
* Each host must [have a container runtime, kubelet, and kubeadm installed](/docs/setup/production-environment/tools/kubeadm/install-kubeadm/).
|
||||
* Each host should have access to the Kubernetes container image registry (`registry.k8s.io`) or list/pull the required etcd image using
|
||||
`kubeadm config images list/pull`. This guide will set up etcd instances as
|
||||
[static pods](/docs/tasks/configure-pod-container/static-pod/) managed by a kubelet.
|
||||
* Some infrastructure to copy files between hosts. For example `ssh` and `scp`
|
||||
- Each host must have systemd and a bash compatible shell installed.
|
||||
- Each host must [have a container runtime, kubelet, and kubeadm installed](/docs/setup/production-environment/tools/kubeadm/install-kubeadm/).
|
||||
- Each host should have access to the Kubernetes container image registry (`registry.k8s.io`) or list/pull the required etcd image using
|
||||
`kubeadm config images list/pull`. This guide will set up etcd instances as
|
||||
[static pods](/docs/tasks/configure-pod-container/static-pod/) managed by a kubelet.
|
||||
- Some infrastructure to copy files between hosts. For example `ssh` and `scp`
|
||||
can satisfy this requirement.
|
||||
|
||||
<!-- steps -->
|
||||
|
@ -42,7 +42,7 @@ etcd cluster of three members that can be used by kubeadm during cluster creatio
|
|||
## Setting up the cluster
|
||||
|
||||
The general approach is to generate all certs on one node and only distribute
|
||||
the *necessary* files to the other nodes.
|
||||
the _necessary_ files to the other nodes.
|
||||
|
||||
{{< note >}}
|
||||
kubeadm contains all the necessary cryptographic machinery to generate
|
||||
|
@ -59,242 +59,241 @@ on Kubernetes dual-stack support see [Dual-stack support with kubeadm](/docs/set
|
|||
1. Configure the kubelet to be a service manager for etcd.
|
||||
|
||||
{{< note >}}You must do this on every host where etcd should be running.{{< /note >}}
|
||||
Since etcd was created first, you must override the service priority by creating a new unit file
|
||||
that has higher precedence than the kubeadm-provided kubelet unit file.
|
||||
Since etcd was created first, you must override the service priority by creating a new unit file
|
||||
that has higher precedence than the kubeadm-provided kubelet unit file.
|
||||
|
||||
```sh
|
||||
cat << EOF > /etc/systemd/system/kubelet.service.d/20-etcd-service-manager.conf
|
||||
[Service]
|
||||
ExecStart=
|
||||
# Replace "systemd" with the cgroup driver of your container runtime. The default value in the kubelet is "cgroupfs".
|
||||
# Replace the value of "--container-runtime-endpoint" for a different container runtime if needed.
|
||||
ExecStart=/usr/bin/kubelet --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver=systemd --container-runtime=remote --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock
|
||||
Restart=always
|
||||
EOF
|
||||
```sh
|
||||
cat << EOF > /etc/systemd/system/kubelet.service.d/20-etcd-service-manager.conf
|
||||
[Service]
|
||||
ExecStart=
|
||||
# Replace "systemd" with the cgroup driver of your container runtime. The default value in the kubelet is "cgroupfs".
|
||||
# Replace the value of "--container-runtime-endpoint" for a different container runtime if needed.
|
||||
ExecStart=/usr/bin/kubelet --address=127.0.0.1 --pod-manifest-path=/etc/kubernetes/manifests --cgroup-driver=systemd --container-runtime=remote --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock
|
||||
Restart=always
|
||||
EOF
|
||||
|
||||
systemctl daemon-reload
|
||||
systemctl restart kubelet
|
||||
```
|
||||
systemctl daemon-reload
|
||||
systemctl restart kubelet
|
||||
```
|
||||
|
||||
Check the kubelet status to ensure it is running.
|
||||
Check the kubelet status to ensure it is running.
|
||||
|
||||
```sh
|
||||
systemctl status kubelet
|
||||
```
|
||||
```sh
|
||||
systemctl status kubelet
|
||||
```
|
||||
|
||||
1. Create configuration files for kubeadm.
|
||||
|
||||
Generate one kubeadm configuration file for each host that will have an etcd
|
||||
member running on it using the following script.
|
||||
Generate one kubeadm configuration file for each host that will have an etcd
|
||||
member running on it using the following script.
|
||||
|
||||
```sh
|
||||
# Update HOST0, HOST1 and HOST2 with the IPs of your hosts
|
||||
export HOST0=10.0.0.6
|
||||
export HOST1=10.0.0.7
|
||||
export HOST2=10.0.0.8
|
||||
```sh
|
||||
# Update HOST0, HOST1 and HOST2 with the IPs of your hosts
|
||||
export HOST0=10.0.0.6
|
||||
export HOST1=10.0.0.7
|
||||
export HOST2=10.0.0.8
|
||||
|
||||
# Update NAME0, NAME1 and NAME2 with the hostnames of your hosts
|
||||
export NAME0="infra0"
|
||||
export NAME1="infra1"
|
||||
export NAME2="infra2"
|
||||
# Update NAME0, NAME1 and NAME2 with the hostnames of your hosts
|
||||
export NAME0="infra0"
|
||||
export NAME1="infra1"
|
||||
export NAME2="infra2"
|
||||
|
||||
# Create temp directories to store files that will end up on other hosts
|
||||
mkdir -p /tmp/${HOST0}/ /tmp/${HOST1}/ /tmp/${HOST2}/
|
||||
# Create temp directories to store files that will end up on other hosts
|
||||
mkdir -p /tmp/${HOST0}/ /tmp/${HOST1}/ /tmp/${HOST2}/
|
||||
|
||||
HOSTS=(${HOST0} ${HOST1} ${HOST2})
|
||||
NAMES=(${NAME0} ${NAME1} ${NAME2})
|
||||
HOSTS=(${HOST0} ${HOST1} ${HOST2})
|
||||
NAMES=(${NAME0} ${NAME1} ${NAME2})
|
||||
|
||||
for i in "${!HOSTS[@]}"; do
|
||||
HOST=${HOSTS[$i]}
|
||||
NAME=${NAMES[$i]}
|
||||
cat << EOF > /tmp/${HOST}/kubeadmcfg.yaml
|
||||
---
|
||||
apiVersion: "kubeadm.k8s.io/v1beta3"
|
||||
kind: InitConfiguration
|
||||
nodeRegistration:
|
||||
name: ${NAME}
|
||||
localAPIEndpoint:
|
||||
advertiseAddress: ${HOST}
|
||||
---
|
||||
apiVersion: "kubeadm.k8s.io/v1beta3"
|
||||
kind: ClusterConfiguration
|
||||
etcd:
|
||||
local:
|
||||
serverCertSANs:
|
||||
- "${HOST}"
|
||||
peerCertSANs:
|
||||
- "${HOST}"
|
||||
extraArgs:
|
||||
initial-cluster: ${NAMES[0]}=https://${HOSTS[0]}:2380,${NAMES[1]}=https://${HOSTS[1]}:2380,${NAMES[2]}=https://${HOSTS[2]}:2380
|
||||
initial-cluster-state: new
|
||||
name: ${NAME}
|
||||
listen-peer-urls: https://${HOST}:2380
|
||||
listen-client-urls: https://${HOST}:2379
|
||||
advertise-client-urls: https://${HOST}:2379
|
||||
initial-advertise-peer-urls: https://${HOST}:2380
|
||||
EOF
|
||||
done
|
||||
```
|
||||
for i in "${!HOSTS[@]}"; do
|
||||
HOST=${HOSTS[$i]}
|
||||
NAME=${NAMES[$i]}
|
||||
cat << EOF > /tmp/${HOST}/kubeadmcfg.yaml
|
||||
---
|
||||
apiVersion: "kubeadm.k8s.io/v1beta3"
|
||||
kind: InitConfiguration
|
||||
nodeRegistration:
|
||||
name: ${NAME}
|
||||
localAPIEndpoint:
|
||||
advertiseAddress: ${HOST}
|
||||
---
|
||||
apiVersion: "kubeadm.k8s.io/v1beta3"
|
||||
kind: ClusterConfiguration
|
||||
etcd:
|
||||
local:
|
||||
serverCertSANs:
|
||||
- "${HOST}"
|
||||
peerCertSANs:
|
||||
- "${HOST}"
|
||||
extraArgs:
|
||||
initial-cluster: ${NAMES[0]}=https://${HOSTS[0]}:2380,${NAMES[1]}=https://${HOSTS[1]}:2380,${NAMES[2]}=https://${HOSTS[2]}:2380
|
||||
initial-cluster-state: new
|
||||
name: ${NAME}
|
||||
listen-peer-urls: https://${HOST}:2380
|
||||
listen-client-urls: https://${HOST}:2379
|
||||
advertise-client-urls: https://${HOST}:2379
|
||||
initial-advertise-peer-urls: https://${HOST}:2380
|
||||
EOF
|
||||
done
|
||||
```
|
||||
|
||||
1. Generate the certificate authority.
|
||||
|
||||
If you already have a CA then the only action that is copying the CA's `crt` and
|
||||
`key` file to `/etc/kubernetes/pki/etcd/ca.crt` and
|
||||
`/etc/kubernetes/pki/etcd/ca.key`. After those files have been copied,
|
||||
proceed to the next step, "Create certificates for each member".
|
||||
If you already have a CA then the only action that is copying the CA's `crt` and
|
||||
`key` file to `/etc/kubernetes/pki/etcd/ca.crt` and
|
||||
`/etc/kubernetes/pki/etcd/ca.key`. After those files have been copied,
|
||||
proceed to the next step, "Create certificates for each member".
|
||||
|
||||
If you do not already have a CA then run this command on `$HOST0` (where you
|
||||
generated the configuration files for kubeadm).
|
||||
If you do not already have a CA then run this command on `$HOST0` (where you
|
||||
generated the configuration files for kubeadm).
|
||||
|
||||
```
|
||||
kubeadm init phase certs etcd-ca
|
||||
```
|
||||
```
|
||||
kubeadm init phase certs etcd-ca
|
||||
```
|
||||
|
||||
This creates two files:
|
||||
This creates two files:
|
||||
|
||||
- `/etc/kubernetes/pki/etcd/ca.crt`
|
||||
- `/etc/kubernetes/pki/etcd/ca.key`
|
||||
- `/etc/kubernetes/pki/etcd/ca.crt`
|
||||
- `/etc/kubernetes/pki/etcd/ca.key`
|
||||
|
||||
1. Create certificates for each member.
|
||||
|
||||
```sh
|
||||
kubeadm init phase certs etcd-server --config=/tmp/${HOST2}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs etcd-peer --config=/tmp/${HOST2}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST2}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST2}/kubeadmcfg.yaml
|
||||
cp -R /etc/kubernetes/pki /tmp/${HOST2}/
|
||||
# cleanup non-reusable certificates
|
||||
find /etc/kubernetes/pki -not -name ca.crt -not -name ca.key -type f -delete
|
||||
```sh
|
||||
kubeadm init phase certs etcd-server --config=/tmp/${HOST2}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs etcd-peer --config=/tmp/${HOST2}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST2}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST2}/kubeadmcfg.yaml
|
||||
cp -R /etc/kubernetes/pki /tmp/${HOST2}/
|
||||
# cleanup non-reusable certificates
|
||||
find /etc/kubernetes/pki -not -name ca.crt -not -name ca.key -type f -delete
|
||||
|
||||
kubeadm init phase certs etcd-server --config=/tmp/${HOST1}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs etcd-peer --config=/tmp/${HOST1}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST1}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST1}/kubeadmcfg.yaml
|
||||
cp -R /etc/kubernetes/pki /tmp/${HOST1}/
|
||||
find /etc/kubernetes/pki -not -name ca.crt -not -name ca.key -type f -delete
|
||||
kubeadm init phase certs etcd-server --config=/tmp/${HOST1}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs etcd-peer --config=/tmp/${HOST1}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST1}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST1}/kubeadmcfg.yaml
|
||||
cp -R /etc/kubernetes/pki /tmp/${HOST1}/
|
||||
find /etc/kubernetes/pki -not -name ca.crt -not -name ca.key -type f -delete
|
||||
|
||||
kubeadm init phase certs etcd-server --config=/tmp/${HOST0}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs etcd-peer --config=/tmp/${HOST0}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST0}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST0}/kubeadmcfg.yaml
|
||||
# No need to move the certs because they are for HOST0
|
||||
kubeadm init phase certs etcd-server --config=/tmp/${HOST0}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs etcd-peer --config=/tmp/${HOST0}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs etcd-healthcheck-client --config=/tmp/${HOST0}/kubeadmcfg.yaml
|
||||
kubeadm init phase certs apiserver-etcd-client --config=/tmp/${HOST0}/kubeadmcfg.yaml
|
||||
# No need to move the certs because they are for HOST0
|
||||
|
||||
# clean up certs that should not be copied off this host
|
||||
find /tmp/${HOST2} -name ca.key -type f -delete
|
||||
find /tmp/${HOST1} -name ca.key -type f -delete
|
||||
```
|
||||
# clean up certs that should not be copied off this host
|
||||
find /tmp/${HOST2} -name ca.key -type f -delete
|
||||
find /tmp/${HOST1} -name ca.key -type f -delete
|
||||
```
|
||||
|
||||
1. Copy certificates and kubeadm configs.
|
||||
|
||||
The certificates have been generated and now they must be moved to their
|
||||
respective hosts.
|
||||
The certificates have been generated and now they must be moved to their
|
||||
respective hosts.
|
||||
|
||||
```sh
|
||||
USER=ubuntu
|
||||
HOST=${HOST1}
|
||||
scp -r /tmp/${HOST}/* ${USER}@${HOST}:
|
||||
ssh ${USER}@${HOST}
|
||||
USER@HOST $ sudo -Es
|
||||
root@HOST $ chown -R root:root pki
|
||||
root@HOST $ mv pki /etc/kubernetes/
|
||||
```
|
||||
```sh
|
||||
USER=ubuntu
|
||||
HOST=${HOST1}
|
||||
scp -r /tmp/${HOST}/* ${USER}@${HOST}:
|
||||
ssh ${USER}@${HOST}
|
||||
USER@HOST $ sudo -Es
|
||||
root@HOST $ chown -R root:root pki
|
||||
root@HOST $ mv pki /etc/kubernetes/
|
||||
```
|
||||
|
||||
1. Ensure all expected files exist.
|
||||
|
||||
The complete list of required files on `$HOST0` is:
|
||||
The complete list of required files on `$HOST0` is:
|
||||
|
||||
```
|
||||
/tmp/${HOST0}
|
||||
└── kubeadmcfg.yaml
|
||||
---
|
||||
/etc/kubernetes/pki
|
||||
├── apiserver-etcd-client.crt
|
||||
├── apiserver-etcd-client.key
|
||||
└── etcd
|
||||
├── ca.crt
|
||||
├── ca.key
|
||||
├── healthcheck-client.crt
|
||||
├── healthcheck-client.key
|
||||
├── peer.crt
|
||||
├── peer.key
|
||||
├── server.crt
|
||||
└── server.key
|
||||
```
|
||||
```
|
||||
/tmp/${HOST0}
|
||||
└── kubeadmcfg.yaml
|
||||
---
|
||||
/etc/kubernetes/pki
|
||||
├── apiserver-etcd-client.crt
|
||||
├── apiserver-etcd-client.key
|
||||
└── etcd
|
||||
├── ca.crt
|
||||
├── ca.key
|
||||
├── healthcheck-client.crt
|
||||
├── healthcheck-client.key
|
||||
├── peer.crt
|
||||
├── peer.key
|
||||
├── server.crt
|
||||
└── server.key
|
||||
```
|
||||
|
||||
On `$HOST1`:
|
||||
On `$HOST1`:
|
||||
|
||||
```
|
||||
$HOME
|
||||
└── kubeadmcfg.yaml
|
||||
---
|
||||
/etc/kubernetes/pki
|
||||
├── apiserver-etcd-client.crt
|
||||
├── apiserver-etcd-client.key
|
||||
└── etcd
|
||||
├── ca.crt
|
||||
├── healthcheck-client.crt
|
||||
├── healthcheck-client.key
|
||||
├── peer.crt
|
||||
├── peer.key
|
||||
├── server.crt
|
||||
└── server.key
|
||||
```
|
||||
```
|
||||
$HOME
|
||||
└── kubeadmcfg.yaml
|
||||
---
|
||||
/etc/kubernetes/pki
|
||||
├── apiserver-etcd-client.crt
|
||||
├── apiserver-etcd-client.key
|
||||
└── etcd
|
||||
├── ca.crt
|
||||
├── healthcheck-client.crt
|
||||
├── healthcheck-client.key
|
||||
├── peer.crt
|
||||
├── peer.key
|
||||
├── server.crt
|
||||
└── server.key
|
||||
```
|
||||
|
||||
On `$HOST2`:
|
||||
On `$HOST2`:
|
||||
|
||||
```
|
||||
$HOME
|
||||
└── kubeadmcfg.yaml
|
||||
---
|
||||
/etc/kubernetes/pki
|
||||
├── apiserver-etcd-client.crt
|
||||
├── apiserver-etcd-client.key
|
||||
└── etcd
|
||||
├── ca.crt
|
||||
├── healthcheck-client.crt
|
||||
├── healthcheck-client.key
|
||||
├── peer.crt
|
||||
├── peer.key
|
||||
├── server.crt
|
||||
└── server.key
|
||||
```
|
||||
```
|
||||
$HOME
|
||||
└── kubeadmcfg.yaml
|
||||
---
|
||||
/etc/kubernetes/pki
|
||||
├── apiserver-etcd-client.crt
|
||||
├── apiserver-etcd-client.key
|
||||
└── etcd
|
||||
├── ca.crt
|
||||
├── healthcheck-client.crt
|
||||
├── healthcheck-client.key
|
||||
├── peer.crt
|
||||
├── peer.key
|
||||
├── server.crt
|
||||
└── server.key
|
||||
```
|
||||
|
||||
1. Create the static pod manifests.
|
||||
|
||||
Now that the certificates and configs are in place it's time to create the
|
||||
manifests. On each host run the `kubeadm` command to generate a static manifest
|
||||
for etcd.
|
||||
Now that the certificates and configs are in place it's time to create the
|
||||
manifests. On each host run the `kubeadm` command to generate a static manifest
|
||||
for etcd.
|
||||
|
||||
```sh
|
||||
root@HOST0 $ kubeadm init phase etcd local --config=/tmp/${HOST0}/kubeadmcfg.yaml
|
||||
root@HOST1 $ kubeadm init phase etcd local --config=$HOME/kubeadmcfg.yaml
|
||||
root@HOST2 $ kubeadm init phase etcd local --config=$HOME/kubeadmcfg.yaml
|
||||
```
|
||||
```sh
|
||||
root@HOST0 $ kubeadm init phase etcd local --config=/tmp/${HOST0}/kubeadmcfg.yaml
|
||||
root@HOST1 $ kubeadm init phase etcd local --config=$HOME/kubeadmcfg.yaml
|
||||
root@HOST2 $ kubeadm init phase etcd local --config=$HOME/kubeadmcfg.yaml
|
||||
```
|
||||
|
||||
1. Optional: Check the cluster health.
|
||||
|
||||
If `etcdctl` isn't available, you can run this tool inside a container image.
|
||||
You would do that directly with your container runtime using a tool such as
|
||||
`crictl run` and not through Kubernetes
|
||||
|
||||
```sh
|
||||
docker run --rm -it \
|
||||
--net host \
|
||||
-v /etc/kubernetes:/etc/kubernetes registry.k8s.io/etcd:${ETCD_TAG} etcdctl \
|
||||
ETCDCTL_API=3 etcdctl \
|
||||
--cert /etc/kubernetes/pki/etcd/peer.crt \
|
||||
--key /etc/kubernetes/pki/etcd/peer.key \
|
||||
--cacert /etc/kubernetes/pki/etcd/ca.crt \
|
||||
--endpoints https://${HOST0}:2379 endpoint health --cluster
|
||||
--endpoints https://${HOST0}:2379 endpoint health
|
||||
...
|
||||
https://[HOST0 IP]:2379 is healthy: successfully committed proposal: took = 16.283339ms
|
||||
https://[HOST1 IP]:2379 is healthy: successfully committed proposal: took = 19.44402ms
|
||||
https://[HOST2 IP]:2379 is healthy: successfully committed proposal: took = 35.926451ms
|
||||
```
|
||||
- Set `${ETCD_TAG}` to the version tag of your etcd image. For example `3.4.3-0`. To see the etcd image and tag that kubeadm uses execute `kubeadm config images list --kubernetes-version ${K8S_VERSION}`, where `${K8S_VERSION}` is for example `v1.17.0`.
|
||||
|
||||
- Set `${HOST0}`to the IP address of the host you are testing.
|
||||
|
||||
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
|
||||
Once you have an etcd cluster with 3 working members, you can continue setting up a
|
||||
highly available control plane using the
|
||||
[external etcd method with kubeadm](/docs/setup/production-environment/tools/kubeadm/high-availability/).
|
||||
|
||||
|
|
|
@ -7,20 +7,18 @@ min-kubernetes-server-version: 1.19
|
|||
|
||||
<!-- overview -->
|
||||
|
||||
An [Ingress](/docs/concepts/services-networking/ingress/) is an API object that defines rules which allow external access
|
||||
to services in a cluster. An [Ingress controller](/docs/concepts/services-networking/ingress-controllers/) fulfills the rules set in the Ingress.
|
||||
|
||||
This page shows you how to set up a simple Ingress which routes requests to Service web or web2 depending on the HTTP URI.
|
||||
|
||||
An [Ingress](/docs/concepts/services-networking/ingress/) is an API object that defines rules
|
||||
which allow external access to services in a cluster. An
|
||||
[Ingress controller](/docs/concepts/services-networking/ingress-controllers/)
|
||||
fulfills the rules set in the Ingress.
|
||||
|
||||
This page shows you how to set up a simple Ingress which routes requests to Service 'web' or
|
||||
'web2' depending on the HTTP URI.
|
||||
|
||||
## {{% heading "prerequisites" %}}
|
||||
|
||||
|
||||
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
|
||||
If you are using an older Kubernetes version, switch to the documentation
|
||||
for that version.
|
||||
|
||||
If you are using an older Kubernetes version, switch to the documentation for that version.
|
||||
|
||||
### Create a Minikube cluster
|
||||
|
||||
|
@ -37,49 +35,60 @@ Locally
|
|||
|
||||
1. To enable the NGINX Ingress controller, run the following command:
|
||||
|
||||
```shell
|
||||
minikube addons enable ingress
|
||||
```
|
||||
```shell
|
||||
minikube addons enable ingress
|
||||
```
|
||||
|
||||
1. Verify that the NGINX Ingress controller is running
|
||||
|
||||
|
||||
{{< tabs name="tab_with_md" >}}
|
||||
{{% tab name="minikube v1.19 or later" %}}
|
||||
```shell
|
||||
kubectl get pods -n ingress-nginx
|
||||
```
|
||||
{{< note >}}It can take up to a minute before you see these pods running OK.{{< /note >}}
|
||||
|
||||
```shell
|
||||
kubectl get pods -n ingress-nginx
|
||||
```
|
||||
|
||||
{{< note >}}
|
||||
It can take up to a minute before you see these pods running OK.
|
||||
{{< /note >}}
|
||||
|
||||
The output is similar to:
|
||||
|
||||
```
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
ingress-nginx-admission-create-g9g49 0/1 Completed 0 11m
|
||||
ingress-nginx-admission-patch-rqp78 0/1 Completed 1 11m
|
||||
ingress-nginx-controller-59b45fb494-26npt 1/1 Running 0 11m
|
||||
```
|
||||
```none
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
ingress-nginx-admission-create-g9g49 0/1 Completed 0 11m
|
||||
ingress-nginx-admission-patch-rqp78 0/1 Completed 1 11m
|
||||
ingress-nginx-controller-59b45fb494-26npt 1/1 Running 0 11m
|
||||
```
|
||||
{{% /tab %}}
|
||||
|
||||
{{% tab name="minikube v1.18.1 or earlier" %}}
|
||||
```shell
|
||||
kubectl get pods -n kube-system
|
||||
```
|
||||
{{< note >}}It can take up to a minute before you see these pods running OK.{{< /note >}}
|
||||
|
||||
```shell
|
||||
kubectl get pods -n kube-system
|
||||
```
|
||||
|
||||
{{< note >}}
|
||||
It can take up to a minute before you see these pods running OK.
|
||||
{{< /note >}}
|
||||
|
||||
The output is similar to:
|
||||
|
||||
```
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
default-http-backend-59868b7dd6-xb8tq 1/1 Running 0 1m
|
||||
kube-addon-manager-minikube 1/1 Running 0 3m
|
||||
kube-dns-6dcb57bcc8-n4xd4 3/3 Running 0 2m
|
||||
kubernetes-dashboard-5498ccf677-b8p5h 1/1 Running 0 2m
|
||||
nginx-ingress-controller-5984b97644-rnkrg 1/1 Running 0 1m
|
||||
storage-provisioner 1/1 Running 0 2m
|
||||
```
|
||||
```none
|
||||
NAME READY STATUS RESTARTS AGE
|
||||
default-http-backend-59868b7dd6-xb8tq 1/1 Running 0 1m
|
||||
kube-addon-manager-minikube 1/1 Running 0 3m
|
||||
kube-dns-6dcb57bcc8-n4xd4 3/3 Running 0 2m
|
||||
kubernetes-dashboard-5498ccf677-b8p5h 1/1 Running 0 2m
|
||||
nginx-ingress-controller-5984b97644-rnkrg 1/1 Running 0 1m
|
||||
storage-provisioner 1/1 Running 0 2m
|
||||
```
|
||||
|
||||
Make sure that you see a Pod with a name that starts with `nginx-ingress-controller-`.
|
||||
|
||||
Make sure that you see a Pod with a name that starts with `nginx-ingress-controller-`.
|
||||
{{% /tab %}}
|
||||
|
||||
{{< /tabs >}}
|
||||
|
||||
## Deploy a hello, world app
|
||||
|
@ -92,7 +101,7 @@ storage-provisioner 1/1 Running 0 2m
|
|||
|
||||
The output should be:
|
||||
|
||||
```
|
||||
```none
|
||||
deployment.apps/web created
|
||||
```
|
||||
|
||||
|
@ -104,19 +113,19 @@ storage-provisioner 1/1 Running 0 2m
|
|||
|
||||
The output should be:
|
||||
|
||||
```
|
||||
```none
|
||||
service/web exposed
|
||||
```
|
||||
|
||||
1. Verify the Service is created and is available on a node port:
|
||||
|
||||
```shell
|
||||
```shell
|
||||
kubectl get service web
|
||||
```
|
||||
|
||||
The output is similar to:
|
||||
|
||||
```
|
||||
```none
|
||||
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
||||
web NodePort 10.104.133.249 <none> 8080:31637/TCP 12m
|
||||
```
|
||||
|
@ -129,26 +138,31 @@ storage-provisioner 1/1 Running 0 2m
|
|||
|
||||
The output is similar to:
|
||||
|
||||
```
|
||||
```none
|
||||
http://172.17.0.15:31637
|
||||
```
|
||||
|
||||
{{< note >}}Katacoda environment only: at the top of the terminal panel, click the plus sign, and then click **Select port to view on Host 1**. Enter the NodePort, in this case `31637`, and then click **Display Port**.{{< /note >}}
|
||||
{{< note >}}
|
||||
Katacoda environment only: at the top of the terminal panel, click the plus sign,
|
||||
and then click **Select port to view on Host 1**. Enter the NodePort value,
|
||||
in this case `31637`, and then click **Display Port**.
|
||||
{{< /note >}}
|
||||
|
||||
The output is similar to:
|
||||
|
||||
```
|
||||
```none
|
||||
Hello, world!
|
||||
Version: 1.0.0
|
||||
Hostname: web-55b8c6998d-8k564
|
||||
```
|
||||
|
||||
You can now access the sample app via the Minikube IP address and NodePort. The next step lets you access
|
||||
the app using the Ingress resource.
|
||||
You can now access the sample application via the Minikube IP address and NodePort.
|
||||
The next step lets you access the application using the Ingress resource.
|
||||
|
||||
## Create an Ingress
|
||||
|
||||
The following manifest defines an Ingress that sends traffic to your Service via hello-world.info.
|
||||
The following manifest defines an Ingress that sends traffic to your Service via
|
||||
`hello-world.info`.
|
||||
|
||||
1. Create `example-ingress.yaml` from the following file:
|
||||
|
||||
|
@ -162,7 +176,7 @@ The following manifest defines an Ingress that sends traffic to your Service via
|
|||
|
||||
The output should be:
|
||||
|
||||
```
|
||||
```none
|
||||
ingress.networking.k8s.io/example-ingress created
|
||||
```
|
||||
|
||||
|
@ -172,11 +186,13 @@ The following manifest defines an Ingress that sends traffic to your Service via
|
|||
kubectl get ingress
|
||||
```
|
||||
|
||||
{{< note >}}This can take a couple of minutes.{{< /note >}}
|
||||
{{< note >}}
|
||||
This can take a couple of minutes.
|
||||
{{< /note >}}
|
||||
|
||||
You should see an IPv4 address in the ADDRESS column; for example:
|
||||
You should see an IPv4 address in the `ADDRESS` column; for example:
|
||||
|
||||
```
|
||||
```none
|
||||
NAME CLASS HOSTS ADDRESS PORTS AGE
|
||||
example-ingress <none> hello-world.info 172.17.0.15 80 38s
|
||||
```
|
||||
|
@ -184,30 +200,35 @@ The following manifest defines an Ingress that sends traffic to your Service via
|
|||
1. Add the following line to the bottom of the `/etc/hosts` file on
|
||||
your computer (you will need administrator access):
|
||||
|
||||
```
|
||||
```none
|
||||
172.17.0.15 hello-world.info
|
||||
```
|
||||
|
||||
{{< note >}}If you are running Minikube locally, use `minikube ip` to get the external IP. The IP address displayed within the ingress list will be the internal IP.{{< /note >}}
|
||||
{{< note >}}
|
||||
If you are running Minikube locally, use `minikube ip` to get the external IP.
|
||||
The IP address displayed within the ingress list will be the internal IP.
|
||||
{{< /note >}}
|
||||
|
||||
After you make this change, your web browser sends requests for
|
||||
hello-world.info URLs to Minikube.
|
||||
After you make this change, your web browser sends requests for
|
||||
`hello-world.info` URLs to Minikube.
|
||||
|
||||
1. Verify that the Ingress controller is directing traffic:
|
||||
|
||||
```shell
|
||||
curl hello-world.info
|
||||
```
|
||||
```shell
|
||||
curl hello-world.info
|
||||
```
|
||||
|
||||
You should see:
|
||||
You should see:
|
||||
|
||||
```
|
||||
Hello, world!
|
||||
Version: 1.0.0
|
||||
Hostname: web-55b8c6998d-8k564
|
||||
```
|
||||
```none
|
||||
Hello, world!
|
||||
Version: 1.0.0
|
||||
Hostname: web-55b8c6998d-8k564
|
||||
```
|
||||
|
||||
{{< note >}}If you are running Minikube locally, you can visit hello-world.info from your browser.{{< /note >}}
|
||||
{{< note >}}
|
||||
If you are running Minikube locally, you can visit `hello-world.info` from your browser.
|
||||
{{< /note >}}
|
||||
|
||||
## Create a second Deployment
|
||||
|
||||
|
@ -216,9 +237,10 @@ The following manifest defines an Ingress that sends traffic to your Service via
|
|||
```shell
|
||||
kubectl create deployment web2 --image=gcr.io/google-samples/hello-app:2.0
|
||||
```
|
||||
|
||||
The output should be:
|
||||
|
||||
```
|
||||
```none
|
||||
deployment.apps/web2 created
|
||||
```
|
||||
|
||||
|
@ -230,7 +252,7 @@ The following manifest defines an Ingress that sends traffic to your Service via
|
|||
|
||||
The output should be:
|
||||
|
||||
```
|
||||
```none
|
||||
service/web2 exposed
|
||||
```
|
||||
|
||||
|
@ -240,13 +262,13 @@ The following manifest defines an Ingress that sends traffic to your Service via
|
|||
following lines at the end:
|
||||
|
||||
```yaml
|
||||
- path: /v2
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: web2
|
||||
port:
|
||||
number: 8080
|
||||
- path: /v2
|
||||
pathType: Prefix
|
||||
backend:
|
||||
service:
|
||||
name: web2
|
||||
port:
|
||||
number: 8080
|
||||
```
|
||||
|
||||
1. Apply the changes:
|
||||
|
@ -257,7 +279,7 @@ The following manifest defines an Ingress that sends traffic to your Service via
|
|||
|
||||
You should see:
|
||||
|
||||
```
|
||||
```none
|
||||
ingress.networking/example-ingress configured
|
||||
```
|
||||
|
||||
|
@ -271,7 +293,7 @@ The following manifest defines an Ingress that sends traffic to your Service via
|
|||
|
||||
The output is similar to:
|
||||
|
||||
```
|
||||
```none
|
||||
Hello, world!
|
||||
Version: 1.0.0
|
||||
Hostname: web-55b8c6998d-8k564
|
||||
|
@ -285,16 +307,16 @@ The following manifest defines an Ingress that sends traffic to your Service via
|
|||
|
||||
The output is similar to:
|
||||
|
||||
```
|
||||
```none
|
||||
Hello, world!
|
||||
Version: 2.0.0
|
||||
Hostname: web2-75cd47646f-t8cjk
|
||||
```
|
||||
|
||||
{{< note >}}If you are running Minikube locally, you can visit hello-world.info and hello-world.info/v2 from your browser.{{< /note >}}
|
||||
|
||||
|
||||
|
||||
{{< note >}}
|
||||
If you are running Minikube locally, you can visit `hello-world.info` and
|
||||
`hello-world.info/v2` from your browser.
|
||||
{{< /note >}}
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
|
|
|
@ -10,26 +10,15 @@ This page shows how to create a Kubernetes Service object that external
|
|||
clients can use to access an application running in a cluster. The Service
|
||||
provides load balancing for an application that has two running instances.
|
||||
|
||||
|
||||
|
||||
|
||||
## {{% heading "prerequisites" %}}
|
||||
|
||||
|
||||
{{< include "task-tutorial-prereqs.md" >}}
|
||||
|
||||
|
||||
|
||||
|
||||
## {{% heading "objectives" %}}
|
||||
|
||||
|
||||
* Run two instances of a Hello World application.
|
||||
* Create a Service object that exposes a node port.
|
||||
* Use the Service object to access the running application.
|
||||
|
||||
|
||||
|
||||
- Run two instances of a Hello World application.
|
||||
- Create a Service object that exposes a node port.
|
||||
- Use the Service object to access the running application.
|
||||
|
||||
<!-- lessoncontent -->
|
||||
|
||||
|
@ -41,9 +30,11 @@ Here is the configuration file for the application Deployment:
|
|||
|
||||
1. Run a Hello World application in your cluster:
|
||||
Create the application Deployment using the file above:
|
||||
|
||||
```shell
|
||||
kubectl apply -f https://k8s.io/examples/service/access/hello-application.yaml
|
||||
```
|
||||
|
||||
The preceding command creates a
|
||||
{{< glossary_tooltip text="Deployment" term_id="deployment" >}}
|
||||
and an associated
|
||||
|
@ -52,30 +43,35 @@ Here is the configuration file for the application Deployment:
|
|||
{{< glossary_tooltip text="Pods" term_id="pod" >}}
|
||||
each of which runs the Hello World application.
|
||||
|
||||
|
||||
1. Display information about the Deployment:
|
||||
|
||||
```shell
|
||||
kubectl get deployments hello-world
|
||||
kubectl describe deployments hello-world
|
||||
```
|
||||
|
||||
1. Display information about your ReplicaSet objects:
|
||||
|
||||
```shell
|
||||
kubectl get replicasets
|
||||
kubectl describe replicasets
|
||||
```
|
||||
|
||||
1. Create a Service object that exposes the deployment:
|
||||
|
||||
```shell
|
||||
kubectl expose deployment hello-world --type=NodePort --name=example-service
|
||||
```
|
||||
|
||||
1. Display information about the Service:
|
||||
|
||||
```shell
|
||||
kubectl describe services example-service
|
||||
```
|
||||
|
||||
The output is similar to this:
|
||||
```shell
|
||||
|
||||
```none
|
||||
Name: example-service
|
||||
Namespace: default
|
||||
Labels: run=load-balancer-example
|
||||
|
@ -90,19 +86,24 @@ Here is the configuration file for the application Deployment:
|
|||
Session Affinity: None
|
||||
Events: <none>
|
||||
```
|
||||
|
||||
Make a note of the NodePort value for the service. For example,
|
||||
in the preceding output, the NodePort value is 31496.
|
||||
|
||||
1. List the pods that are running the Hello World application:
|
||||
|
||||
```shell
|
||||
kubectl get pods --selector="run=load-balancer-example" --output=wide
|
||||
```
|
||||
|
||||
The output is similar to this:
|
||||
```shell
|
||||
|
||||
```none
|
||||
NAME READY STATUS ... IP NODE
|
||||
hello-world-2895499144-bsbk5 1/1 Running ... 10.200.1.4 worker1
|
||||
hello-world-2895499144-m1pwt 1/1 Running ... 10.200.2.5 worker2
|
||||
```
|
||||
|
||||
1. Get the public IP address of one of your nodes that is running
|
||||
a Hello World pod. How you get this address depends on how you set
|
||||
up your cluster. For example, if you are using Minikube, you can
|
||||
|
@ -117,13 +118,16 @@ Here is the configuration file for the application Deployment:
|
|||
cloud providers offer different ways of configuring firewall rules.
|
||||
|
||||
1. Use the node address and node port to access the Hello World application:
|
||||
|
||||
```shell
|
||||
curl http://<public-node-ip>:<node-port>
|
||||
```
|
||||
|
||||
where `<public-node-ip>` is the public IP address of your node,
|
||||
and `<node-port>` is the NodePort value for your service. The
|
||||
response to a successful request is a hello message:
|
||||
```shell
|
||||
|
||||
```none
|
||||
Hello Kubernetes!
|
||||
```
|
||||
|
||||
|
@ -133,12 +137,8 @@ As an alternative to using `kubectl expose`, you can use a
|
|||
[service configuration file](/docs/concepts/services-networking/service/)
|
||||
to create a Service.
|
||||
|
||||
|
||||
|
||||
|
||||
## {{% heading "cleanup" %}}
|
||||
|
||||
|
||||
To delete the Service, enter this command:
|
||||
|
||||
kubectl delete services example-service
|
||||
|
@ -148,9 +148,6 @@ the Hello World application, enter this command:
|
|||
|
||||
kubectl delete deployment hello-world
|
||||
|
||||
|
||||
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
Follow the
|
||||
|
|
|
@ -100,4 +100,4 @@ release with a newer device plugin API version, device plugins must be upgraded
|
|||
both version before the node is upgraded in order to guarantee that device allocations
|
||||
continue to complete successfully during the upgrade.
|
||||
|
||||
Refer to [API compatiblity](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins.md/#api-compatibility) and [Kubelet Device Manager API Versions](/docs/reference/node/device-plugin-api-versions.md) for more details.
|
||||
Refer to [API compatibility](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#api-compatibility) and [Kubelet Device Manager API Versions](/docs/reference/node/device-plugin-api-versions/) for more details.
|
|
@ -4,24 +4,16 @@ content_type: task
|
|||
weight: 70
|
||||
---
|
||||
|
||||
|
||||
<!-- overview -->
|
||||
|
||||
This page shows how to specify extended resources for a Node.
|
||||
Extended resources allow cluster administrators to advertise node-level
|
||||
resources that would otherwise be unknown to Kubernetes.
|
||||
|
||||
|
||||
|
||||
|
||||
## {{% heading "prerequisites" %}}
|
||||
|
||||
|
||||
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
|
||||
|
||||
|
||||
|
||||
|
||||
<!-- steps -->
|
||||
|
||||
## Get the names of your Nodes
|
||||
|
@ -39,7 +31,7 @@ the Kubernetes API server. For example, suppose one of your Nodes has four dongl
|
|||
attached. Here's an example of a PATCH request that advertises four dongle resources
|
||||
for your Node.
|
||||
|
||||
```shell
|
||||
```
|
||||
PATCH /api/v1/nodes/<your-node-name>/status HTTP/1.1
|
||||
Accept: application/json
|
||||
Content-Type: application/json-patch+json
|
||||
|
@ -69,9 +61,9 @@ Replace `<your-node-name>` with the name of your Node:
|
|||
|
||||
```shell
|
||||
curl --header "Content-Type: application/json-patch+json" \
|
||||
--request PATCH \
|
||||
--data '[{"op": "add", "path": "/status/capacity/example.com~1dongle", "value": "4"}]' \
|
||||
http://localhost:8001/api/v1/nodes/<your-node-name>/status
|
||||
--request PATCH \
|
||||
--data '[{"op": "add", "path": "/status/capacity/example.com~1dongle", "value": "4"}]' \
|
||||
http://localhost:8001/api/v1/nodes/<your-node-name>/status
|
||||
```
|
||||
|
||||
{{< note >}}
|
||||
|
@ -100,9 +92,9 @@ Once again, the output shows the dongle resource:
|
|||
|
||||
```yaml
|
||||
Capacity:
|
||||
cpu: 2
|
||||
memory: 2049008Ki
|
||||
example.com/dongle: 4
|
||||
cpu: 2
|
||||
memory: 2049008Ki
|
||||
example.com/dongle: 4
|
||||
```
|
||||
|
||||
Now, application developers can create Pods that request a certain
|
||||
|
@ -178,9 +170,9 @@ Replace `<your-node-name>` with the name of your Node:
|
|||
|
||||
```shell
|
||||
curl --header "Content-Type: application/json-patch+json" \
|
||||
--request PATCH \
|
||||
--data '[{"op": "remove", "path": "/status/capacity/example.com~1dongle"}]' \
|
||||
http://localhost:8001/api/v1/nodes/<your-node-name>/status
|
||||
--request PATCH \
|
||||
--data '[{"op": "remove", "path": "/status/capacity/example.com~1dongle"}]' \
|
||||
http://localhost:8001/api/v1/nodes/<your-node-name>/status
|
||||
```
|
||||
|
||||
Verify that the dongle advertisement has been removed:
|
||||
|
@ -191,20 +183,13 @@ kubectl describe node <your-node-name> | grep dongle
|
|||
|
||||
(you should not see any output)
|
||||
|
||||
|
||||
|
||||
|
||||
## {{% heading "whatsnext" %}}
|
||||
|
||||
|
||||
### For application developers
|
||||
|
||||
* [Assign Extended Resources to a Container](/docs/tasks/configure-pod-container/extended-resource/)
|
||||
- [Assign Extended Resources to a Container](/docs/tasks/configure-pod-container/extended-resource/)
|
||||
|
||||
### For cluster administrators
|
||||
|
||||
* [Configure Minimum and Maximum Memory Constraints for a Namespace](/docs/tasks/administer-cluster/manage-resources/memory-constraint-namespace/)
|
||||
* [Configure Minimum and Maximum CPU Constraints for a Namespace](/docs/tasks/administer-cluster/manage-resources/cpu-constraint-namespace/)
|
||||
|
||||
|
||||
|
||||
- [Configure Minimum and Maximum Memory Constraints for a Namespace](/docs/tasks/administer-cluster/manage-resources/memory-constraint-namespace/)
|
||||
- [Configure Minimum and Maximum CPU Constraints for a Namespace](/docs/tasks/administer-cluster/manage-resources/cpu-constraint-namespace/)
|
||||
|
|
|
@ -83,8 +83,8 @@ providers:
|
|||
#
|
||||
# A match exists between an image and a matchImage when all of the below are true:
|
||||
# - Both contain the same number of domain parts and each part matches.
|
||||
# - The URL path of an imageMatch must be a prefix of the target image URL path.
|
||||
# - If the imageMatch contains a port, then the port must match in the image as well.
|
||||
# - The URL path of an matchImages must be a prefix of the target image URL path.
|
||||
# - If the matchImages contains a port, then the port must match in the image as well.
|
||||
#
|
||||
# Example values of matchImages:
|
||||
# - 123456789.dkr.ecr.us-east-1.amazonaws.com
|
||||
|
@ -143,7 +143,7 @@ A match exists between an image name and a `matchImage` entry when all of the be
|
|||
|
||||
* Both contain the same number of domain parts and each part matches.
|
||||
* The URL path of match image must be a prefix of the target image URL path.
|
||||
* If the imageMatch contains a port, then the port must match in the image as well.
|
||||
* If the matchImages contains a port, then the port must match in the image as well.
|
||||
|
||||
Some example values of `matchImages` patterns are:
|
||||
|
||||
|
|
|
@ -111,11 +111,11 @@ This is the default policy and does not affect the memory allocation in any way.
|
|||
It acts the same as if the Memory Manager is not present at all.
|
||||
|
||||
The `None` policy returns default topology hint. This special hint denotes that Hint Provider
|
||||
(Memory Manger in this case) has no preference for NUMA affinity with any resource.
|
||||
(Memory Manager in this case) has no preference for NUMA affinity with any resource.
|
||||
|
||||
#### Static policy {#policy-static}
|
||||
|
||||
In the case of the `Guaranteed` pod, the `Static` Memory Manger policy returns topology hints
|
||||
In the case of the `Guaranteed` pod, the `Static` Memory Manager policy returns topology hints
|
||||
relating to the set of NUMA nodes where the memory can be guaranteed,
|
||||
and reserves the memory through updating the internal [NodeMap][2] object.
|
||||
|
||||
|
|
|
@ -43,7 +43,7 @@ Decide whether you want to deploy a [cloud](#creating-a-calico-cluster-with-goog
|
|||
## Creating a local Calico cluster with kubeadm
|
||||
|
||||
To get a local single-host Calico cluster in fifteen minutes using kubeadm, refer to the
|
||||
[Calico Quickstart](https://docs.projectcalico.org/latest/getting-started/kubernetes/).
|
||||
[Calico Quickstart](https://projectcalico.docs.tigera.io/getting-started/kubernetes/).
|
||||
|
||||
|
||||
|
||||
|
|
Some files were not shown because too many files have changed in this diff Show More
Loading…
Reference in New Issue