Merge main into dev-1.29 to keep in sync

pull/43347/head
Kat Cosgrove 2023-10-06 15:36:13 +01:00
commit ad943fce2f
No known key found for this signature in database
GPG Key ID: 9909DF75A59E4ABC
858 changed files with 85342 additions and 17492 deletions

View File

@ -32,17 +32,17 @@ aliases:
- annajung
- bradtopol
- divya-mohan0209
- katcosgrove # RT 1.29 Docs Lead
- kbhawkey
- natalisucks
- nate-double-u
- onlydole
- reylejano
- Rishit-dagli # 1.28 Release Team Docs Lead
- sftim
- tengqm
sig-docs-en-reviews: # PR reviews for English content
- bradtopol
- dipesh-rawat
- dipesh-rawat
- divya-mohan0209
- kbhawkey
- mehabhalodiya
@ -54,6 +54,7 @@ aliases:
- sftim
- shannonxtreme
- tengqm
- windsonsea
sig-docs-es-owners: # Admins for Spanish content
- 92nqb
- krol3
@ -209,13 +210,13 @@ aliases:
- Potapy4
# authoritative source: git.k8s.io/community/OWNERS_ALIASES
committee-steering: # provide PR approvals for announcements
- cblecker
- cpanato
- bentheelder
- justaugustus
- mrbobbytables
- pacoxu
- palnabarun
- tpepper
- pohly
- soltysh
# authoritative source: https://git.k8s.io/sig-release/OWNERS_ALIASES
sig-release-leads:
- cpanato # SIG Technical Lead

View File

@ -3,11 +3,11 @@
[![Build Status](https://api.travis-ci.org/kubernetes/website.svg?branch=master)](https://travis-ci.org/kubernetes/website)
[![GitHub release](https://img.shields.io/github/release/kubernetes/website.svg)](https://github.com/kubernetes/website/releases/latest)
स्वागत ह! इस रिपॉजिटरी में [कुबरनेट्स वेबसाइट और दस्तावेज](https://kubernetes.io/) बनाने के लिए आवश्यक सभी संपत्तियां हैं। हम बहुत खुश हैं कि आप योगदान करना चाहते हैं!
स्वागत ह! इस रिपॉजिटरी में [कुबरनेट्स वेबसाइट और दस्तावेज](https://kubernetes.io/) बनाने के लिए आवश्यक सभी संपत्तियां हैं। हम बहुत खुश हैं कि आप योगदान करना चाहते हैं!
## डॉक्स में योगदान देना
आप अपने GitHub खाते में इस रिपॉजिटरी की एक copy बनाने के लिए स्क्रीन के ऊपरी-दाएँ क्षेत्र में **Fork** बटन पर क्लिक करें। इस copy को *Fork* कहा जाता है। अपने fork में परिवर्तन करने के बाद जब आप उनको हमारे पास भेजने के लिए तैयार हों, तो अपने fork पर जाए और हमें इसके बारे में बताने के लिए एक नया pull request बनाएं।
आप अपने GitHub खाते में इस रिपॉजिटरी की एक copy बनाने के लिए स्क्रीन के ऊपरी-दाएँ क्षेत्र में **Fork** बटन पर क्लिक करें। इस copy को *Fork* कहा जाता है। अपने fork में परिवर्तन करने के बाद जब आप उनको हमारे पास भेजने के लिए तैयार हों, तो अपने fork पर जाए और हमें इसके बारे में बताने के लिए एक नया pull request बनाएं।
एक बार जब आपका pull request बन जाता है, तो एक कुबरनेट्स समीक्षक स्पष्ट, कार्रवाई योग्य प्रतिक्रिया प्रदान करने की जिम्मेदारी लेगा। pull request के मालिक के रूप में, **यह आपकी जिम्मेदारी है कि आप कुबरनेट्स समीक्षक द्वारा प्रदान की गई प्रतिक्रिया को संबोधित करने के लिए अपने pull request को संशोधित करें।**

View File

@ -2,11 +2,11 @@
[![Netlify Status](https://api.netlify.com/api/v1/badges/be93b718-a6df-402a-b4a4-855ba186c97d/deploy-status)](https://app.netlify.com/sites/kubernetes-io-main-staging/deploys) [![GitHub release](https://img.shields.io/github/release/kubernetes/website.svg)](https://github.com/kubernetes/website/releases/latest)
Bem-vindos! Este repositório contém todos os recursos necessários para criar o [website e documentação do Kubernetes](https://kubernetes.io/). Estamos muito satisfeitos por você querer contribuir!
Este repositório contém todos os recursos necessários para criar o [website e documentação do Kubernetes](https://kubernetes.io/). Ficamos felizes por você querer contribuir!
# Utilizando este repositório
## Utilizando este repositório
Você pode executar o website localmente utilizando o Hugo (versão Extended), ou você pode executa-ló em um container runtime. É altamente recomendável utilizar um container runtime, pois garante a consistência na implantação do website real.
Você pode executar o website localmente utilizando o [Hugo (versão Extended)](https://gohugo.io/), ou você pode executá-lo em um agente de execução de contêiner. É altamente recomendável utilizar um agente de execução de contêiner, pois este fornece consistência de implantação em relação ao website real.
## Pré-requisitos
@ -24,22 +24,33 @@ git clone https://github.com/kubernetes/website.git
cd website
```
O website do Kubernetes utiliza o [tema Docsy Hugo](https://github.com/google/docsy#readme). Mesmo se você planeje executar o website em um container, é altamente recomendado baixar os submódulos e outras dependências executando o seguinte comando:
O website do Kubernetes utiliza o [tema Docsy Hugo](https://github.com/google/docsy#readme). Mesmo que você planeje executar o website em um contêiner, é altamente recomendado baixar os submódulos e outras dependências executando o seguinte comando:
```
# Baixar o submódulo Docsy
## Windows
```powershell
# Obter dependências e outros submódulos
git submodule update --init --recursive --depth 1
```
## Linux / outros Unix
```bash
# Obter dependências e outros submódulos
make module-init
```
## Executando o website usando um container
Para executar o build do website em um container, execute o comando abaixo para criar a imagem do container e executa-lá:
Para executar o build do website em um contêiner, execute o comando abaixo:
```
make container-image
```bash
# Você pode definir a variável $CONTAINER_ENGINE com o nome do agente de execução de contêiner utilizado.
make container-serve
```
Caso ocorram erros, é provável que o contêiner que está executando o Hugo não tenha recursos suficientes. A solução é aumentar a quantidade de CPU e memória disponível para o Docker ([MacOSX](https://docs.docker.com/docker-for-mac/#resources) e [Windows](https://docs.docker.com/docker-for-windows/#resources)).
Abra seu navegador em http://localhost:1313 para visualizar o website. Conforme você faz alterações nos arquivos fontes, o Hugo atualiza o website e força a atualização do navegador.
## Executando o website localmente utilizando o Hugo
@ -54,7 +65,7 @@ npm ci
make serve
```
Isso iniciará localmente o Hugo na porta 1313. Abra o seu navegador em http://localhost:1313 para visualizar o website. Conforme você faz alterações nos arquivos fontes, o Hugo atualiza o website e força uma atualização no navegador.
O Hugo iniciará localmente na porta 1313. Abra o seu navegador em http://localhost:1313 para visualizar o website. Conforme você faz alterações nos arquivos fontes, o Hugo atualiza o website e força uma atualização no navegador.
## Construindo a página de referência da API
@ -62,31 +73,21 @@ A página de referência da API localizada em `content/en/docs/reference/kuberne
Siga os passos abaixo para atualizar a página de referência para uma nova versão do Kubernetes:
OBS: modifique o "v1.20" no exemplo a seguir pela versão a ser atualizada
1. Obter o submódulo `kubernetes-resources-reference`:
1. Obter o submódulo `api-ref-generator`:
```
git submodule update --init --recursive --depth 1
```
2. Criar a nova versão da API no submódulo e adicionar à especificação do Swagger:
2. Atualizar a especificação do Swagger:
```
mkdir api-ref-generator/gen-resourcesdocs/api/v1.20
curl 'https://raw.githubusercontent.com/kubernetes/kubernetes/master/api/openapi-spec/swagger.json' > api-ref-generator/gen-resourcesdocs/api/v1.20/swagger.json
```bash
curl 'https://raw.githubusercontent.com/kubernetes/kubernetes/master/api/openapi-spec/swagger.json' > api-ref-generator/
```
3. Copiar o sumário e os campos de configuração para a nova versão a partir da versão anterior:
3. Ajustar os arquivos `toc.yaml` e `fields.yaml` para refletir as mudanças entre as duas versões.
```
mkdir api-ref-generator/gen-resourcesdocs/api/v1.20
cp api-ref-generator/gen-resourcesdocs/api/v1.19/* api-ref-generator/gen-resourcesdocs/api/v1.20/
```
4. Ajustar os arquivos `toc.yaml` e `fields.yaml` para refletir as mudanças entre as duas versões.
5. Em seguida, gerar as páginas:
4. Em seguida, gerar as páginas:
```
make api-reference
@ -101,7 +102,7 @@ make container-serve
Abra o seu navegador em http://localhost:1313/docs/reference/kubernetes-api/ para visualizar a página de referência da API.
6. Quando todas as mudanças forem refletidas nos arquivos de configuração `toc.yaml` e `fields.yaml`, crie um pull request com a nova página de referência de API.
5. Quando todas as mudanças forem refletidas nos arquivos de configuração `toc.yaml` e `fields.yaml`, crie um pull request com a nova página de referência de API.
## Troubleshooting
### error: failed to transform resource: TOCSS: failed to transform "scss/main.scss" (text/x-scss): this feature is not available in your current Hugo version
@ -153,16 +154,17 @@ make: *** [container-serve] Error 137
Verifique a quantidade de memória disponível para o agente de execução de contêiner. No caso do Docker Desktop para macOS, abra o menu "Preferences..." -> "Resources..." e tente disponibilizar mais memória.
# Comunidade, discussão, contribuição e apoio
## Comunidade, discussão, contribuição e apoio
Saiba mais sobre a comunidade Kubernetes SIG Docs e reuniões na [página da comunidade](http://kubernetes.io/community/).
Você também pode entrar em contato com os mantenedores deste projeto em:
Você também pode entrar em contato com os mantenedores deste projeto utilizando:
- [Slack](https://kubernetes.slack.com/messages/sig-docs) ([Obter o convide para o este slack](https://slack.k8s.io/))
- [Slack](https://kubernetes.slack.com/messages/sig-docs)
- [Obter o convide para este slack](https://slack.k8s.io/)
- [Mailing List](https://groups.google.com/forum/#!forum/kubernetes-sig-docs)
# Contribuindo com os documentos
## Contribuindo com a documentação
Você pode clicar no botão **Fork** na área superior direita da tela para criar uma cópia desse repositório na sua conta do GitHub. Esta cópia é chamada de *fork*. Faça as alterações desejadas no seu fork e, quando estiver pronto para enviar as alterações para nós, vá até o fork e crie um novo **pull request** para nos informar sobre isso.
@ -179,10 +181,27 @@ Para mais informações sobre como contribuir com a documentação do Kubernetes
* [Guia de Estilo da Documentação](http://kubernetes.io/docs/contribute/style/style-guide/)
* [Localizando documentação do Kubernetes](https://kubernetes.io/docs/contribute/localization/)
Você pode contatar os mantenedores da localização em Português em:
### Embaixadores para novos colaboradores
* Felipe ([GitHub - @femrtnz](https://github.com/femrtnz))
* [Slack channel](https://kubernetes.slack.com/messages/kubernetes-docs-pt)
Caso você precise de ajuda em algum momento ao contribuir, os [Embaixadores para novos colaboradores](https://kubernetes.io/docs/contribute/advanced/#serve-as-a-new-contributor-ambassador) são pontos de contato. São aprovadores do SIG Docs cujas responsabilidades incluem orientar e ajudar novos colaboradores em seus primeiros pull requests. O melhor canal para contato com embaixadores é o [Slack do Kubernetes](https://slack.k8s.io/). Atuais Embaixadores do SIG Docs:
| Nome | Slack | GitHub |
| -------------------------- | -------------------------- | -------------------------- |
| Arsh Sharma | @arsh | @RinkiyaKeDad |
## Traduções do `README.md`
| Idioma | Idioma |
| ------------------------- | -------------------------- |
| [Alemão](README-de.md) | [Italiano](README-it.md) |
| [Chinês](README-zh.md) | [Japonês](README-ja.md) |
| [Coreano](README-ko.md) | [Polonês](README-pl.md) |
| [Espanhol](README-es.md) | [Português](README-pt.md) |
| [Francês](README-fr.md) | [Russo](README-ru.md) |
| [Hindi](README-hi.md) | [Ucraniano](README-uk.md) |
| [Indonésio](README-id.md) | [Vietnamita](README-vi.md) |
Você pode contatar os mantenedores da localização em Português no canal do Slack [#kubernetes-docs-pt](https://kubernetes.slack.com/messages/kubernetes-docs-pt).
# Código de conduta

View File

@ -14,10 +14,10 @@ This repository contains the assets required to build the [Kubernetes website an
<!--
- [Contributing to the docs](#contributing-to-the-docs)
- [Localization ReadMes](#localization-readmemds)
- [Localization READMEs](#localization-readmemds)
-->
- [为文档做贡献](#为文档做贡献)
- [README.md 本地化](#readmemd-本地化)
- [README 本地化](#readme-本地化)
<!--
## Using this repository
@ -26,7 +26,8 @@ You can run the website locally using [Hugo (Extended version)](https://gohugo.i
-->
## 使用这个仓库
可以使用 [Hugo扩展版](https://gohugo.io/)在本地运行网站,也可以在容器中运行它。强烈建议使用容器,因为这样可以和在线网站的部署保持一致。
可以使用 [Hugo扩展版](https://gohugo.io/)在本地运行网站,也可以在容器中运行它。
强烈建议使用容器,因为这样可以和在线网站的部署保持一致。
<!--
## Prerequisites
@ -71,10 +72,11 @@ git submodule update --init --recursive --depth 1
```
-->
### Windows
```powershell
# 获取子模块依赖
git submodule update --init --recursive --depth 1
```
```
<!--
### Linux / other Unix
@ -84,10 +86,11 @@ make module-init
```
-->
### Linux / 其它 Unix
```bash
# 获取子模块依赖
make module-init
```
```
<!--
## Running the website using a container
@ -98,17 +101,23 @@ To build the site in a container, run the following:
要在容器中构建网站,请运行以下命令:
<!--
```bash
# You can set $CONTAINER_ENGINE to the name of any Docker-like container tool
make container-serve
```
-->
```bash
# 你可以将 $CONTAINER_ENGINE 设置为任何 Docker 类容器工具的名称
make container-serve
```
<!--
If you see errors, it probably means that the hugo container did not have enough computing resources available. To solve it, increase the amount of allowed CPU and memory usage for Docker on your machine ([MacOSX](https://docs.docker.com/docker-for-mac/#resources) and [Windows](https://docs.docker.com/docker-for-windows/#resources)).
If you see errors, it probably means that the hugo container did not have enough computing resources available. To solve it, increase the amount of allowed CPU and memory usage for Docker on your machine ([MacOS](https://docs.docker.com/desktop/settings/mac/) and [Windows](https://docs.docker.com/desktop/settings/windows/)).
-->
如果你看到错误,这可能意味着 Hugo 容器没有足够的可用计算资源。
要解决这个问题,请增加机器([MacOSX](https://docs.docker.com/docker-for-mac/#resources)
和 [Windows](https://docs.docker.com/docker-for-windows/#resources))上
要解决这个问题,请增加机器([MacOS](https://docs.docker.com/desktop/settings/mac/)
和 [Windows](https://docs.docker.com/desktop/settings/windows/))上
Docker 允许的 CPU 和内存使用量。
<!--
@ -120,22 +129,36 @@ Open up your browser to <http://localhost:1313> to view the website. As you make
<!--
## Running the website locally using Hugo
Make sure to install the Hugo extended version specified by the `HUGO_VERSION` environment variable in the [`netlify.toml`](netlify.toml#L10) file.
Make sure to install the Hugo extended version specified by the `HUGO_VERSION` environment variable in the [`netlify.toml`](netlify.toml#L11) file.
To build and test the site locally, run:
To install dependencies, deploy and test the site locally, run:
-->
## 在本地使用 Hugo 来运行网站
请确保安装的是 [`netlify.toml`](netlify.toml#L10) 文件中环境变量 `HUGO_VERSION` 所指定的
请确保安装的是 [`netlify.toml`](netlify.toml#L11) 文件中环境变量 `HUGO_VERSION` 所指定的
Hugo Extended 版本。
若要在本地构造和测试网站,请运行
若要在本地安装依赖,构建和测试网站,运行以下命令
```bash
# 安装依赖
npm ci
make serve
```
<!--
- For macOS and Linux
-->
- 对于 macOS 和 Linux
```bash
npm ci
make serve
```
<!--
- For Windows (PowerShell)
-->
- 对于 Windows (PowerShell)
```powershell
npm ci
hugo.exe server --buildFuture --environment development
```
<!--
This will start the local Hugo server on port 1313. Open up your browser to <http://localhost:1313> to view the website. As you make changes to the source files, Hugo updates the website and forces a browser refresh.
@ -154,7 +177,9 @@ The API reference pages located in `content/en/docs/reference/kubernetes-api` ar
To update the reference pages for a new Kubernetes release follow these steps:
-->
位于 `content/en/docs/reference/kubernetes-api` 的 API 参考页面是使用 <https://github.com/kubernetes-sigs/reference-docs/tree/master/gen-resourcesdocs> 根据 Swagger 规范(也称为 OpenAPI 规范)构建的。
位于 `content/en/docs/reference/kubernetes-api` 的 API 参考页面是使用
<https://github.com/kubernetes-sigs/reference-docs/tree/master/gen-resourcesdocs>
根据 Swagger 规范(也称为 OpenAPI 规范)构建的。
要更新 Kubernetes 新版本的参考页面,请执行以下步骤:
@ -193,7 +218,7 @@ To update the reference pages for a new Kubernetes release follow these steps:
<!--
You can test the results locally by making and serving the site from a container image:
-->
你可以通过从容器像创建和提供站点来在本地测试结果:
你可以通过从容器像创建和提供站点来在本地测试结果:
```bash
make container-image
@ -203,12 +228,13 @@ To update the reference pages for a new Kubernetes release follow these steps:
<!--
In a web browser, go to <http://localhost:1313/docs/reference/kubernetes-api/> to view the API reference.
-->
在 Web 浏览器中,打开 <http://localhost:1313/docs/reference/kubernetes-api/> 查看 API 参考。
在 Web 浏览器中,打开 <http://localhost:1313/docs/reference/kubernetes-api/> 查看 API 参考页面
<!--
5. When all changes of the new contract are reflected into the configuration files `toc.yaml` and `fields.yaml`, create a Pull Request with the newly generated API reference pages.
-->
5. 当所有新的更改都反映到配置文件 `toc.yaml``fields.yaml` 中时,使用新生成的 API 参考页面创建一个 Pull Request。
5. 当所有新的更改都反映到配置文件 `toc.yaml``fields.yaml` 中时,使用新生成的 API
参考页面创建一个 Pull Request。
<!--
## Troubleshooting
@ -252,10 +278,13 @@ Then run the following commands (adapted from <https://gist.github.com/tombigel/
-->
然后运行以下命令(参考 <https://gist.github.com/tombigel/d503800a282fcadbee14b537735d202c>
<!--
# These are the original gist links, linking to my gists now.
-->
```shell
#!/bin/sh
# 这些是原始的 gist 链接,立即链接到我的 gist
# 这些是原始的 gist 链接,立即链接到我的 gist
# curl -O https://gist.githubusercontent.com/a2ikm/761c2ab02b7b3935679e55af5d81786a/raw/ab644cb92f216c019a2f032bbf25e258b01d87f9/limit.maxfiles.plist
# curl -O https://gist.githubusercontent.com/a2ikm/761c2ab02b7b3935679e55af5d81786a/raw/ab644cb92f216c019a2f032bbf25e258b01d87f9/limit.maxproc.plist
@ -319,7 +348,7 @@ ARG HUGO_VERSION
将 "https://proxy.golang.org" 替换为本地可以使用的代理地址。
**注意:** 此部分仅适用于中国大陆
**注意:** 此部分仅适用于中国大陆
<!--
## Get involved with SIG Docs
@ -404,14 +433,14 @@ SIG Docs 的当前新贡献者大使:
| -------------------------- | -------------------------- | -------------------------- |
| Arsh Sharma | @arsh | @RinkiyaKeDad |
-->
| 姓名 | Slack | GitHub |
| 姓名 | Slack | GitHub |
| -------------------------- | -------------------------- | -------------------------- |
| Arsh Sharma | @arsh | @RinkiyaKeDad |
<!--
## Localization `README.md`'s
## Localization READMEs
-->
## `README.md` 本地化
## README 本地化
<!--
| Language | Language |
@ -434,7 +463,7 @@ SIG Docs 的当前新贡献者大使:
| [意大利语](README-it.md) | [乌克兰语](README-uk.md) |
| [日语](README-ja.md) | [越南语](README-vi.md) |
# 中文本地化
## 中文本地化
可以通过以下方式联系中文本地化的维护人员:

View File

@ -5,7 +5,7 @@
This repository contains the assets required to build the [Kubernetes website and documentation](https://kubernetes.io/). We're glad that you want to contribute!
- [Contributing to the docs](#contributing-to-the-docs)
- [Localization ReadMes](#localization-readmemds)
- [Localization READMEs](#localization-readmemds)
## Using this repository
@ -50,7 +50,7 @@ To build the site in a container, run the following:
make container-serve
```
If you see errors, it probably means that the hugo container did not have enough computing resources available. To solve it, increase the amount of allowed CPU and memory usage for Docker on your machine ([MacOSX](https://docs.docker.com/docker-for-mac/#resources) and [Windows](https://docs.docker.com/docker-for-windows/#resources)).
If you see errors, it probably means that the hugo container did not have enough computing resources available. To solve it, increase the amount of allowed CPU and memory usage for Docker on your machine ([MacOS](https://docs.docker.com/desktop/settings/mac/) and [Windows](https://docs.docker.com/desktop/settings/windows/)).
Open up your browser to <http://localhost:1313> to view the website. As you make changes to the source files, Hugo updates the website and forces a browser refresh.
@ -58,13 +58,18 @@ Open up your browser to <http://localhost:1313> to view the website. As you make
Make sure to install the Hugo extended version specified by the `HUGO_VERSION` environment variable in the [`netlify.toml`](netlify.toml#L11) file.
To build and test the site locally, run:
To install dependencies, deploy and test the site locally, run:
```bash
# install dependencies
npm ci
make serve
```
- For macOS and Linux
```bash
npm ci
make serve
```
- For Windows (PowerShell)
```powershell
npm ci
hugo.exe server --buildFuture --environment development
```
This will start the local Hugo server on port 1313. Open up your browser to <http://localhost:1313> to view the website. As you make changes to the source files, Hugo updates the website and forces a browser refresh.
@ -182,7 +187,7 @@ If you need help at any point when contributing, the [New Contributor Ambassador
| -------------------------- | -------------------------- | -------------------------- |
| Arsh Sharma | @arsh | @RinkiyaKeDad |
## Localization `README.md`'s
## Localization READMEs
| Language | Language |
| -------------------------- | -------------------------- |

View File

@ -76,6 +76,11 @@ footer {
text-decoration: none;
font-size: 1rem;
border: 0px;
}
.button:hover {
background-color: darken($blue, 10%);
}
#cellophane {
@ -547,6 +552,12 @@ section#cncf {
padding: 20px 10px 20px 10px;
}
#desktopKCButton:hover{
background-color: #ffffff;
color: #3371e3;
transition: 150ms;
}
#desktopShowVideoButton {
position: relative;
font-size: 24px;
@ -566,6 +577,15 @@ section#cncf {
border-width: 10px 0 10px 20px;
border-color: transparent transparent transparent $blue;
}
&:hover::before {
border-color: transparent transparent transparent $dark-grey;
}
}
#desktopShowVideoButton:hover{
color: $dark-grey;
transition: 150ms;
}
#mobileShowVideoButton {

View File

@ -317,20 +317,63 @@ footer {
/* DOCS */
.launch-cards {
button {
cursor: pointer;
box-sizing: border-box;
background: none;
margin: 0;
border: 0;
}
padding: 0;
display: grid;
grid-template-columns: repeat(3, 1fr);
row-gap: 1em;
.launch-card {
display: flex;
padding: 0 30px 0 0;
.card-content{
width: fit-content;
display: flex;
flex-direction: column;
margin: 0;
row-gap: 1em;
h2 {
font-size: 1.75em;
padding: 0.5em 0;
margin: 0;
a {
display: none;
}
}
p {
margin: 0;
}
ul {
list-style: none;
height: fit-content;
line-height: 1.6;
padding: 0;
margin-block-end: auto;
}
br {
display: none;
}
button {
height: min-content;
width: auto;
padding: .5em 1em;
cursor: pointer;
box-sizing: border-box;
}
}
}
ul,
li {
list-style: none;
padding-left: 0;
}
@media only screen and (max-width: 1000px) {
grid-template-columns: 1fr;
.launch-card {
width: 100%;
}
}
}
// table of contents
.td-toc {
@ -349,52 +392,63 @@ footer {
}
main {
.td-content table code,
.td-content>table td {
word-break: break-word;
/* SCSS Related to the Metrics list */
div.metric:nth-of-type(odd) { // Look & Feel , Aesthetics
background-color: $light-grey;
}
/* SCSS Related to the Metrics Table */
div.metrics {
@media (max-width: 767px) { // for mobile devices, Display the names, Stability levels & types
table.metrics {
th:nth-child(n + 4),
td:nth-child(n + 4) {
.metric {
div:empty{
display: none;
}
td.metric_type{
min-width: 7em;
display: flex;
flex-direction: column;
flex-wrap: wrap;
gap: .75em;
padding:.75em .75em .75em .75em;
.metric_name{
font-size: large;
font-weight: bold;
word-break: break-word;
}
td.metric_stability_level{
min-width: 6em;
label{
font-weight: bold;
margin-right: .5em;
}
}
ul {
li:empty{
display: none;
}
display: flex;
flex-direction: column;
gap: .75em;
flex-wrap: wrap;
li.metric_labels_varying{
span{
display: inline-block;
background-color: rgb(240, 239, 239);
padding: 0 0.5em;
margin-right: .35em;
font-family: monospace;
border: 1px solid rgb(230 , 230 , 230);
border-radius: 5%;
margin-bottom: .35em;
}
}
}
}
table.metrics tbody{ // Tested dimensions to improve overall aesthetic of the table
tr {
td {
font-size: smaller;
}
td.metric_labels_varying{
min-width: 9em;
}
td.metric_type{
min-width: 9em;
}
td.metric_description{
min-width: 10em;
}
}
}
table.no-word-break td,
table.no-word-break code {
word-break: normal;
}
}
}
// blockquotes and callouts

View File

@ -76,7 +76,7 @@ Diese Container bilden eine einzelne zusammenhängende
Serviceeinheit, z. B. ein Container, der Daten in einem gemeinsam genutzten
Volume öffentlich verfügbar macht, während ein separater _Sidecar_-Container
die Daten aktualisiert. Der Pod fasst die Container, die Speicherressourcen
und eine kurzlebiges Netzwerk-Identität als eine Einheit zusammen.
und eine kurzlebige Netzwerk-Identität als eine Einheit zusammen.
{{< note >}}
Das Gruppieren mehrerer gemeinsam lokalisierter und gemeinsam verwalteter

View File

@ -14,8 +14,6 @@ card:
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<div class="layout" id="top">
<main class="content">

View File

@ -9,8 +9,6 @@ weight: 20
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<link href="/docs/tutorials/kubernetes-basics/public/css/overrides.css" rel="stylesheet">
<script src="https://katacoda.com/embed.js"></script>
<div class="layout" id="top">

View File

@ -9,8 +9,6 @@ weight: 10
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<div class="layout" id="top">
<main class="content">

View File

@ -9,8 +9,6 @@ weight: 20
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<link href="/docs/tutorials/kubernetes-basics/public/css/overrides.css" rel="stylesheet">
<script src="https://katacoda.com/embed.js"></script>
<div class="layout" id="top">

View File

@ -9,8 +9,6 @@ weight: 10
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<div class="layout" id="top">
<main class="content">

View File

@ -9,8 +9,6 @@ weight: 20
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<link href="/docs/tutorials/kubernetes-basics/public/css/overrides.css" rel="stylesheet">
<script src="https://katacoda.com/embed.js"></script>
<div class="layout" id="top">

View File

@ -9,9 +9,6 @@ weight: 10
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<div class="layout" id="top">
<main class="content">

View File

@ -9,8 +9,6 @@ weight: 20
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<link href="/docs/tutorials/kubernetes-basics/public/css/overrides.css" rel="stylesheet">
<script src="https://katacoda.com/embed.js"></script>
<div class="layout" id="top">

View File

@ -9,8 +9,6 @@ weight: 10
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<div class="layout" id="top">
<main class="content">

View File

@ -9,8 +9,6 @@ weight: 20
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<link href="/docs/tutorials/kubernetes-basics/public/css/overrides.css" rel="stylesheet">
<script src="https://katacoda.com/embed.js"></script>
<div class="layout" id="top">

View File

@ -9,8 +9,6 @@ weight: 10
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<div class="layout" id="top">
<main class="content">

View File

@ -9,8 +9,6 @@ weight: 20
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<link href="/docs/tutorials/kubernetes-basics/public/css/overrides.css" rel="stylesheet">
<script src="https://katacoda.com/embed.js"></script>
<div class="layout" id="top">

View File

@ -9,9 +9,6 @@ weight: 10
<body>
<link href="/docs/tutorials/kubernetes-basics/public/css/styles.css" rel="stylesheet">
<link href="https://fonts.googleapis.com/css?family=Roboto+Slab:300,400,700" rel="stylesheet">
<div class="layout" id="top">
<main class="content">

View File

@ -8,12 +8,11 @@ date: 2018-05-29
[**kustomize**]: https://github.com/kubernetes-sigs/kustomize
[hello world]: https://github.com/kubernetes-sigs/kustomize/blob/master/examples/helloWorld
[kustomization]: https://github.com/kubernetes-sigs/kustomize/blob/master/docs/glossary.md#kustomization
[mailing list]: https://groups.google.com/forum/#!forum/kustomize
[open an issue]: https://github.com/kubernetes-sigs/kustomize/issues/new
[subproject]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-cli/0008-kustomize.md
[subproject]: https://github.com/kubernetes/enhancements/blob/master/keps/sig-cli/2377-Kustomize/README.md
[SIG-CLI]: https://github.com/kubernetes/community/tree/master/sig-cli
[workflow]: https://github.com/kubernetes-sigs/kustomize/blob/master/docs/workflows.md
[workflow]: https://github.com/kubernetes-sigs/kustomize/blob/1dd448e65c81aab9d09308b695691175ca6459cd/docs/workflows.md
If you run a Kubernetes environment, chances are youve
customized a Kubernetes configuration — you've copied

View File

@ -3,6 +3,13 @@ layout: blog
title: "Moving Forward From Beta"
date: 2020-08-21
slug: moving-forward-from-beta
# note to localizers: including this means you are marking
# the article as maintained. That should be fine, but if
# there is ever an update, you're committing to also updating
# the localized version.
# If unsure: omit this next field.
evergreen: true
---
**Author**: Tim Bannister, The Scale Factory
@ -12,7 +19,7 @@ In Kubernetes, features follow a defined
First, as the twinkle of an eye in an interested developer. Maybe, then,
sketched in online discussions, drawn on the online equivalent of a cafe
napkin. This rough work typically becomes a
[Kubernetes Enhancement Proposal](https://github.com/kubernetes/enhancements/blob/master/keps/0001-kubernetes-enhancement-proposal-process.md#kubernetes-enhancement-proposal-process) (KEP), and
[Kubernetes Enhancement Proposal](https://github.com/kubernetes/enhancements/blob/master/keps/sig-architecture/0000-kep-process/README.md#kubernetes-enhancement-proposal-process) (KEP), and
from there it usually turns into code.
For Kubernetes v1.20 and onwards, we're focusing on helping that code

View File

@ -42,7 +42,7 @@ controller container.
While this is not strictly true, to understand what was done here, it's good to understand how
Linux containers (and underlying mechanisms such as kernel namespaces) work.
You can read about cgroups in the Kubernetes glossary: [`cgroup`](https://kubernetes.io/docs/reference/glossary/?fundamental=true#term-cgroup) and learn more about cgroups interact with namespaces in the NGINX project article
You can read about cgroups in the Kubernetes glossary: [`cgroup`](/docs/reference/glossary/?fundamental=true#term-cgroup) and learn more about cgroups interact with namespaces in the NGINX project article
[What Are Namespaces and cgroups, and How Do They Work?](https://www.nginx.com/blog/what-are-namespaces-cgroups-how-do-they-work/).
(As you read that, bear in mind that Linux kernel namespaces are a different thing from
[Kubernetes namespaces](/docs/concepts/overview/working-with-objects/namespaces/)).

View File

@ -41,7 +41,7 @@ gateways and service meshes and guides are available to start exploring quickly.
### Getting started
Gateway API is an official Kubernetes API like
[Ingress](https://kubernetes.io/docs/concepts/services-networking/ingress/).
[Ingress](/docs/concepts/services-networking/ingress/).
Gateway API represents a superset of Ingress functionality, enabling more
advanced concepts. Similar to Ingress, there is no default implementation of
Gateway API built into Kubernetes. Instead, there are many different

View File

@ -47,7 +47,7 @@ API.
Kubernetes 1.0 was released on 10 July 2015 without any mechanism to restrict the
security context and sensitive options of workloads, other than an alpha-quality
SecurityContextDeny admission plugin (then known as `scdeny`).
The [SecurityContextDeny plugin](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#securitycontextdeny)
The [SecurityContextDeny plugin](/docs/reference/access-authn-authz/admission-controllers/#securitycontextdeny)
is still in Kubernetes today (as an alpha feature) and creates an admission controller that
prevents the usage of some fields in the security context.

View File

@ -169,7 +169,7 @@ JAMES LAVERACK: Not really. The cornerstone of a Kubernetes organization is the
**CRAIG BOX: Let's talk about some of the new features in 1.24. We have been hearing for many releases now about the impending doom which is the removal of Dockershim. [It is gone in 1.24](https://github.com/kubernetes/enhancements/issues/2221). Do we worry?**
JAMES LAVERACK: I don't think we worry. This is something that the community has been preparing for for a long time. [We've](https://kubernetes.io/blog/2022/01/07/kubernetes-is-moving-on-from-dockershim/) [published](https://kubernetes.io/blog/2022/02/17/dockershim-faq/) a [lot](https://kubernetes.io/blog/2021/11/12/are-you-ready-for-dockershim-removal/) of [documentation](https://kubernetes.io/blog/2022/03/31/ready-for-dockershim-removal/) [about](https://kubernetes.io/docs/tasks/administer-cluster/migrating-from-dockershim/check-if-dockershim-removal-affects-you/) [how](https://kubernetes.io/blog/2022/05/03/dockershim-historical-context/) you need to approach this. The honest truth is that most users, most application developers in Kubernetes, will simply not notice a difference or have to worry about it.
JAMES LAVERACK: I don't think we worry. This is something that the community has been preparing for for a long time. [We've](/blog/2022/01/07/kubernetes-is-moving-on-from-dockershim/) [published](/blog/2022/02/17/dockershim-faq/) a [lot](/blog/2021/11/12/are-you-ready-for-dockershim-removal/) of [documentation](/blog/2022/03/31/ready-for-dockershim-removal/) [about](/docs/tasks/administer-cluster/migrating-from-dockershim/check-if-dockershim-removal-affects-you/) [how](/blog/2022/05/03/dockershim-historical-context/) you need to approach this. The honest truth is that most users, most application developers in Kubernetes, will simply not notice a difference or have to worry about it.
It's only really platform teams that administer Kubernetes clusters and people in very specific circumstances that are using Docker directly, not through the Kubernetes API, that are going to experience any issue at all.
@ -203,7 +203,7 @@ JAMES LAVERACK: This is really about encouraging the use of stable APIs. There w
JAMES LAVERACK: That's correct. There's no breaking changes in beta APIs other than the ones we've documented this release. It's only new things.
**CRAIG BOX: Now in this release, [the artifacts are signed](https://github.com/kubernetes/enhancements/issues/3031) using Cosign signatures, and there is [experimental support for verification of those signatures](https://kubernetes.io/docs/tasks/administer-cluster/verify-signed-artifacts/). What needed to happen to make that process possible?**
**CRAIG BOX: Now in this release, [the artifacts are signed](https://github.com/kubernetes/enhancements/issues/3031) using Cosign signatures, and there is [experimental support for verification of those signatures](/docs/tasks/administer-cluster/verify-signed-artifacts/). What needed to happen to make that process possible?**
JAMES LAVERACK: This was a huge process from the other half of SIG Release. SIG Release has the release team, but it also has the release engineering team that handles the mechanics of actually pushing releases out. They have spent, and one of my friends over there, Adolfo, has spent a lot of time trying to bring us in line with [SLSA](https://slsa.dev/) compliance. I believe we're [looking now at Level 3 compliance](https://github.com/kubernetes/enhancements/issues/3027).
@ -251,7 +251,7 @@ With Kubernetes 1.24, we're enabling a beta feature that allows them to use gRPC
**CRAIG BOX: Are there any other enhancements that are particularly notable or relevant perhaps to the work you've been doing?**
JAMES LAVERACK: There's a really interesting one from SIG Network which is about [avoiding collisions in IP allocations to services](https://kubernetes.io/blog/2022/05/03/kubernetes-1-24-release-announcement/#avoiding-collisions-in-ip-allocation-to-services). In existing versions of Kubernetes, you can allocate a service to have a particular internal cluster IP, or you can leave it blank and it will generate its own IP.
JAMES LAVERACK: There's a really interesting one from SIG Network which is about [avoiding collisions in IP allocations to services](/blog/2022/05/03/kubernetes-1-24-release-announcement/#avoiding-collisions-in-ip-allocation-to-services). In existing versions of Kubernetes, you can allocate a service to have a particular internal cluster IP, or you can leave it blank and it will generate its own IP.
In Kubernetes 1.24, there's an opt-in feature, which allows you to specify a pool for dynamic IPs to be generated from. This means that you can statically allocate an IP to a service and know that IP can not be accidentally dynamically allocated. This is a problem I've actually had in my local Kubernetes cluster, where I use static IP addresses for a bunch of port forwarding rules. I've always worried that during server start-up, they're going to get dynamically allocated to one of the other services. Now, with 1.24, and this feature, I won't have to worry about it more.
@ -267,7 +267,7 @@ JAMES LAVERACK: That is a very deep question I don't think we have time for.
JAMES LAVERACK: [LAUGHING]
**CRAIG BOX: [The theme for Kubernetes 1.24 is Stargazer](https://kubernetes.io/blog/2022/05/03/kubernetes-1-24-release-announcement/#release-theme-and-logo). How did you pick that as the theme?**
**CRAIG BOX: [The theme for Kubernetes 1.24 is Stargazer](/blog/2022/05/03/kubernetes-1-24-release-announcement/#release-theme-and-logo). How did you pick that as the theme?**
JAMES LAVERACK: Every release lead gets to pick their theme, pretty much by themselves. When I started, I asked Rey, the previous release lead, how he picked his theme, because he picked the Next Frontier for Kubernetes 1.23. And he told me that he'd actually picked it before the release even started, which meant for the first couple of weeks and months of the release, I was really worried about it, because I hadn't picked one yet, and I wasn't sure what to pick.

View File

@ -18,7 +18,7 @@ In this SIG Storage spotlight, [Frederico Muñoz](https://twitter.com/fredericom
**Frederico (FSM)**: Hello, thank you for the opportunity of learning more about SIG Storage. Could you tell us a bit about yourself, your role, and how you got involved in SIG Storage.
**Xing Yang (XY)**: I am a Tech Lead at VMware, working on Cloud Native Storage. I am also a Co-Chair of SIG Storage. I started to get involved in K8s SIG Storage at the end of 2017, starting with contributing to the [VolumeSnapshot](https://kubernetes.io/docs/concepts/storage/volume-snapshots/) project. At that time, the VolumeSnapshot project was still in an experimental, pre-alpha stage. It needed contributors. So I volunteered to help. Then I worked with other community members to bring VolumeSnapshot to Alpha in K8s 1.12 release in 2018, Beta in K8s 1.17 in 2019, and eventually GA in 1.20 in 2020.
**Xing Yang (XY)**: I am a Tech Lead at VMware, working on Cloud Native Storage. I am also a Co-Chair of SIG Storage. I started to get involved in K8s SIG Storage at the end of 2017, starting with contributing to the [VolumeSnapshot](/docs/concepts/storage/volume-snapshots/) project. At that time, the VolumeSnapshot project was still in an experimental, pre-alpha stage. It needed contributors. So I volunteered to help. Then I worked with other community members to bring VolumeSnapshot to Alpha in K8s 1.12 release in 2018, Beta in K8s 1.17 in 2019, and eventually GA in 1.20 in 2020.
**FSM**: Reading the [SIG Storage charter](https://github.com/kubernetes/community/blob/master/sig-storage/charter.md) alone its clear that SIG Storage covers a lot of ground, could you describe how the SIG is organised?
@ -34,7 +34,7 @@ We also have other regular meetings, i.e., CSI Implementation meeting, Object Bu
**XY**: In Kubernetes, there are multiple components involved for a volume operation. For example, creating a Pod to use a PVC has multiple components involved. There are the Attach Detach Controller and the external-attacher working on attaching the PVC to the pod. Theres the Kubelet that works on mounting the PVC to the pod. Of course the CSI driver is involved as well. There could be race conditions sometimes when coordinating between multiple components.
Another challenge is regarding core vs [Custom Resource Definitions](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/) (CRD), not really storage specific. CRD is a great way to extend Kubernetes capabilities while not adding too much code to the Kubernetes core itself. However, this also means there are many external components that are needed when running a Kubernetes cluster.
Another challenge is regarding core vs [Custom Resource Definitions](/docs/concepts/extend-kubernetes/api-extension/custom-resources/) (CRD), not really storage specific. CRD is a great way to extend Kubernetes capabilities while not adding too much code to the Kubernetes core itself. However, this also means there are many external components that are needed when running a Kubernetes cluster.
From the SIG Storage side, one most notable example is Volume Snapshot. Volume Snapshot APIs are defined as CRDs. API definitions and controllers are out-of-tree. There is a common snapshot controller and a snapshot validation webhook that should be deployed on the control plane, similar to how kube-controller-manager is deployed. Although Volume Snapshot is a CRD, it is a core feature of SIG Storage. It is recommended for the K8s cluster distros to deploy Volume Snapshot CRDs, the snapshot controller, and the snapshot validation webhook, however, most of the time we dont see distros deploy them. So this becomes a problem for the storage vendors: now it becomes their responsibility to deploy these non-driver specific common components. This could cause conflicts if a customer wants to use more than one storage system and deploy more than one CSI driver.

View File

@ -37,7 +37,7 @@ PodSecurityPolicy was initially [deprecated in v1.21](/blog/2021/04/06/podsecuri
### Support for cgroups v2 Graduates to Stable
It has been more than two years since the Linux kernel cgroups v2 API was declared stable. With some distributions now defaulting to this API, Kubernetes must support it to continue operating on those distributions. cgroups v2 offers several improvements over cgroups v1, for more information see the [cgroups v2](https://kubernetes.io/docs/concepts/architecture/cgroups/) documentation. While cgroups v1 will continue to be supported, this enhancement puts us in a position to be ready for its eventual deprecation and replacement.
It has been more than two years since the Linux kernel cgroups v2 API was declared stable. With some distributions now defaulting to this API, Kubernetes must support it to continue operating on those distributions. cgroups v2 offers several improvements over cgroups v1, for more information see the [cgroups v2](/docs/concepts/architecture/cgroups/) documentation. While cgroups v1 will continue to be supported, this enhancement puts us in a position to be ready for its eventual deprecation and replacement.
### Improved Windows support
@ -53,11 +53,11 @@ It has been more than two years since the Linux kernel cgroups v2 API was declar
### Promoted SeccompDefault to Beta
SeccompDefault promoted to beta, see the tutorial [Restrict a Container's Syscalls with seccomp](https://kubernetes.io/docs/tutorials/security/seccomp/#enable-the-use-of-runtimedefault-as-the-default-seccomp-profile-for-all-workloads) for more details.
SeccompDefault promoted to beta, see the tutorial [Restrict a Container's Syscalls with seccomp](/docs/tutorials/security/seccomp/#enable-the-use-of-runtimedefault-as-the-default-seccomp-profile-for-all-workloads) for more details.
### Promoted endPort in Network Policy to Stable
Promoted `endPort` in [Network Policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/#targeting-a-range-of-ports) to GA. Network Policy providers that support `endPort` field now can use it to specify a range of ports to apply a Network Policy. Previously, each Network Policy could only target a single port.
Promoted `endPort` in [Network Policy](/docs/concepts/services-networking/network-policies/#targeting-a-range-of-ports) to GA. Network Policy providers that support `endPort` field now can use it to specify a range of ports to apply a Network Policy. Previously, each Network Policy could only target a single port.
Please be aware that `endPort` field **must be supported** by the Network Policy provider. If your provider does not support `endPort`, and this field is specified in a Network Policy, the Network Policy will be created covering only the port field (single port).
@ -75,7 +75,7 @@ The [CSI Ephemeral Volume](https://github.com/kubernetes/enhancements/tree/maste
### Promoted CRD Validation Expression Language to Beta
[CRD Validation Expression Language](https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/2876-crd-validation-expression-language/README.md) is promoted to beta, which makes it possible to declare how custom resources are validated using the [Common Expression Language (CEL)](https://github.com/google/cel-spec). Please see the [validation rules](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-rules) guide.
[CRD Validation Expression Language](https://github.com/kubernetes/enhancements/blob/master/keps/sig-api-machinery/2876-crd-validation-expression-language/README.md) is promoted to beta, which makes it possible to declare how custom resources are validated using the [Common Expression Language (CEL)](https://github.com/google/cel-spec). Please see the [validation rules](/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-rules) guide.
### Promoted Server Side Unknown Field Validation to Beta
@ -83,7 +83,7 @@ Promoted the `ServerSideFieldValidation` feature gate to beta (on by default). T
### Introduced KMS v2 API
Introduce KMS v2alpha1 API to add performance, rotation, and observability improvements. Encrypt data at rest (ie Kubernetes `Secrets`) with DEK using AES-GCM instead of AES-CBC for kms data encryption. No user action is required. Reads with AES-GCM and AES-CBC will continue to be allowed. See the guide [Using a KMS provider for data encryption](https://kubernetes.io/docs/tasks/administer-cluster/kms-provider/) for more information.
Introduce KMS v2alpha1 API to add performance, rotation, and observability improvements. Encrypt data at rest (ie Kubernetes `Secrets`) with DEK using AES-GCM instead of AES-CBC for kms data encryption. No user action is required. Reads with AES-GCM and AES-CBC will continue to be allowed. See the guide [Using a KMS provider for data encryption](/docs/tasks/administer-cluster/kms-provider/) for more information.
### Kube-proxy images are now based on distroless images

View File

@ -10,11 +10,11 @@ slug: pod-security-admission-stable
The release of Kubernetes v1.25 marks a major milestone for Kubernetes out-of-the-box pod security
controls: Pod Security admission (PSA) graduated to stable, and Pod Security Policy (PSP) has been
removed.
[PSP was deprecated in Kubernetes v1.21](https://kubernetes.io/blog/2021/04/06/podsecuritypolicy-deprecation-past-present-and-future/),
[PSP was deprecated in Kubernetes v1.21](/blog/2021/04/06/podsecuritypolicy-deprecation-past-present-and-future/),
and no longer functions in Kubernetes v1.25 and later.
The Pod Security admission controller replaces PodSecurityPolicy, making it easier to enforce predefined
[Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/) by
[Pod Security Standards](/docs/concepts/security/pod-security-standards/) by
simply adding a label to a namespace. The Pod Security Standards are maintained by the K8s
community, which means you automatically get updated security policies whenever new
security-impacting Kubernetes features are introduced.
@ -56,7 +56,7 @@ Warning: myjob-g342hj (and 6 other pods): host namespaces, allowPrivilegeEscalat
```
Additionally, when you apply a non-privileged label to a namespace that has been
[configured to be exempt](https://kubernetes.io/docs/concepts/security/pod-security-admission/#exemptions),
[configured to be exempt](/docs/concepts/security/pod-security-admission/#exemptions),
you will now get a warning alerting you to this fact:
```
@ -65,7 +65,7 @@ Warning: namespace 'kube-system' is exempt from Pod Security, and the policy (en
### Changes to the Pod Security Standards
The [Pod Security Standards](https://kubernetes.io/docs/concepts/security/pod-security-standards/),
The [Pod Security Standards](/docs/concepts/security/pod-security-standards/),
which Pod Security admission enforces, have been updated with support for the new Pod OS
field. In v1.25 and later, if you use the Restricted policy, the following Linux-specific restrictions will no
longer be required if you explicitly set the pod's `.spec.os.name` field to `windows`:
@ -76,14 +76,14 @@ longer be required if you explicitly set the pod's `.spec.os.name` field to `win
In Kubernetes v1.23 and earlier, the kubelet didn't enforce the Pod OS field.
If your cluster includes nodes running a v1.23 or older kubelet, you should explicitly
[pin Restricted policies](https://kubernetes.io/docs/concepts/security/pod-security-admission/#pod-security-admission-labels-for-namespaces)
[pin Restricted policies](/docs/concepts/security/pod-security-admission/#pod-security-admission-labels-for-namespaces)
to a version prior to v1.25.
## Migrating from PodSecurityPolicy to the Pod Security admission controller
For instructions to migrate from PodSecurityPolicy to the Pod Security admission controller, and
for help choosing a migration strategy, refer to the
[migration guide](https://kubernetes.io/docs/tasks/configure-pod-container/migrate-from-psp/).
[migration guide](/docs/tasks/configure-pod-container/migrate-from-psp/).
We're also developing a tool called
[pspmigrator](https://github.com/kubernetes-sigs/pspmigrator) to automate parts
of the migration process.

View File

@ -13,7 +13,7 @@ CSI Inline Volumes are similar to other ephemeral volume types, such as `configM
## What's new in 1.25?
There are a couple of new bug fixes related to this feature in 1.25, and the [CSIInlineVolume feature gate](https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/) has been locked to `True` with the graduation to GA. There are no new API changes, so users of this feature during beta should not notice any significant changes aside from these bug fixes.
There are a couple of new bug fixes related to this feature in 1.25, and the [CSIInlineVolume feature gate](/docs/reference/command-line-tools-reference/feature-gates/) has been locked to `True` with the graduation to GA. There are no new API changes, so users of this feature during beta should not notice any significant changes aside from these bug fixes.
- [#89290 - CSI inline volumes should support fsGroup](https://github.com/kubernetes/kubernetes/issues/89290)
- [#79980 - CSI volume reconstruction does not work for ephemeral volumes](https://github.com/kubernetes/kubernetes/issues/79980)
@ -95,8 +95,8 @@ Cluster administrators may choose to omit (or remove) `Ephemeral` from `volumeLi
For more information on this feature, see:
- [Kubernetes documentation](https://kubernetes.io/docs/concepts/storage/ephemeral-volumes/#csi-ephemeral-volumes)
- [Kubernetes documentation](/docs/concepts/storage/ephemeral-volumes/#csi-ephemeral-volumes)
- [CSI documentation](https://kubernetes-csi.github.io/docs/ephemeral-local-volumes.html)
- [KEP-596](https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/596-csi-inline-volumes/README.md)
- [Beta blog post for CSI Inline Volumes](https://kubernetes.io/blog/2020/01/21/csi-ephemeral-inline-volumes/)
- [Beta blog post for CSI Inline Volumes](/blog/2020/01/21/csi-ephemeral-inline-volumes/)

View File

@ -75,12 +75,10 @@ the CSI provisioner receives the credentials from the Secret as part of the Node
CSI volumes that require secrets for online expansion will have NodeExpandSecretRef
field set. If not set, the NodeExpandVolume CSI RPC call will be made without a secret.
## Trying it out
1. Enable the `CSINodeExpandSecret` feature gate (please refer to
[Feature Gates](https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/)).
[Feature Gates](/docs/reference/command-line-tools-reference/feature-gates/)).
1. Create a Secret, and then a StorageClass that uses that Secret.

View File

@ -7,7 +7,7 @@ slug: crd-validation-rules-beta
**Authors:** Joe Betz (Google), Cici Huang (Google), Kermit Alexander (Google)
In Kubernetes 1.25, [Validation rules for CustomResourceDefinitions](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-rules) (CRDs) have graduated to Beta!
In Kubernetes 1.25, [Validation rules for CustomResourceDefinitions](/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-rules) (CRDs) have graduated to Beta!
Validation rules make it possible to declare how custom resources are validated using the [Common Expression Language](https://github.com/google/cel-spec) (CEL). For example:

View File

@ -48,7 +48,7 @@ whenever this particular rule is not satisfied.
For more details about the capabilities and limitations of Validation Rules using
CEL, please refer to
[validation rules](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-rules).
[validation rules](/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-rules).
The [CEL specification](https://github.com/google/cel-spec) is also a good
reference for information specifically related to the language.
@ -651,5 +651,5 @@ For native types, the same behavior can be achieved using kube-openapis marke
Usage of CEL within Kubernetes Validation Rules is so much more powerful than
what has been shown in this article. For more information please check out
[validation rules](https://kubernetes.io/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-rules)
[validation rules](/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-rules)
in the Kubernetes documentation and [CRD Validation Rules Beta](https://kubernetes.io/blog/2022/09/23/crd-validation-rules-beta/) blog post.

View File

@ -77,7 +77,7 @@ section](#ci-cd-systems), though!)
#### Controllers that use either a GET-modify-PUT sequence or a PATCH {#get-modify-put-patch-controllers}
This kind of controller GETs an object (possibly from a
[**watch**](https://kubernetes.io/docs/reference/using-api/api-concepts/#efficient-detection-of-changes)),
[**watch**](/docs/reference/using-api/api-concepts/#efficient-detection-of-changes)),
modifies it, and then PUTs it back to write its changes. Sometimes it constructs
a custom PATCH, but the semantics are the same. Most existing controllers
(especially those in-tree) work like this.

View File

@ -189,7 +189,7 @@ set of annotations to control conflict resolution for CI/CD-related tooling.
On the other side, non CI/CD-related controllers should ensure that they don't
cause unnecessary conflicts when modifying objects. As of
[the server-side apply documentation](https://kubernetes.io/docs/reference/using-api/server-side-apply/#using-server-side-apply-in-a-controller),
[the server-side apply documentation](/docs/reference/using-api/server-side-apply/#using-server-side-apply-in-a-controller),
it is strongly recommended for controllers to always perform force-applying. When
following this recommendation, controllers should really make sure that only
fields related to the controller are included in the applied object.

View File

@ -52,7 +52,7 @@ Setting the `--image-repository` flag.
kubeadm init --image-repository=k8s.gcr.io
```
Or in [kubeadm config](https://kubernetes.io/docs/reference/config-api/kubeadm-config.v1beta3/) `ClusterConfiguration`:
Or in [kubeadm config](/docs/reference/config-api/kubeadm-config.v1beta3/) `ClusterConfiguration`:
```yaml
apiVersion: kubeadm.k8s.io/v1beta3

View File

@ -69,7 +69,7 @@ to be considered as an alpha level feature in CRI-O and Kubernetes and the
security implications are still under consideration.
Once containers and pods are running it is possible to create a checkpoint.
[Checkpointing](https://kubernetes.io/docs/reference/node/kubelet-checkpoint-api/)
[Checkpointing](/docs/reference/node/kubelet-checkpoint-api/)
is currently only exposed on the **kubelet** level. To checkpoint a container,
you can run `curl` on the node where that container is running, and trigger a
checkpoint:

View File

@ -124,7 +124,7 @@ Introduced in Kubernetes v1.24, [this
feature](https://github.com/kubernetes/enhancements/issues/3031) constitutes a significant milestone
in improving the security of the Kubernetes release process. All release artifacts are signed
keyless using [cosign](https://github.com/sigstore/cosign/), and both binary artifacts and images
[can be verified](https://kubernetes.io/docs/tasks/administer-cluster/verify-signed-artifacts/).
[can be verified](/docs/tasks/administer-cluster/verify-signed-artifacts/).
### Support for Windows privileged containers graduates to stable
@ -223,8 +223,8 @@ This release includes a total of eleven enhancements promoted to Stable:
Kubernetes with this release.
* [CRI `v1alpha2` API is removed](https://github.com/kubernetes/kubernetes/pull/110618)
* [Removal of the `v1beta1` flow control API group](https://kubernetes.io/docs/reference/using-api/deprecation-guide/#flowcontrol-resources-v126)
* [Removal of the `v2beta2` HorizontalPodAutoscaler API](https://kubernetes.io/docs/reference/using-api/deprecation-guide/#horizontalpodautoscaler-v126)
* [Removal of the `v1beta1` flow control API group](/docs/reference/using-api/deprecation-guide/#flowcontrol-resources-v126)
* [Removal of the `v2beta2` HorizontalPodAutoscaler API](/docs/reference/using-api/deprecation-guide/#horizontalpodautoscaler-v126)
* [GlusterFS plugin removed from available in-tree drivers](https://github.com/kubernetes/enhancements/issues/3446)
* [Removal of legacy command line arguments relating to logging](https://github.com/kubernetes/kubernetes/pull/112120)
* [Removal of `kube-proxy` userspace modes](https://github.com/kubernetes/kubernetes/pull/112133)

View File

@ -5,7 +5,7 @@ date: 2022-12-15
slug: dynamic-resource-allocation
---
**Authors:** Patrick Ohly (Intel), Kevin Klues (NVIDIA)
**Authors:** Patrick Ohly (Intel), Kevin Klues (NVIDIA)
Dynamic resource allocation is a new API for requesting resources. It is a
generalization of the persistent volumes API for generic resources, making it possible to:
@ -19,11 +19,11 @@ Third-party resource drivers are responsible for interpreting these parameters
as well as tracking and allocating resources as requests come in.
Dynamic resource allocation is an *alpha feature* and only enabled when the
`DynamicResourceAllocation` [feature
gate](/docs/reference/command-line-tools-reference/feature-gates/) and the
`resource.k8s.io/v1alpha1` {{< glossary_tooltip text="API group"
term_id="api-group" >}} are enabled. For details, see the
`--feature-gates` and `--runtime-config` [kube-apiserver
`DynamicResourceAllocation`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) and the
`resource.k8s.io/v1alpha1`
{{< glossary_tooltip text="API group" term_id="api-group" >}} are enabled. For details,
see the `--feature-gates` and `--runtime-config` [kube-apiserver
parameters](/docs/reference/command-line-tools-reference/kube-apiserver/).
The kube-scheduler, kube-controller-manager and kubelet components all need
the feature gate enabled as well.
@ -39,8 +39,8 @@ for end-to-end testing, but also can be run manually. See
## API
The new `resource.k8s.io/v1alpha1` {{< glossary_tooltip text="API group"
term_id="api-group" >}} provides four new types:
The new `resource.k8s.io/v1alpha1` {{< glossary_tooltip text="API group" term_id="api-group" >}}
provides four new types:
ResourceClass
: Defines which resource driver handles a certain kind of
@ -77,7 +77,7 @@ this `.spec` (for example, inside a Deployment or StatefulSet) share the same
ResourceClaim instance. When referencing a ResourceClaimTemplate, each Pod gets
its own ResourceClaim instance.
For a container defined within a Pod, the `resources.claims` list
For a container defined within a Pod, the `resources.claims` list
defines whether that container gets
access to these resource instances, which makes it possible to share resources
between one or more containers inside the same Pod. For example, an init container could
@ -89,7 +89,7 @@ will get created for this Pod and each container gets access to one of them.
Assuming a resource driver called `resource-driver.example.com` was installed
together with the following resource class:
```
```yaml
apiVersion: resource.k8s.io/v1alpha1
kind: ResourceClass
name: resource.example.com
@ -151,8 +151,7 @@ spec:
In contrast to native resources (such as CPU or RAM) and
[extended resources](/docs/concepts/configuration/manage-resources-containers/#extended-resources)
(managed by a
device plugin, advertised by kubelet), the scheduler has no knowledge of what
(managed by a device plugin, advertised by kubelet), the scheduler has no knowledge of what
dynamic resources are available in a cluster or how they could be split up to
satisfy the requirements of a specific ResourceClaim. Resource drivers are
responsible for that. Drivers mark ResourceClaims as _allocated_ once resources
@ -227,8 +226,8 @@ It is up to the driver developer to decide how these two components
communicate. The [KEP](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/3063-dynamic-resource-allocation/README.md) outlines an [approach using
CRDs](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/3063-dynamic-resource-allocation#implementing-a-plugin-for-node-resources).
Within SIG Node, we also plan to provide a complete [example
driver](https://github.com/kubernetes-sigs/dra-example-driver) that can serve
Within SIG Node, we also plan to provide a complete
[example driver](https://github.com/kubernetes-sigs/dra-example-driver) that can serve
as a template for other drivers.
## Running the test driver
@ -236,7 +235,7 @@ as a template for other drivers.
The following steps bring up a local, one-node cluster directly from the
Kubernetes source code. As a prerequisite, your cluster must have nodes with a container
runtime that supports the
[Container Device Interface](https://github.com/container-orchestrated-devices/container-device-interface)
[Container Device Interface](https://github.com/container-orchestrated-devices/container-device-interface)
(CDI). For example, you can run CRI-O [v1.23.2](https://github.com/cri-o/cri-o/releases/tag/v1.23.2) or later.
Once containerd v1.7.0 is released, we expect that you can run that or any later version.
In the example below, we use CRI-O.
@ -259,15 +258,16 @@ $ RUNTIME_CONFIG=resource.k8s.io/v1alpha1 \
PATH=$(pwd)/third_party/etcd:$PATH \
./hack/local-up-cluster.sh -O
...
To start using your cluster, you can open up another terminal/tab and run:
export KUBECONFIG=/var/run/kubernetes/admin.kubeconfig
...
```
Once the cluster is up, in another
terminal run the test driver controller. `KUBECONFIG` must be set for all of
the following commands.
To start using your cluster, you can open up another terminal/tab and run:
```console
$ export KUBECONFIG=/var/run/kubernetes/admin.kubeconfig
```
Once the cluster is up, in another terminal run the test driver controller.
`KUBECONFIG` must be set for all of the following commands.
```console
$ go run ./test/e2e/dra/test-driver --feature-gates ContextualLogging=true -v=5 controller
@ -319,7 +319,7 @@ user_a='b'
## Next steps
- See the
[Dynamic Resource Allocation](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/3063-dynamic-resource-allocation/README.md)
[Dynamic Resource Allocation](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/3063-dynamic-resource-allocation/README.md)
KEP for more information on the design.
- Read [Dynamic Resource Allocation](/docs/concepts/scheduling-eviction/dynamic-resource-allocation/)
in the official Kubernetes documentation.
@ -328,6 +328,6 @@ user_a='b'
and / or the [CNCF Container Orchestrated Device Working Group](https://github.com/cncf/tag-runtime/blob/master/wg/COD.md).
- You can view or comment on the [project board](https://github.com/orgs/kubernetes/projects/95/views/1)
for dynamic resource allocation.
- In order to move this feature towards beta, we need feedback from hardware
vendors, so here's a call to action: try out this feature, consider how it can help
with problems that your users are having, and write resource drivers…
- In order to move this feature towards beta, we need feedback from hardware
vendors, so here's a call to action: try out this feature, consider how it can help
with problems that your users are having, and write resource drivers…

View File

@ -118,7 +118,7 @@ Below is an overview of how the Kubernetes project is using kubelet credential p
{{< figure src="kubelet-credential-providers-enabling.png" caption="Figure 4: Kubelet credential provider configuration used for Kubernetes e2e testing" >}}
For more configuration details, see [Kubelet Credential Providers](https://kubernetes.io/docs/tasks/kubelet-credential-provider/kubelet-credential-provider/).
For more configuration details, see [Kubelet Credential Providers](/docs/tasks/kubelet-credential-provider/kubelet-credential-provider/).
## Getting Involved

View File

@ -123,6 +123,6 @@ and scheduler. You're more than welcome to test it out and tell us (SIG Scheduli
## Additional resources
- [Pod Scheduling Readiness](https://kubernetes.io/docs/concepts/scheduling-eviction/pod-scheduling-readiness/)
- [Pod Scheduling Readiness](/docs/concepts/scheduling-eviction/pod-scheduling-readiness/)
in the Kubernetes documentation
- [Kubernetes Enhancement Proposal](https://github.com/kubernetes/enhancements/blob/master/keps/sig-scheduling/3521-pod-scheduling-readiness/README.md)

View File

@ -17,7 +17,7 @@ give application owners greater flexibility in managing disruptions.
## What problems does this solve?
API-initiated eviction of pods respects PodDisruptionBudgets (PDBs). This means that a requested [voluntary disruption](https://kubernetes.io/docs/concepts/scheduling-eviction/#pod-disruption)
API-initiated eviction of pods respects PodDisruptionBudgets (PDBs). This means that a requested [voluntary disruption](/docs/concepts/scheduling-eviction/#pod-disruption)
via an eviction to a Pod, should not disrupt a guarded application and `.status.currentHealthy` of a PDB should not fall
below `.status.desiredHealthy`. Running pods that are [Unhealthy](/docs/tasks/run-application/configure-pdb/#healthiness-of-a-pod)
do not count towards the PDB status, but eviction of these is only possible in case the application

View File

@ -92,7 +92,7 @@ we can see that there are a relevant number of metrics, logs, and tracing
[KEPs](https://www.k8s.dev/resources/keps/) in the pipeline. Would you like to
point out important things for last release (maybe alpha & stable milestone candidates?)
**Han (HK)**: We can now generate [documentation](https://kubernetes.io/docs/reference/instrumentation/metrics/)
**Han (HK)**: We can now generate [documentation](/docs/reference/instrumentation/metrics/)
for every single metric in the main Kubernetes code base! We have a pretty fancy
static analysis pipeline that enables this functionality. Weve also added feature
metrics so that you can look at your metrics to determine which features are enabled

View File

@ -123,8 +123,8 @@ repository](https://github.com/aws/aws-eks-best-practices/tree/master/policies/k
that will block them from being pulled. You can use these third-party policies with any Kubernetes
cluster.
**Option 5**: As a **LAST** possible option, you can use a [Mutating
Admission Webhook](https://kubernetes.io/docs/reference/access-authn-authz/extensible-admission-controllers/#what-are-admission-webhooks)
**Option 5**: As a **LAST** possible option, you can use a
[Mutating Admission Webhook](/docs/reference/access-authn-authz/extensible-admission-controllers/#what-are-admission-webhooks)
to change the image address dynamically. This should only be
considered a stopgap till your manifests have been updated. You can
find a (third party) Mutating Webhook and Kyverno policy in

View File

@ -37,7 +37,7 @@ the information about this change and what to do if it impacts you.
## The Kubernetes API Removal and Deprecation process
The Kubernetes project has a well-documented
[deprecation policy](https://kubernetes.io/docs/reference/using-api/deprecation-policy/)
[deprecation policy](/docs/reference/using-api/deprecation-policy/)
for features. This policy states that stable APIs may only be deprecated when
a newer, stable version of that same API is available and that APIs have a
minimum lifetime for each stability level. A deprecated API has been marked
@ -214,7 +214,7 @@ that argument, which has been deprecated since the v1.24 release.
## Looking ahead
The official list of
[API removals](https://kubernetes.io/docs/reference/using-api/deprecation-guide/#v1-29)
[API removals](/docs/reference/using-api/deprecation-guide/#v1-29)
planned for Kubernetes v1.29 includes:
- The `flowcontrol.apiserver.k8s.io/v1beta2` API version of FlowSchema and

View File

@ -177,7 +177,7 @@ The complete details of the Kubernetes v1.27 release are available in our [relea
## Availability
Kubernetes v1.27 is available for download on [GitHub](https://github.com/kubernetes/kubernetes/releases/tag/v1.27.0). To get started with Kubernetes, you can run local Kubernetes clusters using [minikube](https://minikube.sigs.k8s.io/docs/), [kind](https://kind.sigs.k8s.io/), etc. You can also easily install v1.27 using [kubeadm](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/).
Kubernetes v1.27 is available for download on [GitHub](https://github.com/kubernetes/kubernetes/releases/tag/v1.27.0). To get started with Kubernetes, you can run local Kubernetes clusters using [minikube](https://minikube.sigs.k8s.io/docs/), [kind](https://kind.sigs.k8s.io/), etc. You can also easily install v1.27 using [kubeadm](/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/).
## Release team

View File

@ -55,7 +55,7 @@ the issue while being more transparent and less disruptive to end-users.
## What's next?
In preparation to [graduate](https://kubernetes.io/docs/reference/command-line-tools-reference/feature-gates/#feature-stages) the feed
In preparation to [graduate](/docs/reference/command-line-tools-reference/feature-gates/#feature-stages) the feed
to stable i.e. `General Availability` stage, SIG Security is still gathering feedback from end users who are using the updated beta feed.
To help us continue to improve the feed in future Kubernetes Releases please share feedback by adding a comment to

View File

@ -10,18 +10,19 @@ slug: qos-memory-resources
Kubernetes v1.27, released in April 2023, introduced changes to
Memory QoS (alpha) to improve memory management capabilites in Linux nodes.
Support for Memory QoS was initially added in Kubernetes v1.22, and later some
[limitations](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2570-memory-qos#reasons-for-changing-the-formula-of-memoryhigh-calculation-in-alpha-v127)
around the formula for calculating `memory.high` were identified. These limitations are
Support for Memory QoS was initially added in Kubernetes v1.22, and later some
[limitations](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/2570-memory-qos#reasons-for-changing-the-formula-of-memoryhigh-calculation-in-alpha-v127)
around the formula for calculating `memory.high` were identified. These limitations are
addressed in Kubernetes v1.27.
## Background
Kubernetes allows you to optionally specify how much of each resources a container needs
in the Pod specification. The most common resources to specify are CPU and Memory.
in the Pod specification. The most common resources to specify are CPU and Memory.
For example, a Pod manifest that defines container resource requirements could look like:
```
```yaml
apiVersion: v1
kind: Pod
metadata:
@ -40,19 +41,19 @@ spec:
* `spec.containers[].resources.requests`
When you specify the resource request for containers in a Pod, the
When you specify the resource request for containers in a Pod, the
[Kubernetes scheduler](/docs/concepts/scheduling-eviction/kube-scheduler/#kube-scheduler)
uses this information to decide which node to place the Pod on. The scheduler
ensures that for each resource type, the sum of the resource requests of the
ensures that for each resource type, the sum of the resource requests of the
scheduled containers is less than the total allocatable resources on the node.
* `spec.containers[].resources.limits`
When you specify the resource limit for containers in a Pod, the kubelet enforces
those limits so that the running containers are not allowed to use more of those
When you specify the resource limit for containers in a Pod, the kubelet enforces
those limits so that the running containers are not allowed to use more of those
resources than the limits you set.
When the kubelet starts a container as a part of a Pod, kubelet passes the
When the kubelet starts a container as a part of a Pod, kubelet passes the
container's requests and limits for CPU and memory to the container runtime.
The container runtime assigns both CPU request and CPU limit to a container.
Provided the system has free CPU time, the containers are guaranteed to be
@ -61,9 +62,9 @@ the configured limit i.e. containers CPU usage will be throttled if they
use more CPU than the specified limit within a given time slice.
Prior to Memory QoS feature, the container runtime only used the memory
limit and discarded the memory `request` (requests were, and still are,
limit and discarded the memory `request` (requests were, and still are,
also used to influence [scheduling](/docs/concepts/scheduling-eviction/#scheduling)).
If a container uses more memory than the configured limit,
If a container uses more memory than the configured limit,
the Linux Out Of Memory (OOM) killer will be invoked.
Let's compare how the container runtime on Linux typically configures memory
@ -71,26 +72,26 @@ request and limit in cgroups, with and without Memory QoS feature:
* **Memory request**
The memory request is mainly used by kube-scheduler during (Kubernetes) Pod
The memory request is mainly used by kube-scheduler during (Kubernetes) Pod
scheduling. In cgroups v1, there are no controls to specify the minimum amount
of memory the cgroups must always retain. Hence, the container runtime did not
use the value of requested memory set in the Pod spec.
cgroups v2 introduced a `memory.min` setting, used to specify the minimum
cgroups v2 introduced a `memory.min` setting, used to specify the minimum
amount of memory that should remain available to the processes within
a given cgroup. If the memory usage of a cgroup is within its effective
min boundary, the cgroups memory wont be reclaimed under any conditions.
If the kernel cannot maintain at least `memory.min` bytes of memory for the
If the kernel cannot maintain at least `memory.min` bytes of memory for the
processes within the cgroup, the kernel invokes its OOM killer. In other words,
the kernel guarantees at least this much memory is available or terminates
the kernel guarantees at least this much memory is available or terminates
processes (which may be outside the cgroup) in order to make memory more available.
Memory QoS maps `memory.min` to `spec.containers[].resources.requests.memory`
to ensure the availability of memory for containers in Kubernetes Pods.
to ensure the availability of memory for containers in Kubernetes Pods.
* **Memory limit**
The `memory.limit` specifies the memory limit, beyond which if the container tries
to allocate more memory, Linux kernel will terminate a process with an
to allocate more memory, Linux kernel will terminate a process with an
OOM (Out of Memory) kill. If the terminated process was the main (or only) process
inside the container, the container may exit.
@ -103,7 +104,7 @@ request and limit in cgroups, with and without Memory QoS feature:
specify the hard limit for memory usage. If the memory consumption goes above this
level, the kernel invokes its OOM Killer.
cgroups v2 also added `memory.high` configuration . Memory QoS uses `memory.high`
cgroups v2 also added `memory.high` configuration. Memory QoS uses `memory.high`
to set memory usage throttle limit. If the `memory.high` limit is breached,
the offending cgroups are throttled, and the kernel tries to reclaim memory
which may avoid an OOM kill.
@ -113,40 +114,49 @@ request and limit in cgroups, with and without Memory QoS feature:
### Cgroups v2 memory controller interfaces & Kubernetes container resources mapping
Memory QoS uses the memory controller of cgroups v2 to guarantee memory resources in
Kubernetes. cgroupv2 interfaces that this feature uses are:
Kubernetes. cgroupv2 interfaces that this feature uses are:
* `memory.max`
* `memory.min`
* `memory.high`.
{{< figure src="/blog/2023/05/05/qos-memory-resources/memory-qos-cal.svg" title="Memory QoS Levels" alt="Memory QoS Levels" >}}
`memory.max` is mapped to `limits.memory` specified in the Pod spec. The kubelet and
the container runtime configure the limit in the respective cgroup. The kernel
`memory.max` is mapped to `limits.memory` specified in the Pod spec. The kubelet and
the container runtime configure the limit in the respective cgroup. The kernel
enforces the limit to prevent the container from using more than the configured
resource limit. If a process in a container tries to consume more than the
specified limit, kernel terminates a process(es) with an out of
memory Out of Memory (OOM) error.
resource limit. If a process in a container tries to consume more than the
specified limit, kernel terminates a process(es) with an Out of Memory (OOM) error.
{{< figure src="/blog/2023/05/05/qos-memory-resources/container-memory-max.svg" title="memory.max maps to limits.memory" alt="memory.max maps to limits.memory" >}}
```formula
memory.max = pod.spec.containers[i].resources.limits[memory]
```
`memory.min` is mapped to `requests.memory`, which results in reservation of memory resources
that should never be reclaimed by the kernel. This is how Memory QoS ensures the availability of
memory for Kubernetes pods. If there's no unprotected reclaimable memory available, the OOM
that should never be reclaimed by the kernel. This is how Memory QoS ensures the availability of
memory for Kubernetes pods. If there's no unprotected reclaimable memory available, the OOM
killer is invoked to make more memory available.
{{< figure src="/blog/2023/05/05/qos-memory-resources/container-memory-min.svg" title="memory.min maps to requests.memory" alt="memory.min maps to requests.memory" >}}
```formula
memory.min = pod.spec.containers[i].resources.requests[memory]
```
For memory protection, in addition to the original way of limiting memory usage, Memory QoS
throttles workload approaching its memory limit, ensuring that the system is not overwhelmed
by sporadic increases in memory usage. A new field, `memoryThrottlingFactor`, is available in
the KubeletConfiguration when you enable MemoryQoS feature. It is set to 0.9 by default.
the KubeletConfiguration when you enable MemoryQoS feature. It is set to 0.9 by default.
`memory.high` is mapped to throttling limit calculated by using `memoryThrottlingFactor`,
`requests.memory` and `limits.memory` as in the formula below, and rounding down the
`requests.memory` and `limits.memory` as in the formula below, and rounding down the
value to the nearest page size:
{{< figure src="/blog/2023/05/05/qos-memory-resources/container-memory-high.svg" title="memory.high formula" alt="memory.high formula" >}}
```formula
memory.high = pod.spec.containers[i].resources.requests[memory] + MemoryThrottlingFactor *
{(pod.spec.containers[i].resources.limits[memory] or NodeAllocatableMemory) - pod.spec.containers[i].resources.requests[memory]}
```
**Note**: If a container has no memory limits specified, `limits.memory` is substituted for node allocatable memory.
{{< note >}}
If a container has no memory limits specified, `limits.memory` is substituted for node allocatable memory.
{{< /note >}}
**Summary:**
<table>
@ -158,8 +168,8 @@ value to the nearest page size:
<td>memory.max</td>
<td><code>memory.max</code> specifies the maximum memory limit,
a container is allowed to use. If a process within the container
tries to consume more memory than the configured limit,
the kernel terminates the process with an Out of Memory (OOM) error.
tries to consume more memory than the configured limit,
the kernel terminates the process with an Out of Memory (OOM) error.
<br>
<br>
<i>It is mapped to the container's memory limit specified in Pod manifest.</i>
@ -167,7 +177,7 @@ value to the nearest page size:
</tr>
<tr>
<td>memory.min</td>
<td><code>memory.min</code> specifies a minimum amount of memory
<td><code>memory.min</code> specifies a minimum amount of memory
the cgroups must always retain, i.e., memory that should never be
reclaimed by the system.
If there's no unprotected reclaimable memory available, OOM kill is invoked.
@ -178,8 +188,8 @@ value to the nearest page size:
</tr>
<tr>
<td>memory.high</td>
<td><code>memory.high</code> specifies the memory usage throttle limit.
This is the main mechanism to control a cgroup's memory use. If
<td><code>memory.high</code> specifies the memory usage throttle limit.
This is the main mechanism to control a cgroup's memory use. If
cgroups memory use goes over the high boundary specified here,
the cgroups processes are throttled and put under heavy reclaim pressure.
<br>
@ -193,66 +203,79 @@ value to the nearest page size:
</tr>
</table>
**Note** `memory.high` is set only on container level cgroups while `memory.min` is set on
{{< note >}}
`memory.high` is set only on container level cgroups while `memory.min` is set on
container, pod, and node level cgroups.
{{< /note >}}
### `memory.min` calculations for cgroups heirarchy
When container memory requests are made, kubelet passes `memory.min` to the back-end
CRI runtime (such as containerd or CRI-O) via the `Unified` field in CRI during
container creation. The `memory.min` in container level cgroups will be set to:
container creation. For every i<sup>th</sup> container in a pod, the `memory.min`
in container level cgroups will be set to:
$memory.min = pod.spec.containers[i].resources.requests[memory]$
<sub>for every i<sup>th</sup> container in a pod</sub>
<br>
<br>
Since the `memory.min` interface requires that the ancestor cgroups directories are all
set, the pod and node cgroups directories need to be set correctly.
```formula
memory.min = pod.spec.containers[i].resources.requests[memory]
```
`memory.min` in pod level cgroup:
$memory.min = \sum_{i=0}^{no. of pods}pod.spec.containers[i].resources.requests[memory]$
<sub>for every i<sup>th</sup> container in a pod</sub>
<br>
<br>
`memory.min` in node level cgroup:
$memory.min = \sum_{i}^{no. of nodes}\sum_{j}^{no. of pods}pod[i].spec.containers[j].resources.requests[memory]$
<sub>for every j<sup>th</sup> container in every i<sup>th</sup> pod on a node</sub>
<br>
<br>
Kubelet will manage the cgroups hierarchy of the pod level and node level cgroups
Since the `memory.min` interface requires that the ancestor cgroups directories are all
set, the pod and node cgroups directories need to be set correctly.
For every i<sup>th</sup> container in a pod, `memory.min` in pod level cgroup:
```formula
memory.min = \sum_{i=0}^{no. of pods}pod.spec.containers[i].resources.requests[memory]
```
For every j<sup>th</sup> container in every i<sup>th</sup> pod on a node, `memory.min` in node level cgroup:
```formula
memory.min = \sum_{i}^{no. of nodes}\sum_{j}^{no. of pods}pod[i].spec.containers[j].resources.requests[memory]
```
Kubelet will manage the cgroups hierarchy of the pod level and node level cgroups
directly using the libcontainer library (from the runc project), while container
cgroups limits are managed by the container runtime.
### Support for Pod QoS classes
Based on user feedback for the Alpha feature in Kubernetes v1.22, some users would like
Based on user feedback for the Alpha feature in Kubernetes v1.22, some users would like
to opt out of MemoryQoS on a per-pod basis to ensure there is no early memory throttling.
Therefore, in Kubernetes v1.27 Memory QOS also supports memory.high to be set as per
Therefore, in Kubernetes v1.27 Memory QOS also supports memory.high to be set as per
Quality of Service(QoS) for Pod classes. Following are the different cases for memory.high
as per QOS classes:
1. **Guaranteed pods** by their QoS definition require memory requests=memory limits and are
not overcommitted. Hence MemoryQoS feature is disabled on those pods by not setting
memory.high. This ensures that Guaranteed pods can fully use their memory requests up
to their set limit, and not hit any throttling.
1. **Guaranteed pods** by their QoS definition require memory requests=memory limits and are
not overcommitted. Hence MemoryQoS feature is disabled on those pods by not setting
memory.high. This ensures that Guaranteed pods can fully use their memory requests up
to their set limit, and not hit any throttling.
2. **Burstable pods** by their QoS definition require at least one container in the Pod with
CPU or memory request or limit set.
1. **Burstable pods** by their QoS definition require at least one container in the Pod with
CPU or memory request or limit set.
* When requests.memory and limits.memory are set, the formula is used as-is:
{{< figure src="/blog/2023/05/05/qos-memory-resources/container-memory-high-limit.svg" title="memory.high when requests and limits are set" alt="memory.high when requests and limits are set" >}}
```formula
memory.high = pod.spec.containers[i].resources.requests[memory] + MemoryThrottlingFactor *
{(pod.spec.containers[i].resources.limits[memory]) - pod.spec.containers[i].resources.requests[memory]}
```
* When requests.memory is set and limits.memory is not set, limits.memory is substituted
for node allocatable memory in the formula:
{{< figure src="/blog/2023/05/05/qos-memory-resources/container-memory-high-no-limits.svg" title="memory.high when requests and limits are not set" alt="memory.high when requests and limits are not set" >}}
```formula
memory.high = pod.spec.containers[i].resources.requests[memory] + MemoryThrottlingFactor *
{(NodeAllocatableMemory) - pod.spec.containers[i].resources.requests[memory]}
```
3. **BestEffort** by their QoS definition do not require any memory or CPU limits or requests.
1. **BestEffort** by their QoS definition do not require any memory or CPU limits or requests.
For this case, kubernetes sets requests.memory = 0 and substitute limits.memory for node allocatable
memory in the formula:
{{< figure src="/blog/2023/05/05/qos-memory-resources/container-memory-high-best-effort.svg" title="memory.high for BestEffort Pod" alt="memory.high for BestEffort Pod" >}}
```formula
memory.high = MemoryThrottlingFactor * NodeAllocatableMemory
```
**Summary**: Only Pods in Burstable and BestEffort QoS classes will set `memory.high`.
Guaranteed QoS pods do not set `memory.high` as their memory is guaranteed.
@ -261,10 +284,10 @@ Guaranteed QoS pods do not set `memory.high` as their memory is guaranteed.
The prerequisites for enabling Memory QoS feature on your Linux node are:
1. Verify the [requirements](/docs/concepts/architecture/cgroups/#requirements)
1. Verify the [requirements](/docs/concepts/architecture/cgroups/#requirements)
related to [Kubernetes support for cgroups v2](/docs/concepts/architecture/cgroups)
are met.
2. Ensure CRI Runtime supports Memory QoS. At the time of writing, only containerd
are met.
1. Ensure CRI Runtime supports Memory QoS. At the time of writing, only containerd
and CRI-O provide support compatible with Memory QoS (alpha). This was implemented
in the following PRs:
* Containerd: [Feature: containerd-cri support LinuxContainerResources.Unified #5627](https://github.com/containerd/containerd/pull/5627).
@ -291,8 +314,9 @@ and review of this feature:
* David Porter([bobbypage](https://github.com/bobbypage))
* Mrunal Patel([mrunalp](https://github.com/mrunalp))
For those interested in getting involved in future discussions on Memory QoS feature,
For those interested in getting involved in future discussions on Memory QoS feature,
you can reach out SIG Node by several means:
- Slack: [#sig-node](https://kubernetes.slack.com/messages/sig-node)
- [Mailing list](https://groups.google.com/forum/#!forum/kubernetes-sig-node)
- [Open Community Issues/PRs](https://github.com/kubernetes/community/labels/sig%2Fnode)

View File

@ -184,7 +184,7 @@ server requests.
We conducted a test that created 12k secrets and measured the time taken for the API server to
encrypt the resources. The metric used was
[`apiserver_storage_transformation_duration_seconds`](https://kubernetes.io/docs/reference/instrumentation/metrics/).
[`apiserver_storage_transformation_duration_seconds`](/docs/reference/instrumentation/metrics/).
For KMS v1, the test was run on a managed Kubernetes v1.25 cluster with 2 nodes. There was no
additional load on the cluster during the test. For KMS v2, the test was run in the Kubernetes CI
environment with the following [cluster

View File

@ -129,7 +129,7 @@ signature of the memory contents, which can be sent to the VM's owner as an atte
the initial guest memory was not manipulated.
The second generation of SEV, known as
[Encrypted State](https://www.amd.com/system/files/TechDocs/Protecting%20VM%20Register%20State%20with%20SEV-ES.pdf)
[Encrypted State](https://www.amd.com/content/dam/amd/en/documents/epyc-business-docs/white-papers/Protecting-VM-Register-State-with-SEV-ES.pdf)
or SEV-ES, provides additional protection from the hypervisor by encrypting all
CPU register contents when a context switch occurs.

View File

@ -3,7 +3,7 @@ layout: blog
title: "Spotlight on SIG CLI"
date: 2023-07-20
slug: sig-cli-spotlight-2023
canonicalUrl: https://www.kubernetes.dev/blog/2023/07/13/sig-cli-spotlight-2023/
canonicalUrl: https://www.kubernetes.dev/blog/2023/07/20/sig-cli-spotlight-2023/
---
**Author**: Arpit Agrawal

View File

@ -26,19 +26,37 @@ Much like a garden, our release has ever-changing growth, challenges and opportu
# What's New (Major Themes)
## Changes to supported skew between control plane and node versions
This enables testing and expanding the supported skew between core node and control plane components by one version from n-2 to n-3, so that node components (kubelet and kube-proxy) for the oldest supported minor version work with control plane components (kube-apiserver, kube-scheduler, kube-controller-manager, cloud-controller-manager) for the newest supported minor version.
This is valuable for end users as control plane upgrade will be a little faster than node upgrade, which are almost always going to be the longer with running workloads.
Kubernetes v1.28 expands the supported skew between core node and control plane
components by one minor version, from _n-2_ to _n-3_, so that node components
(kubelet and kube-proxy) for the oldest supported minor version work with
control plane components (kube-apiserver, kube-scheduler, kube-controller-manager,
cloud-controller-manager) for the newest supported minor version.
The Kubernetes yearly support period already makes annual upgrades possible. Users can upgrade to the latest patch versions to pick up security fixes and do 3 sequential minor version upgrades once a year to "catch up" to the latest supported minor version.
Some cluster operators avoid node maintenance and especially changes to node
behavior, because nodes are where the workloads run. For minor version upgrades
to a kubelet, the supported process includes draining that node, and hence
disruption to any Pods that had been executing there. For Kubernetes end users
with very long running workloads, and where Pods should stay running wherever
possible, reducing the time lost to node maintenance is a benefit.
However, since the tested/supported skew between nodes and control planes is currently limited to 2 versions, a 3-version upgrade would have to update nodes twice to stay within the supported skew.
The Kubernetes yearly support period already made annual upgrades possible. Users can
upgrade to the latest patch versions to pick up security fixes and do 3 sequential
minor version upgrades once a year to "catch up" to the latest supported minor version.
Previously, to stay within the supported skew, a cluster operator planning an annual
upgrade would have needed to upgrade their nodes twice (perhaps only hours apart). Now,
with Kubernetes v1.28, you have the option of making a minor version upgrade to
nodes just once in each calendar year and still staying within upstream support.
If you'd like to stay current and upgrade your clusters more often, that's
fine and is still completely supported.
## Generally available: recovery from non-graceful node shutdown
If a node shuts down down unexpectedly or ends up in a non-recoverable state (perhaps due to hardware failure or unresponsive OS), Kubernetes allows you to clean up afterwards and allow stateful workloads to restart on a different node. For Kubernetes v1.28, that's now a stable feature.
If a node shuts down unexpectedly or ends up in a non-recoverable state (perhaps due to hardware failure or unresponsive OS), Kubernetes allows you to clean up afterward and allow stateful workloads to restart on a different node. For Kubernetes v1.28, that's now a stable feature.
This allows stateful workloads to failover to a different node successfully after the original node is shut down or in a non-recoverable state, such as the hardware failure or broken OS.
This allows stateful workloads to fail over to a different node successfully after the original node is shut down or in a non-recoverable state, such as the hardware failure or broken OS.
Versions of Kubernetes earlier than v1.20 lacked handling for node shutdown on Linux, the kubelet integrates with systemd
and implements graceful node shutdown (beta, and enabled by default). However, even an intentional
@ -70,10 +88,12 @@ read [non-graceful node shutdown](/docs/concepts/architecture/nodes/#non-gracefu
## Improvements to CustomResourceDefinition validation rules
The [Common Expression Language (CEL)](https://github.com/google/cel-go) can be used to validate
[custom resources](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/). The primary goal is to allow the majority of the validation use cases that might once have needed you, as a CustomResourceDefinition (CRD) author, to design and implement a webhook. Instead, and as a beta feature, you can add _validation expressions_ directly into the schema of a CRD.
[custom resources](/docs/concepts/extend-kubernetes/api-extension/custom-resources/). The primary goal is to allow the majority of the validation use cases that might once have needed you, as a CustomResourceDefinition (CRD) author, to design and implement a webhook. Instead, and as a beta feature, you can add _validation expressions_ directly into the schema of a CRD.
CRDs need direct support for non-trivial validation. While admission webhooks do support CRDs validation, they significantly complicate the development and operability of CRDs.
In 1.28, two optional fields `reason` and `fieldPath` were added to allow user to specify the failure reason and fieldPath when validation failed.
For more information, read [validation rules](/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/#validation-rules) in the CRD documentation.
## ValidatingAdmissionPolicies graduate to beta
@ -84,7 +104,7 @@ This builds on the capabilities of the CRD Validation Rules feature that graduat
This will lower the infrastructure barrier to enforcing customizable policies as well as providing primitives that help the community establish and adhere to the best practices of both K8s and its extensions.
To use [ValidatingAdmissionPolicies](/docs/reference/access-authn-authz/validating-admission-policy/), you need to enable the `admissionregistration.k8s.io/v1beta1` API group in your cluster's control plane.
To use [ValidatingAdmissionPolicies](/docs/reference/access-authn-authz/validating-admission-policy/), you need to enable both the `admissionregistration.k8s.io/v1beta1` API group and the `ValidatingAdmissionPolicy` feature gate in your cluster's control plane.
## Match conditions for admission webhooks
@ -134,10 +154,17 @@ CDI provides a standardized way of injecting complex devices into a container (i
## API awareness of sidecar containers (alpha) {#sidecar-init-containers}
Kubernetes 1.28 introduces an alpha `restartPolicy` field for [init containers](https://github.com/kubernetes/website/blob/main/content/en/docs/concepts/workloads/pods/init-containers.md),
and uses that to indicate when an init container is also a _sidecar container_. The will start init containers with `restartPolicy: Always` in the order they are defined, along with other init containers. Instead of waiting for that sidecar container to complete before starting the main container(s) for the Pod, the kubelet only waits for
the sidecar init container to have started.
and uses that to indicate when an init container is also a _sidecar container_.
The kubelet will start init containers with `restartPolicy: Always` in the order
they are defined, along with other init containers.
Instead of waiting for that sidecar container to complete before starting the main
container(s) for the Pod, the kubelet only waits for the sidecar init container to have started.
The condition for startup completion will be that the startup probe succeeded (or if no startup probe is defined) and postStart handler is completed. This condition is represented with the field Started of ContainerStatus type. See the section "Pod startup completed condition" for considerations on picking this signal.
The kubelet will consider the startup for the sidecar container as being completed
if the startup probe succeeds and the postStart handler is completed.
This condition is represented with the field Started of ContainerStatus type.
If you do not define a startup probe, the kubelet will consider the container
startup to be completed immediately after the postStart handler completion.
For init containers, you can either omit the `restartPolicy` field, or set it to `Always`. Omitting the field
means that you want a true init container that runs to completion before application startup.
@ -145,11 +172,6 @@ means that you want a true init container that runs to completion before applica
Sidecar containers do not block Pod completion: if all regular containers are complete, sidecar
containers in that Pod will be terminated.
For sidecar containers, the restart behavior is more complex than for init containers. In a Pod with
`restartPolicy` set to `Never`, a sidecar container that fails during Pod startup will **not** be restarted
and the whole Pod is treated as having failed. If the Pod's `restartPolicy` is `Always` or `OnFailure`,
a sidecar that fails to start will be retried.
Once the sidecar container has started (process running, `postStart` was successful, and
any configured startup probe is passing), and then there's a failure, that sidecar container will be
restarted even when the Pod's overall `restartPolicy` is `Never` or `OnFailure`.
@ -163,7 +185,7 @@ To learn more, read [API for sidecar containers](/docs/concepts/workloads/pods/i
Kubernetes automatically sets a `storageClassName` for a PersistentVolumeClaim (PVC) if you don't provide
a value. The control plane also sets a StorageClass for any existing PVC that doesn't have a `storageClassName`
defined.
Previous versions of Kubernetes also had this behavior; for Kubernetes v1.28 is is automatic and always
Previous versions of Kubernetes also had this behavior; for Kubernetes v1.28 it is automatic and always
active; the feature has graduated to stable (general availability).
To learn more, read about [StorageClass](/docs/concepts/storage/storage-classes/) in the Kubernetes
@ -199,18 +221,12 @@ For instance, if indexed jobs were used as the basis for a suite of long-running
For more information, read [Handling Pod and container failures](/docs/concepts/workloads/controllers/job/#handling-pod-and-container-failures) in the Kubernetes documentation.
## CRI container and pod statistics without cAdvisor
This encompasses two related pieces of work (changes to the kubelet's `/metrics/cadvisor` endpoint and improvements to the replacement _summary_ API).
<hr />
<a id="cri-container-and-pod-statistics-without-cadvisor" />
There are two main APIs that consumers use to gather stats about running containers and pods: summary API and `/metrics/cadvisor`. The Kubelet is responsible for implementing the summary API, and cadvisor is responsible for fulfilling `/metrics/cadvisor`.
This enhances CRI implementations to be able to fulfill all the stats needs of Kubernetes. At a high level, there are two pieces of this:
- It enhances the CRI API with enough metrics to supplement the pod and container fields in the summary API directly from CRI.
- It enhances the CRI implementations to broadcast the required metrics to fulfill the pod and container fields in the `/metrics/cadvisor` endpoint.
**Correction**: the feature CRI container and pod statistics without cAdvisor has been removed as it did not make the release.
The original release announcement stated that Kubernetes 1.28 included the new feature.
## Feature graduations and deprecations in Kubernetes v1.28
### Graduations to stable
@ -247,7 +263,7 @@ The complete details of the Kubernetes v1.28 release are available in our [relea
## Availability
Kubernetes v1.28 is available for download on [GitHub](https://github.com/kubernetes/kubernetes/releases/tag/v1.28.0). To get started with Kubernetes, you can run local Kubernetes clusters using [minikube](https://minikube.sigs.k8s.io/docs/), [kind](https://kind.sigs.k8s.io/), etc. You can also easily install v1.28 using [kubeadm](https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/).
Kubernetes v1.28 is available for download on [GitHub](https://github.com/kubernetes/kubernetes/releases/tag/v1.28.0). To get started with Kubernetes, you can run local Kubernetes clusters using [minikube](https://minikube.sigs.k8s.io/docs/), [kind](https://kind.sigs.k8s.io/), etc. You can also easily install v1.28 using [kubeadm](/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/).
## Release Team
@ -270,7 +286,7 @@ In the v1.28 release cycle, which [ran for 14 weeks](https://github.com/kubernet
## Upcoming Release Webinar
Join members of the Kubernetes v1.28 release team on Friday, September 14, 2023, at 10 a.m. PDT to learn about the major features of this release, as well as deprecations and removals to help plan for upgrades. For more information and registration, visit the [event page](https://community.cncf.io/events/details/cncf-cncf-online-programs-presents-cncf-live-webinar-kubernetes-v128-release/) on the CNCF Online Programs site.
Join members of the Kubernetes v1.28 release team on Wednesday, September 6th, 2023, at 9 A.M. PDT to learn about the major features of this release, as well as deprecations and removals to help plan for upgrades. For more information and registration, visit the [event page](https://community.cncf.io/events/details/cncf-cncf-online-programs-presents-cncf-live-webinar-kubernetes-128-release/) on the CNCF Online Programs site.
## Get Involved

View File

@ -0,0 +1,225 @@
---
layout: blog
title: "pkgs.k8s.io: Introducing Kubernetes Community-Owned Package Repositories"
date: 2023-08-15T20:00:00+0000
slug: pkgs-k8s-io-introduction
---
**Author**: Marko Mudrinić (Kubermatic)
On behalf of Kubernetes SIG Release, I am very excited to introduce the
Kubernetes community-owned software
repositories for Debian and RPM packages: `pkgs.k8s.io`! The new package
repositories are replacement for the Google-hosted package repositories
(`apt.kubernetes.io` and `yum.kubernetes.io`) that we've been using since
Kubernetes v1.5.
This blog post contains information about these new package repositories,
what does it mean to you as an end user, and how to migrate to the new
repositories.
** Update (August 31, 2023):** the _**legacy Google-hosted repositories are deprecated
and will be frozen starting with September 13, 2023.**_
Check out [the deprecation announcement](/blog/2023/08/31/legacy-package-repository-deprecation/)
for more details about this change.
## What you need to know about the new package repositories?
_(updated on August 31, 2023)_
- This is an **opt-in change**; you're required to manually migrate from the
Google-hosted repository to the Kubernetes community-owned repositories.
See [how to migrate](#how-to-migrate) later in this announcement for migration information
and instructions.
- The legacy Google-hosted repositories are **deprecated as of August 31, 2023**,
and will be **frozen approximately as of September 13, 2023**. The freeze will happen
immediately following the patch releases that are scheduled for September 2023.
Freezing the legacy repositories means that we will publish packages for the Kubernetes
project only to the community-owned repositories as of the September 13, 2023 cut-off point.
Check out the [deprecation announcement](/blog/2023/08/31/legacy-package-repository-deprecation/)
for more details about this change.
- The existing packages in the legacy repositories will be available for the foreseeable future.
However, the Kubernetes project can't provide any guarantees on how long is that going to be.
The deprecated legacy repositories, and their contents, might be removed at any time in the future
and without a further notice period.
- Given that no new releases will be published to the legacy repositories after
the September 13, 2023 cut-off point, you will not be able to upgrade to any patch or minor
release made from that date onwards if you don't migrate to the new Kubernetes package repositories.
That said, we recommend migrating to the new Kubernetes package repositories **as soon as possible**.
- The new Kubernetes package repositories contain packages beginning with those
Kubernetes versions that were still under support when the community took
over the package builds. This means that anything before v1.24.0 will only be
available in the Google-hosted repository.
- There's a dedicated package repository for each Kubernetes minor version.
When upgrading to a different minor release, you must bear in mind that
the package repository details also change.
## Why are we introducing new package repositories?
As the Kubernetes project is growing, we want to ensure the best possible
experience for the end users. The Google-hosted repository has been serving
us well for many years, but we started facing some problems that require
significant changes to how we publish packages. Another goal that we have is to
use community-owned infrastructure for all critical components and that
includes package repositories.
Publishing packages to the Google-hosted repository is a manual process that
can be done only by a team of Google employees called
[Google Build Admins](/releases/release-managers/#build-admins).
[The Kubernetes Release Managers team](/releases/release-managers/#release-managers)
is a very diverse team especially in terms of timezones that we work in.
Given this constraint, we have to do very careful planning for every release to
ensure that we have both Release Manager and Google Build Admin available to
carry out the release.
Another problem is that we only have a single package repository. Because of
this, we were not able to publish packages for prerelease versions (alpha,
beta, and rc). This made testing Kubernetes prereleases harder for anyone who
is interested to do so. The feedback that we receive from people testing these
releases is critical to ensure the best quality of releases, so we want to make
testing these releases as easy as possible. On top of that, having only one
repository limited us when it comes to publishing dependencies like `cri-tools`
and `kubernetes-cni`.
Regardless of all these issues, we're very thankful to Google and Google Build
Admins for their involvement, support, and help all these years!
## How the new package repositories work?
The new package repositories are hosted at `pkgs.k8s.io` for both Debian and
RPM packages. At this time, this domain points to a CloudFront CDN backed by S3
bucket that contains repositories and packages. However, we plan on onboarding
additional mirrors in the future, giving possibility for other companies to
help us with serving packages.
Packages are built and published via the [OpenBuildService (OBS) platform](http://openbuildservice.org).
After a long period of evaluating different solutions, we made a decision to
use OpenBuildService as a platform to manage our repositories and packages.
First of all, OpenBuildService is an open source platform used by a large
number of open source projects and companies, like openSUSE, VideoLAN,
Dell, Intel, and more. OpenBuildService has many features making it very
flexible and easy to integrate with our existing release tooling. It also
allows us to build packages in a similar way as for the Google-hosted
repository making the migration process as seamless as possible.
SUSE sponsors the Kubernetes project with access to their reference
OpenBuildService setup ([`build.opensuse.org`](http://build.opensuse.org)) and
with technical support to integrate OBS with our release processes.
We use SUSE's OBS instance for building and publishing packages. Upon building
a new release, our tooling automatically pushes needed artifacts and
package specifications to `build.opensuse.org`. That will trigger the build
process that's going to build packages for all supported architectures (AMD64,
ARM64, PPC64LE, S390X). At the end, generated packages will be automatically
pushed to our community-owned S3 bucket making them available to all users.
We want to take this opportunity to thank SUSE for allowing us to use
`build.opensuse.org` and their generous support to make this integration
possible!
## What are significant differences between the Google-hosted and Kubernetes package repositories?
There are three significant differences that you should be aware of:
- There's a dedicated package repository for each Kubernetes minor release.
For example, repository called `core:/stable:/v1.28` only hosts packages for
stable Kubernetes v1.28 releases. This means you can install v1.28.0 from
this repository, but you can't install v1.27.0 or any other minor release
other than v1.28. Upon upgrading to another minor version, you have to add a
new repository and optionally remove the old one
- There's a difference in what `cri-tools` and `kubernetes-cni` package
versions are available in each Kubernetes repository
- These two packages are dependencies for `kubelet` and `kubeadm`
- Kubernetes repositories for v1.24 to v1.27 have same versions of these
packages as the Google-hosted repository
- Kubernetes repositories for v1.28 and onwards are going to have published
only versions that are used by that Kubernetes minor release
- Speaking of v1.28, only kubernetes-cni 1.2.0 and cri-tools v1.28 are going
to be available in the repository for Kubernetes v1.28
- Similar for v1.29, we only plan on publishing cri-tools v1.29 and
whatever kubernetes-cni version is going to be used by Kubernetes v1.29
- The revision part of the package version (the `-00` part in `1.28.0-00`) is
now autogenerated by the OpenBuildService platform and has a different format.
The revision is now in the format of `-x.y`, e.g. `1.28.0-1.1`
## Does this in any way affect existing Google-hosted repositories?
The Google-hosted repository and all packages published to it will continue
working in the same way as before. There are no changes in how we build and
publish packages to the Google-hosted repository, all newly-introduced changes
are only affecting packages publish to the community-owned repositories.
However, as mentioned at the beginning of this blog post, we plan to stop
publishing packages to the Google-hosted repository in the future.
## How to migrate to the Kubernetes community-owned repositories? {#how-to-migrate}
### Debian, Ubuntu, and operating systems using `apt`/`apt-get` {#how-to-migrate-deb}
1. Replace the `apt` repository definition so that `apt` points to the new
repository instead of the Google-hosted repository. Make sure to replace the
Kubernetes minor version in the command below with the minor version
that you're currently using:
```shell
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /" | sudo tee /etc/apt/sources.list.d/kubernetes.list
```
2. Download the public signing key for the Kubernetes package repositories.
The same signing key is used for all repositories, so you can disregard the
version in the URL:
```shell
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
```
3. Update the `apt` package index:
```shell
sudo apt-get update
```
### CentOS, Fedora, RHEL, and operating systems using `rpm`/`dnf` {#how-to-migrate-rpm}
1. Replace the `yum` repository definition so that `yum` points to the new
repository instead of the Google-hosted repository. Make sure to replace the
Kubernetes minor version in the command below with the minor version
that you're currently using:
```shell
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/repodata/repomd.xml.key
exclude=kubelet kubeadm kubectl cri-tools kubernetes-cni
EOF
```
## Can I rollback to the Google-hosted repository after migrating to the Kubernetes repositories?
In general, yes. Just do the same steps as when migrating, but use parameters
for the Google-hosted repository. You can find those parameters in a document
like ["Installing kubeadm"](/docs/setup/production-environment/tools/kubeadm/install-kubeadm).
## Why isnt there a stable list of domains/IPs? Why cant I restrict package downloads?
Our plan for `pkgs.k8s.io` is to make it work as a redirector to a set of
backends (package mirrors) based on user's location. The nature of this change
means that a user downloading a package could be redirected to any mirror at
any time. Given the architecture and our plans to onboard additional mirrors in
the near future, we can't provide a list of IP addresses or domains that you
can add to an allow list.
Restrictive control mechanisms like man-in-the-middle proxies or network
policies that restrict access to a specific list of IPs/domains will break with
this change. For these scenarios, we encourage you to mirror the release
packages to a local package repository that you have strict control over.
## What should I do if I detect some abnormality with the new repositories?
If you encounter any issue with new Kubernetes package repositories, please
file an issue in the
[`kubernetes/release` repository](https://github.com/kubernetes/release/issues/new/choose).

View File

@ -3,7 +3,6 @@ layout: blog
title: "Kubernetes 1.28: Non-Graceful Node Shutdown Moves to GA"
date: 2023-08-16T10:00:00-08:00
slug: kubernetes-1-28-non-graceful-node-shutdown-GA
draft: true
---
**Authors:** Xing Yang (VMware) and Ashutosh Kumar (Elastic)
@ -80,7 +79,7 @@ that are shutdown/failed and automatically failover workloads to another node.
## How can I learn more?
Check out additional documentation on this feature
[here](https://kubernetes.io/docs/concepts/architecture/nodes/#non-graceful-node-shutdown).
[here](/docs/concepts/architecture/nodes/#non-graceful-node-shutdown).
## How to get involved?

View File

@ -0,0 +1,39 @@
---
layout: blog
title: "Kubernetes v1.28: Retroactive Default StorageClass move to GA"
date: 2023-08-18
slug: retroactive-default-storage-class-ga
---
**Author:** Roman Bednář (Red Hat)
Announcing graduation to General Availability (GA) - Retroactive Default StorageClass Assignment in Kubernetes v1.28!
Kubernetes SIG Storage team is thrilled to announce that the "Retroactive Default StorageClass Assignment" feature,
introduced as an alpha in Kubernetes v1.25, has now graduated to GA and is officially part of the Kubernetes v1.28 release.
This enhancement brings a significant improvement to how default
[StorageClasses](/docs/concepts/storage/storage-classes/) are assigned to PersistentVolumeClaims (PVCs).
With this feature enabled, you no longer need to create a default StorageClass first and then a PVC to assign the class.
Instead, any PVCs without a StorageClass assigned will now be retroactively updated to include the default StorageClass.
This enhancement ensures that PVCs no longer get stuck in an unbound state, and storage provisioning works seamlessly,
even when a default StorageClass is not defined at the time of PVC creation.
## What changed?
The PersistentVolume (PV) controller has been modified to automatically assign a default StorageClass to any unbound
PersistentVolumeClaim with the `storageClassName` not set. Additionally, the PersistentVolumeClaim
admission validation mechanism within
the API server has been adjusted to allow changing values from an unset state to an actual StorageClass name.
## How to use it?
As this feature has graduated to GA, there's no need to enable a feature gate anymore.
Simply make sure you are running Kubernetes v1.28 or later, and the feature will be available for use.
For more details, read about
[default StorageClass assignment](/docs/concepts/storage/persistent-volumes/#retroactive-default-storageclass-assignment) in the Kubernetes documentation.
You can also read the previous [blog post](/blog/2023/01/05/retroactive-default-storage-class/) announcing beta graduation in v1.26.
To provide feedback, join our [Kubernetes Storage Special-Interest-Group](https://github.com/kubernetes/community/tree/master/sig-storage) (SIG)
or participate in discussions on our [public Slack channel](https://app.slack.com/client/T09NY5SBT/C09QZFCE5).

View File

@ -0,0 +1,231 @@
---
layout: blog
title: "Kubernetes 1.28: Improved failure handling for Jobs"
date: 2023-08-21
slug: kubernetes-1-28-jobapi-update
---
**Authors:** Kevin Hannon (G-Research), Michał Woźniak (Google)
This blog discusses two new features in Kubernetes 1.28 to improve Jobs for batch
users: [Pod replacement policy](/docs/concepts/workloads/controllers/job/#pod-replacement-policy)
and [Backoff limit per index](/docs/concepts/workloads/controllers/job/#backoff-limit-per-index).
These features continue the effort started by the
[Pod failure policy](/docs/concepts/workloads/controllers/job/#pod-failure-policy)
to improve the handling of Pod failures in a Job.
## Pod replacement policy {#pod-replacement-policy}
By default, when a pod enters a terminating state (e.g. due to preemption or
eviction), Kubernetes immediately creates a replacement Pod. Therefore, both Pods are running
at the same time. In API terms, a pod is considered terminating when it has a
`deletionTimestamp` and it has a phase `Pending` or `Running`.
The scenario when two Pods are running at a given time is problematic for
some popular machine learning frameworks, such as
TensorFlow and [JAX](https://jax.readthedocs.io/en/latest/), which require at most one Pod running at the same time,
for a given index.
Tensorflow gives the following error if two pods are running for a given index.
```
/job:worker/task:4: Duplicate task registration with task_name=/job:worker/replica:0/task:4
```
See more details in the ([issue](https://github.com/kubernetes/kubernetes/issues/115844)).
Creating the replacement Pod before the previous one fully terminates can also
cause problems in clusters with scarce resources or with tight budgets, such as:
* cluster resources can be difficult to obtain for Pods pending to be scheduled,
as Kubernetes might take a long time to find available nodes until the existing
Pods are fully terminated.
* if cluster autoscaler is enabled, the replacement Pods might produce undesired
scale ups.
### How can you use it? {#pod-replacement-policy-how-to-use}
This is an alpha feature, which you can enable by turning on `JobPodReplacementPolicy`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) in
your cluster.
Once the feature is enabled in your cluster, you can use it by creating a new Job that specifies a
`podReplacementPolicy` field as shown here:
```yaml
kind: Job
metadata:
name: new
...
spec:
podReplacementPolicy: Failed
...
```
In that Job, the Pods would only be replaced once they reached the `Failed` phase,
and not when they are terminating.
Additionally, you can inspect the `.status.terminating` field of a Job. The value
of the field is the number of Pods owned by the Job that are currently terminating.
```shell
kubectl get jobs/myjob -o=jsonpath='{.items[*].status.terminating}'
```
```
3 # three Pods are terminating and have not yet reached the Failed phase
```
This can be particularly useful for external queueing controllers, such as
[Kueue](https://github.com/kubernetes-sigs/kueue), that tracks quota
from running Pods of a Job until the resources are reclaimed from
the currently terminating Job.
Note that the `podReplacementPolicy: Failed` is the default when using a custom
[Pod failure policy](/docs/concepts/workloads/controllers/job/#pod-failure-policy).
## Backoff limit per index {#backoff-limit-per-index}
By default, Pod failures for [Indexed Jobs](/docs/concepts/workloads/controllers/job/#completion-mode)
are counted towards the global limit of retries, represented by `.spec.backoffLimit`.
This means, that if there is a consistently failing index, it is restarted
repeatedly until it exhausts the limit. Once the limit is reached the entire
Job is marked failed and some indexes may never be even started.
This is problematic for use cases where you want to handle Pod failures for
every index independently. For example, if you use Indexed Jobs for running
integration tests where each index corresponds to a testing suite. In that case,
you may want to account for possible flake tests allowing for 1 or 2 retries per
suite. There might be some buggy suites, making the corresponding
indexes fail consistently. In that case you may prefer to limit retries for
the buggy suites, yet allowing other suites to complete.
The feature allows you to:
* complete execution of all indexes, despite some indexes failing.
* better utilize the computational resources by avoiding unnecessary retries of consistently failing indexes.
### How can you use it? {#backoff-limit-per-index-how-to-use}
This is an alpha feature, which you can enable by turning on the
`JobBackoffLimitPerIndex`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
in your cluster.
Once the feature is enabled in your cluster, you can create an Indexed Job with the
`.spec.backoffLimitPerIndex` field specified.
#### Example
The following example demonstrates how to use this feature to make sure the
Job executes all indexes (provided there is no other reason for the early Job
termination, such as reaching the `activeDeadlineSeconds` timeout, or being
manually deleted by the user), and the number of failures is controlled per index.
```yaml
apiVersion: batch/v1
kind: Job
metadata:
name: job-backoff-limit-per-index-execute-all
spec:
completions: 8
parallelism: 2
completionMode: Indexed
backoffLimitPerIndex: 1
template:
spec:
restartPolicy: Never
containers:
- name: example # this example container returns an error, and fails,
# when it is run as the second or third index in any Job
# (even after a retry)
image: python
command:
- python3
- -c
- |
import os, sys, time
id = int(os.environ.get("JOB_COMPLETION_INDEX"))
if id == 1 or id == 2:
sys.exit(1)
time.sleep(1)
```
Now, inspect the Pods after the job is finished:
```sh
kubectl get pods -l job-name=job-backoff-limit-per-index-execute-all
```
Returns output similar to this:
```
NAME READY STATUS RESTARTS AGE
job-backoff-limit-per-index-execute-all-0-b26vc 0/1 Completed 0 49s
job-backoff-limit-per-index-execute-all-1-6j5gd 0/1 Error 0 49s
job-backoff-limit-per-index-execute-all-1-6wd82 0/1 Error 0 37s
job-backoff-limit-per-index-execute-all-2-c66hg 0/1 Error 0 32s
job-backoff-limit-per-index-execute-all-2-nf982 0/1 Error 0 43s
job-backoff-limit-per-index-execute-all-3-cxmhf 0/1 Completed 0 33s
job-backoff-limit-per-index-execute-all-4-9q6kq 0/1 Completed 0 28s
job-backoff-limit-per-index-execute-all-5-z9hqf 0/1 Completed 0 28s
job-backoff-limit-per-index-execute-all-6-tbkr8 0/1 Completed 0 23s
job-backoff-limit-per-index-execute-all-7-hxjsq 0/1 Completed 0 22s
```
Additionally, you can take a look at the status for that Job:
```sh
kubectl get jobs job-backoff-limit-per-index-fail-index -o yaml
```
The output ends with a `status` similar to:
```yaml
status:
completedIndexes: 0,3-7
failedIndexes: 1,2
succeeded: 6
failed: 4
conditions:
- message: Job has failed indexes
reason: FailedIndexes
status: "True"
type: Failed
```
Here, indexes `1` and `2` were both retried once. After the second failure,
in each of them, the specified `.spec.backoffLimitPerIndex` was exceeded, so
the retries were stopped. For comparison, if the per-index backoff was disabled,
then the buggy indexes would retry until the global `backoffLimit` was exceeded,
and then the entire Job would be marked failed, before some of the higher
indexes are started.
## How can you learn more?
- Read the user-facing documentation for [Pod replacement policy](/docs/concepts/workloads/controllers/job/#pod-replacement-policy),
[Backoff limit per index](/docs/concepts/workloads/controllers/job/#backoff-limit-per-index), and
[Pod failure policy](/docs/concepts/workloads/controllers/job/#pod-failure-policy)
- Read the KEPs for [Pod Replacement Policy](https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3939-allow-replacement-when-fully-terminated),
[Backoff limit per index](https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3850-backoff-limits-per-index-for-indexed-jobs), and
[Pod failure policy](https://github.com/kubernetes/enhancements/tree/master/keps/sig-apps/3329-retriable-and-non-retriable-failures).
## Getting Involved
These features were sponsored by [SIG Apps](https://github.com/kubernetes/community/tree/master/sig-apps). Batch use cases are actively
being improved for Kubernetes users in the
[batch working group](https://github.com/kubernetes/community/tree/master/wg-batch).
Working groups are relatively short-lived initiatives focused on specific goals.
The goal of the WG Batch is to improve experience for batch workload users, offer support for
batch processing use cases, and enhance the
Job API for common use cases. If that interests you, please join the working
group either by subscriping to our
[mailing list](https://groups.google.com/a/kubernetes.io/g/wg-batch) or on
[Slack](https://kubernetes.slack.com/messages/wg-batch).
## Acknowledgments
As with any Kubernetes feature, multiple people contributed to getting this
done, from testing and filing bugs to reviewing code.
We would not have been able to achieve either of these features without Aldo
Culquicondor (Google) providing excellent domain knowledge and expertise
throughout the Kubernetes ecosystem.

View File

@ -0,0 +1,127 @@
---
layout: blog
title: 'Kubernetes 1.28: Node podresources API Graduates to GA'
date: 2023-08-23
slug: kubelet-podresources-api-GA
---
**Author:** Francesco Romani (Red Hat)
The podresources API is an API served by the kubelet locally on the node, which exposes the compute resources exclusively
allocated to containers. With the release of Kubernetes 1.28, that API is now Generally Available.
## What problem does it solve?
The kubelet can allocate exclusive resources to containers, like
[CPUs, granting exclusive access to full cores](/docs/tasks/administer-cluster/cpu-management-policies/)
or [memory, either regions or hugepages](/docs/tasks/administer-cluster/memory-manager/).
Workloads which require high performance, or low latency (or both) leverage these features.
The kubelet also can assign [devices to containers](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/).
Collectively, these features which enable exclusive assignments are known as "resource managers".
Without an API like podresources, the only possible option to learn about resource assignment was to read the state files the
resource managers use. While done out of necessity, the problem with this approach is the path and the format of these file are
both internal implementation details. Albeit very stable, the project reserves the right to change them freely.
Consuming the content of the state files is thus fragile and unsupported, and projects doing this are recommended to consider
moving to podresources API or to other supported APIs.
## Overview of the API
The podresources API was [initially proposed to enable device monitoring](/docs/concepts/extend-kubernetes/compute-storage-net/device-plugins/#monitoring-device-plugin-resources).
In order to enable monitoring agents, a key prerequisite is to enable introspection of device assignment, which is performed by the kubelet.
Serving this purpose was the initial goal of the API. The first iteration of the API only had a single function implemented, `List`,
to return information about the assignment of devices to containers.
The API is used by [multus CNI](https://github.com/k8snetworkplumbingwg/multus-cni) and by
[GPU monitoring tools](https://github.com/NVIDIA/dcgm-exporter).
Since its inception, the podresources API increased its scope to cover other resource managers than device manager.
Starting from Kubernetes 1.20, the `List` API reports also CPU cores and memory regions (including hugepages); the API also
reports the NUMA locality of the devices, while the locality of CPUs and memory can be inferred from the system.
In Kubernetes 1.21, the API [gained](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2403-pod-resources-allocatable-resources/README.md)
the `GetAllocatableResources` function.
This newer API complements the existing `List` API and enables monitoring agents to determine the unallocated resources,
thus enabling new features built on top of the podresources API like a
[NUMA-aware scheduler plugin](https://github.com/kubernetes-sigs/scheduler-plugins/blob/master/pkg/noderesourcetopology/README.md).
Finally, in Kubernetes 1.27, another function, `Get` was introduced to be more friendly with CNI meta-plugins, to make it simpler to access resources
allocated to a specific pod, rather than having to filter through resources for all pods on the node. The `Get` function is currently alpha level.
## Consuming the API
The podresources API is served by the kubelet locally, on the same node on which is running.
On unix flavors, the endpoint is served over a unix domain socket; the default path is `/var/lib/kubelet/pod-resources/kubelet.sock`.
On windows, the endpoint is served over a named pipe; the default path is `npipe://\\.\pipe\kubelet-pod-resources`.
In order for the containerized monitoring application consume the API, the socket should be mounted inside the container.
A good practice is to mount the directory on which the podresources socket endpoint sits rather than the socket directly.
This will ensure that after a kubelet restart, the containerized monitor application will be able to re-connect to the socket.
An example manifest for a hypothetical monitoring agent consuming the podresources API and deployed as a DaemonSet could look like:
```yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: podresources-monitoring-app
namespace: monitoring
spec:
selector:
matchLabels:
name: podresources-monitoring
template:
metadata:
labels:
name: podresources-monitoring
spec:
containers:
- args:
- --podresources-socket=unix:///host-podresources/kubelet.sock
command:
- /bin/podresources-monitor
image: podresources-monitor:latest # just for an example
volumeMounts:
- mountPath: /host-podresources
name: host-podresources
serviceAccountName: podresources-monitor
volumes:
- hostPath:
path: /var/lib/kubelet/pod-resources
type: Directory
name: host-podresources
```
I hope you find it straightforward to consume the podresources API programmatically.
The kubelet API package provides the protocol file and the go type definitions; however, a client package is not yet available from the project,
and the existing code should not be used directly.
The [recommended](https://github.com/kubernetes/kubernetes/blob/v1.28.0-rc.0/pkg/kubelet/apis/podresources/client.go#L32)
approach is to reimplement the client in your projects, copying and pasting the related functions like for example
the multus project is [doing](https://github.com/k8snetworkplumbingwg/multus-cni/blob/v4.0.2/pkg/kubeletclient/kubeletclient.go).
When operating the containerized monitoring application consuming the podresources API, few points are worth highlighting to prevent "gotcha" moments:
- Even though the API only exposes data, and doesn't allow by design clients to mutate the kubelet state, the gRPC request/response model requires
read-write access to the podresources API socket. In other words, it is not possible to limit the container mount to `ReadOnly`.
- Multiple clients are allowed to connect to the podresources socket and consume the API, since it is stateless.
- The kubelet has [built-in rate limits](https://github.com/kubernetes/kubernetes/pull/116459) to mitigate local Denial of Service attacks from
misbehaving or malicious consumers. The consumers of the API must tolerate rate limit errors returned by the server. The rate limit is currently
hardcoded and global, so misbehaving clients can consume all the quota and potentially starve correctly behaving clients.
## Future enhancements
For historical reasons, the podresources API has a less precise specification than typical kubernetes APIs (such as the Kubernetes HTTP API, or the container runtime interface).
This leads to unspecified behavior in corner cases.
An [effort](https://issues.k8s.io/119423) is ongoing to rectify this state and to have a more precise specification.
The [Dynamic Resource Allocation (DRA)](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/3063-dynamic-resource-allocation) infrastructure
is a major overhaul of the resource management.
The [integration](https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/3695-pod-resources-for-dra) with the podresources API
is already ongoing.
An [effort](https://issues.k8s.io/119817) is ongoing to recommend or create a reference client package ready to be consumed.
## Getting involved
This feature is driven by [SIG Node](https://github.com/Kubernetes/community/blob/master/sig-node/README.md).
Please join us to connect with the community and share your ideas and feedback around the above feature and
beyond. We look forward to hearing from you!

View File

@ -0,0 +1,248 @@
---
layout: blog
title: "Kubernetes 1.28: Beta support for using swap on Linux"
date: 2023-08-24T10:00:00-08:00
slug: swap-linux-beta
---
**Author:** Itamar Holder (Red Hat)
The 1.22 release [introduced Alpha support](/blog/2021/08/09/run-nodes-with-swap-alpha/)
for configuring swap memory usage for Kubernetes workloads running on Linux on a per-node basis.
Now, in release 1.28, support for swap on Linux nodes has graduated to Beta, along with many
new improvements.
Prior to version 1.22, Kubernetes did not provide support for swap memory on Linux systems.
This was due to the inherent difficulty in guaranteeing and accounting for pod memory utilization
when swap memory was involved. As a result, swap support was deemed out of scope in the initial
design of Kubernetes, and the default behavior of a kubelet was to fail to start if swap memory
was detected on a node.
In version 1.22, the swap feature for Linux was initially introduced in its Alpha stage. This represented
a significant advancement, providing Linux users with the opportunity to experiment with the swap
feature for the first time. However, as an Alpha version, it was not fully developed and had
several issues, including inadequate support for cgroup v2, insufficient metrics and summary
API statistics, inadequate testing, and more.
Swap in Kubernetes has numerous [use cases](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md#user-stories)
for a wide range of users. As a result, the node special interest group within the Kubernetes project
has invested significant effort into supporting swap on Linux nodes for beta.
Compared to the alpha, the kubelet's support for running with swap enabled is more stable and
robust, more user-friendly, and addresses many known shortcomings. This graduation to beta
represents a crucial step towards achieving the goal of fully supporting swap in Kubernetes.
## How do I use it?
The utilization of swap memory on a node where it has already been provisioned can be
facilitated by the activation of the `NodeSwap` feature gate on the kubelet.
Additionally, you must disable the `failSwapOn` configuration setting, or the deprecated
`--fail-swap-on` command line flag must be deactivated.
It is possible to configure the `memorySwap.swapBehavior` option to define the manner in which a node utilizes swap memory. For instance,
```yaml
# this fragment goes into the kubelet's configuration file
memorySwap:
swapBehavior: UnlimitedSwap
```
The available configuration options for `swapBehavior` are:
- `UnlimitedSwap` (default): Kubernetes workloads can use as much swap memory as they
request, up to the system limit.
- `LimitedSwap`: The utilization of swap memory by Kubernetes workloads is subject to limitations.
Only Pods of [Burstable](/docs/concepts/workloads/pods/pod-qos/#burstable) QoS are permitted to employ swap.
If configuration for `memorySwap` is not specified and the feature gate is
enabled, by default the kubelet will apply the same behaviour as the
`UnlimitedSwap` setting.
Note that `NodeSwap` is supported for **cgroup v2** only. For Kubernetes v1.28,
using swap along with cgroup v1 is no longer supported.
## Install a swap-enabled cluster with kubeadm
### Before you begin
It is required for this demo that the kubeadm tool be installed, following the steps outlined in the
[kubeadm installation guide](/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm).
If swap is already enabled on the node, cluster creation may
proceed. If swap is not enabled, please refer to the provided instructions for enabling swap.
### Create a swap file and turn swap on
I'll demonstrate creating 4GiB of unencrypted swap.
```bash
dd if=/dev/zero of=/swapfile bs=128M count=32
chmod 600 /swapfile
mkswap /swapfile
swapon /swapfile
swapon -s # enable the swap file only until this node is rebooted
```
To start the swap file at boot time, add line like `/swapfile swap swap defaults 0 0` to `/etc/fstab` file.
### Set up a Kubernetes cluster that uses swap-enabled nodes
To make things clearer, here is an example kubeadm configuration file `kubeadm-config.yaml` for the swap enabled cluster.
```yaml
---
apiVersion: "kubeadm.k8s.io/v1beta3"
kind: InitConfiguration
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
failSwapOn: false
featureGates:
NodeSwap: true
memorySwap:
swapBehavior: LimitedSwap
```
Then create a single-node cluster using `kubeadm init --config kubeadm-config.yaml`.
During init, there is a warning that swap is enabled on the node and in case the kubelet
`failSwapOn` is set to true. We plan to remove this warning in a future release.
## How is the swap limit being determined with LimitedSwap?
The configuration of swap memory, including its limitations, presents a significant
challenge. Not only is it prone to misconfiguration, but as a system-level property, any
misconfiguration could potentially compromise the entire node rather than just a specific
workload. To mitigate this risk and ensure the health of the node, we have implemented
Swap in Beta with automatic configuration of limitations.
With `LimitedSwap`, Pods that do not fall under the Burstable QoS classification (i.e.
`BestEffort`/`Guaranteed` Qos Pods) are prohibited from utilizing swap memory.
`BestEffort` QoS Pods exhibit unpredictable memory consumption patterns and lack
information regarding their memory usage, making it difficult to determine a safe
allocation of swap memory. Conversely, `Guaranteed` QoS Pods are typically employed for
applications that rely on the precise allocation of resources specified by the workload,
with memory being immediately available. To maintain the aforementioned security and node
health guarantees, these Pods are not permitted to use swap memory when `LimitedSwap` is
in effect.
Prior to detailing the calculation of the swap limit, it is necessary to define the following terms:
* `nodeTotalMemory`: The total amount of physical memory available on the node.
* `totalPodsSwapAvailable`: The total amount of swap memory on the node that is available for use by Pods (some swap memory may be reserved for system use).
* `containerMemoryRequest`: The container's memory request.
Swap limitation is configured as:
`(containerMemoryRequest / nodeTotalMemory) × totalPodsSwapAvailable`
In other words, the amount of swap that a container is able to use is proportionate to its
memory request, the node's total physical memory and the total amount of swap memory on
the node that is available for use by Pods.
It is important to note that, for containers within Burstable QoS Pods, it is possible to
opt-out of swap usage by specifying memory requests that are equal to memory limits.
Containers configured in this manner will not have access to swap memory.
## How does it work?
There are a number of possible ways that one could envision swap use on a node.
When swap is already provisioned and available on a node,
SIG Node have [proposed](https://github.com/kubernetes/enhancements/blob/9d127347773ad19894ca488ee04f1cd3af5774fc/keps/sig-node/2400-node-swap/README.md#proposal)
the kubelet should be able to be configured so that:
- It can start with swap on.
- It will direct the Container Runtime Interface to allocate zero swap memory
to Kubernetes workloads by default.
Swap configuration on a node is exposed to a cluster admin via the
[`memorySwap` in the KubeletConfiguration](/docs/reference/config-api/kubelet-config.v1).
As a cluster administrator, you can specify the node's behaviour in the
presence of swap memory by setting `memorySwap.swapBehavior`.
The kubelet [employs the CRI](/docs/concepts/architecture/cri/)
(container runtime interface) API to direct the CRI to
configure specific cgroup v2 parameters (such as `memory.swap.max`) in a manner that will
enable the desired swap configuration for a container. The CRI is then responsible to
write these settings to the container-level cgroup.
## How can I monitor swap?
A notable deficiency in the Alpha version was the inability to monitor and introspect swap
usage. This issue has been addressed in the Beta version introduced in Kubernetes 1.28, which now
provides the capability to monitor swap usage through several different methods.
The beta version of kubelet now collects
[node-level metric statistics](/docs/reference/instrumentation/node-metrics/),
which can be accessed at the `/metrics/resource` and `/stats/summary` kubelet HTTP endpoints.
This allows clients who can directly interrogate the kubelet to
monitor swap usage and remaining swap memory when using LimitedSwap. Additionally, a
`machine_swap_bytes` metric has been added to cadvisor to show the total physical swap capacity of the
machine.
## Caveats
Having swap available on a system reduces predictability. Swap's performance is
worse than regular memory, sometimes by many orders of magnitude, which can
cause unexpected performance regressions. Furthermore, swap changes a system's
behaviour under memory pressure. Since enabling swap permits
greater memory usage for workloads in Kubernetes that cannot be predictably
accounted for, it also increases the risk of noisy neighbours and unexpected
packing configurations, as the scheduler cannot account for swap memory usage.
The performance of a node with swap memory enabled depends on the underlying
physical storage. When swap memory is in use, performance will be significantly
worse in an I/O operations per second (IOPS) constrained environment, such as a
cloud VM with I/O throttling, when compared to faster storage mediums like
solid-state drives or NVMe.
As such, we do not advocate the utilization of swap memory for workloads or
environments that are subject to performance constraints. Furthermore, it is
recommended to employ `LimitedSwap`, as this significantly mitigates the risks
posed to the node.
Cluster administrators and developers should benchmark their nodes and applications
before using swap in production scenarios, and [we need your help](#how-do-i-get-involved) with that!
### Security risk
Enabling swap on a system without encryption poses a security risk, as critical information,
such as volumes that represent Kubernetes Secrets, [may be swapped out to the disk](/docs/concepts/configuration/secret/#information-security-for-secrets).
If an unauthorized individual gains
access to the disk, they could potentially obtain these confidential data. To mitigate this risk, the
Kubernetes project strongly recommends that you encrypt your swap space.
However, handling encrypted swap is not within the scope of
kubelet; rather, it is a general OS configuration concern and should be addressed at that level.
It is the administrator's responsibility to provision encrypted swap to mitigate this risk.
Furthermore, as previously mentioned, with `LimitedSwap` the user has the option to completely
disable swap usage for a container by specifying memory requests that are equal to memory limits.
This will prevent the corresponding containers from accessing swap memory.
## Looking ahead
The Kubernetes 1.28 release introduced Beta support for swap memory on Linux nodes,
and we will continue to work towards [general availability](/docs/reference/command-line-tools-reference/feature-gates/#feature-stages)
for this feature. I hope that this will include:
* Add the ability to set a system-reserved quantity of swap from what kubelet detects on the host.
* Adding support for controlling swap consumption at the Pod level via cgroups.
* This point is still under discussion.
* Collecting feedback from test user cases.
* We will consider introducing new configuration modes for swap, such as a
node-wide swap limit for workloads.
## How can I learn more?
You can review the current [documentation](/docs/concepts/architecture/nodes/#swap-memory)
for using swap with Kubernetes.
For more information, and to assist with testing and provide feedback, please
see [KEP-2400](https://github.com/kubernetes/enhancements/issues/4128) and its
[design proposal](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md).
## How do I get involved?
Your feedback is always welcome! SIG Node [meets regularly](https://github.com/kubernetes/community/tree/master/sig-node#meetings)
and [can be reached](https://github.com/kubernetes/community/tree/master/sig-node#contact)
via [Slack](https://slack.k8s.io/) (channel **#sig-node**), or the SIG's
[mailing list](https://groups.google.com/forum/#!forum/kubernetes-sig-node). A Slack
channel dedicated to swap is also available at **#sig-node-swap**.
Feel free to reach out to me, Itamar Holder (**@iholder101** on Slack and GitHub)
if you'd like to help or ask further questions.

View File

@ -0,0 +1,114 @@
---
layout: blog
title: "Kubernetes v1.28: Introducing native sidecar containers"
date: 2023-08-25
slug: native-sidecar-containers
---
***Authors:*** Todd Neal (AWS), Matthias Bertschy (ARMO), Sergey Kanzhelev (Google), Gunju Kim (NAVER), Shannon Kularathna (Google)
This post explains how to use the new sidecar feature, which enables restartable init containers and is available in alpha in Kubernetes 1.28. We want your feedback so that we can graduate this feature as soon as possible.
The concept of a “sidecar” has been part of Kubernetes since nearly the very beginning. In 2015, sidecars were described in a [blog post](/blog/2015/06/the-distributed-system-toolkit-patterns/) about composite containers as additional containers that “extend and enhance the main container”. Sidecar containers have become a common Kubernetes deployment pattern and are often used for network proxies or as part of a logging system. Until now, sidecars were a concept that Kubernetes users applied without native support. The lack of native support has caused some usage friction, which this enhancement aims to resolve.
## What are sidecar containers in 1.28?
Kubernetes 1.28 adds a new `restartPolicy` field to [init containers](/docs/concepts/workloads/pods/init-containers/) that is available when the `SidecarContainers` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/) is enabled.
```yaml
apiVersion: v1
kind: Pod
spec:
initContainers:
- name: secret-fetch
image: secret-fetch:1.0
- name: network-proxy
image: network-proxy:1.0
restartPolicy: Always
containers:
...
```
The field is optional and, if set, the only valid value is Always. Setting this field changes the behavior of init containers as follows:
- The container restarts if it exits
- Any subsequent init container starts immediately after the [startupProbe](/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-startup-probes) has successfully completed instead of waiting for the restartable init container to exit
- The resource usage calculation changes for the pod as restartable init container resources are now added to the sum of the resource requests by the main containers
[Pod termination](/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination) continues to only depend on the main containers. An init container with a `restartPolicy` of `Always` (named a sidecar) won't prevent the pod from terminating after the main containers exit.
The following properties of restartable init containers make them ideal for the sidecar deployment pattern:
- Init containers have a well-defined startup order regardless of whether you set a `restartPolicy`, so you can ensure that your sidecar starts before any container declarations that come after the sidecar declaration in your manifest.
- Sidecar containers don't extend the lifetime of the Pod, so you can use them in short-lived Pods with no changes to the Pod lifecycle.
- Sidecar containers are restarted on exit, which improves resilience and lets you use sidecars to provide services that your main containers can more reliably consume.
## When to use sidecar containers
You might find built-in sidecar containers useful for workloads such as the following:
- **Batch or AI/ML workloads**, or other Pods that run to completion. These workloads will experience the most significant benefits.
- **Network proxies** that start up before any other container in the manifest. Every other container that runs can use the proxy container's services. For instructions, see the [Kubernetes Native sidecars in Istio blog post](https://istio.io/latest/blog/2023/native-sidecars/).
- **Log collection containers**, which can now start before any other container and run until the Pod terminates. This improves the reliability of log collection in your Pods.
- **Jobs**, which can use sidecars for any purpose without Job completion being blocked by the running sidecar. No additional configuration is required to ensure this behavior.
## How did users get sidecar behavior before 1.28?
Prior to the sidecar feature, the following options were available for implementing sidecar behavior depending on the desired lifetime of the sidecar container:
- **Lifetime of sidecar less than Pod lifetime**: Use an init container, which provides well-defined startup order. However, the sidecar has to exit for other init containers and main Pod containers to start.
- **Lifetime of sidecar equal to Pod lifetime**: Use a main container that runs alongside your workload containers in the Pod. This method doesn't give you control over startup order, and lets the sidecar container potentially block Pod termination after the workload containers exit.
The built-in sidecar feature solves for the use case of having a lifetime equal to the Pod lifetime and has the following additional benefits:
- Provides control over startup order
- Doesnt block Pod termination
## Transitioning existing sidecars to the new model
We recommend only using the sidecars feature gate in [short lived testing clusters](/docs/reference/command-line-tools-reference/feature-gates/#feature-stages) at the alpha stage. If you have an existing sidecar that is configured as a main container so it can run for the lifetime of the pod, it can be moved to the `initContainers` section of the pod spec and given a `restartPolicy` of `Always`. In many cases, the sidecar should work as before with the added benefit of having a defined startup ordering and not prolonging the pod lifetime.
## Known issues
The alpha release of built-in sidecar containers has the following known issues, which we'll resolve before graduating the feature to beta:
- The CPU, memory, device, and topology manager are unaware of the sidecar container lifetime and additional resource usage, and will operate as if the Pod had lower resource requests than it actually does.
- The output of `kubectl describe node` is incorrect when sidecars are in use. The output shows resource usage that's lower than the actual usage because it doesn't use the new resource usage calculation for sidecar containers.
## We need your feedback!
In the alpha stage, we want you to try out sidecar containers in your environments and open issues if you encounter bugs or friction points. We're especially interested in feedback about the following:
- The shutdown sequence, especially with multiple sidecars running
- The backoff timeout adjustment for crashing sidecars
- The behavior of Pod readiness and liveness probes when sidecars are running
To open an issue, see the [Kubernetes GitHub repository](https://github.com/kubernetes/kubernetes/issues/new/choose).
## Whats next?
In addition to the known issues that will be resolved, we're working on adding termination ordering for sidecar and main containers. This will ensure that sidecar containers only terminate after the Pod's main containers have exited.
Were excited to see the sidecar feature come to Kubernetes and are interested in feedback.
## Acknowledgements
Many years have passed since the original KEP was written, so we apologize if we omit anyone who worked on this feature over the years. This is a best-effort attempt to recognize the people involved in this effort.
- [mrunalp](https://github.com/mrunalp/) for design discussions and reviews
- [thockin](https://github.com/thockin/) for API discussions and support thru years
- [bobbypage](https://github.com/bobbypage) for reviews
- [smarterclayton](https://github.com/smarterclayton) for detailed review and feedback
- [howardjohn](https://github.com/howardjohn) for feedback over years and trying it early during implementation
- [derekwaynecarr](https://github.com/derekwaynecarr) and [dchen1107](https://github.com/dchen1107) for leadership
- [jpbetz](https://github.com/Jpbetz) for API and termination ordering designs as well as code reviews
- [Joseph-Irving](https://github.com/Joseph-Irving) and [rata](https://github.com/rata) for the early iterations design and reviews years back
- [swatisehgal](https://github.com/swatisehgal) and [ffromani](https://github.com/ffromani) for early feedback on resource managers impact
- [alculquicondor](https://github.com/Alculquicondor) for feedback on addressing the version skew of the scheduler
- [wojtek-t](https://github.com/Wojtek-t) for PRR review of a KEP
- [ahg-g](https://github.com/ahg-g) for reviewing the scheduler portion of a KEP
- [adisky](https://github.com/Adisky) for the Job completion issue
## More Information
- Read [API for sidecar containers](/docs/concepts/workloads/pods/init-containers/#api-for-sidecar-containers) in the Kubernetes documentation
- Read the [Sidecar KEP](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/753-sidecar-containers/README.md)

View File

@ -0,0 +1,51 @@
---
layout: blog
title: "Kubernetes 1.28: A New (alpha) Mechanism For Safer Cluster Upgrades"
date: 2023-08-28
slug: kubernetes-1-28-feature-mixed-version-proxy-alpha
---
**Author:** Richa Banker (Google)
This blog describes the _mixed version proxy_, a new alpha feature in Kubernetes 1.28. The
mixed version proxy enables an HTTP request for a resource to be served by the correct API server
in cases where there are multiple API servers at varied versions in a cluster. For example,
this is useful during a cluster upgrade, or when you're rolling out the runtime configuration of
the cluster's control plane.
## What problem does this solve?
When a cluster undergoes an upgrade, the kube-apiservers existing at different versions in that scenario can serve different sets (groups, versions, resources) of built-in resources. A resource request made in this scenario may be served by any of the available apiservers, potentially resulting in the request ending up at an apiserver that may not be aware of the requested resource; consequently it being served a 404 not found error which is incorrect. Furthermore, incorrect serving of the 404 errors can lead to serious consequences such as namespace deletion being blocked incorrectly or objects being garbage collected mistakenly.
## How do we solve the problem?
{{< figure src="/images/blog/2023-08-28-a-new-alpha-mechanism-for-safer-cluster-upgrades/mvp-flow-diagram.svg" class="diagram-large" >}}
The new feature “Mixed Version Proxy” provides the kube-apiserver with the capability to proxy a request to a peer kube-apiserver which is aware of the requested resource and hence can serve the request. To do this, a new filter has been added to the handler chain in the API server's aggregation layer.
1. The new filter in the handler chain checks if the request is for a group/version/resource that the apiserver doesn't know about (using the existing [StorageVersion API](https://github.com/kubernetes/kubernetes/blob/release-1.28/pkg/apis/apiserverinternal/types.go#L25-L37)). If so, it proxies the request to one of the apiservers that is listed in the ServerStorageVersion object. If the identified peer apiserver fails to respond (due to reasons like network connectivity, race between the request being received and the controller registering the apiserver-resource info in ServerStorageVersion object), then error 503("Service Unavailable") is served.
2. To prevent indefinite proxying of the request, a (new for v1.28) HTTP header `X-Kubernetes-APIServer-Rerouted: true` is added to the original request once it is determined that the request cannot be served by the original API server. Setting that to true marks that the original API server couldn't handle the request and it should therefore be proxied. If a destination peer API server sees this header, it never proxies the request further.
3. To set the network location of a kube-apiserver that peers will use to proxy requests, the value passed in `--advertise-address` or (when `--advertise-address` is unspecified) the `--bind-address` flag is used. For users with network configurations that would not allow communication between peer kube-apiservers using the addresses specified in these flags, there is an option to pass in the correct peer address as `--peer-advertise-ip` and `--peer-advertise-port` flags that are introduced in this feature.
## How do I enable this feature?
Following are the required steps to enable the feature:
* Download the [latest Kubernetes project](/releases/download/) (version `v1.28.0` or later)
* Switch on the feature gate with the command line flag `--feature-gates=UnknownVersionInteroperabilityProxy=true` on the kube-apiservers
* Pass the CA bundle that will be used by source kube-apiserver to authenticate destination kube-apiserver's serving certs using the flag `--peer-ca-file` on the kube-apiservers. Note: this is a required flag for this feature to work. There is no default value enabled for this flag.
* Pass the correct ip and port of the local kube-apiserver that will be used by peers to connect to this kube-apiserver while proxying a request. Use the flags `--peer-advertise-ip` and `peer-advertise-port` to the kube-apiservers upon startup. If unset, the value passed to either `--advertise-address` or `--bind-address` is used. If those too, are unset, the host's default interface will be used.
## Whats missing?
Currently we only proxy resource requests to a peer kube-apiserver when its determined to do so. Next we need to address how to work discovery requests in such scenarios. Right now we are planning to have the following capabilities for beta
* Merged discovery across all kube-apiservers
* Use an egress dialer for network connections made to peer kube-apiservers
## How can I learn more?
- Read the [Mixed Version Proxy documentation](/docs/concepts/architecture/mixed-version-proxy)
- Read [KEP-4020: Unknown Version Interoperability Proxy](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/4020-unknown-version-interoperability-proxy)
## How can I get involved?
Reach us on [Slack](https://slack.k8s.io/): [#sig-api-machinery](https://kubernetes.slack.com/messages/sig-api-machinery), or through the [mailing list](https://groups.google.com/forum/#!forum/kubernetes-sig-api-machinery).
Huge thanks to the contributors that have helped in the design, implementation, and review of this feature: Daniel Smith, Han Kang, Joe Betz, Jordan Liggit, Antonio Ojea, David Eads and Ben Luddy!

View File

@ -0,0 +1,195 @@
---
layout: blog
title: "Gateway API v0.8.0: Introducing Service Mesh Support"
date: 2023-08-29T10:00:00-08:00
slug: gateway-api-v0-8
---
***Authors:*** Flynn (Buoyant), John Howard (Google), Keith Mattix (Microsoft), Michael Beaumont (Kong), Mike Morris (independent), Rob Scott (Google)
We are thrilled to announce the v0.8.0 release of Gateway API! With this
release, Gateway API support for service mesh has reached [Experimental
status][status]. We look forward to your feedback!
We're especially delighted to announce that Kuma 2.3+, Linkerd 2.14+, and Istio
1.16+ are all fully-conformant implementations of Gateway API service mesh
support.
## Service mesh support in Gateway API
While the initial focus of Gateway API was always ingress (north-south)
traffic, it was clear almost from the beginning that the same basic routing
concepts should also be applicable to service mesh (east-west) traffic. In
2022, the Gateway API subproject started the [GAMMA initiative][gamma], a
dedicated vendor-neutral workstream, specifically to examine how best to fit
service mesh support into the framework of the Gateway API resources, without
requiring users of Gateway API to relearn everything they understand about the
API.
Over the last year, GAMMA has dug deeply into the challenges and possible
solutions around using Gateway API for service mesh. The end result is a small
number of [enhancement proposals][geps] that subsume many hours of thought and
debate, and provide a minimum viable path to allow Gateway API to be used for
service mesh.
### How will mesh routing work when using Gateway API?
You can find all the details in the [Gateway API Mesh routing
documentation][mesh-routing] and [GEP-1426], but the short version for Gateway
API v0.8.0 is that an HTTPRoute can now have a `parentRef` that is a Service,
rather than just a Gateway. We anticipate future GEPs in this area as we gain
more experience with service mesh use cases -- binding to a Service makes it
possible to use the Gateway API with a service mesh, but there are several
interesting use cases that remain difficult to cover.
As an example, you might use an HTTPRoute to do an A-B test in the mesh as
follows:
```yaml
apiVersion: gateway.networking.k8s.io/v1beta1
kind: HTTPRoute
metadata:
name: bar-route
spec:
parentRefs:
- group: ""
kind: Service
name: demo-app
port: 5000
rules:
- matches:
- headers:
- type: Exact
name: env
value: v1
backendRefs:
- name: demo-app-v1
port: 5000
- backendRefs:
- name: demo-app-v2
port: 5000
```
Any request to port 5000 of the `demo-app` Service that has the header `env:
v1` will be routed to `demo-app-v1`, while any request without that header
will be routed to `demo-app-v2` -- and since this is being handled by the
service mesh, not the ingress controller, the A/B test can happen anywhere in
the application's call graph.
### How do I know this will be truly portable?
Gateway API has been investing heavily in conformance tests across all
features it supports, and mesh is no exception. One of the challenges that the
GAMMA initiative ran into is that many of these tests were strongly tied to
the idea that a given implementation provides an ingress controller. Many
service meshes don't, and requiring a GAMMA-conformant mesh to also implement
an ingress controller seemed impractical at best. This resulted in work
restarting on Gateway API _conformance profiles_, as discussed in [GEP-1709].
The basic idea of conformance profiles is that we can define subsets of the
Gateway API, and allow implementations to choose (and document) which subsets
they conform to. GAMMA is adding a new profile, named `Mesh` and described in
[GEP-1686], which checks only the mesh functionality as defined by GAMMA. At
this point, Kuma 2.3+, Linkerd 2.14+, and Istio 1.16+ are all conformant with
the `Mesh` profile.
## What else is in Gateway API v0.8.0?
This release is all about preparing Gateway API for the upcoming v1.0 release
where HTTPRoute, Gateway, and GatewayClass will graduate to GA. There are two
main changes related to this: CEL validation and API version changes.
### CEL Validation
The first major change is that Gateway API v0.8.0 is the start of a transition
from webhook validation to [CEL validation][cel] using information built into
the CRDs. That will mean different things depending on the version of
Kubernetes you're using:
#### Kubernetes 1.25+
CEL validation is fully supported, and almost all validation is implemented in
CEL. (The sole exception is that header names in header modifier filters can
only do case-insensitive validation. There is more information in [issue
2277].)
We recommend _not_ using the validating webhook on these Kubernetes versions.
#### Kubernetes 1.23 and 1.24
CEL validation is not supported, but Gateway API v0.8.0 CRDs can still be
installed. When you upgrade to Kubernetes 1.25+, the validation included in
these CRDs will automatically take effect.
We recommend continuing to use the validating webhook on these Kubernetes
versions.
#### Kubernetes 1.22 and older
Gateway API only commits to support for [5 most recent versions of
Kubernetes][supported-versions]. As such, these versions are no longer
supported by Gateway API, and unfortunately Gateway API v0.8.0 cannot be
installed on them, since CRDs containing CEL validation will be rejected.
### API Version Changes
As we prepare for a v1.0 release that will graduate Gateway, GatewayClass, and
HTTPRoute to the `v1` API Version from `v1beta1`, we are continuing the process
of moving away from `v1alpha2` for resources that have graduated to `v1beta1`.
For more information on this change and everything else included in this
release, refer to the [v0.8.0 release notes][v0.8.0 release notes].
## How can I get started with Gateway API?
Gateway API represents the future of load balancing, routing, and service mesh
APIs in Kubernetes. There are already more than 20 [implementations][impl]
available (including both ingress controllers and service meshes) and the list
keeps growing.
If you're interested in getting started with Gateway API, take a look at the
[API concepts documentation][concepts] and check out some of the
[Guides][guides] to try it out. Because this is a CRD-based API, you can
install the latest version on any Kubernetes 1.23+ cluster.
If you're specifically interested in helping to contribute to Gateway API, we
would love to have you! Please feel free to [open a new issue][issue] on the
repository, or join in the [discussions][disc]. Also check out the [community
page][community] which includes links to the Slack channel and community
meetings. We look forward to seeing you!!
## Further Reading:
- [GEP-1324] provides an overview of the GAMMA goals and some important
definitions. This GEP is well worth a read for its discussion of the problem
space.
- [GEP-1426] defines how to use Gateway API route resources, such as
HTTPRoute, to manage traffic within a service mesh.
- [GEP-1686] builds on the work of [GEP-1709] to define a _conformance
profile_ for service meshes to be declared conformant with Gateway API.
Although these are [Experimental][status] patterns, note that they are available
in the [`standard` release channel][ch], since the GAMMA initiative has not
needed to introduce new resources or fields to date.
[gamma]:https://gateway-api.sigs.k8s.io/concepts/gamma/
[status]:https://gateway-api.sigs.k8s.io/geps/overview/#status
[ch]:https://gateway-api.sigs.k8s.io/concepts/versioning/#release-channels-eg-experimental-standard
[cel]:/docs/reference/using-api/cel/
[crd]:/docs/tasks/extend-kubernetes/custom-resources/custom-resource-definitions/
[concepts]:https://gateway-api.sigs.k8s.io/concepts/api-overview/
[geps]:https://gateway-api.sigs.k8s.io/contributing/enhancement-requests/
[guides]:https://gateway-api.sigs.k8s.io/guides/getting-started/
[impl]:https://gateway-api.sigs.k8s.io/implementations/
[install-crds]:https://gateway-api.sigs.k8s.io/guides/getting-started/#install-the-crds
[issue]:https://github.com/kubernetes-sigs/gateway-api/issues/new/choose
[disc]:https://github.com/kubernetes-sigs/gateway-api/discussions
[community]:https://gateway-api.sigs.k8s.io/contributing/community/
[mesh-routing]:https://gateway-api.sigs.k8s.io/concepts/gamma/#how-the-gateway-api-works-for-service-mesh
[GEP-1426]:https://gateway-api.sigs.k8s.io/geps/gep-1426/
[GEP-1324]:https://gateway-api.sigs.k8s.io/geps/gep-1324/
[GEP-1686]:https://gateway-api.sigs.k8s.io/geps/gep-1686/
[GEP-1709]:https://gateway-api.sigs.k8s.io/geps/gep-1709/
[issue 2277]:https://github.com/kubernetes-sigs/gateway-api/issues/2277
[supported-versions]:https://gateway-api.sigs.k8s.io/concepts/versioning/#supported-versions
[v0.8.0 release notes]:https://github.com/kubernetes-sigs/gateway-api/releases/tag/v0.8.0
[versioning docs]:https://gateway-api.sigs.k8s.io/concepts/versioning/

File diff suppressed because it is too large Load Diff

After

Width:  |  Height:  |  Size: 296 KiB

View File

@ -0,0 +1,214 @@
---
layout: blog
title: "Kubernetes Legacy Package Repositories Will Be Frozen On September 13, 2023"
date: 2023-08-31T15:30:00-07:00
slug: legacy-package-repository-deprecation
evergreen: true
---
**Authors**: Bob Killen (Google), Chris Short (AWS), Jeremy Rickard (Microsoft), Marko Mudrinić (Kubermatic), Tim Bannister (The Scale Factory)
On August 15, 2023, the Kubernetes project announced the general availability of
the community-owned package repositories for Debian and RPM packages available
at `pkgs.k8s.io`. The new package repositories are replacement for the legacy
Google-hosted package repositories: `apt.kubernetes.io` and `yum.kubernetes.io`.
The
[announcement blog post for `pkgs.k8s.io`](/blog/2023/08/15/pkgs-k8s-io-introduction/)
highlighted that we will stop publishing packages to the legacy repositories in
the future.
Today, we're formally deprecating the legacy package repositories (`apt.kubernetes.io`
and `yum.kubernetes.io`), and we're announcing our plans to freeze the contents of
the repositories as of **September 13, 2023**.
Please continue reading in order to learn what does this mean for you as an user or
distributor, and what steps you may need to take.
## How does this affect me as a Kubernetes end user?
This change affects users **directly installing upstream versions of Kubernetes**,
either manually by following the official
[installation](/docs/setup/production-environment/tools/kubeadm/install-kubeadm/) and
[upgrade](/docs/tasks/administer-cluster/kubeadm/kubeadm-upgrade/) instructions, or
by **using a Kubernetes installer** that's using packages provided by the Kubernetes
project.
**This change also affects you if you run Linux on your own PC and have installed `kubectl` using the legacy package repositories**.
We'll explain later on how to [check](#check-if-affected) if you're affected.
If you use **fully managed** Kubernetes, for example through a service from a cloud
provider, you would only be affected by this change if you also installed `kubectl`
on your Linux PC using packages from the legacy repositories. Cloud providers are
generally using their own Kubernetes distributions and therefore they don't use
packages provided by the Kubernetes project; more importantly, if someone else is
managing Kubernetes for you, then they would usually take responsibility for that check.
If you have a managed [control plane](/docs/concepts/overview/components/#control-plane-components)
but you are responsible for **managing the nodes yourself**, and any of those nodes run Linux,
you should [check](#check-if-affected) whether you are affected.
If you're managing your clusters on your own by following the official installation
and upgrade instructions, please follow the instructions in this blog post to migrate
to the (new) community-owned package repositories.
If you're using a Kubernetes installer that's using packages provided by the
Kubernetes project, please check the installer tool's communication channels for
information about what steps you need to take, and eventually if needed, follow up
with maintainers to let them know about this change.
The following diagram shows who's affected by this change in a visual form
(click on diagram for the larger version):
{{< figure src="/blog/2023/08/31/legacy-package-repository-deprecation/flow.svg" alt="Visual explanation of who's affected by the legacy repositories being deprecated and frozen. Textual explanation is available above this diagram." class="diagram-large" link="/blog/2023/08/31/legacy-package-repository-deprecation/flow.svg" >}}
## How does this affect me as a Kubernetes distributor?
If you're using the legacy repositories as part of your project (e.g. a Kubernetes
installer tool), you should migrate to the community-owned repositories as soon as
possible and inform your users about this change and what steps they need to take.
## Timeline of changes
<!-- note to maintainers - the trailing whitespace is significant -->
- **15th August 2023:**
Kubernetes announces a new, community-managed source for Linux software packages of Kubernetes components
- **31st August 2023:**
_(this announcement)_ Kubernetes formally deprecates the legacy
package repositories
- **13th September 2023** (approximately):
Kubernetes will freeze the legacy package repositories,
(`apt.kubernetes.io` and `yum.kubernetes.io`).
The freeze will happen immediately following the patch releases that are scheduled for September, 2023.
The Kubernetes patch releases scheduled for September 2023 (v1.28.2, v1.27.6,
v1.26.9, v1.25.14) will have packages published **both** to the community-owned and
the legacy repositories.
We'll freeze the legacy repositories after cutting the patch releases for September
which means that we'll completely stop publishing packages to the legacy repositories
at that point.
For the v1.28, v1.27, v1.26, and v1.25 patch releases from October 2023 and onwards,
we'll only publish packages to the new package repositories (`pkgs.k8s.io`).
### What about future minor releases?
Kubernetes 1.29 and onwards will have packages published **only** to the
community-owned repositories (`pkgs.k8s.io`).
## Can I continue to use the legacy package repositories?
The existing packages in the legacy repositories will be available for the foreseeable
future. However, the Kubernetes project can't provide _any_ guarantees on how long
is that going to be. The deprecated legacy repositories, and their contents, might
be removed at any time in the future and without a further notice period.
The Kubernetes project **strongly recommends** migrating to the new community-owned
repositories **as soon as possible**.
Given that no new releases will be published to the legacy repositories **after the September 13, 2023**
cut-off point, **you will not be able to upgrade to any patch or minor release made from that date onwards.**
Whilst the project makes every effort to release secure software, there may one
day be a high-severity vulnerability in Kubernetes, and consequently an important
release to upgrade to. The advice we're announcing will help you be as prepared for
any future security update, whether trivial or urgent.
## How can I check if I'm using the legacy repositories? {#check-if-affected}
The steps to check if you're using the legacy repositories depend on whether you're
using Debian-based distributions (Debian, Ubuntu, and more) or RPM-based distributions
(CentOS, RHEL, Rocky Linux, and more) in your cluster.
Run these instructions on one of your nodes in the cluster.
### Debian-based Linux distributions
The repository definitions (sources) are located in `/etc/apt/sources.list` and `/etc/apt/sources.list.d/`
on Debian-based distributions. Inspect these two locations and try to locate a
package repository definition that looks like:
```
deb [signed-by=/etc/apt/keyrings/kubernetes-archive-keyring.gpg] https://apt.kubernetes.io/ kubernetes-xenial main
```
**If you find a repository definition that looks like this, you're using the legacy repository and you need to migrate.**
If the repository definition uses `pkgs.k8s.io`, you're already using the
community-hosted repositories and you don't need to take any action.
On most systems, this repository definition should be located in
`/etc/apt/sources.list.d/kubernetes.list` (as recommended by the Kubernetes
documentation), but on some systems it might be in a different location.
If you can't find a repository definition related to Kubernetes, it's likely that you
don't use package managers to install Kubernetes and you don't need to take any action.
### RPM-based Linux distributions
The repository definitions are located in `/etc/yum.repos.d` if you're using the
`yum` package manager, or `/etc/dnf/dnf.conf` and `/etc/dnf/repos.d/` if you're using
`dnf` package manager. Inspect those locations and try to locate a package repository
definition that looks like this:
```
[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-\$basearch
enabled=1
gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kubelet kubeadm kubectl
```
**If you find a repository definition that looks like this, you're using the legacy repository and you need to migrate.**
If the repository definition uses `pkgs.k8s.io`, you're already using the
community-hosted repositories and you don't need to take any action.
On most systems, that repository definition should be located in `/etc/yum.repos.d/kubernetes.repo`
(as recommended by the Kubernetes documentation), but on some systems it might be
in a different location.
If you can't find a repository definition related to Kubernetes, it's likely that you
don't use package managers to install Kubernetes and you don't need to take any action.
## How can I migrate to the new community-operated repositories?
For more information on how to migrate to the new community
managed packages, please refer to the
[announcement blog post for `pkgs.k8s.io`](/blog/2023/08/15/pkgs-k8s-io-introduction/).
## Why is the Kubernetes project making this change?
Kubernetes has been publishing packages solely to the Google-hosted repository
since Kubernetes v1.5, or the past **seven** years! Following in the footsteps of
migrating to our community-managed registry, `registry.k8s.io`, we are now migrating the
Kubernetes package repositories to our own community-managed infrastructure. Were
thankful to Google for their continuous hosting and support all these years, but
this transition marks another big milestone for the projects goal of migrating
to complete community-owned infrastructure.
## Is there a Kubernetes tool to help me migrate?
We don't have any announcement to make about tooling there. As a Kubernetes user, you
have to manually modify your configuration to use the new repositories. Automating
the migration from the legacy to the community-owned repositories is technically
challenging and we want to avoid any potential risks associated with this.
## Acknowledgments
First of all, we want to acknowledge the contributions from Alphabet. Staff at Google
have provided their time; Google as a business has provided both the infrastructure
to serve packages, and the security context for giving those packages trustworthy
digital signatures.
These have been important to the adoption and growth of Kubernetes.
Releasing software might not be glamorous but it's important. Many people within
the Kubernetes contributor community have contributed to the new way that we, as a
project, have for building and publishing packages.
And finally, we want to once again acknowledge the help from SUSE. OpenBuildService,
from SUSE, is the technology that the powers the new community-managed package repositories.

View File

@ -0,0 +1,102 @@
---
layout: blog
title: 'Comparing Local Kubernetes Development Tools: Telepresence, Gefyra, and mirrord'
date: 2023-09-12
slug: local-k8s-development-tools
---
**Author:** Eyal Bukchin (MetalBear)
The Kubernetes development cycle is an evolving landscape with a myriad of tools seeking to streamline the process. Each tool has its unique approach, and the choice often comes down to individual project requirements, the team's expertise, and the preferred workflow.
Among the various solutions, a category we dubbed “Local K8S Development tools” has emerged, which seeks to enhance the Kubernetes development experience by connecting locally running components to the Kubernetes cluster. This facilitates rapid testing of new code in cloud conditions, circumventing the traditional cycle of Dockerization, CI, and deployment.
In this post, we compare three solutions in this category: Telepresence, Gefyra, and our own contender, mirrord.
## Telepresence
The oldest and most well-established solution in the category, [Telepresence](https://www.telepresence.io/) uses a VPN (or more specifically, a `tun` device) to connect the user's machine (or a locally running container) and the cluster's network. It then supports the interception of incoming traffic to a specific service in the cluster, and its redirection to a local port. The traffic being redirected can also be filtered to avoid completely disrupting the remote service. It also offers complementary features to support file access (by locally mounting a volume mounted to a pod) and importing environment variables.
Telepresence requires the installation of a local daemon on the user's machine (which requires root privileges) and a Traffic Manager component on the cluster. Additionally, it runs an Agent as a sidecar on the pod to intercept the desired traffic.
## Gefyra
[Gefyra](https://gefyra.dev/), similar to Telepresence, employs a VPN to connect to the cluster. However, it only supports connecting locally running Docker containers to the cluster. This approach enhances portability across different OSes and local setups. However, the downside is that it does not support natively run uncontainerized code.
Gefyra primarily focuses on network traffic, leaving file access and environment variables unsupported. Unlike Telepresence, it doesn't alter the workloads in the cluster, ensuring a straightforward clean-up process if things go awry.
## mirrord
The newest of the three tools, [mirrord](https://mirrord.dev/) adopts a different approach by injecting itself
into the local binary (utilizing `LD_PRELOAD` on Linux or `DYLD_INSERT_LIBRARIES` on macOS),
and overriding libc function calls, which it then proxies a temporary agent it runs in the cluster.
For example, when the local process tries to read a file mirrord intercepts that call and sends it
to the agent, which then reads the file from the remote pod. This method allows mirrord to cover
all inputs and outputs to the process covering network access, file access, and
environment variables uniformly.
By working at the process level, mirrord supports running multiple local processes simultaneously, each in the context of their respective pod in the cluster, without requiring them to be containerized and without needing root permissions on the users machine.
## Summary
<table>
<caption>Comparison of Telepresence, Gefyra, and mirrord</caption>
<thead>
<tr>
<td class="empty"></td>
<th>Telepresence</th>
<th>Gefyra</th>
<th>mirrord</th>
</tr>
</thead>
<tbody>
<tr>
<th scope="row">Cluster connection scope</th>
<td>Entire machine or container</td>
<td>Container</td>
<td>Process</td>
</tr>
<tr>
<th scope="row">Developer OS support</th>
<td>Linux, macOS, Windows</td>
<td>Linux, macOS, Windows</td>
<td>Linux, macOS, Windows (WSL)</td>
</tr>
<tr>
<th scope="row">Incoming traffic features</th>
<td>Interception</td>
<td>Interception</td>
<td>Interception or mirroring</td>
</tr>
<tr>
<th scope="row">File access</th>
<td>Supported</td>
<td>Unsupported</td>
<td>Supported</td>
</tr>
<tr>
<th scope="row">Environment variables</th>
<td>Supported</td>
<td>Unsupported</td>
<td>Supported</td>
</tr>
<tr>
<th scope="row">Requires local root</th>
<td>Yes</td>
<td>No</td>
<td>No</td>
</tr>
<tr>
<th scope="row">How to use</th>
<td><ul><li>CLI</li><li>Docker Desktop extension</li></ul></td>
<td><ul><li>CLI</li><li>Docker Desktop extension</li></ul></td>
<td><ul><li>CLI</li><li>Visual Studio Code extension</li><li>IntelliJ plugin</li></ul></td>
</tr>
</tbody>
</table>
## Conclusion
Telepresence, Gefyra, and mirrord each offer unique approaches to streamline the Kubernetes development cycle, each having its strengths and weaknesses. Telepresence is feature-rich but comes with complexities, mirrord offers a seamless experience and supports various functionalities, while Gefyra aims for simplicity and robustness.
Your choice between them should depend on the specific requirements of your project, your team's familiarity with the tools, and the desired development workflow. Whichever tool you choose, we believe the local Kubernetes development approach can provide an easy, effective, and cheap solution to the bottlenecks of the Kubernetes development cycle, and will become even more prevalent as these tools continue to innovate and evolve.

View File

@ -0,0 +1,148 @@
---
layout: blog
title: "User Namespaces: Now Supports Running Stateful Pods in Alpha!"
date: 2023-09-13
slug: userns-alpha
---
**Authors:** Rodrigo Campos Catelin (Microsoft), Giuseppe Scrivano (Red Hat), Sascha Grunert (Red Hat)
Kubernetes v1.25 introduced support for user namespaces for only stateless
pods. Kubernetes 1.28 lifted that restriction, after some design changes were
done in 1.27.
The beauty of this feature is that:
* it is trivial to adopt (you just need to set a bool in the pod spec)
* doesn't need any changes for **most** applications
* improves security by _drastically_ enhancing the isolation of containers and
mitigating CVEs rated HIGH and CRITICAL.
This post explains the basics of user namespaces and also shows:
* the changes that arrived in the recent Kubernetes v1.28 release
* a **demo of a vulnerability rated as HIGH** that is not exploitable with user namespaces
* the runtime requirements to use this feature
* what you can expect in future releases regarding user namespaces.
## What is a user namespace?
A user namespace is a Linux feature that isolates the user and group identifiers
(UIDs and GIDs) of the containers from the ones on the host. The indentifiers
in the container can be mapped to indentifiers on the host in a way where the
host UID/GIDs used for different containers never overlap. Even more, the
identifiers can be mapped to *unprivileged* non-overlapping UIDs and GIDs on the
host. This basically means two things:
* As the UIDs and GIDs for different containers are mapped to different UIDs
and GIDs on the host, containers have a harder time to attack each other even
if they escape the container boundaries. For example, if container A is running
with different UIDs and GIDs on the host than container B, the operations it
can do on container B's files and process are limited: only read/write what a
file allows to others, as it will never have permission for the owner or
group (the UIDs/GIDs on the host are guaranteed to be different for
different containers).
* As the UIDs and GIDs are mapped to unprivileged users on the host, if a
container escapes the container boundaries, even if it is running as root
inside the container, it has no privileges on the host. This greatly
protects what host files it can read/write, which process it can send signals
to, etc.
Furthermore, capabilities granted are only valid inside the user namespace and
not on the host.
Without using a user namespace a container running as root, in the case of a
container breakout, has root privileges on the node. And if some capabilities
were granted to the container, the capabilities are valid on the host too. None
of this is true when using user namespaces (modulo bugs, of course 🙂).
## Changes in 1.28
As already mentioned, starting from 1.28, Kubernetes supports user namespaces
with stateful pods. This means that pods with user namespaces can use any type
of volume, they are no longer limited to only some volume types as before.
The feature gate to activate this feature was renamed, it is no longer
`UserNamespacesStatelessPodsSupport` but from 1.28 onwards you should use
`UserNamespacesSupport`. There were many changes done and the requirements on
the node hosts changed. So with Kubernetes 1.28 the feature flag was renamed to
reflect this.
## Demo
Rodrigo created a demo which exploits [CVE 2022-0492][cve-link] and shows how
the exploit can occur without user namespaces. He also shows how it is not
possible to use this exploit from a Pod where the containers are using this
feature.
This vulnerability is rated **HIGH** and allows **a container with no special
privileges to read/write to any path on the host** and launch processes as root
on the host too.
{{< youtube id="M4a2b4KkXN8" title="Mitigation of CVE-2022-0492 on Kubernetes by enabling User Namespace support">}}
Most applications in containers run as root today, or as a semi-predictable
non-root user (user ID 65534 is a somewhat popular choice). When you run a Pod
with containers using a userns, Kubernetes runs those containers as unprivileged
users, with no changes needed in your app.
This means two containers running as user 65534 will effectively be mapped to
different users on the host, limiting what they can do to each other in case of
an escape, and if they are running as root, the privileges on the host are
reduced to the one of an unprivileged user.
[cve-link]: https://unit42.paloaltonetworks.com/cve-2022-0492-cgroups/
## Node system requirements
There are requirements on the Linux kernel version as well as the container
runtime to use this feature.
On Linux you need Linux 6.3 or greater. This is because the feature relies on a
kernel feature named idmap mounts, and support to use idmap mounts with tmpfs
was merged in Linux 6.3.
If you are using CRI-O with crun, this is [supported in CRI-O
1.28.1][CRIO-release] and crun 1.9 or greater. If you are using CRI-O with runc,
this is still not supported.
containerd support is currently targeted for containerd 2.0; it is likely that
it won't matter if you use it with crun or runc.
Please note that containerd 1.7 added _experimental_ support for user
namespaces as implemented in Kubernetes 1.25 and 1.26. The redesign done in 1.27
is not supported by containerd 1.7, therefore it only works, in terms of user
namespaces support, with Kubernetes 1.25 and 1.26.
One limitation present in containerd 1.7 is that it needs to change the
ownership of every file and directory inside the container image, during Pod
startup. This means it has a storage overhead and can significantly impact the
container startup latency. Containerd 2.0 will probably include a implementation
that will eliminate the startup latency added and the storage overhead. Take
this into account if you plan to use containerd 1.7 with user namespaces in
production.
None of these containerd limitations apply to [CRI-O 1.28][CRIO-release].
[CRIO-release]: https://github.com/cri-o/cri-o/releases/tag/v1.28.1
## Whats next?
Looking ahead to Kubernetes 1.29, the plan is to work with SIG Auth to integrate user
namespaces to Pod Security Standards (PSS) and the Pod Security Admission. For
the time being, the plan is to relax checks in PSS policies when user namespaces are
in use. This means that the fields `spec[.*].securityContext` `runAsUser`,
`runAsNonRoot`, `allowPrivilegeEscalation` and `capabilities` will not trigger a
violation if user namespaces are in use. The behavior will probably be controlled by
utilizing a API Server feature gate, like `UserNamespacesPodSecurityStandards`
or similar.
## How do I get involved?
You can reach SIG Node by several means:
- Slack: [#sig-node](https://kubernetes.slack.com/messages/sig-node)
- [Mailing list](https://groups.google.com/forum/#!forum/kubernetes-sig-node)
- [Open Community Issues/PRs](https://github.com/kubernetes/community/labels/sig%2Fnode)
You can also contact us directly:
- GitHub: @rata @giuseppe @saschagrunert
- Slack: @rata @giuseppe @sascha

View File

@ -0,0 +1,111 @@
---
layout: blog
title: 'kubeadm: Use etcd Learner to Join a Control Plane Node Safely'
date: 2023-09-25
slug: kubeadm-use-etcd-learner-mode
---
**Author:** Paco Xu (DaoCloud)
The [`kubeadm`](/docs/reference/setup-tools/kubeadm/) tool now supports etcd learner mode, which
allows you to enhance the resilience and stability
of your Kubernetes clusters by leveraging the [learner mode](https://etcd.io/docs/v3.4/learning/design-learner/#appendix-learner-implementation-in-v34)
feature introduced in etcd version 3.4.
This guide will walk you through using etcd learner mode with kubeadm. By default, kubeadm runs
a local etcd instance on each control plane node.
In v1.27, kubeadm introduced a new feature gate `EtcdLearnerMode`. With this feature gate enabled,
when joining a new control plane node, a new etcd member will be created as a learner and
promoted to a voting member only after the etcd data are fully aligned.
## What are the advantages of using etcd learner mode?
etcd learner mode offers several compelling reasons to consider its adoption
in Kubernetes clusters:
1. **Enhanced Resilience**: etcd learner nodes are non-voting members that catch up with
the leader's logs before becoming fully operational. This prevents new cluster members
from disrupting the quorum or causing leader elections, making the cluster more resilient
during membership changes.
1. **Reduced Cluster Unavailability**: Traditional approaches to adding new members often
result in cluster unavailability periods, especially in slow infrastructure or misconfigurations.
etcd learner mode minimizes such disruptions.
1. **Simplified Maintenance**: Learner nodes provide a safer and reversible way to add or replace
cluster members. This reduces the risk of accidental cluster outages due to misconfigurations or
missteps during member additions.
1. **Improved Network Tolerance**: In scenarios involving network partitions, learner mode allows
for more graceful handling. Depending on the partition a new member lands, it can seamlessly
integrate with the existing cluster without causing disruptions.
In summary, the etcd learner mode improves the reliability and manageability of Kubernetes clusters
during member additions and changes, making it a valuable feature for cluster operators.
## How nodes join a cluster that's using the new mode
### Create a Kubernetes cluster backed by etcd in learner mode {#create-K8s-cluster-etcd-learner-mode}
For a general explanation about creating highly available clusters with kubeadm, you can refer to
[Creating Highly Available Clusters with kubeadm](/docs/setup/production-environment/tools/kubeadm/high-availability/).
To create a Kubernetes cluster, backed by etcd in learner mode, using kubeadm, follow these steps:
```shell
# kubeadm init --feature-gates=EtcdLearnerMode=true ...
kubeadm init --config=kubeadm-config.yaml
```
The kubeadm configuration file is like below:
```yaml
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
featureGates:
EtcdLearnerMode: true
```
The kubeadm tool deploys a single-node Kubernetes cluster with etcd set to use learner mode.
### Join nodes to the Kubernetes cluster
Before joining a control-plane node to the new Kubernetes cluster, ensure that the existing control plane nodes
and all etcd members are healthy.
Check the cluster health with `etcdctl`. If `etcdctl` isn't available, you can run this tool inside a container image.
You would do that directly with your container runtime using a tool such as `crictl run` and not through Kubernetes
Here is an example on a client command that uses secure communication to check the cluster health of the etcd cluster:
```shell
ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 \
--cert=/etc/kubernetes/pki/etcd/server.crt \
--key=/etc/kubernetes/pki/etcd/server.key \
--cacert=/etc/kubernetes/pki/etcd/ca.crt \
member list
...
dc543c4d307fadb9, started, node1, https://10.6.177.40:2380, https://10.6.177.40:2379, false
```
To check if the Kubernetes control plane is healthy, run `kubectl get node -l node-role.kubernetes.io/control-plane=`
and check if the nodes are ready.
{{< note >}}
It is recommended to have an odd number of members in an etcd cluster.
{{< /note >}}
Before joining a worker node to the new Kubernetes cluster, ensure that the control plane nodes are healthy.
## What's next
- The feature gate `EtcdLearnerMode` is alpha in v1.27 and we expect it to graduate to beta in the next
minor release of Kubernetes (v1.29).
- etcd has an open issue that may make the process more automatic:
[Support auto-promoting a learner member to a voting member](https://github.com/etcd-io/etcd/issues/15107).
- Learn more about the kubeadm [configuration format](/docs/reference/config-api/kubeadm-config.v1beta3/).
## Feedback
Was this guide helpful? If you have any feedback or encounter any issues, please let us know.
Your feedback is always welcome! Join the bi-weekly [SIG Cluster Lifecycle meeting](https://docs.google.com/document/d/1Gmc7LyCIL_148a9Tft7pdhdee0NBHdOfHS1SAF0duI4/edit)
or weekly [kubeadm office hours](https://docs.google.com/document/d/130_kiXjG7graFNSnIAgtMS1G8zPDwpkshgfRYS0nggo/edit).
Or reach us via [Slack](https://slack.k8s.io/) (channel **#kubeadm**), or the
[SIG's mailing list](https://groups.google.com/g/kubernetes-sig-cluster-lifecycle).

View File

@ -0,0 +1,73 @@
---
layout: blog
title: 'Happy 7th Birthday kubeadm!'
date: 2023-09-26
slug: happy-7th-birthday-kubeadm
---
**Author:** Fabrizio Pandini (VMware)
What a journey so far!
Starting from the initial blog post [“How we made Kubernetes insanely easy to install”](/blog/2016/09/how-we-made-kubernetes-easy-to-install/) in September 2016, followed by an exciting growth that lead to general availability / [“Production-Ready Kubernetes Cluster Creation with kubeadm”](/blog/2018/12/04/production-ready-kubernetes-cluster-creation-with-kubeadm/) two years later.
And later on a continuous, steady and reliable flow of small improvements that is still going on as of today.
## What is kubeadm? (quick refresher)
kubeadm is focused on bootstrapping Kubernetes clusters on existing infrastructure and performing an essential set of maintenance tasks. The core of the kubeadm interface is quite simple: new control plane nodes
are created by running [`kubeadm init`](/docs/reference/setup-tools/kubeadm/kubeadm-init/) and
worker nodes are joined to the control plane by running
[`kubeadm join`](/docs/reference/setup-tools/kubeadm/kubeadm-join/).
Also included are utilities for managing already bootstrapped clusters, such as control plane upgrades
and token and certificate renewal.
To keep kubeadm lean, focused, and vendor/infrastructure agnostic, the following tasks are out of its scope:
- Infrastructure provisioning
- Third-party networking
- Non-critical add-ons, e.g. for monitoring, logging, and visualization
- Specific cloud provider integrations
Infrastructure provisioning, for example, is left to other SIG Cluster Lifecycle projects, such as the
[Cluster API](https://cluster-api.sigs.k8s.io/). Instead, kubeadm covers only the common denominator
in every Kubernetes cluster: the
[control plane](/docs/concepts/overview/components/#control-plane-components).
The user may install their preferred networking solution and other add-ons on top of Kubernetes
*after* cluster creation.
Behind the scenes, kubeadm does a lot. The tool makes sure you have all the key components:
etcd, the API server, the scheduler, the controller manager. You can join more control plane nodes
for improving resiliency or join worker nodes for running your workloads. You get cluster DNS
and kube-proxy set up for you. TLS between components is enabled and used for encryption in transit.
## Let's celebrate! Past, present and future of kubeadm
In all and for all kubeadm's story is tightly coupled with Kubernetes' story, and with this amazing community.
Therefore celebrating kubeadm is first of all celebrating this community, a set of people, who joined forces in finding a common ground, a minimum viable tool, for bootstrapping Kubernetes clusters.
This tool, was instrumental to the Kubernetes success back in time as well as it is today, and the silver line of kubeadm's value proposition can be summarized in two points
- An obsession in making things deadly simple for the majority of the users: kubeadm init & kubeadm join, that's all you need!
- A sharp focus on a well-defined problem scope: bootstrapping Kubernetes clusters on existing infrastructure. As our slogan says: *keep it simple, keep it extensible!*
This silver line, this clear contract, is the foundation the entire kubeadm user base relies on, and this post is a celebration for kubeadm's users as well.
We are deeply thankful for any feedback from our users, for the enthusiasm that they are continuously showing for this tool via Slack, GitHub, social media, blogs, in person at every KubeCon or at the various meet ups around the world. Keep going!
What continues to amaze me after all those years is the great things people are building on top of kubeadm, and as of today there is a strong and very active list of projects doing so:
- [minikube](https://minikube.sigs.k8s.io/)
- [kind](https://kind.sigs.k8s.io/)
- [Cluster API](https://cluster-api.sigs.k8s.io/)
- [Kubespray](https://kubespray.io/)
- and many more; if you are using Kubernetes today, there is a good chance that you are using kubeadm even without knowing it 😜
This community, the kubeadms users, the projects building on top of kubeadm are the highlights of kubeadms 7th birthday celebration and the foundation for what will come next!
Stay tuned, and feel free to reach out to us!
- Try [kubeadm](/docs/setup/) to install Kubernetes today
- Get involved with the Kubernetes project on [GitHub](https://github.com/kubernetes/kubernetes)
- Connect with the community on [Slack](http://slack.k8s.io/)
- Follow us on Twitter [@Kubernetesio](https://twitter.com/kubernetesio) for latest updates

View File

@ -0,0 +1,60 @@
---
layout: blog
title: "Announcing the 2023 Steering Committee Election Results"
date: 2023-10-02
slug: steering-committee-results-2023
canonicalUrl: https://www.kubernetes.dev/blog/2023/10/02/steering-committee-results-2023/
---
**Author**: Kaslin Fields
The [2023 Steering Committee Election](https://github.com/kubernetes/community/tree/master/events/elections/2023) is now complete. The Kubernetes Steering Committee consists of 7 seats, 4 of which were up for election in 2023. Incoming committee members serve a term of 2 years, and all members are elected by the Kubernetes Community.
This community body is significant since it oversees the governance of the entire Kubernetes project. With that great power comes great responsibility. You can learn more about the steering committees role in their [charter](https://github.com/kubernetes/steering/blob/master/charter.md).
Thank you to everyone who voted in the election; your participation helps support the communitys continued health and success.
## Results
Congratulations to the elected committee members whose two year terms begin immediately (listed in alphabetical order by GitHub handle):
* **Stephen Augustus ([@justaugustus](https://github.com/justaugustus)), Cisco**
* **Paco Xu 徐俊杰 ([@pacoxu](https://github.com/pacoxu)), DaoCloud**
* **Patrick Ohly ([@pohly](https://github.com/pohly)), Intel**
* **Maciej Szulik ([@soltysh](https://github.com/soltysh)), Red Hat**
They join continuing members:
* **Benjamin Elder ([@bentheelder](https://github.com/bentheelder)), Google**
* **Bob Killen ([@mrbobbytables](https://github.com/mrbobbytables)), Google**
* **Nabarun Pal ([@palnabarun](https://github.com/palnabarun)), VMware**
Stephen Augustus is a returning Steering Committee Member.
## Big Thanks!
Thank you and congratulations on a successful election to this rounds election officers:
* Bridget Kromhout ([@bridgetkromhout](https://github.com/bridgetkromhout))
* Davanum Srinavas ([@dims](https://github.com/dims))
* Kaslin Fields ([@kaslin](https://github.com/kaslin))
Thanks to the Emeritus Steering Committee Members. Your service is appreciated by the community:
* Christoph Blecker ([@cblecker](https://github.com/cblecker))
* Carlos Tadeu Panato Jr. ([@cpanato](https://github.com/cpanato))
* Tim Pepper ([@tpepper](https://github.com/tpepper))
And thank you to all the candidates who came forward to run for election.
## Get Involved with the Steering Committee
This governing body, like all of Kubernetes, is open to all. You can follow along with Steering Committee [backlog items](https://github.com/orgs/kubernetes/projects/40) and weigh in by filing an issue or creating a PR against their [repo](https://github.com/kubernetes/steering). They have an open meeting on [the first Monday at 9:30am PT of every month](https://github.com/kubernetes/steering). They can also be contacted at their public mailing list steering@kubernetes.io.
You can see what the Steering Committee meetings are all about by watching past meetings on the [YouTube Playlist](https://www.youtube.com/playlist?list=PL69nYSiGNLP1yP1B_nd9-drjoxp0Q14qM).
If you want to meet some of the newly elected Steering Committee members, join us for the Steering AMA at the [Kubernetes Contributor Summit in Chicago](https://k8s.dev/summit).
---
_This post was written by the [Contributor Comms Subproject](https://github.com/kubernetes/community/tree/master/communication/contributor-comms). If you want to write stories about the Kubernetes community, learn more about us._

View File

@ -0,0 +1,197 @@
---
layout: blog
title: "Spotlight on SIG Architecture: Conformance"
slug: sig-architecture-conformance-spotlight-2023
date: 2023-10-05
canonicalUrl: https://www.k8s.dev/blog/2023/10/05/sig-architecture-conformance-spotlight-2023/
---
**Author**: Frederico Muñoz (SAS Institute)
_This is the first interview of a SIG Architecture Spotlight series
that will cover the different subprojects. We start with the SIG
Architecture: Conformance subproject_
In this [SIG
Architecture](https://github.com/kubernetes/community/blob/master/sig-architecture/README.md)
spotlight, we talked with [Riaan
Kleinhans](https://github.com/Riaankl) (ii-Team), Lead for the
[Conformance
sub-project](https://github.com/kubernetes/community/blob/master/sig-architecture/README.md#conformance-definition-1).
## About SIG Architecture and the Conformance subproject
**Frederico (FSM)**: Hello Riaan, and welcome! For starters, tell us a
bit about yourself, your role and how you got involved in Kubernetes.
**Riaan Kleinhans (RK)**: Hi! My name is Riaan Kleinhans and I live in
South Africa. I am the Project manager for the [ii-Team](ii.nz) in New
Zealand. When I joined ii the plan was to move to New Zealand in April
2020 and then Covid happened. Fortunately, being a flexible and
dynamic team we were able to make it work remotely and in very
different time zones.
The ii team have been tasked with managing the Kubernetes Conformance
testing technical debt and writing tests to clear the technical
debt. I stepped into the role of project manager to be the link
between monitoring, test writing and the community. Through that work
I had the privilege of meeting [Dan Kohn](https://github.com/dankohn)
in those first months, his enthusiasm about the work we were doing was
a great inspiration.
**FSM**: Thank you - so, your involvement in SIG Architecture started
because of the conformance work?
**RK**: SIG Architecture is the home for the Kubernetes Conformance
subproject. Initially, most of my interactions were directly with SIG
Architecture through the Conformance sub-project. However, as we
began organizing the work by SIG, we started engaging directly with
each individual SIG. These engagements with the SIGs that own the
untested APIs have helped us accelerate our work.
**FSM**: How would you describe the main goals and
areas of intervention of the Conformance sub-project?
**RM**: The Kubernetes Conformance sub-project focuses on guaranteeing
compatibility and adherence to the Kubernetes specification by
developing and maintaining a comprehensive conformance test suite. Its
main goals include assuring compatibility across different Kubernetes
implementations, verifying adherence to the API specification,
supporting the ecosystem by encouraging conformance certification, and
fostering collaboration within the Kubernetes community. By providing
standardised tests and promoting consistent behaviour and
functionality, the Conformance subproject ensures a reliable and
compatible Kubernetes ecosystem for developers and users alike.
## More on the Conformance Test Suite
**FSM**: A part of providing those standardised tests is, I believe,
the [Conformance Test
Suite](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md). Could
you explain what it is and its importance?
**RK**: The Kubernetes Conformance Test Suite checks if Kubernetes
distributions meet the project's specifications, ensuring
compatibility across different implementations. It covers various
features like APIs, networking, storage, scheduling, and
security. Passing the tests confirms proper implementation and
promotes a consistent and portable container orchestration platform.
**FSM**: Right, the tests are important in the way they define the
minimum features that any Kubernetes cluster must support. Could you
describe the process around determining which features are considered
for inclusion? Is there any tension between a more minimal approach,
and proposals from the other SIGs?
**RK**: The requirements for each endpoint that undergoes conformance
testing are clearly defined by SIG Architecture. Only API endpoints
that are generally available and non-optional features are eligible
for conformance. Over the years, there have been several discussions
regarding conformance profiles, exploring the possibility of including
optional endpoints like RBAC, which are widely used by most end users,
in specific profiles. However, this aspect is still a work in
progress.
Endpoints that do not meet the conformance criteria are listed in
[ineligible_endpoints.yaml](https://github.com/kubernetes/kubernetes/blob/master/test/conformance/testdata/ineligible_endpoints.yaml),
which is publicly accessible in the Kubernetes repo. This file can be
updated to add or remove endpoints as their status or requirements
change. These ineligible endpoints are also visible on
[APISnoop](https://apisnoop.cncf.io/).
Ensuring transparency and incorporating community input regarding the
eligibility or ineligibility of endpoints is of utmost importance to
SIG Architecture.
**FSM**: Writing tests for new features is something generally
requires some kind of enforcement. How do you see the evolution of
this in Kubernetes? Was there a specific effort to improve the process
in a way that required tests would be a first-class citizen, or was
that never an issue?
**RK**: When discussions surrounding the Kubernetes conformance
programme began in 2018, only approximately 11% of endpoints were
covered by tests. At that time, the CNCF's governing board requested
that if funding were to be provided for the work to cover missing
conformance tests, the Kubernetes Community should adopt a policy of
not allowing new features to be added unless they include conformance
tests for their stable APIs.
SIG Architecture is responsible for stewarding this requirement, and
[APISnoop](https://apisnoop.cncf.io/) has proven to be an invaluable
tool in this regard. Through automation, APISnoop generates a pull
request every weekend to highlight any discrepancies in Conformance
coverage. If any endpoints are promoted to General Availability
without a conformance test, it will be promptly identified. This
approach helps prevent the accumulation of new technical debt.
Additionally, there are plans in the near future to create a release
informing job, which will add an additional layer to prevent any new
technical debt.
**FSM**: I see, tooling and automation play an important role
there. What are, in your opinion, the areas that, conformance-wise,
still require some work to be done? In other words, what are the
current priority areas marked for improvement?
**RK**: We have reached the “100% Conformance Tested” milestone in
release 1.27!
At that point, the community took another look at all the endpoints
that were listed as ineligible for conformance. The list was populated
through community input over several years. Several endpoints
that were previously deemed ineligible for conformance have been
identified and relocated to a new dedicated list, which is currently
receiving focused attention for conformance test development. Again,
that list can also be checked on apisnoop.cncf.io.
To ensure the avoidance of new technical debt in the conformance
project, there are upcoming plans to establish a release informing job
as an additional preventive measure.
While APISnoop is currently hosted on CNCF infrastructure, the project
has been generously donated to the Kubernetes community. Consequently,
it will be transferred to community-owned infrastructure before the
end of 2023.
**FSM**: That's great news! For anyone wanting to help, what are the
venues for collaboration that you would highlight? Do all of them
require solid knowledge of Kubernetes as a whole, or are there ways
someone newer to the project can contribute?
**RK**: Contributing to conformance testing is akin to the task of
"washing the dishes" it may not be highly visible, but it remains
incredibly important. It necessitates a strong understanding of
Kubernetes, particularly in the areas where the endpoints need to be
tested. This is why working with each SIG that owns the API endpoint
being tested is so important.
As part of our commitment to making test writing accessible to
everyone, the ii team is currently engaged in the development of a
"click and deploy" solution. This solution aims to enable anyone to
swiftly create a working environment on real hardware within
minutes. We will share updates regarding this development as soon as
we are ready.
**FSM**: That's very helpful, thank you. Any final comments you would
like to share with our readers?
**RK**: Conformance testing is a collaborative community endeavour that
involves extensive cooperation among SIGs. SIG Architecture has
spearheaded the initiative and provided guidance. However, the
progress of the work relies heavily on the support of all SIGs in
reviewing, enhancing, and endorsing the tests.
I would like to extend my sincere appreciation to the ii team for
their unwavering commitment to resolving technical debt over the
years. In particular, [Hippie Hacker](https://github.com/hh)'s
guidance and stewardship of the vision has been
invaluable. Additionally, I want to give special recognition to
Stephen Heywood for shouldering the majority of the test writing
workload in recent releases, as well as to Zach Mandeville for his
contributions to APISnoop.
**FSM**: Many thanks for your availability and insightful comments,
I've personally learned quite a bit with it and I'm sure our readers
will as well.

View File

@ -0,0 +1,189 @@
---
layout: blog
title: "CRI-O is moving towards pkgs.k8s.io"
date: 2023-10-10
slug: cri-o-community-package-infrastructure
---
**Author:** Sascha Grunert
The Kubernetes community [recently announced](/blog/2023/08/31/legacy-package-repository-deprecation/)
that their legacy package repositories are frozen, and now they moved to
[introduced community-owned package repositories](/blog/2023/08/15/pkgs-k8s-io-introduction) powered by the
[OpenBuildService (OBS)](https://build.opensuse.org/project/subprojects/isv:kubernetes).
CRI-O has a long history of utilizing
[OBS for their package builds](https://github.com/cri-o/cri-o/blob/e292f17/install.md#install-packaged-versions-of-cri-o),
but all of the packaging efforts have been done manually so far.
The CRI-O community absolutely loves Kubernetes, which means that they're
delighted to announce that:
**All future CRI-O packages will be shipped as part of the officially supported
Kubernetes infrastructure hosted on pkgs.k8s.io!**
There will be a deprecation phase for the existing packages, which is currently
being [discussed in the CRI-O community](https://github.com/cri-o/cri-o/discussions/7315).
The new infrastructure will only support releases of CRI-O `>= v1.28.2` as well as
release branches newer than `release-1.28`.
## How to use the new packages
In the same way as the Kubernetes community, CRI-O provides `deb` and `rpm`
packages as part of a dedicated subproject in OBS, called
[`isv:kubernetes:addons:cri-o`](https://build.opensuse.org/project/show/isv:kubernetes:addons:cri-o).
This project acts as an umbrella and provides `stable` (for CRI-O tags) as well as
`prerelease` (for CRI-O `release-1.y` and `main` branches) package builds.
**Stable Releases:**
- [`isv:kubernetes:addons:cri-o:stable`](https://build.opensuse.org/project/show/isv:kubernetes:addons:cri-o:stable): Stable Packages
- [`isv:kubernetes:addons:cri-o:stable:v1.29`](https://build.opensuse.org/project/show/isv:kubernetes:addons:cri-o:stable:v1.29): `v1.29.z` tags
- [`isv:kubernetes:addons:cri-o:stable:v1.28`](https://build.opensuse.org/project/show/isv:kubernetes:addons:cri-o:stable:v1.28): `v1.28.z` tags
**Prereleases:**
- [`isv:kubernetes:addons:cri-o:prerelease`](https://build.opensuse.org/project/show/isv:kubernetes:addons:cri-o:prerelease): Prerelease Packages
- [`isv:kubernetes:addons:cri-o:prerelease:main`](https://build.opensuse.org/project/show/isv:kubernetes:addons:cri-o:prerelease:main): [`main`](https://github.com/cri-o/cri-o/commits/main) branch
- [`isv:kubernetes:addons:cri-o:prerelease:v1.29`](https://build.opensuse.org/project/show/isv:kubernetes:addons:cri-o:prerelease:v1.29): [`release-1.29`](https://github.com/cri-o/cri-o/commits/release-1.29) branch
- [`isv:kubernetes:addons:cri-o:prerelease:v1.28`](https://build.opensuse.org/project/show/isv:kubernetes:addons:cri-o:prerelease:v1.28): [`release-1.28`](https://github.com/cri-o/cri-o/commits/release-1.28) branch
There are no stable releases available in the v1.29 repository yet, because
v1.29.0 will be released in December. The CRI-O community will also **not**
support release branches older than `release-1.28`, because there have been CI
requirements merged into `main` which could be only backported to `release-1.28`
with appropriate efforts.
For example, If an end-user would like to install the latest available version
of the CRI-O `main` branch, then they can add the repository in the same way as
they do for Kubernetes.
### `rpm` Based Distributions
For `rpm` based distributions, you can run the following commands as a `root` user
to install CRI-O together with Kubernetes:
#### Add the Kubernetes repo
```bash
cat <<EOF | tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.28/rpm/repodata/repomd.xml.key
EOF
```
#### Add the CRI-O repo
```bash
cat <<EOF | tee /etc/yum.repos.d/cri-o.repo
[cri-o]
name=CRI-O
baseurl=https://pkgs.k8s.io/addons:/cri-o:/prerelease:/main/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/addons:/cri-o:/prerelease:/main/rpm/repodata/repomd.xml.key
EOF
```
#### Install official package dependencies
```bash
dnf install -y \
conntrack \
container-selinux \
ebtables \
ethtool \
iptables \
socat
```
#### Install the packages from the added repos
```bash
dnf install -y --repo cri-o --repo kubernetes \
cri-o \
kubeadm \
kubectl \
kubelet
```
### `deb` Based Distributions
For `deb` based distributions, you can run the following commands as a `root`
user:
#### Install dependencies for adding the repositories
```bash
apt-get update
apt-get install -y software-properties-common curl
```
#### Add the Kubernetes repository
```bash
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.28/deb/Release.key |
gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.28/deb/ /" |
tee /etc/apt/sources.list.d/kubernetes.list
```
#### Add the CRI-O repository
```bash
curl -fsSL https://pkgs.k8s.io/addons:/cri-o:/prerelease:/main/deb/Release.key |
gpg --dearmor -o /etc/apt/keyrings/cri-o-apt-keyring.gpg
echo "deb [signed-by=/etc/apt/keyrings/cri-o-apt-keyring.gpg] https://pkgs.k8s.io/addons:/cri-o:/prerelease:/main/deb/ /" |
tee /etc/apt/sources.list.d/cri-o.list
```
#### Install the packages
```bash
apt-get update
apt-get install -y cri-o kubelet kubeadm kubectl
```
#### Start CRI-O
```bash
systemctl start crio.service
```
The Project's `prerelease:/main` prefix at the CRI-O's package path, can be replaced with
`stable:/v1.28`, `stable:/v1.29`, `prerelease:/v1.28` or `prerelease:/v1.29`
if another stream package is used.
Bootstrapping [a cluster using `kubeadm`](/docs/setup/production-environment/tools/kubeadm/install-kubeadm/)
can be done by running `kubeadm init` command, which automatically detects that
CRI-O is running in the background. There are also `Vagrantfile` examples
available for [Fedora 38](https://github.com/cri-o/packaging/blob/91df5f7/test/rpm/Vagrantfile)
as well as [Ubuntu 22.04](https://github.com/cri-o/packaging/blob/91df5f7/test/deb/Vagrantfile)
for testing the packages together with `kubeadm`.
## How it works under the hood
Everything related to these packages lives in the new
[CRI-O packaging repository](https://github.com/cri-o/packaging).
It contains a [daily reconciliation](https://github.com/cri-o/packaging/blob/91df5f7/.github/workflows/schedule.yml)
GitHub action workflow, for all supported release branches as well as tags of
CRI-O. A [test pipeline](https://github.com/cri-o/packaging/actions/workflows/obs.yml)
in the OBS workflow ensures that the packages can be correctly installed and
used before being published. All of the staging and publishing of the
packages is done with the help of the [Kubernetes Release Toolbox (krel)](https://github.com/kubernetes/release/blob/1f85912/docs/krel/README.md),
which is also used for the official Kubernetes `deb` and `rpm` packages.
The package build inputs will undergo daily reconciliation and will be supplied by
CRI-O's static binary bundles.
These bundles are built and signed for each commit in the CRI-O CI,
and contain everything CRI-O requires to run on a certain architecture.
The static builds are reproducible, powered by [nixpkgs](https://github.com/NixOS/nixpkgs)
and available only for `x86_64`, `aarch64` and `ppc64le` architecture.
The CRI-O maintainers will be happy to listen to any feedback or suggestions on the new
packaging efforts! Thank you for reading this blog post, feel free to reach out
to the maintainers via the Kubernetes [Slack channel #crio](https://kubernetes.slack.com/messages/CAZH62UR1)
or create an issue in the [packaging repository](https://github.com/cri-o/packaging/issues).

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 17 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 24 KiB

View File

@ -0,0 +1,698 @@
---
layout: blog
title: "Bootstrap an Air Gapped Cluster With Kubeadm"
date: 2023-10-12
slug: bootstrap-an-air-gapped-cluster-with-kubeadm
---
**Author:** Rob Mengert (Defense Unicorns)
Ever wonder how software gets deployed onto a system that is deliberately disconnected from the Internet and other networks? These systems are typically disconnected due to their sensitive nature. Sensitive as in utilities (power/water), banking, healthcare, weapons systems, other government use cases, etc. Sometimes it's technically a water gap, if you're running Kubernetes on an underwater vessel. Still, these environments need software to operate. This concept of deployment in a disconnected state is what it means to deploy to the other side of an [air gap](https://en.wikipedia.org/wiki/Air_gap_(networking)).
Again, despite this posture, software still needs to run in these environments. Traditionally, software artifacts are physically carried across the air gap on hard drives, USB sticks, CDs, or floppy disks (for ancient systems, it still happens). Kubernetes lends itself particularly well to running software behind an air gap for several reasons, largely due to its declarative nature.
In this blog article, I will walk through the process of bootstrapping a Kubernetes
cluster in an air-gapped lab environment using Fedora Linux and kubeadm.
## The Air Gap VM Setup
A real air-gapped network can take some effort to set up, so for this post, I will use an example VM on a laptop and do some network modifications. Below is the topology:
{{< figure src="airgap-vm.svg" alt="Topology on the host/laptop which shows that connectivity to the internet from the air gap VM is not possible. However, connectivity between the host/laptop and the VM is possible" >}}
### Local topology
This VM will have its network connectivity disabled but in a way that doesn't shut down the VM's virtual NIC. Instead, its network will be downed by injecting a default route to a dummy interface, making anything internet-hosted unreachable. However, the VM still has a connected route to the bridge interface on the host, which means that network connectivity to the host is still working. This posture means that data can be transferred from the host/laptop to the VM via scp, even with the default route on the VM black-holing all traffic that isn't destined for the local bridge subnet. This type of transfer is analogous to carrying data across the air gap and will be used throughout this post.
Other details about the lab setup:
**VM OS:** Fedora 37
**Kubernetes Version:** v1.27.3
**CNI Plugins Version:** v1.3.0
**CNI Provider and Version:** Flannel v0.22.0
While this single VM lab is a simplified example, the below diagram more approximately shows what a real air-gapped environment could look like:
{{< figure src="example_production_topology.svg" alt="Example production topology which shows 3 control plane Kubernetes nodes and 'n' worker nodes along with a Docker registry in an air-gapped environment. Additionally shows two workstations, one on each side of the air gap and an IT admin which physically carries the artifacts across." >}}
Note, there is still intentional isolation between the envirnment and the internet. There are also some things that are not shown in order to keep the diagram simple, for example malware scanning on the secure side of the air gap.
Back to the single VM lab environment.
## Identifying the required software artifacts
I have gone through the trouble of identifying all of the required software components that need to be carried across the air gap in order for this cluster to be stood up:
- Docker (to host an internal container image registry)
- Containerd
- libcgroup
- socat
- conntrack-tools
- CNI plugins
- crictl
- kubeadm
- kubelet
- kubectl and k9s (strictly speaking, these aren't required to bootstrap a cluster but they are handy to interact with one)
- kubelet.service systemd file
- kubeadm configuration file
- Docker registry container image
- Kubernetes component container images
- CNI network plugin container images ([Flannel](https://github.com/flannel-io/flannel) will be used for this lab)
- CNI network plugin manifests
- CNI tooling container images
The way I identified these was by trying to do the installation and working through all of the errors that are thrown around an additional dependency being required. In a real air-gapped scenario, each transport of artifacts across the air gap could represent anywhere from 20 minutes to several weeks of time spent by the installer. That is to say that the target system could be located in a data center on the same floor as your desk, at a satellite downlink facility in the middle of nowhere, or on a submarine that's out to sea. Knowing what is on that system at any given time is important so you know what you have to bring.
## Prepare the Node for K8s
Before downloading and moving the artifacts to the VM, let's first prep that VM to run Kubernetes.
### VM preparation
_Run these steps as a normal user_
**Make destination directory for software artifacts**
```bash
mkdir ~/tmp
```
_Run the following steps as the superuser_ (`root`)
Write to `/etc/sysctl.d/99-k8s-cri.conf`:
```bash
cat > /etc/sysctl.d/99-k8s-cri.conf << EOF
net.bridge.bridge-nf-call-iptables=1
net.ipv4.ip_forward=1
net.bridge.bridge-nf-call-ip6tables=1
EOF
```
Write to `/etc/modules-load.d/k8s.conf` (enable `overlay` and `nbr_netfilter`):
```bash
echo -e overlay\\nbr_netfilter > /etc/modules-load.d/k8s.conf
```
Install iptables:
```bash
dnf -y install iptables-legacy
```
Set iptables to use legacy mode (not `nft` emulating `iptables`):
```bash
update-alternatives --set iptables /usr/sbin/iptables-legacy
```
Turn off swap:
```bash
touch /etc/systemd/zram-generator.conf
systemctl mask systemd-zram-setup@.service
sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
```
Disable `firewalld` (this is OK in a demo context):
```bash
systemctl disable --now firewalld
```
Disable `systemd-resolved`:
```bash
systemctl disable --now systemd-resolved
```
Configure DNS defaults for NetworkManager:
```bash
sed -i '/\[main\]/a dns=default' /etc/NetworkManager/NetworkManager.conf
```
Blank the system-level DNS resolver configuration:
```bash
unlink /etc/resolv.conf || true
touch /etc/resolv.conf
```
Disable SELinux _(just for a demo - check before doing this in production!)_:
```bash
setenforce 0
```
**Make sure all changes survive a reboot**
```bash
reboot
```
## Download all the artifacts
On the laptop/host machine, download all of the artifacts enumerated in the previous section. Since the air gapped VM is running Fedora 37, all of the dependencies shown in this part are for Fedora 37. Note, this procedure will only work on AArch64 or AMD64 CPU architectures as they are the most popular and widely available.. You can execute this procedure anywhere you have write permissions; your home directory is a perfectly suitable choice.
Note, operating system packages for the Kubernetes artifacts that need to be carried across can now be found at [pkgs.k8s.io](https://kubernetes.io/blog/2023/08/15/pkgs-k8s-io-introduction/). This blog post will use a combination of Fedora repositories and GitHub in order to download all of the required artifacts. When youre doing this on your own cluster, you should decide whether to use the official Kubernetes packages, or the official packages from your operating system distribution - both are valid choices.
```bash
# Set architecture variables
UARCH=$(uname -m)
if [["$UARCH" == "arm64" || "$UARCH" == "aarch64"]]; then
ARCH="aarch64"
K8s_ARCH="arm64"
else
ARCH="x86_64"
K8s_ARCH="amd64"
fi
```
Set environment variables for software versions to use:
```bash
CNI_PLUGINS_VERSION="v1.3.0"
CRICTL_VERSION="v1.27.0"
KUBE_RELEASE="v1.27.3"
RELEASE_VERSION="v0.15.1"
K9S_VERSION="v0.27.4"
```
**Create a `download` directory, change into it, and download all of the RPMs and configuration files**
```bash
mkdir download && cd download
curl -O https://download.docker.com/linux/fedora/37/${ARCH}/stable/Packages/docker-ce-cli-23.0.2-1.fc37.${ARCH}.rpm
curl -O https://download.docker.com/linux/fedora/37/${ARCH}/stable/Packages/containerd.io-1.6.19-3.1.fc37.${ARCH}.rpm
curl -O https://download.docker.com/linux/fedora/37/${ARCH}/stable/Packages/docker-compose-plugin-2.17.2-1.fc37.${ARCH}.rpm
curl -O https://download.docker.com/linux/fedora/37/${ARCH}/stable/Packages/docker-ce-rootless-extras-23.0.2-1.fc37.${ARCH}.rpm
curl -O https://download.docker.com/linux/fedora/37/${ARCH}/stable/Packages/docker-ce-23.0.2-1.fc37.${ARCH}.rpm
curl -O https://download-ib01.fedoraproject.org/pub/fedora/linux/releases/37/Everything/${ARCH}/os/Packages/l/libcgroup-3.0-1.fc37.${ARCH}.rpm
echo -e "\nDownload Kubernetes Binaries"
curl -L -O "https://github.com/containernetworking/plugins/releases/download/${CNI_PLUGINS_VERSION}/cni-plugins-linux-${K8s_ARCH}-${CNI_PLUGINS_VERSION}.tgz"
curl -L -O "https://github.com/kubernetes-sigs/cri-tools/releases/download/${CRICTL_VERSION}/crictl-${CRICTL_VERSION}-linux-${K8s_ARCH}.tar.gz"
curl -L --remote-name-all https://dl.k8s.io/release/${KUBE_RELEASE}/bin/linux/${K8s_ARCH}/{kubeadm,kubelet}
curl -L -O "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubelet/lib/systemd/system/kubelet.service"
curl -L -O "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubeadm/10-kubeadm.conf"
curl -L -O "https://dl.k8s.io/release/${KUBE_RELEASE}/bin/linux/${K8s_ARCH}/kubectl"
echo -e "\nDownload dependencies"
curl -O "https://dl.fedoraproject.org/pub/fedora/linux/releases/37/Everything/${ARCH}/os/Packages/s/socat-1.7.4.2-3.fc37.${ARCH}.rpm"
curl -O "https://dl.fedoraproject.org/pub/fedora/linux/releases/37/Everything/${ARCH}/os/Packages/l/libcgroup-3.0-1.fc37.${ARCH}.rpm"
curl -O "https://dl.fedoraproject.org/pub/fedora/linux/releases/37/Everything/${ARCH}/os/Packages/c/conntrack-tools-1.4.6-4.fc37.${ARCH}.rpm"
curl -LO "https://github.com/derailed/k9s/releases/download/${K9S_VERSION}/k9s_Linux_${K8s_ARCH}.tar.gz"
curl -LO "https://raw.githubusercontent.com/flannel-io/flannel/master/Documentation/kube-flannel.yml"
```
**Download all of the necessary container images:**
```bash
images=(
"registry.k8s.io/kube-apiserver:${KUBE_RELEASE}"
"registry.k8s.io/kube-controller-manager:${KUBE_RELEASE}"
"registry.k8s.io/kube-scheduler:${KUBE_RELEASE}"
"registry.k8s.io/kube-proxy:${KUBE_RELEASE}"
"registry.k8s.io/pause:3.9"
"registry.k8s.io/etcd:3.5.7-0"
"registry.k8s.io/coredns/coredns:v1.10.1"
"registry:2.8.2"
"flannel/flannel:v0.22.0"
"flannel/flannel-cni-plugin:v1.1.2"
)
for image in "${images[@]}"; do
# Pull the image from the registry
docker pull "$image"
# Save the image to a tar file on the local disk
image_name=$(echo "$image" | sed 's|/|_|g' | sed 's/:/_/g')
docker save -o "${image_name}.tar" "$image"
done
```
The above commands will take a look at the CPU architecture for the current host/laptop, create and change into a directory called download, and finally download all of the dependencies. Each of these files must then be transported over the air gap via scp. The exact syntax of the command will vary depending on the user on the VM, if you created an SSH key, and the IP of your air gap VM. The rough syntax is:
```bash
scp -i <<SSH_KEY>> <<FILE>> <<AIRGAP_VM_USER>>@<<AIRGAP_VM_IP>>:~/tmp/
```
Once all of the files have been transported to the air gapped VM, the rest of the blog post will take place from the VM. Open a terminal session to that system.
### Put the artifacts in place
Everything that is needed in order to bootstrap a Kubernetes cluster now exists on the air-gapped VM. This section is a lot more complicated since various types of artifacts are now on disk on the air-gapped VM. Get a root shell on the air gap VM as the rest of this section will be executed from there. Let's start by setting the same architecture variables and environmental as were set on the host/laptop and then install all of the RPM packages:
```bash
UARCH=$(uname -m)
# Set architecture variables
if [["$UARCH" == "arm64" || "$UARCH" == "aarch64"]]; then
ARCH="aarch64"
K8s_ARCH="arm64"
else
ARCH="x86_64"
K8s_ARCH="amd64"
fi
# Set environment variables
CNI_PLUGINS_VERSION="v1.3.0"
CRICTL_VERSION="v1.27.0"
KUBE_RELEASE="v1.27.3"
RELEASE_VERSION="v0.15.1"
K9S_VERSION="v0.27.4"
cd ~/tmp/
dnf -y install ./*.rpm
```
Next, install the CNI plugins and `crictl`:
```bash
mkdir -p /opt/cni/bin
tar -C /opt/cni/bin -xz -f "cni-plugins-linux-${K8s_ARCH}-v1.3.0.tgz"
tar -C /usr/local/bin-xz -f "crictl-v1.27.0-linux-${K8s_ARCH}.tar.gz"
```
Make kubeadm, kubelet and kubectl executable and move them from the `/tmp`
directory to `/usr/local/bin`:
```bash
chmod +x kubeadm kubelet kubectl
mv kubeadm kubelet kubectl /usr/local/bin
```
Define an override for the systemd kubelet service file, and move it to the proper location:
```bash
mkdir -p /etc/systemd/system/kubelet.service.d
sed "s:/usr/bin:/usr/local/bin:g" 10-kubeadm.conf > /etc/systemd/system/kubelet.service.d/10-kubeadm.conf
```
The CRI plugin for containerd is disabled by default; enable it:
```bash
sed -i 's/^disabled_plugins = \["cri"\]/#&/' /etc/containerd/config.toml
```
Put a custom `/etc/docker/daemon.json` file in place:
```bash
echo '{
"exec-opts": ["native.cgroupdriver=systemd"],
"insecure-registries" : ["localhost:5000"],
"allow-nondistributable-artifacts": ["localhost:5000"],
"log-driver": "json-file",
"log-opts": {
"max-size": "100m"
},
"group": "rnd",
"storage-driver": "overlay2",
"storage-opts": [
"overlay2.override_kernel_check=true"
]
}' > /etc/docker/daemon.json
```
Two important items to highlight in the Docker `daemon.json` configuration file. The insecure-registries line means that the registry in brackets does not support TLS. Even inside an air gapped environment, this isn't a good practice but is fine for the purposes of this lab. The allow-nondistributable-artifacts line tells Docker to permit pushing nondistributable artifacts to this registry. Docker by default does not push these layers to avoid potential issues around licensing or distribution rights. A good example of this is the Windows base container image. This line will allow layers that Docker marks as "foreign" to be pushed to the registry. While not a big deal for this article, that line could be required for some air gapped environments. All layers have to exist locally since nothing inside the air gapped environment can reach out to a public container image registry to get what it needs.
(Re)start Docker and enable it so it starts at system boot:
```bash
systemctl restart docker
systemctl enable docker
```
Start, and enable, containerd and the kubelet:
```bash
systemctl enable --now containerd
systemctl enable --now kubelet
```
The container image registry that runs in Docker is only required for any CNI related containers and subsequent workload containers. This registry is **not** used to house the Kubernetes component containers. Note, nerdctl would have also worked here as an alternative to Docker and would have allowed for direct interaction with containerd. Docker was chosen for its familiarity.
Start a container image registry inside Docker:
```bash
docker load -i registry_2.8.2.tar
docker run -d -p 5000:5000 --restart=always --name registry registry:2.8.2
```
### Load Flannel containers into the Docker registry
**Note**: _Flannel was chosen for this lab due to familiarity. Chose whatever CNI works best in your environment._
```bash
docker load -i flannel_flannel_v0.22.0.tar
docker load -i flannel_flannel-cni-plugin_v1.1.2.tar
docker tag flannel/flannel:v0.22.0 localhost:5000/flannel/flannel:v0.22.0
docker tag flannel/flannel-cni-plugin:v1.1.1 localhost:5000/flannel/flannel-cni-plugin:v1.1.1
docker push localhost:5000/flannel/flannel:v0.22.0
docker push localhost:5000/flannel/flannel-cni-plugin:v1.1.1
```
Load container images for Kubernetes components, via `ctr`:
```bash
images_files=(
"registry.k8s.io/kube-apiserver:${KUBE_RELEASE}"
"registry.k8s.io/kube-controller-manager:${KUBE_RELEASE}"
"registry.k8s.io/kube-scheduler:${KUBE_RELEASE}"
"registry.k8s.io/kube-proxy:${KUBE_RELEASE}"
"registry.k8s.io/pause:3.9"
"registry.k8s.io/etcd:3.5.7-0"
"registry.k8s.io/coredns/coredns:v1.10.1"
)
for index in "${!image_files[@]}"; do
if [[-f "${image_files[$index]}" ]]; then
# The below line loads the images where they need to be on the VM
ctr -n k8s.io images import ${image_files[$index]}
else
echo "File ${image_files[$index]} not found!" 1>&2
fi
done
```
A totally reasonable question here could be "Why not use the Docker registry that was just stood up to house the K8s component images?" This simply didn't work even with the proper modification to the configuration file that gets passed to kubeadm.
### Spin up the Kubernetes cluster
Check if a cluster is already running and tear it down if it is:
```bash
if systemctl is-active --quiet kubelet; then
# Reset the Kubernetes cluster
echo "A Kubernetes cluster is already running. Resetting the cluster..."
kubeadm reset -f
fi
```
Log into the Docker registry from inside the air-gapped VM:
```bash
# OK for a demo; use secure credentials in production!
DOCKER_USER=user
DOCKER_PASS=pass
echo ${DOCKER_PASS} | docker login --username=${DOCKER_USER} --password-stdin localhost:5000
```
Create a cluster configuration file and initialize the cluster:
```bash
echo "---
apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
clusterName: kubernetes
kubernetesVersion: v1.27.3
networking:
dnsDomain: cluster.local
podSubnet: 10.244.0.0/16 # --pod-network-cidr
serviceSubnet: 10.96.0.0/12
---
apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 10.10.10.10 # Update to the IP address of the air gap VM
bindPort: 6443
nodeRegistration:
criSocket: unix:///run/containerd/containerd.sock # or rely on autodetection
name: airgap # this must match the hostname of the air gap VM
# Since this is a single node cluster, this taint has to be commented out,
# otherwise the coredns pods will not come up.
# taints:
# - effect: NoSchedule
# key: node-role.kubernetes.io/master" > kubeadm_cluster.yaml
kubeadm init --config kubeadm_config.yaml
```
Set `$KUBECONFIG` and use `kubectl` to wait until the API server is healthy:
```bash
export KUBECONFIG=/etc/kubernetes/admin.conf
until kubectl get nodes; do
echo -e "\nWaiting for API server to respond..." 1>&2
sleep 5
done
```
### Set up networking
Update Flannel image locations in the Flannel manifest, and apply it:
```bash
sed -i 's/image: docker\.io/image: localhost:5000/g' kube-flannel.yaml
kubectl apply -f kube-flannel.yaml
```
Run `kubectl get pods -A --watch` until all pods are up and running.
## Run an example Pod
With a cluster operational, the next step is a workload. For this simple demonstration, the [Podinfo](https://github.com/stefanprodan/podinfo) application will be deployed.
### Install Helm
This first part of the procedure must be executed from the host/laptop. If not already present, install Helm following [Installing Helm](https://helm.sh/docs/intro/install/).
Next, download the helm binary for Linux:
```bash
UARCH=$(uname -m)
# Reset the architecture variables if needed
if [["$UARCH" == "arm64" || "$UARCH" == "aarch64"]]; then
ARCH="aarch64"
K8s_ARCH="arm64"
else
ARCH="x86_64"
K8s_ARCH="amd64"
fi
curl -LO https://get.helm.sh/helm-v3.12.2-linux-${K8s_ARCH}.tar.gz
```
Add the Podinfo helm repository, download the Podinfo helm chart, download the Podinfo container image, and then finally save it to the local disk:
```bash
helm repo add https://stefanprodan.github.io/podinfo
helm fetch podinfo/podinfo --version 6.4.0
docker pull ghcr.io/stefanprodan/podinfo:6.4.0
```
### Save the podinfo image to a tar file on the local disk
```bash
docker save -o podinfo_podinfo-6.4.0.tar ghcr.io/stefanprodan/podinfo
```
```suggestion
### Transfer the image across the air gap
Reuse the `~/tmp` directory created on the air gapped VM to transport these artifacts across the air gap:
```bash
scp -i <<SSH_KEY>> <<FILE>> <<AIRGAP_VM_USER>>@<<AIRGAP_VM_IP>>:~/tmp/
```
### Continue on the isolated side
_Now pivot over to the air gap VM for the rest of the installation procedure._
Switch into `~/tmp`:
```
cd ~/tmp
```
Extract and move the `helm` binary:
```
tar -zxvf helm-v3.0.0-linux-amd64.tar.gz
mv linux-amd64/helm /usr/local/bin/helm
```
Load the Podinfo container image into the local Docker registry:
```bash
docker load -i podinfo_podinfo-6.4.0.tar
docker tag podinfo/podinfo:6.4.0 localhost:5000/podinfo/podinfo:6.4.0
docker push localhost:5000/podinfo/podinfo:6.4.0
```
Ensure "$KUBECONFIG` is set correctly, then install the Podinfo Helm chart:
```
# Outside of a demo or lab environment, use lower (or even least) privilege
# credentials to manage your workloads.
export KUBECONFIG=/etc/kubernetes/admin.conf
helm install podinfo ./podinfo-6.4.0.tgz --set image.repository=localhost:5000/podinfo/podinfo
```
Verify that the Podinfo application comes up:
```bash
kubectl get pods -n default
```
Or run k9s (a terminal user interface for Kubernetes):
```bash
k9s
```
## Zarf
Zarf is an open-source tool that takes a declarative approach to software packaging and delivery, including air gap. This same podinfo application will be installed onto the air gap VM using Zarf in this section. The first step is to install [Zarf](https://zarf.dev/install/) on the host/laptop.
Alternatively, a prebuilt binary can be downloaded onto the host/laptop from [GitHub](https://github.com/defenseunicorns/zarf/releases/) for various OS/CPU architectures.
A binary is also needed across the air gap on the VM:
```bash
UARCH=$(uname -m)
# Set the architecture variables if needed
if [["$UARCH" == "arm64" || "$UARCH" == "aarch64"]]; then
ARCH="aarch64"
K8s_ARCH="arm64"
else
ARCH="x86_64"
K8s_ARCH="amd64"
fi
export ZARF_VERSION=v0.28.3
curl -LO "https://github.com/defenseunicorns/zarf/releases/download/${ZARF_VERSION}/zarf_${ZARF_VERSION}_Linux_${K8s_ARCH}"
```
Zarf needs to bootstrap itself into a Kubernetes cluster through the use of an init package. That also needs to be transported across the air gap so let's download it onto the host/laptop:
```bash
curl -LO "https://github.com/defenseunicorns/zarf/releases/download/${ZARF_VERSION}/zarf-init-${K8s_ARCH}-${ZARF_VERSION}.tar.zst"
```
The way that Zarf is declarative is through the use of a zarf.yaml file. Here is the zarf.yaml file that will be used for this Podinfo installation. Write it to whatever directory you you have write access to on your host/laptop; your home directory is fine:
```
echo 'kind: ZarfPackageConfig
metadata:
name: podinfo
description: "Deploy helm chart for the podinfo application in K8s via zarf"
components:
- name: podinfo
required: true
charts:
- name: podinfo
version: 6.4.0
namespace: podinfo-helm-namespace
releaseName: podinfo
url: https://stefanprodan.github.io/podinfo
images:
- ghcr.io/stefanprodan/podinfo:6.4.0' > zarf.yaml
```
The next step is to build the Podinfo package. This must be done from the same directory location where the zarf.yaml file is located.
```bash
zarf package create --confirm
```
This command will download the defined helm chart and image and put them into a single file written to disk. This single file is all that needs to be carried across the air gap:
```bash
ls zarf-package-*
```
Sample output:
```bash
zarf-package-podinfo-arm64.tar.zst
```
Transport the linux zarf binary, zarf init package and Podinfo package over to the air gapped VM:
```bash
scp -i <<SSH_KEY>> <<FILE>> <<AIRGAP_VM_USER>>@<<AIRGAP_VM_IP>>:~/tmp/
```
From the air gapped VM, switch into the ~/tmp directory where all of the artifacts were placed:
```bash
cd ~/tmp
```
Set `$KUBECONFIG` to a file with credentials for the local cluster; also set the the Zarf version:
```bash
export KUBECONFIG=/etc/kubernetes/admin.conf
export ZARF_VERSION=$(zarf version)
```
Make the `zarf` binary executable and (as `root`) move it to `/usr/bin`:
```bash
chmod +x zarf && sudo mv zarf /usr/bin
```
Likewise, move the Zarf init package to `/usr/bin`:
```bash
mv zarf-init-arm64-${ZARF_VERSION}.tar.zst /usr/bin
```
Initialize Zarf into the cluster:
```
zarf init --confirm --components=git-server
```
When this command is done, a Zarf package is ready to be deployed.
```bash
zarf package deploy
```
This command will search the current directory for a Zarf package. Select the podinfo package (zarf-package-podinfo-${K8s_ARCH}.tar.zst) and continue. Once the package deployment is complete, run `zarf tools monitor` in order to bring up k9s to view the cluster.
## Conclusion
This is one method that can be used to spin up an air-gapped cluster and two methods to deploy
a mission application. Your mileage may vary on different operating systems regarding the
exact software artifacts that need to be carried across the air gap, but conceptually this procedure is still valid.
This demo also created an artificial air-gapped environment. In the real world, every missed dependency
could represent hours, if not days, or weeks of lost time to get running software in the air-gapped environment.
This artificial air gap also obscured some common methods or air gap software delivery such as using a
_data diode_. Depending on the environment, the diode can be very expensive to use.
Also, none of the artifacts were scanned before being carried across the air gap.
The presence of the air gap in general means that the workload running there is more sensitive, and nothing should be carried across unless it's known to be safe.

View File

@ -20,7 +20,7 @@ case_study_details:
<h2>Solution</h2>
<p>Mountain has been overseeing the company's migration to <a href="http://kubernetes.io/">Kubernetes</a>, using <a href="https://www.openshift.org/">OpenShift</a> Container Platform, <a href="https://www.redhat.com/en">Red Hat</a>'s enterprise container platform.</p>
<p>Mountain has been overseeing the company's migration to <a href="https://kubernetes.io/">Kubernetes</a>, using <a href="https://www.openshift.org/">OpenShift</a> Container Platform, <a href="https://www.redhat.com/en">Red Hat</a>'s enterprise container platform.</p>
<h2>Impact</h2>
@ -48,7 +48,7 @@ In his two decades at Amadeus, Eric Mountain has been the migrations guy.
<p>While mainly a C++ and Java shop, Amadeus also wanted to be able to adopt new technologies more easily. Some of its developers had started using languages like <a href="https://www.python.org/">Python</a> and databases like <a href="https://www.couchbase.com/">Couchbase</a>, but Mountain wanted still more options, he says, "in order to better adapt our technical solutions to the products we offer, and open up entirely new possibilities to our developers." Working with recent technologies and cool new things would also make it easier to attract new talent.</p>
<p>All of those needs led Mountain and his team on a search for a new platform. "We did a set of studies and proofs of concept over a fairly short period, and we considered many technologies," he says. "In the end, we were left with three choices: build everything on premise, build on top of <a href="http://kubernetes.io/">Kubernetes</a> whatever happens to be missing from our point of view, or go with <a href="https://www.openshift.com/">OpenShift</a> and build whatever remains there."</p>
<p>All of those needs led Mountain and his team on a search for a new platform. "We did a set of studies and proofs of concept over a fairly short period, and we considered many technologies," he says. "In the end, we were left with three choices: build everything on premise, build on top of <a href="https://kubernetes.io/">Kubernetes</a> whatever happens to be missing from our point of view, or go with <a href="https://www.openshift.com/">OpenShift</a> and build whatever remains there."</p>
<p>The team decided against building everything themselves—though they'd done that sort of thing in the past—because "people were already inventing things that looked good," says Mountain.</p>

View File

@ -10,7 +10,7 @@ community_styles_migrated: true
Kubernetes follows the
<a href="https://github.com/cncf/foundation/blob/main/code-of-conduct.md">CNCF Code of Conduct</a>.
The text of the CNCF CoC is replicated below, as of
<a href="https://github.com/cncf/foundation/blob/fff715fb000ba4d7422684eca1d50d80676be254/code-of-conduct.md">commit fff715fb0</a>.
<a href="https://github.com/cncf/foundation/blob/c79711b5127e2d963107bc1be4a41975c8791acc/code-of-conduct.md">commit c79711b51</a>.
If you notice that this is out of date, please
<a href="https://github.com/kubernetes/website/issues/new">file an issue</a>.
</p>

View File

@ -1,29 +1,31 @@
<!-- Do not edit this file directly. Get the latest from
https://github.com/cncf/foundation/blob/main/code-of-conduct.md -->
## CNCF Community Code of Conduct v1.2
## CNCF Community Code of Conduct v1.3
### Contributor Code of Conduct
### Community Code of Conduct
As contributors and maintainers in the CNCF community, and in the interest of fostering
an open and welcoming community, we pledge to respect all people who contribute
As contributors, maintainers, and participants in the CNCF community, and in the interest of fostering
an open and welcoming community, we pledge to respect all people who participate or contribute
through reporting issues, posting feature requests, updating documentation,
submitting pull requests or patches, and other activities.
submitting pull requests or patches, attending conferences or events, or engaging in other community or project activities.
We are committed to making participation in the CNCF community a harassment-free experience for everyone, regardless of level of experience, gender, gender identity and expression,
sexual orientation, disability, personal appearance, body size, race, ethnicity, age,
religion, or nationality.
We are committed to making participation in the CNCF community a harassment-free experience for everyone, regardless of age, body size, caste, disability, ethnicity, level of experience, family status, gender, gender identity and expression, marital status, military or veteran status, nationality, personal appearance, race, religion, sexual orientation, socieconomic status, tribe, or any other dimension of diversity.
## Scope
This code of conduct applies both within project spaces and in public spaces when an individual is representing the project or its community.
This code of conduct applies:
* within project and community spaces,
* in other spaces when an individual CNCF community participant's words or actions are directed at or are about a CNCF project, the CNCF community, or another CNCF community participant.
### CNCF Events
CNCF events, or events run by the Linux Foundation with professional events staff, are governed by the Linux Foundation [Events Code of Conduct](https://events.linuxfoundation.org/code-of-conduct/) available on the event page. This is designed to be used in conjunction with the CNCF Code of Conduct.
CNCF events that are produced by the Linux Foundation with professional events staff are governed by the Linux Foundation [Events Code of Conduct](https://events.linuxfoundation.org/code-of-conduct/) available on the event page. This is designed to be used in conjunction with the CNCF Code of Conduct.
## Our Standards
Examples of behavior that contributes to a positive environment include:
The CNCF Community is open, inclusive and respectful. Every member of our community has the right to have their identity respected.
Examples of behavior that contributes to a positive environment include but are not limited to:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
@ -32,23 +34,31 @@ Examples of behavior that contributes to a positive environment include:
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the
overall community
* Using welcoming and inclusive language
Examples of unacceptable behavior include:
Examples of unacceptable behavior include but are not limited to:
* The use of sexualized language or imagery, and sexual attention or
advances of any kind
* The use of sexualized language or imagery
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Public or private harassment in any form
* Publishing others' private information, such as a physical or email
address, without their explicit permission
* Violence, threatening violence, or encouraging others to engage in violent behavior
* Stalking or following someone without their consent
* Unwelcome physical contact
* Unwelcome sexual or romantic attention or advances
* Other conduct which could reasonably be considered inappropriate in a
professional setting
The following behaviors are also prohibited:
* Providing knowingly false or misleading information in connection with a Code of Conduct investigation or otherwise intentionally tampering with an investigation.
* Retaliating against a person because they reported an incident or provided information about an incident as a witness.
Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct.
By adopting this Code of Conduct, project maintainers commit themselves to fairly and consistently applying these principles to every aspect
of managing this project.
Project maintainers who do not follow or enforce the Code of
Conduct may be permanently removed from the project team.
of managing a CNCF project.
Project maintainers who do not follow or enforce the Code of Conduct may be temporarily or permanently removed from the project team.
## Reporting
@ -56,16 +66,20 @@ For incidents occurring in the Kubernetes community, contact the [Kubernetes Cod
For other projects, or for incidents that are project-agnostic or impact multiple CNCF projects, please contact the [CNCF Code of Conduct Committee](https://www.cncf.io/conduct/committee/) via conduct@cncf.io. Alternatively, you can contact any of the individual members of the [CNCF Code of Conduct Committee](https://www.cncf.io/conduct/committee/) to submit your report. For more detailed instructions on how to submit a report, including how to submit a report anonymously, please see our [Incident Resolution Procedures](https://www.cncf.io/conduct/procedures/). You can expect a response within three business days.
For incidents ocurring at CNCF event that is produced by the Linux Foundation, please contact eventconduct@cncf.io.
## Enforcement
The Kubernetes project's [Code of Conduct Committee](https://github.com/kubernetes/community/tree/master/committee-code-of-conduct) enforces code of conduct issues for the Kubernetes project.
Upon review and investigation of a reported incident, the CoC response team that has jurisdiction will determine what action is appropriate based on this Code of Conduct and its related documentation.
For all projects that do not have their own Code of Conduct Committee or other Code of Conduct incident responders, and for incidents that are project-agnostic or impact multiple projects, the [CNCF Code of Conduct Committee](https://www.cncf.io/conduct/committee/) enforces code of conduct issues. For more information, see our [Jurisdiction and Escalation Policy](https://www.cncf.io/conduct/jurisdiction/)
For information about which Code of Conduct incidents are handled by project leadership, which incidents are handled by the CNCF Code of Conduct Committee, and which incidents are handled by the Linux Foundation (including its events team), see our [Jurisdiction Policy](https://www.cncf.io/conduct/jurisdiction/).
Both bodies try to resolve incidents without punishment, but may remove people from the project or CNCF communities at their discretion.
## Amendments
Consistent with the CNCF Charter, any substantive changes to this Code of Conduct must be approved by the Technical Oversight Committee.
## Acknowledgements
This Code of Conduct is adapted from the Contributor Covenant
(http://contributor-covenant.org), version 2.0 available at
http://contributor-covenant.org/version/2/0/code_of_conduct/
http://contributor-covenant.org/version/2/0/code_of_conduct/

View File

@ -7,23 +7,25 @@ weight: 220
---
<!-- overview -->
{{< feature-state state="alpha" for_k8s_version="v1.28" >}}
Kubernetes {{< skew currentVersion >}} includes an alpha feature that lets a
{{< feature-state state="alpha" for_k8s_version="v1.28" >}}
Kubernetes {{< skew currentVersion >}} includes an alpha feature that lets an
{{< glossary_tooltip text="API Server" term_id="kube-apiserver" >}}
proxy a resource requests to other _peer_ API servers. This is useful when there are multiple
API servers running different versions of Kubernetes in one cluster (for example, during a long-lived
rollout to a new release of Kubernetes).
API servers running different versions of Kubernetes in one cluster
(for example, during a long-lived rollout to a new release of Kubernetes).
This enables cluster administrators to configure highly available clusters that can be upgraded
more safely, by directing resource requests (made during the upgrade) to the correct kube-apiserver.
That proxying prevents users from seeing unexpected 404 Not Found errors that stem
from the upgrade process.
This mechanism is called the _Mixed Version Proxy_.
This mechanism is called the _Mixed Version Proxy_.
## Enabling the Mixed Version Proxy
Ensure that `UnknownVersionInteroperabilityProxy` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
Ensure that `UnknownVersionInteroperabilityProxy` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
is enabled when you start the {{< glossary_tooltip text="API Server" term_id="kube-apiserver" >}}:
```shell
@ -46,39 +48,63 @@ kube-apiserver \
### Proxy transport and authentication between API servers {#transport-and-authn}
* The source kube-apiserver reuses the [existing APIserver client authentication flags](https://kubernetes.io/docs/tasks/extend-kubernetes/configure-aggregation-layer/#kubernetes-apiserver-client-authentication) `--proxy-client-cert-file` and `--proxy-client-key-file` to present its identity that will be verified by its peer (the destination kube-apiserver). The destination API server verifies that peer connection based on the configuration you specify using the `--requestheader-client-ca-file` command line argument.
* The source kube-apiserver reuses the
[existing APIserver client authentication flags](/docs/tasks/extend-kubernetes/configure-aggregation-layer/#kubernetes-apiserver-client-authentication)
`--proxy-client-cert-file` and `--proxy-client-key-file` to present its identity that
will be verified by its peer (the destination kube-apiserver). The destination API server
verifies that peer connection based on the configuration you specify using the
`--requestheader-client-ca-file` command line argument.
* To authenticate the destination server's serving certs, you must configure a certificate authority bundle by specifying the `--peer-ca-file` command line argument to the **source** API server.
* To authenticate the destination server's serving certs, you must configure a certificate
authority bundle by specifying the `--peer-ca-file` command line argument to the **source** API server.
### Configuration for peer API server connectivity
To set the network location of a kube-apiserver that peers will use to proxy requests, use the
`--peer-advertise-ip` and `--peer-advertise-port` command line arguments to kube-apiserver or specify
To set the network location of a kube-apiserver that peers will use to proxy requests, use the
`--peer-advertise-ip` and `--peer-advertise-port` command line arguments to kube-apiserver or specify
these fields in the API server configuration file.
If these flags are unspecified, peers will use the value from either `--advertise-address` or
`--bind-address` command line argument to the kube-apiserver. If those too, are unset, the host's default interface is used.
`--bind-address` command line argument to the kube-apiserver.
If those too, are unset, the host's default interface is used.
## Mixed version proxying
When you enable mixed version proxying, the [aggregation layer](/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/)
When you enable mixed version proxying, the [aggregation layer](/docs/concepts/extend-kubernetes/api-extension/apiserver-aggregation/)
loads a special filter that does the following:
* When a resource request reaches an API server that cannot serve that API (either because it is at a version pre-dating the introduction of the API or the API is turned off on the API server) the API server attempts to send the request to a peer API server that can serve the requested API. It does so by identifying API groups / versions / resources that the local server doesn't recognise, and tries to proxy those requests to a peer API server that is capable of handling the request.
* If the peer API server fails to respond, the _source_ API server responds with 503("Service Unavailable") error.
* When a resource request reaches an API server that cannot serve that API
(either because it is at a version pre-dating the introduction of the API or the API is turned off on the API server)
the API server attempts to send the request to a peer API server that can serve the requested API.
It does so by identifying API groups / versions / resources that the local server doesn't recognise,
and tries to proxy those requests to a peer API server that is capable of handling the request.
* If the peer API server fails to respond, the _source_ API server responds with 503 ("Service Unavailable") error.
### How it works under the hood
When an API Server receives a resource request, it first checks which API servers can serve the requested resource. This check happens using the internal [`StorageVersion` API].
When an API Server receives a resource request, it first checks which API servers can
serve the requested resource. This check happens using the internal
[`StorageVersion` API](/docs/reference/generated/kubernetes-api/v{{< skew currentVersion >}}/#storageversioncondition-v1alpha1-internal-apiserver-k8s-io).
* If the resource is known to the API server that received the request (ex: `GET /api/v1/pods/some-pod`), the request is handled locally.
* If the resource is known to the API server that received the request
(for example, `GET /api/v1/pods/some-pod`), the request is handled locally.
* If there is no internal `StorageVersion` object found for the requested resource (ex: `GET /my-api/v1/my-resource`) and the configured APIService specifies proxying to an extension API server, that proxying happens following the usual
[flow](/docs/tasks/extend-kubernetes/configure-aggregation-layer/) for
extension APIs.
* If there is no internal `StorageVersion` object found for the requested resource
(for example, `GET /my-api/v1/my-resource`) and the configured APIService specifies proxying
to an extension API server, that proxying happens following the usual
[flow](/docs/tasks/extend-kubernetes/configure-aggregation-layer/) for extension APIs.
* If a valid internal `StorageVersion` object is found for the requested resource
(for example, `GET /batch/v1/jobs`) and the API server trying to handle the request
(the _handling API server_) has the `batch` API disabled, then the _handling API server_
fetches the peer API servers that do serve the relevant API group / version / resource
(`api/v1/batch` in this case) using the information in the fetched `StorageVersion` object.
The _handling API server_ then proxies the request to one of the matching peer kube-apiservers
that are aware of the requested resource.
* If there is no peer known for that API group / version / resource, the handling API server
passes the request to its own handler chain which should eventually return a 404 ("Not Found") response.
* If a valid internal `StorageVersion` object is found for the requested resource (ex: `GET /batch/v1/jobs`) and the API server trying to handle the request (the _handling API server_) has the `batch` API disabled, then the _handling API server_fetches the peer API servers that do serve the relevant API group / version / resource (`api/v1/batch` in this case) using the information in the fetched `StorageVersion` object. The _handling API server_ then proxies the request to one of the matching peer kube-apiservers that are aware of the requested resource.
* If there is no peer known for that API group / version / resource, the handling API server passes the request to its own handler chain which should eventually return a 404("Not Found") response.
* If the handling API server has identified and selected a peer API server, but that peer fails
to respond (for reasons such as network connectivity issues, or a data race between the request
being received and a controller registering the peer's info into the control plane), then the handling
API server responds with a 503 (“Service Unavailable”) error.
API server responds with a 503 ("Service Unavailable") error.

View File

@ -9,7 +9,8 @@ weight: 10
<!-- overview -->
Kubernetes runs your {{< glossary_tooltip text="workload" term_id="workload" >}} by placing containers into Pods to run on _Nodes_.
Kubernetes runs your {{< glossary_tooltip text="workload" term_id="workload" >}}
by placing containers into Pods to run on _Nodes_.
A node may be a virtual or physical machine, depending on the cluster. Each node
is managed by the
{{< glossary_tooltip text="control plane" term_id="control-plane" >}}
@ -28,14 +29,15 @@ The [components](/docs/concepts/overview/components/#node-components) on a node
## Management
There are two main ways to have Nodes added to the {{< glossary_tooltip text="API server" term_id="kube-apiserver" >}}:
There are two main ways to have Nodes added to the
{{< glossary_tooltip text="API server" term_id="kube-apiserver" >}}:
1. The kubelet on a node self-registers to the control plane
2. You (or another human user) manually add a Node object
After you create a Node {{< glossary_tooltip text="object" term_id="object" >}},
or the kubelet on a node self-registers, the control plane checks whether the new Node object is
valid. For example, if you try to create a Node from the following JSON manifest:
or the kubelet on a node self-registers, the control plane checks whether the new Node object
is valid. For example, if you try to create a Node from the following JSON manifest:
```json
{
@ -72,11 +74,10 @@ The name of a Node object must be a valid
The [name](/docs/concepts/overview/working-with-objects/names#names) identifies a Node. Two Nodes
cannot have the same name at the same time. Kubernetes also assumes that a resource with the same
name is the same object. In case of a Node, it is implicitly assumed that an instance using the
same name will have the same state (e.g. network settings, root disk contents)
and attributes like node labels. This may lead to
inconsistencies if an instance was modified without changing its name. If the Node needs to be
replaced or updated significantly, the existing Node object needs to be removed from API server
first and re-added after the update.
same name will have the same state (e.g. network settings, root disk contents) and attributes like
node labels. This may lead to inconsistencies if an instance was modified without changing its name.
If the Node needs to be replaced or updated significantly, the existing Node object needs to be
removed from API server first and re-added after the update.
### Self-registration of Nodes
@ -163,10 +164,10 @@ that should run on the Node even if it is being drained of workload applications
A Node's status contains the following information:
* [Addresses](#addresses)
* [Conditions](#condition)
* [Capacity and Allocatable](#capacity)
* [Info](#info)
* [Addresses](/docs/reference/node/node-status/#addresses)
* [Conditions](/docs/reference/node/node-status/#condition)
* [Capacity and Allocatable](/docs/reference/node/node-status/#capacity)
* [Info](/docs/reference/node/node-status/#info)
You can use `kubectl` to view a Node's status and other details:
@ -174,121 +175,21 @@ You can use `kubectl` to view a Node's status and other details:
kubectl describe node <insert-node-name-here>
```
Each section of the output is described below.
See [Node Status](/docs/reference/node/node-status/) for more details.
### Addresses
The usage of these fields varies depending on your cloud provider or bare metal configuration.
* HostName: The hostname as reported by the node's kernel. Can be overridden via the kubelet
`--hostname-override` parameter.
* ExternalIP: Typically the IP address of the node that is externally routable (available from
outside the cluster).
* InternalIP: Typically the IP address of the node that is routable only within the cluster.
### Conditions {#condition}
The `conditions` field describes the status of all `Running` nodes. Examples of conditions include:
{{< table caption = "Node conditions, and a description of when each condition applies." >}}
| Node Condition | Description |
|----------------------|-------------|
| `Ready` | `True` if the node is healthy and ready to accept pods, `False` if the node is not healthy and is not accepting pods, and `Unknown` if the node controller has not heard from the node in the last `node-monitor-grace-period` (default is 40 seconds) |
| `DiskPressure` | `True` if pressure exists on the disk size—that is, if the disk capacity is low; otherwise `False` |
| `MemoryPressure` | `True` if pressure exists on the node memory—that is, if the node memory is low; otherwise `False` |
| `PIDPressure` | `True` if pressure exists on the processes—that is, if there are too many processes on the node; otherwise `False` |
| `NetworkUnavailable` | `True` if the network for the node is not correctly configured, otherwise `False` |
{{< /table >}}
{{< note >}}
If you use command-line tools to print details of a cordoned Node, the Condition includes
`SchedulingDisabled`. `SchedulingDisabled` is not a Condition in the Kubernetes API; instead,
cordoned nodes are marked Unschedulable in their spec.
{{< /note >}}
In the Kubernetes API, a node's condition is represented as part of the `.status`
of the Node resource. For example, the following JSON structure describes a healthy node:
```json
"conditions": [
{
"type": "Ready",
"status": "True",
"reason": "KubeletReady",
"message": "kubelet is posting ready status",
"lastHeartbeatTime": "2019-06-05T18:38:35Z",
"lastTransitionTime": "2019-06-05T11:41:27Z"
}
]
```
When problems occur on nodes, the Kubernetes control plane automatically creates
[taints](/docs/concepts/scheduling-eviction/taint-and-toleration/) that match the conditions
affecting the node. An example of this is when the `status` of the Ready condition
remains `Unknown` or `False` for longer than the kube-controller-manager's `NodeMonitorGracePeriod`,
which defaults to 40 seconds. This will cause either an `node.kubernetes.io/unreachable` taint, for an `Unknown` status,
or a `node.kubernetes.io/not-ready` taint, for a `False` status, to be added to the Node.
These taints affect pending pods as the scheduler takes the Node's taints into consideration when
assigning a pod to a Node. Existing pods scheduled to the node may be evicted due to the application
of `NoExecute` taints. Pods may also have {{< glossary_tooltip text="tolerations" term_id="toleration" >}} that let
them schedule to and continue running on a Node even though it has a specific taint.
See [Taint Based Evictions](/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-based-evictions) and
[Taint Nodes by Condition](/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-nodes-by-condition)
for more details.
### Capacity and Allocatable {#capacity}
Describes the resources available on the node: CPU, memory, and the maximum
number of pods that can be scheduled onto the node.
The fields in the capacity block indicate the total amount of resources that a
Node has. The allocatable block indicates the amount of resources on a
Node that is available to be consumed by normal Pods.
You may read more about capacity and allocatable resources while learning how
to [reserve compute resources](/docs/tasks/administer-cluster/reserve-compute-resources/#node-allocatable)
on a Node.
### Info
Describes general information about the node, such as kernel version, Kubernetes
version (kubelet and kube-proxy version), container runtime details, and which
operating system the node uses.
The kubelet gathers this information from the node and publishes it into
the Kubernetes API.
## Heartbeats
## Node heartbeats
Heartbeats, sent by Kubernetes nodes, help your cluster determine the
availability of each node, and to take action when failures are detected.
For nodes there are two forms of heartbeats:
* updates to the `.status` of a Node
* Updates to the [`.status`](/docs/reference/node/node-status/) of a Node.
* [Lease](/docs/concepts/architecture/leases/) objects
within the `kube-node-lease`
{{< glossary_tooltip term_id="namespace" text="namespace">}}.
Each Node has an associated Lease object.
Compared to updates to `.status` of a Node, a Lease is a lightweight resource.
Using Leases for heartbeats reduces the performance impact of these updates
for large clusters.
The kubelet is responsible for creating and updating the `.status` of Nodes,
and for updating their related Leases.
- The kubelet updates the node's `.status` either when there is change in status
or if there has been no update for a configured interval. The default interval
for `.status` updates to Nodes is 5 minutes, which is much longer than the 40
second default timeout for unreachable nodes.
- The kubelet creates and then updates its Lease object every 10 seconds
(the default update interval). Lease updates occur independently from
updates to the Node's `.status`. If the Lease update fails, the kubelet retries,
using exponential backoff that starts at 200 milliseconds and capped at 7 seconds.
## Node controller
The node {{< glossary_tooltip text="controller" term_id="controller" >}} is a
@ -327,8 +228,7 @@ from more than 1 node per 10 seconds.
The node eviction behavior changes when a node in a given availability zone
becomes unhealthy. The node controller checks what percentage of nodes in the zone
are unhealthy (the `Ready` condition is `Unknown` or `False`) at
the same time:
are unhealthy (the `Ready` condition is `Unknown` or `False`) at the same time:
- If the fraction of unhealthy nodes is at least `--unhealthy-zone-threshold`
(default 0.55), then the eviction rate is reduced.
@ -424,8 +324,8 @@ node shutdown has been detected, so that even Pods with a
{{< glossary_tooltip text="toleration" term_id="toleration" >}} for
`node.kubernetes.io/not-ready:NoSchedule` do not start there.
At the same time when kubelet is setting that condition on its Node via the API, the kubelet also begins
terminating any Pods that are running locally.
At the same time when kubelet is setting that condition on its Node via the API,
the kubelet also begins terminating any Pods that are running locally.
During a graceful shutdown, kubelet terminates pods in two phases:
@ -448,10 +348,9 @@ Graceful node shutdown feature is configured with two
{{< note >}}
There are cases when Node termination was cancelled by the system (or perhaps manually
by an administrator). In either of those situations the
Node will return to the `Ready` state. However Pods which already started the process
of termination
will not be restored by kubelet and will need to be re-scheduled.
by an administrator). In either of those situations the Node will return to the `Ready` state.
However, Pods which already started the process of termination will not be restored by kubelet
and will need to be re-scheduled.
{{< /note >}}
@ -544,7 +443,6 @@ example, you could instead use these settings:
| 1000 |120 seconds |
| 0 |60 seconds |
In the above case, the pods with `custom-class-b` will go into the same bucket
as `custom-class-c` for shutdown.
@ -556,8 +454,8 @@ If this feature is enabled and no configuration is provided, then no ordering
action will be taken.
Using this feature requires enabling the `GracefulNodeShutdownBasedOnPodPriority`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
, and setting `ShutdownGracePeriodByPodPriority` in the
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/),
and setting `ShutdownGracePeriodByPodPriority` in the
[kubelet config](/docs/reference/config-api/kubelet-config.v1beta1/)
to the desired configuration containing the pod priority class values and
their respective shutdown periods.
@ -582,24 +480,25 @@ ShutdownGracePeriodCriticalPods are not configured properly. Please refer to abo
section [Graceful Node Shutdown](#graceful-node-shutdown) for more details.
When a node is shutdown but not detected by kubelet's Node Shutdown Manager, the pods
that are part of a {{< glossary_tooltip text="StatefulSet" term_id="statefulset" >}} will be stuck in terminating status on
the shutdown node and cannot move to a new running node. This is because kubelet on
the shutdown node is not available to delete the pods so the StatefulSet cannot
create a new pod with the same name. If there are volumes used by the pods, the
VolumeAttachments will not be deleted from the original shutdown node so the volumes
that are part of a {{< glossary_tooltip text="StatefulSet" term_id="statefulset" >}}
will be stuck in terminating status on the shutdown node and cannot move to a new running node.
This is because kubelet on the shutdown node is not available to delete the pods so
the StatefulSet cannot create a new pod with the same name. If there are volumes used by the pods,
the VolumeAttachments will not be deleted from the original shutdown node so the volumes
used by these pods cannot be attached to a new running node. As a result, the
application running on the StatefulSet cannot function properly. If the original
shutdown node comes up, the pods will be deleted by kubelet and new pods will be
created on a different running node. If the original shutdown node does not come up,
these pods will be stuck in terminating status on the shutdown node forever.
To mitigate the above situation, a user can manually add the taint `node.kubernetes.io/out-of-service` with either `NoExecute`
or `NoSchedule` effect to a Node marking it out-of-service.
To mitigate the above situation, a user can manually add the taint `node.kubernetes.io/out-of-service`
with either `NoExecute` or `NoSchedule` effect to a Node marking it out-of-service.
If the `NodeOutOfServiceVolumeDetach`[feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
is enabled on {{< glossary_tooltip text="kube-controller-manager" term_id="kube-controller-manager" >}}, and a Node is marked out-of-service with this taint, the
pods on the node will be forcefully deleted if there are no matching tolerations on it and volume
detach operations for the pods terminating on the node will happen immediately. This allows the
Pods on the out-of-service node to recover quickly on a different node.
is enabled on {{< glossary_tooltip text="kube-controller-manager" term_id="kube-controller-manager" >}},
and a Node is marked out-of-service with this taint, the pods on the node will be forcefully deleted
if there are no matching tolerations on it and volume detach operations for the pods terminating on
the node will happen immediately. This allows the Pods on the out-of-service node to recover quickly
on a different node.
During a non-graceful shutdown, Pods are terminated in the two phases:
@ -607,9 +506,8 @@ During a non-graceful shutdown, Pods are terminated in the two phases:
2. Immediately perform detach volume operation for such pods.
{{< note >}}
- Before adding the taint `node.kubernetes.io/out-of-service` , it should be verified
that the node is already in shutdown or power off state (not in the middle of
restarting).
- Before adding the taint `node.kubernetes.io/out-of-service`, it should be verified
that the node is already in shutdown or power off state (not in the middle of restarting).
- The user is required to manually remove the out-of-service taint after the pods are
moved to a new node and the user has checked that the shutdown node has been
recovered since the user was the one who originally added the taint.
@ -639,7 +537,8 @@ memorySwap:
- `UnlimitedSwap` (default): Kubernetes workloads can use as much swap memory as they
request, up to the system limit.
- `LimitedSwap`: The utilization of swap memory by Kubernetes workloads is subject to limitations. Only Pods of Burstable QoS are permitted to employ swap.
- `LimitedSwap`: The utilization of swap memory by Kubernetes workloads is subject to limitations.
Only Pods of Burstable QoS are permitted to employ swap.
If configuration for `memorySwap` is not specified and the feature gate is
enabled, by default the kubelet will apply the same behaviour as the
@ -647,13 +546,14 @@ enabled, by default the kubelet will apply the same behaviour as the
With `LimitedSwap`, Pods that do not fall under the Burstable QoS classification (i.e.
`BestEffort`/`Guaranteed` Qos Pods) are prohibited from utilizing swap memory.
To maintain the aforementioned security and node
health guarantees, these Pods are not permitted to use swap memory when `LimitedSwap` is
in effect.
To maintain the aforementioned security and node health guarantees, these Pods
are not permitted to use swap memory when `LimitedSwap` is in effect.
Prior to detailing the calculation of the swap limit, it is necessary to define the following terms:
* `nodeTotalMemory`: The total amount of physical memory available on the node.
* `totalPodsSwapAvailable`: The total amount of swap memory on the node that is available for use by Pods (some swap memory may be reserved for system use).
* `totalPodsSwapAvailable`: The total amount of swap memory on the node that is available for use by Pods
(some swap memory may be reserved for system use).
* `containerMemoryRequest`: The container's memory request.
Swap limitation is configured as:
@ -663,19 +563,21 @@ It is important to note that, for containers within Burstable QoS Pods, it is po
opt-out of swap usage by specifying memory requests that are equal to memory limits.
Containers configured in this manner will not have access to swap memory.
Swap is supported only with **cgroup v2**, cgroup v1 is not supported.
Swap is supported only with **cgroup v2**, cgroup v1 is not supported.
For more information, and to assist with testing and provide feedback, please
see the blog-post about [Kubernetes 1.28: NodeSwap graduates to Beta1](/blog/2023/07/18/swap-beta1-1.28-2023/),
see the blog-post about [Kubernetes 1.28: NodeSwap graduates to Beta1](/blog/2023/08/24/swap-linux-beta/),
[KEP-2400](https://github.com/kubernetes/enhancements/issues/4128) and its
[design proposal](https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2400-node-swap/README.md).
## {{% heading "whatsnext" %}}
Learn more about the following:
* [Components](/docs/concepts/overview/components/#node-components) that make up a node.
* [API definition for Node](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#node-v1-core).
* [Node](https://git.k8s.io/design-proposals-archive/architecture/architecture.md#the-kubernetes-node) section of the architecture design document.
* [Node](https://git.k8s.io/design-proposals-archive/architecture/architecture.md#the-kubernetes-node)
section of the architecture design document.
* [Taints and Tolerations](/docs/concepts/scheduling-eviction/taint-and-toleration/).
* [Node Resource Managers](/docs/concepts/policy/node-resource-managers/).
* [Resource Management for Windows nodes](/docs/concepts/configuration/windows-resource-management/).

View File

@ -8,6 +8,12 @@ content_type: concept
description: >
Lower-level detail relevant to creating or administering a Kubernetes cluster.
no_list: true
card:
name: setup
weight: 60
anchors:
- anchor: "#securing-a-cluster"
title: Securing a cluster
---
<!-- overview -->

View File

@ -470,24 +470,7 @@ traffic, you can configure rules to block any health check requests
that originate from outside your cluster.
{{< /caution >}}
{{% code file="priority-and-fairness/health-for-strangers.yaml" %}}
## Diagnostics
Every HTTP response from an API server with the priority and fairness feature
enabled has two extra headers: `X-Kubernetes-PF-FlowSchema-UID` and
`X-Kubernetes-PF-PriorityLevel-UID`, noting the flow schema that matched the request
and the priority level to which it was assigned, respectively. The API objects'
names are not included in these headers in case the requesting user does not
have permission to view them, so when debugging you can use a command like
```shell
kubectl get flowschemas -o custom-columns="uid:{metadata.uid},name:{metadata.name}"
kubectl get prioritylevelconfigurations -o custom-columns="uid:{metadata.uid},name:{metadata.name}"
```
to get a mapping of UIDs to names for both FlowSchemas and
PriorityLevelConfigurations.
{{% code_sample file="priority-and-fairness/health-for-strangers.yaml" %}}
## Observability
@ -678,114 +661,124 @@ poorly-behaved workloads that may be harming system health.
to a request being dispatched but did not, due to lack of available
concurrency, broken down by `flow_schema` and `priority_level`.
### Debug endpoints
## Good practices for using API Priority and Fairness
When you enable the API Priority and Fairness feature, the `kube-apiserver`
serves the following additional paths at its HTTP(S) ports.
When a given priority level exceeds its permitted concurrency, requests can
experience increased latency or be dropped with an HTTP 429 (Too Many Requests)
error. To prevent these side effects of APF, you can modify your workload or
tweak your APF settings to ensure there are sufficient seats available to serve
your requests.
- `/debug/api_priority_and_fairness/dump_priority_levels` - a listing of
all the priority levels and the current state of each. You can fetch like this:
To detect whether requests are being rejected due to APF, check the following
metrics:
```shell
kubectl get --raw /debug/api_priority_and_fairness/dump_priority_levels
```
- apiserver_flowcontrol_rejected_requests_total: the total number of requests
rejected per FlowSchema and PriorityLevelConfiguration.
- apiserver_flowcontrol_current_inqueue_requests: the current number of requests
queued per FlowSchema and PriorityLevelConfiguration.
- apiserver_flowcontrol_request_wait_duration_seconds: the latency added to
requests waiting in queues.
- apiserver_flowcontrol_priority_level_seat_utilization: the seat utilization
per PriorityLevelConfiguration.
The output is similar to this:
### Workload modifications {#good-practice-workload-modifications}
```none
PriorityLevelName, ActiveQueues, IsIdle, IsQuiescing, WaitingRequests, ExecutingRequests, DispatchedRequests, RejectedRequests, TimedoutRequests, CancelledRequests
catch-all, 0, true, false, 0, 0, 1, 0, 0, 0
exempt, <none>, <none>, <none>, <none>, <none>, <none>, <none>, <none>, <none>
global-default, 0, true, false, 0, 0, 46, 0, 0, 0
leader-election, 0, true, false, 0, 0, 4, 0, 0, 0
node-high, 0, true, false, 0, 0, 34, 0, 0, 0
system, 0, true, false, 0, 0, 48, 0, 0, 0
workload-high, 0, true, false, 0, 0, 500, 0, 0, 0
workload-low, 0, true, false, 0, 0, 0, 0, 0, 0
```
To prevent requests from queuing and adding latency or being dropped due to APF,
you can optimize your requests by:
- `/debug/api_priority_and_fairness/dump_queues` - a listing of all the
queues and their current state. You can fetch like this:
- Reducing the rate at which requests are executed. A fewer number of requests
over a fixed period will result in a fewer number of seats being needed at a
given time.
- Avoid issuing a large number of expensive requests concurrently. Requests can
be optimized to use fewer seats or have lower latency so that these requests
hold those seats for a shorter duration. List requests can occupy more than 1
seat depending on the number of objects fetched during the request. Restricting
the number of objects retrieved in a list request, for example by using
pagination, will use less total seats over a shorter period. Furthermore,
replacing list requests with watch requests will require lower total concurrency
shares as watch requests only occupy 1 seat during its initial burst of
notifications. If using streaming lists in versions 1.27 and later, watch
requests will occupy the same number of seats as a list request for its initial
burst of notifications because the entire state of the collection has to be
streamed. Note that in both cases, a watch request will not hold any seats after
this initial phase.
```shell
kubectl get --raw /debug/api_priority_and_fairness/dump_queues
```
Keep in mind that queuing or rejected requests from APF could be induced by
either an increase in the number of requests or an increase in latency for
existing requests. For example, if requests that normally take 1s to execute
start taking 60s, it is possible that APF will start rejecting requests because
requests are occupying seats for a longer duration than normal due to this
increase in latency. If APF starts rejecting requests across multiple priority
levels without a significant change in workload, it is possible there is an
underlying issue with control plane performance rather than the workload or APF
settings.
The output is similar to this:
### Priority and fairness settings {#good-practice-apf-settings}
```none
PriorityLevelName, Index, PendingRequests, ExecutingRequests, VirtualStart,
workload-high, 0, 0, 0, 0.0000,
workload-high, 1, 0, 0, 0.0000,
workload-high, 2, 0, 0, 0.0000,
...
leader-election, 14, 0, 0, 0.0000,
leader-election, 15, 0, 0, 0.0000,
```
You can also modify the default FlowSchema and PriorityLevelConfiguration
objects or create new objects of these types to better accommodate your
workload.
- `/debug/api_priority_and_fairness/dump_requests` - a listing of all the requests
that are currently waiting in a queue. You can fetch like this:
APF settings can be modified to:
```shell
kubectl get --raw /debug/api_priority_and_fairness/dump_requests
```
- Give more seats to high priority requests.
- Isolate non-essential or expensive requests that would starve a concurrency
level if it was shared with other flows.
The output is similar to this:
#### Give more seats to high priority requests
```none
PriorityLevelName, FlowSchemaName, QueueIndex, RequestIndexInQueue, FlowDistingsher, ArriveTime,
exempt, <none>, <none>, <none>, <none>, <none>,
system, system-nodes, 12, 0, system:node:127.0.0.1, 2020-07-23T15:26:57.179170694Z,
```
In addition to the queued requests, the output includes one phantom line
for each priority level that is exempt from limitation.
1. If possible, the number of seats available across all priority levels for a
particular `kube-apiserver` can be increased by increasing the values for the
`max-requests-inflight` and `max-mutating-requests-inflight` flags. Alternatively,
horizontally scaling the number of `kube-apiserver` instances will increase the
total concurrency per priority level across the cluster assuming there is
sufficient load balancing of requests.
1. You can create a new FlowSchema which references a PriorityLevelConfiguration
with a larger concurrency level. This new PriorityLevelConfiguration could be an
existing level or a new level with its own set of nominal concurrency shares.
For example, a new FlowSchema could be introduced to change the
PriorityLevelConfiguration for your requests from global-default to workload-low
to increase the number of seats available to your user. Creating a new
PriorityLevelConfiguration will reduce the number of seats designated for
existing levels. Recall that editing a default FlowSchema or
PriorityLevelConfiguration will require setting the
`apf.kubernetes.io/autoupdate-spec` annotation to false.
1. You can also increase the NominalConcurrencyShares for the
PriorityLevelConfiguration which is serving your high priority requests.
Alternatively, for versions 1.26 and later, you can increase the LendablePercent
for competing priority levels so that the given priority level has a higher pool
of seats it can borrow.
You can get a more detailed listing with a command like this:
#### Isolate non-essential requests from starving other flows
```shell
kubectl get --raw '/debug/api_priority_and_fairness/dump_requests?includeRequestDetails=1'
```
For request isolation, you can create a FlowSchema whose subject matches the
user making these requests or create a FlowSchema that matches what the request
is (corresponding to the resourceRules). Next, you can map this FlowSchema to a
PriorityLevelConfiguration with a low share of seats.
The output is similar to this:
For example, suppose list event requests from Pods running in the default namespace
are using 10 seats each and execute for 1 minute. To prevent these expensive
requests from impacting requests from other Pods using the existing service-accounts
FlowSchema, you can apply the following FlowSchema to isolate these list calls
from other requests.
```none
PriorityLevelName, FlowSchemaName, QueueIndex, RequestIndexInQueue, FlowDistingsher, ArriveTime, UserName, Verb, APIPath, Namespace, Name, APIVersion, Resource, SubResource,
system, system-nodes, 12, 0, system:node:127.0.0.1, 2020-07-23T15:31:03.583823404Z, system:node:127.0.0.1, create, /api/v1/namespaces/scaletest/configmaps,
system, system-nodes, 12, 1, system:node:127.0.0.1, 2020-07-23T15:31:03.594555947Z, system:node:127.0.0.1, create, /api/v1/namespaces/scaletest/configmaps,
```
Example FlowSchema object to isolate list event requests:
### Debug logging
{{% code_sample file="priority-and-fairness/list-events-default-service-account.yaml" %}}
At `-v=3` or more verbose the server outputs an httplog line for every
request, and it includes the following attributes.
- `apf_fs`: the name of the flow schema to which the request was classified.
- `apf_pl`: the name of the priority level for that flow schema.
- `apf_iseats`: the number of seats determined for the initial
(normal) stage of execution of the request.
- `apf_fseats`: the number of seats determined for the final stage of
execution (accounting for the associated WATCH notifications) of the
request.
- `apf_additionalLatency`: the duration of the final stage of
execution of the request.
At higher levels of verbosity there will be log lines exposing details
of how APF handled the request, primarily for debugging purposes.
### Response headers
APF adds the following two headers to each HTTP response message.
- `X-Kubernetes-PF-FlowSchema-UID` holds the UID of the FlowSchema
object to which the corresponding request was classified.
- `X-Kubernetes-PF-PriorityLevel-UID` holds the UID of the
PriorityLevelConfiguration object associated with that FlowSchema.
- This FlowSchema captures all list event calls made by the default service
account in the default namespace. The matching precedence 8000 is lower than the
value of 9000 used by the existing service-accounts FlowSchema so these list
event calls will match list-events-default-service-account rather than
service-accounts.
- The catch-all PriorityLevelConfiguration is used to isolate these requests.
The catch-all priority level has a very small concurrency share and does not
queue requests.
## {{% heading "whatsnext" %}}
For background information on design details for API priority and fairness, see
- You can visit flow control [reference doc](/docs/reference/debug-cluster/flow-control/) to learn more about troubleshooting.
- For background information on design details for API priority and fairness, see
the [enhancement proposal](https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/1040-priority-and-fairness).
You can make suggestions and feature requests via [SIG API Machinery](https://github.com/kubernetes/community/tree/master/sig-api-machinery)
- You can make suggestions and feature requests via [SIG API Machinery](https://github.com/kubernetes/community/tree/master/sig-api-machinery)
or the feature's [slack channel](https://kubernetes.slack.com/messages/api-priority-and-fairness).

View File

@ -39,7 +39,7 @@ Kubernetes captures logs from each container in a running Pod.
This example uses a manifest for a `Pod` with a container
that writes text to the standard output stream, once per second.
{{% code file="debug/counter-pod.yaml" %}}
{{% code_sample file="debug/counter-pod.yaml" %}}
To run this pod, use the following command:
@ -255,7 +255,7 @@ For example, a pod runs a single container, and the container
writes to two different log files using two different formats. Here's a
manifest for the Pod:
{{% code file="admin/logging/two-files-counter-pod.yaml" %}}
{{% code_sample file="admin/logging/two-files-counter-pod.yaml" %}}
It is not recommended to write log entries with different formats to the same log
stream, even if you managed to redirect both components to the `stdout` stream of
@ -265,7 +265,7 @@ the logs to its own `stdout` stream.
Here's a manifest for a pod that has two sidecar containers:
{{% code file="admin/logging/two-files-counter-pod-streaming-sidecar.yaml" %}}
{{% code_sample file="admin/logging/two-files-counter-pod-streaming-sidecar.yaml" %}}
Now when you run this pod, you can access each log stream separately by
running the following commands:
@ -332,7 +332,7 @@ Here are two example manifests that you can use to implement a sidecar container
The first manifest contains a [`ConfigMap`](/docs/tasks/configure-pod-container/configure-pod-configmap/)
to configure fluentd.
{{% code file="admin/logging/fluentd-sidecar-config.yaml" %}}
{{% code_sample file="admin/logging/fluentd-sidecar-config.yaml" %}}
{{< note >}}
In the sample configurations, you can replace fluentd with any logging agent, reading
@ -342,7 +342,7 @@ from any source inside an application container.
The second manifest describes a pod that has a sidecar container running fluentd.
The pod mounts a volume where fluentd can pick up its configuration data.
{{% code file="admin/logging/two-files-counter-pod-agent-sidecar.yaml" %}}
{{% code_sample file="admin/logging/two-files-counter-pod-agent-sidecar.yaml" %}}
### Exposing logs directly from the application

View File

@ -10,9 +10,6 @@ weight: 40
You've deployed your application and exposed it via a service. Now what? Kubernetes provides a
number of tools to help you manage your application deployment, including scaling and updating.
Among the features that we will discuss in more depth are
[configuration files](/docs/concepts/configuration/overview/) and
[labels](/docs/concepts/overview/working-with-objects/labels/).
<!-- body -->
@ -22,7 +19,7 @@ Many applications require multiple resources to be created, such as a Deployment
Management of multiple resources can be simplified by grouping them together in the same file
(separated by `---` in YAML). For example:
{{% code file="application/nginx-app.yaml" %}}
{{% code_sample file="application/nginx-app.yaml" %}}
Multiple resources can be created the same way as a single resource:
@ -176,71 +173,10 @@ persistentvolumeclaim/my-pvc created
If you're interested in learning more about `kubectl`, go ahead and read
[Command line tool (kubectl)](/docs/reference/kubectl/).
## Using labels effectively
The examples we've used so far apply at most a single label to any resource. There are many
scenarios where multiple labels should be used to distinguish sets from one another.
For instance, different applications would use different values for the `app` label, but a
multi-tier application, such as the [guestbook example](https://github.com/kubernetes/examples/tree/master/guestbook/),
would additionally need to distinguish each tier. The frontend could carry the following labels:
```yaml
labels:
app: guestbook
tier: frontend
```
while the Redis master and slave would have different `tier` labels, and perhaps even an
additional `role` label:
```yaml
labels:
app: guestbook
tier: backend
role: master
```
and
```yaml
labels:
app: guestbook
tier: backend
role: slave
```
The labels allow us to slice and dice our resources along any dimension specified by a label:
```shell
kubectl apply -f examples/guestbook/all-in-one/guestbook-all-in-one.yaml
kubectl get pods -Lapp -Ltier -Lrole
```
```none
NAME READY STATUS RESTARTS AGE APP TIER ROLE
guestbook-fe-4nlpb 1/1 Running 0 1m guestbook frontend <none>
guestbook-fe-ght6d 1/1 Running 0 1m guestbook frontend <none>
guestbook-fe-jpy62 1/1 Running 0 1m guestbook frontend <none>
guestbook-redis-master-5pg3b 1/1 Running 0 1m guestbook backend master
guestbook-redis-slave-2q2yf 1/1 Running 0 1m guestbook backend slave
guestbook-redis-slave-qgazl 1/1 Running 0 1m guestbook backend slave
my-nginx-divi2 1/1 Running 0 29m nginx <none> <none>
my-nginx-o0ef1 1/1 Running 0 29m nginx <none> <none>
```
```shell
kubectl get pods -lapp=guestbook,role=slave
```
```none
NAME READY STATUS RESTARTS AGE
guestbook-redis-slave-2q2yf 1/1 Running 0 3m
guestbook-redis-slave-qgazl 1/1 Running 0 3m
```
## Canary deployments
<!--TODO: make a task out of this for canary deployment, ref #42786-->
Another scenario where multiple labels are needed is to distinguish deployments of different
releases or configurations of the same component. It is common practice to deploy a *canary* of a
new application release (specified via image tag in the pod template) side by side with the
@ -296,42 +232,6 @@ the canary one.
For a more concrete example, check the
[tutorial of deploying Ghost](https://github.com/kelseyhightower/talks/tree/master/kubecon-eu-2016/demo#deploy-a-canary).
## Updating labels
Sometimes existing pods and other resources need to be relabeled before creating new resources.
This can be done with `kubectl label`.
For example, if you want to label all your nginx pods as frontend tier, run:
```shell
kubectl label pods -l app=nginx tier=fe
```
```none
pod/my-nginx-2035384211-j5fhi labeled
pod/my-nginx-2035384211-u2c7e labeled
pod/my-nginx-2035384211-u3t6x labeled
```
This first filters all pods with the label "app=nginx", and then labels them with the "tier=fe".
To see the pods you labeled, run:
```shell
kubectl get pods -l app=nginx -L tier
```
```none
NAME READY STATUS RESTARTS AGE TIER
my-nginx-2035384211-j5fhi 1/1 Running 0 23m fe
my-nginx-2035384211-u2c7e 1/1 Running 0 23m fe
my-nginx-2035384211-u3t6x 1/1 Running 0 23m fe
```
This outputs all "app=nginx" pods, with an additional label column of pods' tier (specified with
`-L` or `--label-columns`).
For more information, please see [labels](/docs/concepts/overview/working-with-objects/labels/)
and [kubectl label](/docs/reference/generated/kubectl/kubectl-commands/#label).
## Updating annotations
Sometimes you would want to attach annotations to resources. Annotations are arbitrary

View File

@ -34,12 +34,21 @@ To learn about the Kubernetes networking model, see [here](/docs/concepts/servic
## How to implement the Kubernetes network model
The network model is implemented by the container runtime on each node. The most common container runtimes use [Container Network Interface](https://github.com/containernetworking/cni) (CNI) plugins to manage their network and security capabilities. Many different CNI plugins exist from many different vendors. Some of these provide only basic features of adding and removing network interfaces, while others provide more sophisticated solutions, such as integration with other container orchestration systems, running multiple CNI plugins, advanced IPAM features etc.
The network model is implemented by the container runtime on each node. The most common container
runtimes use [Container Network Interface](https://github.com/containernetworking/cni) (CNI)
plugins to manage their network and security capabilities. Many different CNI plugins exist from
many different vendors. Some of these provide only basic features of adding and removing network
interfaces, while others provide more sophisticated solutions, such as integration with other
container orchestration systems, running multiple CNI plugins, advanced IPAM features etc.
See [this page](/docs/concepts/cluster-administration/addons/#networking-and-network-policy) for a non-exhaustive list of networking addons supported by Kubernetes.
See [this page](/docs/concepts/cluster-administration/addons/#networking-and-network-policy)
for a non-exhaustive list of networking addons supported by Kubernetes.
## {{% heading "whatsnext" %}}
The early design of the networking model and its rationale, and some future
plans are described in more detail in the
The early design of the networking model and its rationale are described in more detail in the
[networking design document](https://git.k8s.io/design-proposals-archive/network/networking.md).
For future plans and some on-going efforts that aim to improve Kubernetes networking, please
refer to the SIG-Network
[KEPs](https://github.com/kubernetes/enhancements/tree/master/keps/sig-network).

View File

@ -21,7 +21,7 @@ This format is structured plain text, designed so that people and machines can b
## Metrics in Kubernetes
In most cases metrics are available on `/metrics` endpoint of the HTTP server. For components that
doesn't expose endpoint by default it can be enabled using `--bind-address` flag.
don't expose endpoint by default, it can be enabled using `--bind-address` flag.
Examples of those components:
@ -119,21 +119,6 @@ If you're upgrading from release `1.12` to `1.13`, but still depend on a metric
`1.12`, you should set hidden metrics via command line: `--show-hidden-metrics=1.12` and remember
to remove this metric dependency before upgrading to `1.14`
## Disable accelerator metrics
The kubelet collects accelerator metrics through cAdvisor. To collect these metrics, for
accelerators like NVIDIA GPUs, kubelet held an open handle on the driver. This meant that in order
to perform infrastructure changes (for example, updating the driver), a cluster administrator
needed to stop the kubelet agent.
The responsibility for collecting accelerator metrics now belongs to the vendor rather than the
kubelet. Vendors must provide a container that collects metrics and exposes them to the metrics
service (for example, Prometheus).
The [`DisableAcceleratorUsageMetrics` feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
disables metrics collected by the kubelet, with a
[timeline for enabling this feature by default](https://github.com/kubernetes/enhancements/tree/411e51027db842355bd489691af897afc1a41a5e/keps/sig-node/1867-disable-accelerator-usage-metrics#graduation-criteria).
## Component metrics
### kube-controller-manager metrics

View File

@ -105,8 +105,12 @@ span will be sent to the exporter.
The kubelet in Kubernetes v{{< skew currentVersion >}} collects spans from
the garbage collection, pod synchronization routine as well as every gRPC
method. Connected container runtimes like CRI-O and containerd can link the
traces to their exported spans to provide additional context of information.
method. The kubelet propagates trace context with gRPC requests so that
container runtimes with trace instrumentation, such as CRI-O and containerd,
can associate their exported spans with the trace context from the kubelet.
The resulting traces will have parent-child links between kubelet and
container runtime spans, providing helpful context when debugging node
issues.
Please note that exporting spans always comes with a small performance overhead
on the networking and CPU side, depending on the overall configuration of the

View File

@ -29,10 +29,12 @@ that exposes the database component to your cluster.
This lets you fetch a container image running in the cloud and
debug the exact same code locally if needed.
{{< note >}}
A ConfigMap is not designed to hold large chunks of data. The data stored in a
ConfigMap cannot exceed 1 MiB. If you need to store settings that are
larger than this limit, you may want to consider mounting a volume or use a
separate database or file service.
{{< /note >}}
## ConfigMap object
@ -111,7 +113,7 @@ technique also lets you access a ConfigMap in a different namespace.
Here's an example Pod that uses values from `game-demo` to configure a Pod:
{{% code file="configmap/configure-pod.yaml" %}}
{{% code_sample file="configmap/configure-pod.yaml" %}}
A ConfigMap doesn't differentiate between single line property values and
multi-line file-like values.

View File

@ -146,7 +146,7 @@ for a comprehensive list.
- Use label selectors for `get` and `delete` operations instead of specific object names. See the
sections on [label selectors](/docs/concepts/overview/working-with-objects/labels/#label-selectors)
and [using labels effectively](/docs/concepts/cluster-administration/manage-deployment/#using-labels-effectively).
and [using labels effectively](/docs/concepts/overview/working-with-objects/labels/#using-labels-effectively).
- Use `kubectl create deployment` and `kubectl expose` to quickly create single-container
Deployments and Services.

View File

@ -182,16 +182,16 @@ Kubernetes doesn't impose any constraints on the type name. However, if you
are using one of the built-in types, you must meet all the requirements defined
for that type.
If you are defining a type of secret that's for public use, follow the convention
and structure the secret type to have your domain name before the name, separated
If you are defining a type of Secret that's for public use, follow the convention
and structure the Secret type to have your domain name before the name, separated
by a `/`. For example: `cloud-hosting.example.net/cloud-api-credentials`.
### Opaque secrets
### Opaque Secrets
`Opaque` is the default Secret type if omitted from a Secret configuration file.
When you create a Secret using `kubectl`, you will use the `generic`
subcommand to indicate an `Opaque` Secret type. For example, the following
command creates an empty Secret of type `Opaque`.
`Opaque` is the default Secret type if you don't explicitly specify a type in
a Secret manifest. When you create a Secret using `kubectl`, you must use the
`generic` subcommand to indicate an `Opaque` Secret type. For example, the
following command creates an empty Secret of type `Opaque`:
```shell
kubectl create secret generic empty-secret
@ -208,50 +208,48 @@ empty-secret Opaque 0 2m6s
The `DATA` column shows the number of data items stored in the Secret.
In this case, `0` means you have created an empty Secret.
### Service account token Secrets
### ServiceAccount token Secrets
A `kubernetes.io/service-account-token` type of Secret is used to store a
token credential that identifies a
{{< glossary_tooltip text="service account" term_id="service-account" >}}.
{{< glossary_tooltip text="ServiceAccount" term_id="service-account" >}}. This
is a legacy mechanism that provides long-lived ServiceAccount credentials to
Pods.
In Kubernetes v1.22 and later, the recommended approach is to obtain a
short-lived, automatically rotating ServiceAccount token by using the
[`TokenRequest`](/docs/reference/kubernetes-api/authentication-resources/token-request-v1/)
API instead. You can get these short-lived tokens using the following methods:
* Call the `TokenRequest` API either directly or by using an API client like
`kubectl`. For example, you can use the
[`kubectl create token`](/docs/reference/generated/kubectl/kubectl-commands#-em-token-em-)
command.
* Request a mounted token in a
[projected volume](/docs/reference/access-authn-authz/service-accounts-admin/#bound-service-account-token-volume)
in your Pod manifest. Kubernetes creates the token and mounts it in the Pod.
The token is automatically invalidated when the Pod that it's mounted in is
deleted. For details, see
[Launch a Pod using service account token projection](/docs/tasks/configure-pod-container/configure-service-account/#launch-a-pod-using-service-account-token-projection).
{{< note >}}
Versions of Kubernetes before v1.22 automatically created credentials for
accessing the Kubernetes API. This older mechanism was based on creating token
Secrets that could then be mounted into running Pods.
In more recent versions, including Kubernetes v{{< skew currentVersion >}}, API
credentials are obtained directly by using the
[TokenRequest](/docs/reference/kubernetes-api/authentication-resources/token-request-v1/)
API, and are mounted into Pods using a
[projected volume](/docs/reference/access-authn-authz/service-accounts-admin/#bound-service-account-token-volume).
The tokens obtained using this method have bounded lifetimes, and are
automatically invalidated when the Pod they are mounted into is deleted.
You can still
[manually create](/docs/tasks/configure-pod-container/configure-service-account/#manually-create-a-service-account-api-token)
a service account token Secret; for example, if you need a token that never
expires. However, using the
[TokenRequest](/docs/reference/kubernetes-api/authentication-resources/token-request-v1/)
subresource to obtain a token to access the API is recommended instead.
You can use the
[`kubectl create token`](/docs/reference/generated/kubectl/kubectl-commands#-em-token-em-)
command to obtain a token from the `TokenRequest` API.
{{< /note >}}
You should only create a service account token Secret object
You should only create a ServiceAccount token Secret
if you can't use the `TokenRequest` API to obtain a token,
and the security exposure of persisting a non-expiring token credential
in a readable API object is acceptable to you.
in a readable API object is acceptable to you. For instructions, see
[Manually create a long-lived API token for a ServiceAccount](/docs/tasks/configure-pod-container/configure-service-account/#manually-create-a-service-account-api-token).
{{< /note >}}
When using this Secret type, you need to ensure that the
`kubernetes.io/service-account.name` annotation is set to an existing
service account name. If you are creating both the ServiceAccount and
ServiceAccount name. If you are creating both the ServiceAccount and
the Secret objects, you should create the ServiceAccount object first.
After the Secret is created, a Kubernetes {{< glossary_tooltip text="controller" term_id="controller" >}}
fills in some other fields such as the `kubernetes.io/service-account.uid` annotation, and the
`token` key in the `data` field, which is populated with an authentication token.
The following example configuration declares a service account token Secret:
The following example configuration declares a ServiceAccount token Secret:
```yaml
apiVersion: v1
@ -268,33 +266,27 @@ data:
After creating the Secret, wait for Kubernetes to populate the `token` key in the `data` field.
See the [ServiceAccount](/docs/tasks/configure-pod-container/configure-service-account/)
documentation for more information on how service accounts work.
See the [ServiceAccount](/docs/concepts/security/service-accounts/)
documentation for more information on how ServiceAccounts work.
You can also check the `automountServiceAccountToken` field and the
`serviceAccountName` field of the
[`Pod`](/docs/reference/generated/kubernetes-api/{{< param "version" >}}/#pod-v1-core)
for information on referencing service account credentials from within Pods.
for information on referencing ServiceAccount credentials from within Pods.
### Docker config Secrets
You can use one of the following `type` values to create a Secret to
store the credentials for accessing a container image registry:
If you are creating a Secret to store credentials for accessing a container image registry,
you must use one of the following `type` values for that Secret:
- `kubernetes.io/dockercfg`
- `kubernetes.io/dockerconfigjson`
The `kubernetes.io/dockercfg` type is reserved to store a serialized
`~/.dockercfg` which is the legacy format for configuring Docker command line.
When using this Secret type, you have to ensure the Secret `data` field
contains a `.dockercfg` key whose value is content of a `~/.dockercfg` file
encoded in the base64 format.
The `kubernetes.io/dockerconfigjson` type is designed for storing a serialized
JSON that follows the same format rules as the `~/.docker/config.json` file
which is a new format for `~/.dockercfg`.
When using this Secret type, the `data` field of the Secret object must
contain a `.dockerconfigjson` key, in which the content for the
`~/.docker/config.json` file is provided as a base64 encoded string.
- `kubernetes.io/dockercfg`: store a serialized `~/.dockercfg` which is the
legacy format for configuring Docker command line. The Secret
`data` field contains a `.dockercfg` key whose value is the content of a
base64 encoded `~/.dockercfg` file.
- `kubernetes.io/dockerconfigjson`: store a serialized JSON that follows the
same format rules as the `~/.docker/config.json` file, which is a new format
for `~/.dockercfg`. The Secret `data` field must contain a
`.dockerconfigjson` key for which the value is the content of a base64
encoded `~/.docker/config.json` file.
Below is an example for a `kubernetes.io/dockercfg` type of Secret:
@ -314,13 +306,13 @@ If you do not want to perform the base64 encoding, you can choose to use the
`stringData` field instead.
{{< /note >}}
When you create these types of Secrets using a manifest, the API
When you create Docker config Secrets using a manifest, the API
server checks whether the expected key exists in the `data` field, and
it verifies if the value provided can be parsed as a valid JSON. The API
server doesn't validate if the JSON actually is a Docker config file.
When you do not have a Docker config file, or you want to use `kubectl`
to create a Secret for accessing a container registry, you can do:
You can also use `kubectl` to create a Secret for accessing a container
registry, such as when you don't have a Docker configuration file:
```shell
kubectl create secret docker-registry secret-tiger-docker \
@ -330,15 +322,16 @@ kubectl create secret docker-registry secret-tiger-docker \
--docker-server=my-registry.example:5000
```
That command creates a Secret of type `kubernetes.io/dockerconfigjson`.
If you dump the `.data.dockerconfigjson` field from that new Secret and then
decode it from base64:
This command creates a Secret of type `kubernetes.io/dockerconfigjson`.
Retrieve the `.data.dockerconfigjson` field from that new Secret and decode the
data:
```shell
kubectl get secret secret-tiger-docker -o jsonpath='{.data.*}' | base64 -d
```
then the output is equivalent to this JSON document (which is also a valid
The output is equivalent to the following JSON document (which is also a valid
Docker configuration file):
```json
@ -354,10 +347,12 @@ Docker configuration file):
}
```
{{< note >}}
{{< caution >}}
The `auth` value there is base64 encoded; it is obscured but not secret.
Anyone who can read that Secret can learn the registry access bearer token.
{{< /note >}}
It is suggested to use [credential providers](/docs/tasks/administer-cluster/kubelet-credential-provider/) to dynamically and securely provide pull secrets on-demand.
{{< /caution >}}
### Basic authentication Secret
@ -368,9 +363,9 @@ Secret must contain one of the following two keys:
- `username`: the user name for authentication
- `password`: the password or token for authentication
Both values for the above two keys are base64 encoded strings. You can, of
course, provide the clear text content using the `stringData` for Secret
creation.
Both values for the above two keys are base64 encoded strings. You can
alternatively provide the clear text content using the `stringData` field in the
Secret manifest.
The following manifest is an example of a basic authentication Secret:
@ -392,7 +387,7 @@ people to understand the purpose of your Secret, and sets a convention for what
to expect.
The Kubernetes API verifies that the required keys are set for a Secret of this type.
### SSH authentication secrets
### SSH authentication Secrets
The builtin type `kubernetes.io/ssh-auth` is provided for storing data used in
SSH authentication. When using this Secret type, you will have to specify a
@ -414,12 +409,12 @@ data:
MIIEpQIBAAKCAQEAulqb/Y ...
```
The SSH authentication Secret type is provided only for user's convenience.
You could instead create an `Opaque` type Secret for credentials used for SSH authentication.
The SSH authentication Secret type is provided only for convenience.
You can create an `Opaque` type for credentials used for SSH authentication.
However, using the defined and public Secret type (`kubernetes.io/ssh-auth`) helps other
people to understand the purpose of your Secret, and sets a convention for what key names
to expect.
and the API server does verify if the required keys are provided in a Secret configuration.
The Kubernetes API verifies that the required keys are set for a Secret of this type.
{{< caution >}}
SSH private keys do not establish trusted communication between an SSH client and
@ -427,18 +422,22 @@ host server on their own. A secondary means of establishing trust is needed to
mitigate "man in the middle" attacks, such as a `known_hosts` file added to a ConfigMap.
{{< /caution >}}
### TLS secrets
### TLS Secrets
Kubernetes provides a builtin Secret type `kubernetes.io/tls` for storing
The `kubernetes.io/tls` Secret type is for storing
a certificate and its associated key that are typically used for TLS.
One common use for TLS secrets is to configure encryption in transit for
One common use for TLS Secrets is to configure encryption in transit for
an [Ingress](/docs/concepts/services-networking/ingress/), but you can also use it
with other resources or directly in your workload.
When using this type of Secret, the `tls.key` and the `tls.crt` key must be provided
in the `data` (or `stringData`) field of the Secret configuration, although the API
server doesn't actually validate the values for each key.
As an alternative to using `stringData`, you can use the `data` field to provide
the base64 encoded certificate and private key. For details, see
[Constraints on Secret names and data](#restriction-names-data).
The following YAML contains an example config for a TLS Secret:
```yaml
@ -447,21 +446,23 @@ kind: Secret
metadata:
name: secret-tls
type: kubernetes.io/tls
data:
stringData:
# the data is abbreviated in this example
tls.crt: |
--------BEGIN CERTIFICATE-----
MIIC2DCCAcCgAwIBAgIBATANBgkqh ...
tls.key: |
-----BEGIN RSA PRIVATE KEY-----
MIIEpgIBAAKCAQEA7yn3bRHQ5FHMQ ...
```
The TLS Secret type is provided for user's convenience. You can create an `Opaque`
for credentials used for TLS server and/or client. However, using the builtin Secret
type helps ensure the consistency of Secret format in your project; the API server
does verify if the required keys are provided in a Secret configuration.
The TLS Secret type is provided only for convenience.
You can create an `Opaque` type for credentials used for TLS authentication.
However, using the defined and public Secret type (`kubernetes.io/ssh-auth`)
helps ensure the consistency of Secret format in your project. The API server
verifies if the required keys are set for a Secret of this type.
When creating a TLS Secret using `kubectl`, you can use the `tls` subcommand
as shown in the following example:
To create a TLS Secret using `kubectl`, use the `tls` subcommand:
```shell
kubectl create secret tls my-tls-secret \
@ -469,26 +470,12 @@ kubectl create secret tls my-tls-secret \
--key=path/to/key/file
```
The public/private key pair must exist before hand. The public key certificate
for `--cert` must be DER format as per
[Section 5.1 of RFC 7468](https://datatracker.ietf.org/doc/html/rfc7468#section-5.1),
and must match the given private key for `--key` (PKCS #8 in DER format;
[Section 11 of RFC 7468](https://datatracker.ietf.org/doc/html/rfc7468#section-11)).
{{< note >}}
A kubernetes.io/tls Secret stores the Base64-encoded DER data for keys and
certificates. If you're familiar with PEM format for private keys and for certificates,
the base64 data are the same as that format except that you omit
the initial and the last lines that are used in PEM.
For example, for a certificate, you do **not** include `--------BEGIN CERTIFICATE-----`
and `-------END CERTIFICATE----`.
{{< /note >}}
The public/private key pair must exist before hand. The public key certificate for `--cert` must be .PEM encoded
and must match the given private key for `--key`.
### Bootstrap token Secrets
A bootstrap token Secret can be created by explicitly specifying the Secret
`type` to `bootstrap.kubernetes.io/token`. This type of Secret is designed for
The `bootstrap.kubernetes.io/token` Secret type is for
tokens used during the node bootstrap process. It stores tokens used to sign
well-known ConfigMaps.
@ -515,7 +502,7 @@ data:
usage-bootstrap-signing: dHJ1ZQ==
```
A bootstrap type Secret has the following keys specified under `data`:
A bootstrap token Secret has the following keys specified under `data`:
- `token-id`: A random 6 character string as the token identifier. Required.
- `token-secret`: A random 16 character string as the actual token secret. Required.
@ -528,8 +515,8 @@ A bootstrap type Secret has the following keys specified under `data`:
- `auth-extra-groups`: A comma-separated list of group names that will be
authenticated as in addition to the `system:bootstrappers` group.
The above YAML may look confusing because the values are all in base64 encoded
strings. In fact, you can create an identical Secret using the following YAML:
You can alternatively provide the values in the `stringData` field of the Secret
without base64 encoding them:
```yaml
apiVersion: v1

View File

@ -6,6 +6,9 @@ reviewers:
- erictune
- thockin
content_type: concept
card:
name: concepts
weight: 50
---
<!-- overview -->

View File

@ -157,7 +157,7 @@ several types of extensions.
<!-- image source for flowchart: https://docs.google.com/drawings/d/1sdviU6lDz4BpnzJNHfNpQrqI9F19QZ07KnhnxVrp2yg/edit -->
{{< figure src="/docs/concepts/extend-kubernetes/flowchart.png"
{{< figure src="/docs/concepts/extend-kubernetes/flowchart.svg"
alt="Flowchart with questions about use cases and guidance for implementers. Green circles indicate yes; red circles indicate no."
class="diagram-large" caption="Flowchart guide to select an extension approach" >}}

View File

@ -10,9 +10,8 @@ weight: 20
<!-- overview -->
{{< feature-state for_k8s_version="v1.26" state="stable" >}}
Kubernetes provides a [device plugin framework](https://git.k8s.io/design-proposals-archive/resource-management/device-plugin.md)
that you can use to advertise system hardware resources to the
{{< glossary_tooltip term_id="kubelet" >}}.
Kubernetes provides a device plugin framework that you can use to advertise system hardware
resources to the {{< glossary_tooltip term_id="kubelet" >}}.
Instead of customizing the code for Kubernetes itself, vendors can implement a
device plugin that you deploy either manually or as a {{< glossary_tooltip term_id="daemonset" >}}.
@ -151,7 +150,7 @@ The general workflow of a device plugin includes the following steps:
device plugin defines modifications that must be made to a container's definition to provide
access to the device. These modifications include:
* annotations
* [Annotations](/docs/concepts/overview/working-with-objects/annotations/)
* device nodes
* environment variables
* mounts
@ -159,7 +158,8 @@ The general workflow of a device plugin includes the following steps:
{{< note >}}
The processing of the fully-qualified CDI device names by the Device Manager requires
the `DevicePluginCDIDevices` feature gate to be enabled. This was added as an alpha feature in
that the `DevicePluginCDIDevices` [feature gate](/docs/reference/command-line-tools-reference/feature-gates/)
is enabled for the kubelet and the kube-apiserver. This was added as an alpha feature in Kubernetes
v1.28.
{{< /note >}}
@ -245,7 +245,7 @@ of running pods allocated in `ResourceClaims` by the `DynamicResourceAllocation`
this feature `kubelet` must be started with the following flags:
```
--feature-gates=DynamicResourceAllocation=true,KubeletPodResourcesDynamiceResources=true
--feature-gates=DynamicResourceAllocation=true,KubeletPodResourcesDynamicResources=true
```
```gRPC
@ -383,10 +383,6 @@ will continue working.
{{< /note >}}
Support for the `PodResourcesLister service` requires `KubeletPodResources`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) to be enabled.
It is enabled by default starting with Kubernetes 1.15 and is v1 since Kubernetes 1.20.
### `Get` gRPC endpoint {#grpc-endpoint-get}
{{< feature-state state="alpha" for_k8s_version="v1.27" >}}
@ -414,7 +410,7 @@ allocated by the dynamic resource allocation API. To enable this feature, you mu
ensure your kubelet services are started with the following flags:
```
--feature-gates=KubeletPodResourcesGet=true,DynamicResourceAllocation=true,KubeletPodResourcesDynamiceResources=true
--feature-gates=KubeletPodResourcesGet=true,DynamicResourceAllocation=true,KubeletPodResourcesDynamicResources=true
```
## Device plugin integration with the Topology Manager
@ -459,6 +455,7 @@ pluginapi.Device{ID: "25102017", Health: pluginapi.Healthy, Topology:&pluginapi.
Here are some examples of device plugin implementations:
* The [AMD GPU device plugin](https://github.com/RadeonOpenCompute/k8s-device-plugin)
* The [generic device plugin](https://github.com/squat/generic-device-plugin) for generic Linux devices and USB devices
* The [Intel device plugins](https://github.com/intel/intel-device-plugins-for-kubernetes) for
Intel GPU, FPGA, QAT, VPU, SGX, DSA, DLB and IAA devices
* The [KubeVirt device plugins](https://github.com/kubevirt/kubernetes-device-plugins) for

Binary file not shown.

Before

Width:  |  Height:  |  Size: 159 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 23 KiB

View File

@ -10,6 +10,9 @@ weight: 20
card:
name: concepts
weight: 10
anchors:
- anchor: "#why-you-need-kubernetes-and-what-can-it-do"
title: Why Kubernetes?
no_list: true
---

View File

@ -8,6 +8,7 @@ description: >
plane and a set of machines called nodes.
weight: 30
card:
title: Components of a cluster
name: concepts
weight: 20
---

View File

@ -33,7 +33,7 @@ will constantly work to ensure that object exists. By creating an object, you're
telling the Kubernetes system what you want your cluster's workload to look like; this is your
cluster's *desired state*.
To work with Kubernetes objects--whether to create, modify, or delete them--you'll need to use the
To work with Kubernetes objects—whether to create, modify, or delete them—you'll need to use the
[Kubernetes API](/docs/concepts/overview/kubernetes-api/). When you use the `kubectl` command-line
interface, for example, the CLI makes the necessary Kubernetes API calls for you. You can also use
the Kubernetes API directly in your own programs using one of the
@ -71,15 +71,18 @@ For more information on the object spec, status, and metadata, see the
When you create an object in Kubernetes, you must provide the object spec that describes its
desired state, as well as some basic information about the object (such as a name). When you use
the Kubernetes API to create the object (either directly or via `kubectl`), that API request must
include that information as JSON in the request body. **Most often, you provide the information to
`kubectl` in a .yaml file.** `kubectl` converts the information to JSON when making the API
request.
include that information as JSON in the request body.
Most often, you provide the information to `kubectl` in file known as a _manifest_.
By convention, manifests are YAML (you could also use JSON format).
Tools such as `kubectl` convert the information from a manifest into JSON or another supported
serialization format when making the API request over HTTP.
Here's an example `.yaml` file that shows the required fields and object spec for a Kubernetes Deployment:
Here's an example manifest that shows the required fields and object spec for a Kubernetes
Deployment:
{{% code file="application/deployment.yaml" %}}
{{% code_sample file="application/deployment.yaml" %}}
One way to create a Deployment using a `.yaml` file like the one above is to use the
One way to create a Deployment using a manifest file like the one above is to use the
[`kubectl apply`](/docs/reference/generated/kubectl/kubectl-commands#apply) command
in the `kubectl` command-line interface, passing the `.yaml` file as an argument. Here's an example:
@ -95,7 +98,8 @@ deployment.apps/nginx-deployment created
### Required fields
In the `.yaml` file for the Kubernetes object you want to create, you'll need to set values for the following fields:
In the manifest (YAML or JSON file) for the Kubernetes object you want to create, you'll need to set values for
the following fields:
* `apiVersion` - Which version of the Kubernetes API you're using to create this object
* `kind` - What kind of object you want to create
@ -159,6 +163,10 @@ If you're new to Kubernetes, read more about the following:
* [Controllers](/docs/concepts/architecture/controller/) in Kubernetes.
* [kubectl](/docs/reference/kubectl/) and [kubectl commands](/docs/reference/generated/kubectl/kubectl-commands).
[Kubernetes Object Management](/docs/concepts/overview/working-with-objects/object-management/)
explains how to use `kubectl` to manage objects.
You might need to [install kubectl](/docs/tasks/tools/#kubectl) if you don't already have it available.
To learn about the Kubernetes API in general, visit:
* [Kubernetes API overview](/docs/reference/using-api/)

View File

@ -17,7 +17,8 @@ objects. Labels can be used to select objects and to find
collections of objects that satisfy certain conditions. In contrast, annotations
are not used to identify and select objects. The metadata
in an annotation can be small or large, structured or unstructured, and can
include characters not permitted by labels.
include characters not permitted by labels. It is possible to use labels as
well as annotations in the metadata of the same object.
Annotations, like labels, are key/value maps:
@ -93,4 +94,5 @@ spec:
## {{% heading "whatsnext" %}}
Learn more about [Labels and Selectors](/docs/concepts/overview/working-with-objects/labels/).
- Learn more about [Labels and Selectors](/docs/concepts/overview/working-with-objects/labels/).
- Find [Well-known labels, Annotations and Taints](/docs/reference/labels-annotations-taints/)

View File

@ -39,6 +39,10 @@ You can use the `=`, `==`, and `!=` operators with field selectors (`=` and `==`
```shell
kubectl get services --all-namespaces --field-selector metadata.namespace!=default
```
{{< note >}}
[Set-based operators](/docs/concepts/overview/working-with-objects/labels/#set-based-requirement)
(`in`, `notin`, `exists`) are not supported for field selectors.
{{< /note >}}
## Chained selectors

View File

@ -26,7 +26,7 @@ and does the following:
* Modifies the object to add a `metadata.deletionTimestamp` field with the
time you started the deletion.
* Prevents the object from being removed until its `metadata.finalizers` field is empty.
* Prevents the object from being removed until all items are removed from its `metadata.finalizers` field
* Returns a `202` status code (HTTP "Accepted")
The controller managing that finalizer notices the update to the object setting the
@ -45,6 +45,16 @@ controller can't delete it because the finalizer exists. When the Pod stops
using the `PersistentVolume`, Kubernetes clears the `pv-protection` finalizer,
and the controller deletes the volume.
{{<note>}}
* When you `DELETE` an object, Kubernetes adds the deletion timestamp for that object and then
immediately starts to restrict changes to the `.metadata.finalizers` field for the object that is
now pending deletion. You can remove existing finalizers (deleting an entry from the `finalizers`
list) but you cannot add a new finalizer. You also cannot modify the `deletionTimestamp` for an
object once it is set.
* After the deletion is requested, you can not resurrect this object. The only way is to delete it and make a new similar object.
{{</note>}}
## Owner references, labels, and finalizers {#owners-labels-finalizers}
Like {{<glossary_tooltip text="labels" term_id="label">}},

Some files were not shown because too many files have changed in this diff Show More