Merge pull request #48249 from mengjiao-liu/resync-virtual-ips-zh

[zh-cn] Resync content/zh-cn/docs/reference/networking/virtual-ips.md
pull/48285/head
Kubernetes Prow Robot 2024-10-11 01:44:19 +01:00 committed by GitHub
commit ce7077cf25
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
1 changed files with 558 additions and 116 deletions

View File

@ -31,6 +31,28 @@ of `type` other than
[`ExternalName`](/zh-cn/docs/concepts/services-networking/service/#externalname)
以外的{{< glossary_tooltip term_id="service" text="服务">}},实现**虚拟 IP** 机制。
<!--
Each instance of kube-proxy watches the Kubernetes {{< glossary_tooltip
term_id="control-plane" text="control plane" >}} for the addition and
removal of Service and EndpointSlice {{< glossary_tooltip
term_id="object" text="objects" >}}. For each Service, kube-proxy
calls appropriate APIs (depending on the kube-proxy mode) to configure
the node to capture traffic to the Service's `clusterIP` and `port`,
and redirect that traffic to one of the Service's endpoints
(usually a Pod, but possibly an arbitrary user-provided IP address). A control
loop ensures that the rules on each node are reliably synchronized with
the Service and EndpointSlice state as indicated by the API server.
{{< figure src="/images/docs/services-iptables-overview.svg" title="Virtual IP mechanism for Services, using iptables mode" class="diagram-medium" >}}
-->
kube-proxy 的每个实例都会监视 Kubernetes {{< glossary_tooltip text="控制平面" term_id="control-plane" >}} 中
Service 和 EndpointSlice {{< glossary_tooltip text="对象" term_id="object" >}} 的添加和删除。对于每个
Servicekube-proxy 调用适当的 API取决于 kube-proxy 模式)来配置节点,以捕获流向 Service 的 `clusterIP``port`
的流量,并将这些流量重定向到 Service 的某个端点(通常是 Pod但也可能是用户提供的任意 IP 地址)。一个控制回路确保每个节点上的规则与
API 服务器指示的 Service 和 EndpointSlice 状态可靠同步。
{{< figure src="/zh-cn/docs/images/services-iptables-overview.svg" title="iptables 模式下 Service 的虚拟 IP 机制" class="diagram-medium" >}}
<!--
A question that pops up every now and then is why Kubernetes relies on
proxying to forward inbound traffic to backends. What about other
@ -107,21 +129,27 @@ The kube-proxy starts up in different modes, which are determined by its configu
On Linux nodes, the available modes for kube-proxy are:
[`iptables`](#proxy-mode-iptables)
: A mode where the kube-proxy configures packet forwarding rules using iptables, on Linux.
: A mode where the kube-proxy configures packet forwarding rules using iptables.
[`ipvs`](#proxy-mode-ipvs)
: a mode where the kube-proxy configures packet forwarding rules using ipvs.
[`nftables`](#proxy-mode-nftables)
: a mode where the kube-proxy configures packet forwarding rules using nftables.
-->
kube-proxy 会根据不同配置以不同的模式启动。
在 Linux 节点上kube-proxy 的可用模式是:
[`iptables`](#proxy-mode-iptables)
: kube-proxy 在 Linux 上使用 iptables 配置数据包转发规则的一种模式。
: kube-proxy 使用 iptables 配置数据包转发规则的一种模式。
[`ipvs`](#proxy-mode-ipvs)
: kube-proxy 使用 ipvs 配置数据包转发规则的一种模式。
[`nftables`](#proxy-mode-nftables)
: kube-proxy 使用 nftables 配置数据包转发规则的一种模式。
<!--
There is only one mode available for kube-proxy on Windows:
@ -137,64 +165,19 @@ Windows 上的 kube-proxy 只有一种模式可用:
### `iptables` proxy mode {#proxy-mode-iptables}
_This proxy mode is only available on Linux nodes._
In this mode, kube-proxy configures packet forwarding rules using the
iptables API of the kernel netfilter subsystem. For each endpoint, it
installs iptables rules which, by default, select a backend Pod at
random.
-->
### `iptables` 代理模式 {#proxy-mode-iptables}
**此代理模式仅适用于 Linux 节点。**
<!--
In this mode, kube-proxy watches the Kubernetes
{{< glossary_tooltip term_id="control-plane" text="control plane" >}} for the addition and
removal of Service and EndpointSlice {{< glossary_tooltip term_id="object" text="objects." >}}
For each Service, it installs
iptables rules, which capture traffic to the Service's `clusterIP` and `port`,
and redirect that traffic to one of the Service's
backend sets. For each endpoint, it installs iptables rules which
select a backend Pod.
-->
在这种模式下kube-proxy 监视 Kubernetes
{{< glossary_tooltip text="控制平面" term_id="control-plane" >}},获知对 Service 和 EndpointSlice
{{< glossary_tooltip text="对象" term_id="object" >}}的添加和删除操作。
对于每个 Servicekube-proxy 会添加 iptables 规则,这些规则捕获流向 Service 的 `clusterIP``port` 的流量,
并将这些流量重定向到 Service 后端集合中的其中之一。
对于每个端点,它会添加指向一个特定后端 Pod 的 iptables 规则。
<!--
By default, kube-proxy in iptables mode chooses a backend at random.
Using iptables to handle traffic has a lower system overhead, because traffic
is handled by Linux netfilter without the need to switch between userspace and the
kernel space. This approach is also likely to be more reliable.
-->
默认情况下iptables 模式下的 kube-proxy 会随机选择一个后端。
使用 iptables 处理流量的系统开销较低,因为流量由 Linux netfilter 处理,
无需在用户空间和内核空间之间切换。这种方案也更为可靠。
<!--
If kube-proxy is running in iptables mode and the first Pod that's selected
does not respond, the connection fails. This is different from the old `userspace`
mode: in that scenario, kube-proxy would detect that the connection to the first
Pod had failed and would automatically retry with a different backend Pod.
-->
如果 kube-proxy 以 iptables 模式运行,并且它选择的第一个 Pod 没有响应,
那么连接会失败。这与用户空间模式不同:
在后者这种情况下kube-proxy 会检测到与第一个 Pod 的连接失败,
并会自动用不同的后端 Pod 重试。
<!--
You can use Pod [readiness probes](/docs/concepts/workloads/pods/pod-lifecycle/#container-probes)
to verify that backend Pods are working OK, so that kube-proxy in iptables mode
only sees backends that test out as healthy. Doing this means you avoid
having traffic sent via kube-proxy to a Pod that's known to have failed.
-->
你可以使用 Pod [就绪探针](/zh-cn/docs/concepts/workloads/pods/pod-lifecycle/#container-probes)来验证后端 Pod 是否健康。
这样可以避免 kube-proxy 将流量发送到已知失败的 Pod 上。
<!--
{{< figure src="/images/docs/services-iptables-overview.svg" title="Virtual IP mechanism for Services, using iptables mode" class="diagram-medium" >}}
-->
{{< figure src="/zh-cn/docs/images/services-iptables-overview.svg" title="iptables 模式下 Service 的虚拟 IP 机制" class="diagram-medium" >}}
在这种模式下kube-proxy 使用内核 netfilter 子系统的 iptables API
配置数据包转发规则。对于每个端点kube-proxy 会添加 iptables
规则,这些规则默认情况下会随机选择一个后端 Pod。
<!--
#### Example {#packet-processing-iptables}
@ -244,8 +227,10 @@ through a load-balancer, though in those cases the client IP address does get al
<!--
#### Optimizing iptables mode performance
In large clusters (with tens of thousands of Pods and Services), the
iptables mode of kube-proxy may take a long time to update the rules
In iptables mode, kube-proxy creates a few iptables rules for every
Service, and a few iptables rules for each endpoint IP address. In
clusters with tens of thousands of Pods and Services, this means tens
of thousands of iptables rules, and kube-proxy may take a long time to update the rules
in the kernel when Services (or their EndpointSlices) change. You can adjust the syncing
behavior of kube-proxy via options in the [`iptables` section](/docs/reference/config-api/kube-proxy-config.v1alpha1/#kubeproxy-config-k8s-io-v1alpha1-KubeProxyIPTablesConfiguration)
of the
@ -253,13 +238,12 @@ kube-proxy [configuration file](/docs/reference/config-api/kube-proxy-config.v1a
(which you specify via `kube-proxy --config <path>`):
-->
#### 优化 iptables 模式性能 {#optimizing-iptables-mode-performance}
在大型集群(有数万个 Pod 和 Service当 Service或其 EndpointSlice发生变化时
iptables 模式的 kube-proxy 在更新内核中的规则时可能要用较长时间。
你可以通过(`kube-proxy --config <path>` 指定的kube-proxy
[配置文件](/zh-cn/docs/reference/config-api/kube-proxy-config.v1alpha1/)的
[`iptables` 节](/zh-cn/docs/reference/config-api/kube-proxy-config.v1alpha1/#kubeproxy-config-k8s-io-v1alpha1-KubeProxyIPTablesConfiguration)中的选项来调整
kube-proxy 的同步行为:
在 iptables 模式下kube-proxy 为每个 Service 创建一些 iptables 规则,并为每个端点
IP 地址创建一些 iptables 规则。在拥有数万个 Pod 和 Service 的集群中,这意味着数万个
iptables 规则,当 Service或其 EndpointSlice发生变化时kube-proxy
在更新内核中的规则时可能要用很长时间。你可以通过(`kube-proxy --config <path>` 指定的)
kube-proxy [配置文件](/zh-cn/docs/reference/config-api/kube-proxy-config.v1alpha1/)的
[`iptables` 章节](/zh-cn/docs/reference/config-api/kube-proxy-config.v1alpha1/#kubeproxy-config-k8s-io-v1alpha1-KubeProxyIPTablesConfiguration)中的选项来调整 kube-proxy 的同步行为:
```yaml
...
@ -389,17 +373,11 @@ _This proxy mode is only available on Linux nodes._
**此代理模式仅适用于 Linux 节点。**
<!--
In `ipvs` mode, kube-proxy watches Kubernetes Services and EndpointSlices,
calls `netlink` interface to create IPVS rules accordingly and synchronizes
IPVS rules with Kubernetes Services and EndpointSlices periodically.
This control loop ensures that IPVS status matches the desired state.
When accessing a Service, IPVS directs traffic to one of the backend Pods.
In `ipvs` mode, kube-proxy uses the kernel IPVS and iptables APIs to
create rules to redirect traffic from Service IPs to endpoint IPs.
-->
`ipvs` 模式下kube-proxy 监视 Kubernetes Service 和 EndpointSlice
然后调用 `netlink` 接口创建 IPVS 规则,
并定期与 Kubernetes Service 和 EndpointSlice 同步 IPVS 规则。
该控制回路确保 IPVS 状态与期望的状态保持一致。
访问 Service 时IPVS 会将流量导向到某一个后端 Pod。
`ipvs` 模式下kube-proxy 使用内核 IPVS 和 iptables API
创建规则,将流量从 Service IP 重定向到端点 IP。
<!--
The IPVS proxy mode is based on netfilter hook function that is similar to
@ -407,14 +385,14 @@ iptables mode, but uses a hash table as the underlying data structure and works
in the kernel space.
That means kube-proxy in IPVS mode redirects traffic with lower latency than
kube-proxy in iptables mode, with much better performance when synchronizing
proxy rules. Compared to the other proxy modes, IPVS mode also supports a
proxy rules. Compared to the iptables proxy mode, IPVS mode also supports a
higher throughput of network traffic.
-->
IPVS 代理模式基于 netfilter 回调函数,类似于 iptables 模式,
但它使用哈希表作为底层数据结构,在内核空间中生效。
这意味着 IPVS 模式下的 kube-proxy 比 iptables 模式下的 kube-proxy
重定向流量的延迟更低,同步代理规则时性能也更好。
其他代理模式相比IPVS 模式还支持更高的网络流量吞吐量。
iptables 代理模式相比IPVS 模式还支持更高的网络流量吞吐量。
<!--
IPVS provides more options for balancing traffic to backend Pods;
@ -497,12 +475,12 @@ the node before starting kube-proxy.
When kube-proxy starts in IPVS proxy mode, it verifies whether IPVS
kernel modules are available. If the IPVS kernel modules are not detected, then kube-proxy
falls back to running in iptables proxy mode.
exits with an error.
-->
要在 IPVS 模式下运行 kube-proxy必须在启动 kube-proxy 之前确保节点上的 IPVS 可用。
当 kube-proxy 以 IPVS 代理模式启动时,它会验证 IPVS 内核模块是否可用。
如果未检测到 IPVS 内核模块,则 kube-proxy 会退回到 iptables 代理模式运行
如果未检测到 IPVS 内核模块,则 kube-proxy 会退出并报错
{{< /note >}}
<!--
@ -510,6 +488,136 @@ falls back to running in iptables proxy mode.
-->
{{< figure src="/zh-cn/docs/images/services-ipvs-overview.svg" title="IPVS 模式下 Service 的虚拟 IP 地址机制" class="diagram-medium" >}}
<!--
### `nftables` proxy mode {#proxy-mode-nftables}
{{< feature-state feature_gate_name="NFTablesProxyMode" >}}
_This proxy mode is only available on Linux nodes, and requires kernel
5.13 or later._
-->
### `nftables` 代理模式 {#proxy-mode-nftables}
{{< feature-state feature_gate_name="NFTablesProxyMode" >}}
**此代理模式仅适用于 Linux 节点,并且需要 5.13 或更高的内核版本。**
<!--
In this mode, kube-proxy configures packet forwarding rules using the
nftables API of the kernel netfilter subsystem. For each endpoint, it
installs nftables rules which, by default, select a backend Pod at
random.
-->
在这种模式下kube-proxy 使用内核 netfilter 子系统的 nftables API
配置数据包转发规则。对于每个端点,它会添加 nftables
规则,这些规则默认情况下会随机选择一个后端 Pod。
<!--
The nftables API is the successor to the iptables API and is designed
to provide better performance and scalability than iptables. The
`nftables` proxy mode is able to process changes to service endpoints
faster and more efficiently than the `iptables` mode, and is also able
to more efficiently process packets in the kernel (though this only
becomes noticeable in clusters with tens of thousands of services).
-->
nftables API 是 iptables API 的后继,旨在提供比 iptables 更好的性能和可扩展性。
`nftables` 代理模式能够比 `iptables` 模式更快、更高效地处理服务端点的变化,
并且在内核中处理数据包的效率也更高(尽管这只有在拥有数万个服务的集群中才会比较明显)。
<!--
As of Kubernetes {{< skew currentVersion >}}, the `nftables` mode is
still relatively new, and may not be compatible with all network
plugins; consult the documentation for your network plugin.
-->
在 Kubernetes {{< skew currentVersion >}} 中,`nftables`
模式仍然相对较新,可能还不兼容所有的网络插件;请查阅你的网络插件文档。
<!--
#### Migrating from `iptables` mode to `nftables`
Users who want to switch from the default `iptables` mode to the
`nftables` mode should be aware that some features work slightly
differently the `nftables` mode:
-->
#### 从 `iptables` 模式到 `nftables` 模式的迁移 {#migrating-from-iptables-mode-to-nftables}
想要从默认的 `iptables` 模式切换到 `nftables` 模式的用户应注意,在
`nftables` 模式下,一些特性的工作方式略有不同:
<!--
- **NodePort interfaces**: In `iptables` mode, by default,
[NodePort services](/docs/concepts/services-networking/service/#type-nodeport)
are reachable on all local IP addresses. This is usually not what
users want, so the `nftables` mode defaults to
`--nodeport-addresses primary`, meaning NodePort services are only
reachable on the node's primary IPv4 and/or IPv6 addresses. You can
override this by specifying an explicit value for that option:
e.g., `--nodeport-addresses 0.0.0.0/0` to listen on all (local)
IPv4 IPs.
-->
- **NodePort 接口**:在 `iptables` 模式下,默认情况下,
[NodePort 服务](/zh-cn/docs/concepts/services-networking/service/#type-nodeport) 可以在所有本地
IP 地址上访问。这通常不是用户想要的,因此 `nftables` 模式默认使用 `--nodeport-addresses primary`,这意味着
NodePort 服务只能在节点的主 IPv4 和/或 IPv6 地址上访问。你可以通过为该选项指定一个明确的值来覆盖此设置:例如,使用
`--nodeport-addresses 0.0.0.0/0` 以监听所有本地IPv4 IP。
<!--
- **NodePort services on `127.0.0.1`**: In `iptables` mode, if the
`--nodeport-addresses` range includes `127.0.0.1` (and the option
`--iptables-localhost-nodeports false` option is not passed), then
NodePort services are reachable even on "localhost" (`127.0.0.1`).
In `nftables` mode (and `ipvs` mode), this will not work. If you
are not sure if you are depending on this functionality, you can
check kube-proxy's
`iptables_localhost_nodeports_accepted_packets_total` metric; if it
is non-0, that means that some client has connected to a NodePort
service via `127.0.0.1`.
-->
- **`127.0.0.1` 上的 NodePort 服务**:在 `iptables` 模式下,如果
`--nodeport-addresses` 范围包括 `127.0.0.1`(且未传递 `--iptables-localhost-nodeports false` 选项),
则 NodePort 服务甚至可以在 "localhost" (`127.0.0.1`) 上访问。
`nftables` 模式(和 `ipvs` 模式)下,这将不起作用。如果你不确定是否依赖此功能,
可以检查 kube-proxy 的 `iptables_localhost_nodeports_accepted_packets_total` 指标;
如果该值非 0则表示某些客户端已通过 `127.0.0.1` 连接到 NodePort 服务。
<!--
- **NodePort interaction with firewalls**: The `iptables` mode of
kube-proxy tries to be compatible with overly-agressive firewalls;
for each NodePort service, it will add rules to accept inbound
traffic on that port, in case that traffic would otherwise be
blocked by a firewall. This approach will not work with firewalls
based on nftables, so kube-proxy's `nftables` mode does not do
anything here; if you have a local firewall, you must ensure that
it is properly configured to allow Kubernetes traffic through
(e.g., by allowing inbound traffic on the entire NodePort range).
-->
- **NodePort 与防火墙的交互**kube-proxy 的 `iptables` 模式尝试与过于激进的防火墙兼容;
对于每个 NodePort 服务,它会添加规则以接受该端口的入站流量,以防该流量被防火墙阻止。
这种方法不适用于基于 nftables 的防火墙,因此 kube-proxy 的 `nftables` 模式在这里不会做任何事情;
如果你有本地防火墙,必须确保其配置正确以允许 Kubernetes 流量通过(例如,允许整个 NodePort 范围的入站流量)。
<!--
- **Conntrack bug workarounds**: Linux kernels prior to 6.1 have a
bug that can result in long-lived TCP connections to service IPs
being closed with the error "Connection reset by peer". The
`iptables` mode of kube-proxy installs a workaround for this bug,
but this workaround was later found to cause other problems in some
clusters. The `nftables` mode does not install any workaround by
default, but you can check kube-proxy's
`iptables_ct_state_invalid_dropped_packets_total` metric to see if
your cluster is depending on the workaround, and if so, you can run
kube-proxy with the option `--conntrack-tcp-be-liberal` to work
around the problem in `nftables` mode.
-->
- **Conntrack 错误规避**6.1 之前的 Linux 内核存在一个错误,可能导致与服务 IP 的长时间
TCP 连接被关闭并出现“Connection reset by peer对方重置连接”的错误。kube-proxy 的 `iptables`
模式为此错误配备了一个修复程序,但后来发现该修复程序在某些集群中会导致其他问题。
`nftables` 模式默认不安装任何修复程序,但你可以检查 kube-proxy 的
`iptables_ct_state_invalid_dropped_packets_total`
指标,看看你的集群是否依赖于该修复程序,如果是,你可以使用 `--conntrack-tcp-be-liberal`
选项运行 kube-proxy以在 `nftables` 模式下解决该问题。
<!--
### `kernelspace` proxy mode {#proxy-mode-kernelspace}
@ -665,17 +773,17 @@ Kubernetes 通过从为 {{< glossary_tooltip text="API 服务器" term_id="kube-
`service-cluster-ip-range` CIDR 范围内为每个 Service 分配自己的 IP 地址来实现这一点。
<!--
#### IP address allocation tracking
### IP address allocation tracking
To ensure each Service receives a unique IP, an internal allocator atomically
To ensure each Service receives a unique IP address, an internal allocator atomically
updates a global allocation map in {{< glossary_tooltip term_id="etcd" >}}
prior to creating each Service. The map object must exist in the registry for
Services to get IP address assignments, otherwise creations will
fail with a message indicating an IP address could not be allocated.
-->
#### IP 地址分配追踪
### IP 地址分配追踪 {#ip-address-allocation-tracking}
为了确保每个 Service 都获得唯一的 IP内部分配器在创建每个 Service
为了确保每个 Service 都获得唯一的 IP 地址,内部分配器在创建每个 Service
之前更新 {{< glossary_tooltip term_id="etcd" >}} 中的全局分配映射,这种更新操作具有原子性。
映射对象必须存在于数据库中,这样 Service 才能获得 IP 地址分配,
否则创建将失败,并显示无法分配 IP 地址。
@ -684,50 +792,67 @@ fail with a message indicating an IP address could not be allocated.
In the control plane, a background controller is responsible for creating that
map (needed to support migrating from older versions of Kubernetes that used
in-memory locking). Kubernetes also uses controllers to check for invalid
assignments (e.g. due to administrator intervention) and for cleaning up allocated
assignments (for example: due to administrator intervention) and for cleaning up allocated
IP addresses that are no longer used by any Services.
-->
在控制平面中,后台控制器负责创建该映射(从使用内存锁定的旧版本的 Kubernetes 迁移时需要这一映射)。
Kubernetes 还使用控制器来检查无效的分配(例如,因管理员干预而导致无效分配)
以及清理已分配但没有 Service 使用的 IP 地址。
{{< feature-state for_k8s_version="v1.27" state="alpha" >}}
<!--
#### IP address allocation tracking using the Kubernetes API {#ip-address-objects}
-->
#### 使用 Kubernetes API 跟踪IP 地址分配 {#ip-address-objects}
{{< feature-state feature_gate_name="MultiCIDRServiceAllocator" >}}
<!--
If you enable the `MultiCIDRServiceAllocator`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) and the
[`networking.k8s.io/v1alpha1` API group](/docs/tasks/administer-cluster/enable-disable-api/),
the control plane replaces the existing etcd allocator with a new one, using IPAddress
objects instead of an internal global allocation map. The ClusterIP address
associated to each Service will have a referenced IPAddress object.
the control plane replaces the existing etcd allocator with a revised implementation
that uses IPAddress and ServiceCIDR objects instead of an internal global allocation map.
Each cluster IP address associated to a Service then references an IPAddress object.
-->
如果你启用 `MultiCIDRServiceAllocator` [特性门控](/zh-cn/docs/reference/command-line-tools-reference/feature-gate/)和
[`networking.k8s.io/v1alpha1` API 组](/zh-cn/docs/tasks/administer-cluster/enable-disable-api/)
控制平面将用一个新的分配器替换现有的 etcd 分配器,使用 IPAddress 对象而不是内部的全局分配映射。
与每个 Service 关联的 ClusterIP 地址将有一个对应的 IPAddress 对象。
控制平面用一个改进后的实现替换现有的 etcd 分配器,使用 IPAddress 和 ServiceCIDR
对象而不是内部的全局分配映射。与某 Service 关联的每个 ClusterIP 地址将有一个对应的
IPAddress 对象。
<!--
The background controller is also replaced by a new one to handle the new IPAddress
objects and the migration from the old allocator model.
Enabling the feature gate also replaces a background controller with an alternative
that handles the IPAddress objects and supports migration from the old allocator model.
Kubernetes {{< skew currentVersion >}} does not support migrating from IPAddress
objects to the internal allocation map.
-->
后台控制器也被一个新的控制器取代,来处理新的 IPAddress 对象和从旧的分配器模型的迁移。
启用该特性门控还会用替代实现将后台控制器替换,来处理 IPAddress 对象并支持从旧的分配器模型迁移。
Kubernetes {{< skew currentVersion >}} 不支持从 IPAddress 对象迁移到内部分配映射。
<!--
One of the main benefits of the new allocator is that it removes the size limitations
for the `service-cluster-ip-range`, there is no limitations for IPv4 and for IPv6
users can use masks equal or larger than /64 (previously it was /108).
One of the main benefits of the revised allocator is that it removes the size limitations
for the IP address range that can be used for the cluster IP address of Services.
With `MultiCIDRServiceAllocator` enabled, there are no limitations for IPv4, and for IPv6
you can use IP address netmasks that are a /64 or smaller (as opposed to /108 with the
legacy implementation).
-->
新分配器的主要好处之一是它取消了对 `service-cluster-ip-range` 的大小限制,对 IPv4 没有大小限制,
对于 IPv6 用户可以使用等于或大于 /64 的掩码(以前是 /108
改进后的分配器的主要优点之一是它取消了对可用于 Service 的集群 IP 地址的范围大小限制。
启用 `MultiCIDRServiceAllocator` 后,对 IPv4 没有大小限制,而对于
IPv6你可以使用等于或小于 /64 的 IP 地址子网掩码(与旧实现中的 /108 相比)。
<!--
Users now will be able to inspect the IP addresses assigned to their Services, and
Kubernetes extensions such as the [Gateway](https://gateway-api.sigs.k8s.io/) API, can use this new
IPAddress object kind to enhance the Kubernetes networking capabilities, going beyond the limitations of
the built-in Service API.
Making IP address allocations available via the API means that you as a cluster administrator
can allow users to inspect the IP addresses assigned to their Services.
Kubernetes extensions, such as the [Gateway API](/docs/concepts/services-networking/gateway/),
can use the IPAddress API to extend Kubernetes' inherent networking capabilities.
-->
用户现在能够检查分配给他们的 Service 的 IP 地址Kubernetes 扩展,
如 [Gateway](https://gateway-api.sigs.k8s.io/) API
可以使用这个新的 IPAddress 对象类别来增强 Kubernetes 的网络能力,解除内置 Service API 的限制。
通过 API 提供 IP 地址分配,意味着作为集群管理员,你可以允许用户检查分配给他们的 Service 的 IP 地址。
Kubernetes 扩展(例如 [Gateway API](/zh-cn/docs/concepts/services-networking/gateway/)
可以使用 IPAddress API 来扩展 Kubernetes 的固有网络功能。
<!--
Here is a brief example of a user querying for IP addresses:
-->
以下是用户查询 IP 地址的简短示例:
```shell
kubectl get services
@ -747,11 +872,56 @@ NAME PARENTREF
2001:db8:1:2::1 services/default/kubernetes
2001:db8:1:2::a services/kube-system/kube-dns
```
<!--
Kubernetes also allow users to dynamically define the available IP ranges for Services using
ServiceCIDR objects. During bootstrap, a default ServiceCIDR object named `kubernetes` is created
from the value of the `--service-cluster-ip-range` command line argument to kube-apiserver:
-->
Kubernetes 还允许用户使用 ServiceCIDR 对象动态定义 Service 的可用 IP 范围。在引导过程中,集群会根据
kube-apiserver 的 `--service-cluster-ip-range` 命令行参数的值创建一个名为
`kubernetes` 的默认 ServiceCIDR 对象:
```shell
kubectl get servicecidrs
```
```
NAME CIDRS AGE
kubernetes 10.96.0.0/28 17m
```
<!--
#### IP address ranges for Service virtual IP addresses {#service-ip-static-sub-range}
Users can create or delete new ServiceCIDR objects to manage the available IP ranges for Services:
-->
#### Service 虚拟 IP 地址的地址段 {#service-ip-static-sub-range}
用户可以创建或删除新的 ServiceCIDR 对象来管理 Service 的可用 IP 范围:
```shell
cat <<'EOF' | kubectl apply -f -
apiVersion: networking.k8s.io/v1beta1
kind: ServiceCIDR
metadata:
name: newservicecidr
spec:
cidrs:
- 10.96.0.0/24
EOF
```
```
servicecidr.networking.k8s.io/newcidr1 created
```
```shell
kubectl get servicecidrs
```
```
NAME CIDRS AGE
kubernetes 10.96.0.0/28 17m
newservicecidr 10.96.0.0/24 7m
```
<!--
### IP address ranges for Service virtual IP addresses {#service-ip-static-sub-range}
-->
### Service 虚拟 IP 地址的地址段 {#service-ip-static-sub-range}
{{< feature-state for_k8s_version="v1.26" state="stable" >}}
@ -775,16 +945,6 @@ Kubernetes 优先通过从高段中选择来为 Service 分配动态 IP 地址
这意味着如果要将特定 IP 地址分配给 `type: ClusterIP` Service
则应手动从**低**段中分配 IP 地址。该方法降低了分配导致冲突的风险。
<!--
If you disable the `ServiceIPStaticSubrange`
[feature gate](/docs/reference/command-line-tools-reference/feature-gates/) then Kubernetes
uses a single shared pool for both manually and dynamically assigned IP addresses,
that are used for `type: ClusterIP` Services.
-->
如果你禁用 `ServiceIPStaticSubrange`
[特性门控](/zh-cn/docs/reference/command-line-tools-reference/feature-gates/)
则 Kubernetes 用于手动分配和动态分配的 IP 共享单个地址池,这适用于 `type: ClusterIP` 的 Service。
<!--
## Traffic policies
-->
@ -837,6 +997,122 @@ relevant Service.
如果流量策略为 `Local` 并且没有本地节点端点,
那么 kube-proxy 不会转发与相关 Service 相关的任何流量。
<!--
If `Cluster` is specified all nodes are eligible load balancing targets _as long as_
the node is not being deleted and kube-proxy is healthy. In this mode: load balancer
health checks are configured to target the service proxy's readiness port and path.
In the case of kube-proxy this evaluates to: `${NODE_IP}:10256/healthz`. kube-proxy
will return either an HTTP code 200 or 503. kube-proxy's load balancer health check
endpoint returns 200 if:
-->
如果指定了 `Cluster`,则所有节点都可以作为负载均衡目标,**只要**节点没有被删除且
kube-proxy 是健康的。在这种模式下:负载均衡器健康检查被配置为针对服务代理的就绪端口和路径。对于
kube-proxy这个健康检查端点为`${NODE_IP}:10256/healthz`。kube-proxy 将返回 HTTP
状态码 200 或 503。如果满足以下条件kube-proxy 的负载均衡器健康检查端点将返回 200
<!--
1. kube-proxy is healthy, meaning:
- it's able to progress programming the network and isn't timing out while doing
so (the timeout is defined to be: **2 × `iptables.syncPeriod`**); and
2. the node is not being deleted (there is no deletion timestamp set for the Node).
-->
1. kube-proxy 是健康的,意味着:
- 它能够继续进行网络编程,并且在此过程中不会超时(超时时间定义为:**2 × `iptables.syncPeriod`**);并且
2. 节点没有被删除Node 对象上没有设置删除时间戳)。
<!--
The reason why kube-proxy returns 503 and marks the node as not
eligible when it's being deleted, is because kube-proxy supports connection
draining for terminating nodes. A couple of important things occur from the point
of view of a Kubernetes-managed load balancer when a node _is being_ / _is_ deleted.
-->
kube-proxy 在节点被删除时返回 503 并将节点标记为不符合条件的原因在于
kube-proxy 对处于终止过程中的节点支持连接腾空。从 Kubernetes 管理的负载均衡器的角度来看,
当节点**正在**/ **已**被删除时,会发生一些重要的事情。
<!--
While deleting:
* kube-proxy will start failing its readiness probe and essentially mark the
node as not eligible for load balancer traffic. The load balancer health
check failing causes load balancers which support connection draining to
allow existing connections to terminate, and block new connections from
establishing.
-->
当节点被删除时:
* kube-proxy 的就绪探针将开始失败,并将该节点标记为不胜任接收负载均衡器流量。
负载均衡器健康检查失败会导致支持连接排空的负载均衡器允许现有连接终止,并阻止新连接建立。
<!--
When deleted:
* The service controller in the Kubernetes cloud controller manager removes the
node from the referenced set of eligible targets. Removing any instance from
the load balancer's set of backend targets immediately terminates all
connections. This is also the reason kube-proxy first fails the health check
while the node is deleting.
-->
当节点被删除后:
* Kubernetes 云控制器管理器中的服务控制器会将节点从所引用的候选目标集中移除。
从负载均衡器的后端目标集中移除任何实例会立即终止所有连接。
这也是 kube-proxy 在节点删除过程中首先使健康检查失败的原因。
<!--
It's important to note for Kubernetes vendors that if any vendor configures the
kube-proxy readiness probe as a liveness probe: that kube-proxy will start
restarting continuously when a node is deleting until it has been fully deleted.
kube-proxy exposes a `/livez` path which, as opposed to the `/healthz` one, does
**not** consider the Node's deleting state and only its progress programming the
network. `/livez` is therefore the recommended path for anyone looking to define
a livenessProbe for kube-proxy.
-->
需要注意的是,对于 Kubernetes 供应商,如果任何供应商将
kube-proxy 的就绪探针配置为存活探针当节点正在删除直到完全删除时kube-proxy
将开始不断重启。kube-proxy 公开了一个 `/livez` 路径,与 `/healthz` 路径不同,
`/livez` 路径**不**考虑节点的删除状态,仅考虑其网络编程进度。因此,对于任何希望为
kube-proxy 定义存活探针的人来说,推荐使用 `/livez` 路径。
<!--
Users deploying kube-proxy can inspect both the readiness / liveness state by
evaluating the metrics: `proxy_livez_total` / `proxy_healthz_total`. Both
metrics publish two series, one with the 200 label and one with the 503 one.
-->
部署 kube-proxy 的用户可以通过评估指标 `proxy_livez_total` / `proxy_healthz_total`
来检查就绪/存活状态。这两个指标都发布了两个序列,一个带有 200 标签,另一个带有 503 标签。
<!--
For `Local` Services: kube-proxy will return 200 if
1. kube-proxy is healthy/ready, and
2. has a local endpoint on the node in question.
Node deletion does **not** have an impact on kube-proxy's return
code for what concerns load balancer health checks. The reason for this is:
deleting nodes could end up causing an ingress outage should all endpoints
simultaneously be running on said nodes.
-->
对于 `Local` Service如果满足以下条件kube-proxy 将返回 200
1. kube-proxy 是健康/就绪的,并且
2. 在相关节点上有一个本地端点。
对于负载均衡器健康检查而言,节点删除**不会**对 kube-proxy
的返回代码产生影响。原因是:如果所有端点同时在上述节点上运行,则删除节点最终可能会导致入站流量中断。
<!--
The Kubernetes project recommends that cloud provider integration code
configures load balancer health checks that target the service proxy's healthz
port. If you are using or implementing your own virtual IP implementation,
that people can use instead of kube-proxy, you should set up a similar health
checking port with logic that matches the kube-proxy implementation.
-->
Kubernetes 项目建议云提供商集成代码配置负载均衡器健康检查,以针对服务代理的 healthz 端口。
如果你正在使用或实现自己的虚拟 IP 实现,供人们使用它替代 kube-proxy你应该设置一个类似的健康检查端口
其逻辑应与 kube-proxy 实现相匹配。
<!--
### Traffic to terminating endpoints
-->
@ -884,6 +1160,172 @@ pool.
并逐渐降低指向那些处于终止过程中的 Pod 的流量。
到 Pod 完成终止时,外部负载均衡器应该已经发现节点的健康检查失败并从后端池中完全移除该节点。
<!--
## Traffic Distribution
{{< feature-state feature_gate_name="ServiceTrafficDistribution" >}}
The `spec.trafficDistribution` field within a Kubernetes Service allows you to
express preferences for how traffic should be routed to Service endpoints.
Implementations like kube-proxy use the `spec.trafficDistribution` field as a
guideline. The behavior associated with a given preference may subtly differ
between implementations.
-->
## 流量分发 {#traffic-distribution}
{{< feature-state feature_gate_name="ServiceTrafficDistribution" >}}
Kubernetes Service 中的 `spec.trafficDistribution` 字段允许你定义流量应如何路由到
Service 端点的偏好。像 kube-proxy 这样的实现会将 `spec.trafficDistribution`
字段作为指导。不同实现之间,与给定偏好相关的行为可能会略有不同。
<!--
`PreferClose` with kube-proxy
: For kube-proxy, this means prioritizing sending traffic to endpoints within
the same zone as the client. The EndpointSlice controller updates
EndpointSlices with `hints` to communicate this preference, which kube-proxy
then uses for routing decisions. If a client's zone does not have any
available endpoints, traffic will be routed cluster-wide for that client.
In the absence of any value for `trafficDistribution`, the default routing
strategy for kube-proxy is to distribute traffic to any endpoint in the cluster.
-->
`PreferClose` 与 kube-proxy 结合
: 对于 kube-proxy这意味着优先将流量发送到与客户端位于同一区域的端点。
EndpointSlice 控制器使用 `hints` 来更新 EndpointSlices 以传达此偏好,
之后kube-proxy 会使用这些提示进行路由决策。如果客户端的区域没有可用的端点,
则流量将在整个集群范围内路由。
如果 `trafficDistribution` 没有任何值kube-proxy 的默认路由策略是将流量分配到集群中的任一端点。
<!--
### Comparison with `service.kubernetes.io/topology-mode: Auto`
The `trafficDistribution` field with `PreferClose` and the
`service.kubernetes.io/topology-mode: Auto` annotation both aim to prioritize
same-zone traffic. However, there are key differences in their approaches:
-->
### 与 `service.kubernetes.io/topology-mode: Auto` 的比较 {#comparison-with-service-kubernetes-io-topology-mode-auto}
`trafficDistribution` 字段中的 `PreferClose`
`service.kubernetes.io/topology-mode: Auto` 注解都旨在优先处理同一区域的流量。
然而,它们的方法存在一些关键差异:
<!--
* `service.kubernetes.io/topology-mode: Auto`: Attempts to distribute traffic
proportionally across zones based on allocatable CPU resources. This heuristic
includes safeguards (such as the [fallback
behavior](/docs/concepts/services-networking/topology-aware-routing/#three-or-more-endpoints-per-zone)
for small numbers of endpoints) and could lead to the feature being disabled
in certain scenarios for load-balancing reasons. This approach sacrifices some
predictability in favor of potential load balancing.
-->
* `service.kubernetes.io/topology-mode: Auto`:尝试根据可分配的 CPU
资源在各区域之间按比例分配流量。此启发式方法包括一些保障措施
(例如针对少量端点的[回退行为](/zh-cn/docs/concepts/services-networking/topology-aware-routing/#three-or-more-endpoints-per-zone)
并在某些场景下可能因负载均衡原因导致该特性被禁用。这种方法在一定程度上牺牲了可预测性,
以换取潜在的负载均衡。
<!--
* `trafficDistribution: PreferClose`: This approach aims to be slightly simpler
and more predictable: "If there are endpoints in the zone, they will receive
all traffic for that zone, if there are no endpoints in a zone, the traffic
will be distributed to other zones". While the approach may offer more
predictability, it does mean that you are in control of managing a [potential
overload](#considerations-for-using-traffic-distribution-control).
-->
* `trafficDistribution: PreferClose`:这种方法偏重更简单和更可预测:
“如果区域内有端点,它们将接收该区域的所有流量;如果区域内没有端点,流量将分配到其他区域”。
虽然这种方法可能提供更多的可预测性,但这意味着你需要管理[潜在的过载](#considerations-for-using-traffic-distribution-control)。
<!--
If the `service.kubernetes.io/topology-mode` annotation is set to `Auto`, it
will take precedence over `trafficDistribution`. (The annotation may be deprecated
in the future in favour of the `trafficDistribution` field).
-->
如果 `service.kubernetes.io/topology-mode` 注解设置为 `Auto`,它将优先于
`trafficDistribution`。(该注解将来可能会被弃用,取而代之的是 `trafficDistribution` 字段)。
<!--
### Interaction with Traffic Policies
When compared to the `trafficDistribution` field, the traffic policy fields
(`externalTrafficPolicy` and `internalTrafficPolicy`) are meant to offer a
stricter traffic locality requirements. Here's how `trafficDistribution`
interacts with them:
-->
### 与流量策略的交互 {#interaction-with-traffic-policies}
`trafficDistribution` 字段相比,流量策略字段
`externalTrafficPolicy` 和 `internalTrafficPolicy`)旨在提供更严格的流量局域化要求。
以下是 `trafficDistribution` 与它们的交互方式:
<!--
* Precedence of Traffic Policies: For a given Service, if a traffic policy
(`externalTrafficPolicy` or `internalTrafficPolicy`) is set to `Local`, it
takes precedence over `trafficDistribution: PreferClose` for the corresponding
traffic type (external or internal, respectively).
-->
* 流量策略的优先序:对于给定的 Service如果流量策略
`externalTrafficPolicy` 或 `internalTrafficPolicy`)设置为 `Local`
则它优先于相应流量类型(分别为外部或内部)的 `trafficDistribution: PreferClose`
<!--
* `trafficDistribution` Influence: For a given Service, if a traffic policy
(`externalTrafficPolicy` or `internalTrafficPolicy`) is set to `Cluster` (the
default), or if the fields are not set, then `trafficDistribution:
PreferClose` guides the routing behavior for the corresponding traffic type
(external or internal, respectively). This means that an attempt will be made
to route traffic to an endpoint that is in the same zone as the client.
-->
* `trafficDistribution` 的影响:对于给定的 Service如果流量策略
`externalTrafficPolicy` 或 `internalTrafficPolicy`)设置为 `Cluster`(默认值),
或者这些字段未设置,那么 `trafficDistribution: PreferClose` 将指导相应流量类型
(分别为外部或内部)的路由行为。这意味着 kube-proxy 将尝试将流量路由到与客户端位于同一区域的端点。
<!--
### Considerations for using traffic distribution control
* **Increased Probability of Overloaded Endpoints:** The `PreferClose`
heuristic will attempt to route traffic to the closest healthy endpoints
instead of spreading that traffic evenly across all endpoints. If you do not
have a sufficient number of endpoints within a zone, they may become
overloaded. This is especially likely if incoming traffic is not
proportionally distributed across zones. To mitigate this, consider the
following strategies:
* [Pod Topology Spread
Constraints](/docs/concepts/scheduling-eviction/topology-spread-constraints/):
Use Pod Topology Spread Constraints to distribute your pods more evenly
across zones.
* Zone-specific Deployments: If you expect to see skewed traffic patterns,
create a separate Deployment for each zone. This approach allows the
separate workloads to scale independently. There are also workload
management addons available from the ecosystem, outside the Kubernetes
project itself, that can help here.
-->
### 使用流量分配控制的注意事项 {#considerations-for-using-traffic-distribution-control}
* **端点过载的概率增加:** `PreferClose` 启发式方法将尝试将流量路由到最近的健康端点,
而不是将流量均匀分布到所有端点。如果某个区域内的端点数量不足,它们可能会过载。
如果传入流量在各区域之间分布不均,这种情况更有可能发生。为减轻这种情况,请考虑以下策略:
* [Pod 拓扑分布约束](/zh-cn/docs/concepts/scheduling-eviction/topology-spread-constraints/)
使用 Pod 拓扑分布约束在各区域之间更均匀地分布你的 Pod。
* 区域特定的 Deployment如果你预计会看到不均衡的流量模式
可以为每个区域创建一个单独的 Deployment。这种方法允许独立扩展各个工作负载。
生态系统中还有一些 Kubernetes 项目之外的工作负载管理插件,可以在这方面提供帮助。
<!--
* **Implementation-specific behavior:** Each dataplane implementation may handle
this field slightly differently. If you're using an implementation other than
kube-proxy, refer the documentation specific to that implementation to
understand how this field is being handled.
-->
* **特定于具体实现的行为:** 各个数据平面实现处理此字段的方式可能会稍有不同。
如果你使用的是 kube-proxy 以外的实现,请参阅该实现的特定文档以了解该实现是如何处理此字段的。
## {{% heading "whatsnext" %}}
<!--