website/content/zh/docs/tutorials/services/source-ip.md

332 lines
11 KiB
Markdown
Raw Normal View History

2017-08-19 07:33:01 +00:00
---
2017-08-19 07:33:01 +00:00
title: 使用 Source IP
content_template: templates/tutorial
2017-08-19 07:33:01 +00:00
---
{{% capture overview %}}
2017-08-19 07:33:01 +00:00
2017-08-19 07:33:01 +00:00
Kubernetes 集群中运行的应用通过抽象的 Service 查找彼此,相互通信和连接外部世界。本文揭示了发送到不同类型 Services 的数据包源 IP 的内幕,你可以根据需求改变这个行为。
{{% /capture %}}
2017-08-19 07:33:01 +00:00
{{% capture prerequisites %}}
2017-08-19 07:33:01 +00:00
{{< include "task-tutorial-prereqs.md" >}} {{< version-check >}}
2017-08-19 07:33:01 +00:00
2017-08-19 07:33:01 +00:00
## 术语表
本文使用了下列术语:
* [NAT](https://en.wikipedia.org/wiki/Network_address_translation): 网络地址转换
* [Source NAT](https://en.wikipedia.org/wiki/Network_address_translation#SNAT): 替换数据包的源 IP, 通常为节点的 IP
* [Destination NAT](https://en.wikipedia.org/wiki/Network_address_translation#DNAT): 替换数据包的目的 IP, 通常为 Pod 的 IP
* [VIP](/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies): 一个虚拟 IP, 例如分配给每个 Kubernetes Service 的 IP
* [Kube-proxy](/docs/concepts/services-networking/service/#virtual-ips-and-service-proxies): 一个网络守护程序,在每个节点上协调 Service VIP 管理
2017-08-19 07:33:01 +00:00
## 准备工作
2017-08-19 07:33:01 +00:00
你必须拥有一个正常工作的 Kubernetes 1.5 集群,用来运行本文中的示例。该示例使用一个简单的 nginx webserver 回送它接收到的请求的 HTTP 头中的源 IP 地址。你可以像下面这样创建它:
```console
$ kubectl run source-ip-app --image=k8s.gcr.io/echoserver:1.4
2017-08-19 07:33:01 +00:00
deployment "source-ip-app" created
```
{{% /capture %}}
2017-08-19 07:33:01 +00:00
{{% capture objectives %}}
2017-08-19 07:33:01 +00:00
* 通过多种类型的 Services 暴露一个简单应用
2017-08-19 07:33:01 +00:00
* 理解每种 Service 类型如何处理源 IP NAT
* 理解保留源 IP 的折中
{{% /capture %}}
2017-08-19 07:33:01 +00:00
{{% capture lessoncontent %}}
2017-08-19 07:33:01 +00:00
2017-08-19 07:33:01 +00:00
## Type=ClusterIP 类型 Services 的 Source IP
2017-08-19 07:33:01 +00:00
如果你的 kube-proxy 运行在 [iptables 模式](/docs/user-guide/services/#proxy-mode-iptables)下,从集群内部发送到 ClusterIP 的包永远不会进行源地址 NAT这从 Kubernetes 1.2 开始是默认选项。Kube-proxy 通过一个 `proxyMode` endpoint 暴露它的模式。
```console
$ kubectl get nodes
NAME STATUS AGE VERSION
kubernetes-minion-group-6jst Ready 2h v1.6.0+fff5156
kubernetes-minion-group-cx31 Ready 2h v1.6.0+fff5156
kubernetes-minion-group-jj1t Ready 2h v1.6.0+fff5156
kubernetes-minion-group-6jst $ curl localhost:10249/proxyMode
iptables
```
2017-08-19 07:33:01 +00:00
你可以通过在 source IP 应用上创建一个服务来测试源 IP 保留。
```console
$ kubectl expose deployment source-ip-app --name=clusterip --port=80 --target-port=8080
service "clusterip" exposed
$ kubectl get svc clusterip
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
clusterip 10.0.170.92 <none> 80/TCP 51s
```
2017-08-19 07:33:01 +00:00
从相同集群中的一个 pod 访问这个 `ClusterIP`
```console
$ kubectl run busybox -it --image=busybox --restart=Never --rm
Waiting for pod default/busybox to be running, status is Pending, pod ready: false
If you don't see a command prompt, try pressing enter.
# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
3: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1460 qdisc noqueue
link/ether 0a:58:0a:f4:03:08 brd ff:ff:ff:ff:ff:ff
inet 10.244.3.8/24 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::188a:84ff:feb0:26a5/64 scope link
valid_lft forever preferred_lft forever
# wget -qO - 10.0.170.92
CLIENT VALUES:
client_address=10.244.3.8
command=GET
...
```
2017-08-19 07:33:01 +00:00
如果客户端 pod 和 服务端 pod 在相同的节点上client_address 就是客户端 pod 的 IP 地址。但是,如果它们在不同的节点上, client_address 将会是客户端 pod 所在节点的 flannel IP 地址。
2017-08-19 07:33:01 +00:00
## Type=NodePort 类型 Services 的 Source IP
Update localization guidelines (#10485) * Update localization guidelines for language labels Continuing work Continuing work Continuing work More work in progress Add local OWNERS folders Add an OWNERS file to Chinese Remove shortcode for repos Add Japanese Alphabetize languages, change weights accordingly More updates Add Korean in Korean Add English to languageName Feedback from gochist Move Chinese content from cn/ to zh/ Move OWNERS from cn/ to zh/ Resolve merge conflicts by updating from master Add files back in to prep for resolution After rebase on upstream/master, remove files Review and update localization guidelines Feedback from gochist, tnir, cstoku Add a trailing newline to content/ja/OWNERS Add a trailing newline to content/zh/OWNERS Drop requirement for GH repo project Clarify language about forks/branches Edits and typos Remove a shortcode specific to a multi-repo language setup Update aliases and owners Add explicit OWNERS for content/en Migrate content from Chinese repo, update regex in config.toml Remove untranslated strings Add trailing newline to content/en/OWNERS Add trailing newlines to OWNERS files add Jaguar project description (#10433) * add Jaguar project description [Jaguar](https://gitlab.com/sdnlab/jaguar) is an open source solution for Kubernetes's network based on OpenDaylight. Jaguar provides overlay network using vxlan and Jaguar CNIPlugin provides one IP address per pod. * Minor newline tweak blog post for azure vmss (#10538) Add microk8s to pick-right-solution.md (#10542) * Add microk8s to pick-right-solution.md Microk8s is a single-command installation of upstream Kubernetes on any Linux and should be included in the list of local-machine solutions. * capitalized Istio Add microk8s to foundational.md (#10543) * Add microk8s to foundational.md Adding microk8s as credible and stable alternative to get started with Kubernetes on a local machine. This is especially attractive for those not wanting to incur the overhead of running a VM for a local cluster. * Update foundational.md Thank you for your suggestions! LMK if this works now? * Rewrote first paragraph And included a bullet list of features of microk8s * Copyedit fix typo (#10545) Fix the kubectl subcommands links. (#10550) Signed-off-by: William Zhang <warmchang@outlook.com> Fix command issue (#10515) Signed-off-by: mooncake <xcoder@tenxcloud.com> remove imported community files per issue 10184 (#10501) networking.md: Markdown fix (#10498) Fix front matter, federation command-line tools (#10500) Clean up glossary entry (#10399) update slack link (#10536) typo in StatefulSet docs (#10558) fix discription about horizontal pod autoscale (#10557) Remove redundant symbols (#10556) Fix issue #10520 (#10554) Signed-off-by: William Zhang <warmchang@outlook.com> Update api-concepts.md (#10534) Revert "Fix command issue (#10515)" This reverts commit c02a7fb9f9d19872d9227814b3e9ffaaa28d85f0. Update memory-constraint-namespace.md (#10530) update memory request to 100MiB corresponding the yaml content Blog: Introducing Volume Snapshot Alpha for Kubernetes (#10562) * blog post for azure vmss * snapshot blog post Resolve merge conflicts in OWNERS* Minor typo fix (#10567) Not sure what's supposed to be here, proposing removing it. * Feedback from gochist Tweaks to feedback * Feedback from ClaudiaJKang
2018-10-12 21:25:01 +00:00
对于 Kubernetes 1.5,发送给类型为 [Type=NodePort](/docs/user-guide/services/#type-nodeport) Services 的数据包默认进行源地址 NAT。你可以创建一个 `NodePort` Service 来进行测试:
2017-08-19 07:33:01 +00:00
```console
$ kubectl expose deployment source-ip-app --name=nodeport --port=80 --target-port=8080 --type=NodePort
service "nodeport" exposed
$ NODEPORT=$(kubectl get -o jsonpath="{.spec.ports[0].nodePort}" services nodeport)
$ NODES=$(kubectl get nodes -o jsonpath='{ $.items[*].status.addresses[?(@.type=="ExternalIP")].address }')
```
2017-08-19 07:33:01 +00:00
如果你的集群运行在一个云服务上,你可能需要为上面报告的 `nodes:nodeport` 开启一条防火墙规则。
现在,你可以通过上面分配的节点端口从外部访问这个 Service。
```console
$ for node in $NODES; do curl -s $node:$NODEPORT | grep -i client_address; done
client_address=10.180.1.1
client_address=10.240.0.5
client_address=10.240.0.3
```
2017-08-19 07:33:01 +00:00
请注意,这些并不是正确的客户端 IP它们是集群的内部 IP。这是所发生的事情
* 客户端发送数据包到 `node2:nodePort`
* `node2` 使用它自己的 IP 地址替换数据包的源 IP 地址SNAT
* `node2` 使用 pod IP 地址替换数据包的目的 IP 地址
* 数据包被路由到 node 1然后交给 endpoint
* Pod 的回复被路由回 node2
* Pod 的回复被发送回给客户端
2017-08-19 07:33:01 +00:00
形象的:
```
client
\ ^
\ \
v \
node 1 <--- node 2
| ^ SNAT
| | --->
v |
endpoint
```
2017-08-19 07:33:01 +00:00
为了防止这种情况发生Kubernetes 提供了一个特性来保留客户端的源 IP 地址[(点击此处查看可用特性)](/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip)。设置 `service.spec.externalTrafficPolicy` 的值为 `Local`,请求就只会被代理到本地 endpoints 而不会被转发到其它节点。这样就保留了最初的源 IP 地址。如果没有本地 endpoints发送到这个节点的数据包将会被丢弃。这样在应用到数据包的任何包处理规则下你都能依赖这个正确的 source-ip 使数据包通过并到达 endpoint。
设置 `service.spec.externalTrafficPolicy` 字段如下:
2017-08-19 07:33:01 +00:00
```console
$ kubectl patch svc nodeport -p '{"spec":{"externalTrafficPolicy":"Local"}}'
service "nodeport" patched
```
2017-08-19 07:33:01 +00:00
现在,重新运行测试:
```console
$ for node in $NODES; do curl --connect-timeout 1 -s $node:$NODEPORT | grep -i client_address; done
client_address=104.132.1.79
```
2017-08-19 07:33:01 +00:00
请注意,你只从 endpoint pod 运行的那个节点得到了一个回复,这个回复有*正确的*客户端 IP。
这是发生的事情:
* 客户端发送数据包到 `node2:nodePort`,它没有任何 endpoints
* 数据包被丢弃
* 客户端发送数据包到 `node1:nodePort`,它*有*endpoints
* node1 使用正确的源 IP 地址将数据包路由到 endpoint
2017-08-19 07:33:01 +00:00
形象的:
```
client
^ / \
/ / \
/ v X
node 1 node 2
^ |
| |
| v
endpoint
```
2017-08-19 07:33:01 +00:00
## Type=LoadBalancer 类型 Services 的 Source IP
Update localization guidelines (#10485) * Update localization guidelines for language labels Continuing work Continuing work Continuing work More work in progress Add local OWNERS folders Add an OWNERS file to Chinese Remove shortcode for repos Add Japanese Alphabetize languages, change weights accordingly More updates Add Korean in Korean Add English to languageName Feedback from gochist Move Chinese content from cn/ to zh/ Move OWNERS from cn/ to zh/ Resolve merge conflicts by updating from master Add files back in to prep for resolution After rebase on upstream/master, remove files Review and update localization guidelines Feedback from gochist, tnir, cstoku Add a trailing newline to content/ja/OWNERS Add a trailing newline to content/zh/OWNERS Drop requirement for GH repo project Clarify language about forks/branches Edits and typos Remove a shortcode specific to a multi-repo language setup Update aliases and owners Add explicit OWNERS for content/en Migrate content from Chinese repo, update regex in config.toml Remove untranslated strings Add trailing newline to content/en/OWNERS Add trailing newlines to OWNERS files add Jaguar project description (#10433) * add Jaguar project description [Jaguar](https://gitlab.com/sdnlab/jaguar) is an open source solution for Kubernetes's network based on OpenDaylight. Jaguar provides overlay network using vxlan and Jaguar CNIPlugin provides one IP address per pod. * Minor newline tweak blog post for azure vmss (#10538) Add microk8s to pick-right-solution.md (#10542) * Add microk8s to pick-right-solution.md Microk8s is a single-command installation of upstream Kubernetes on any Linux and should be included in the list of local-machine solutions. * capitalized Istio Add microk8s to foundational.md (#10543) * Add microk8s to foundational.md Adding microk8s as credible and stable alternative to get started with Kubernetes on a local machine. This is especially attractive for those not wanting to incur the overhead of running a VM for a local cluster. * Update foundational.md Thank you for your suggestions! LMK if this works now? * Rewrote first paragraph And included a bullet list of features of microk8s * Copyedit fix typo (#10545) Fix the kubectl subcommands links. (#10550) Signed-off-by: William Zhang <warmchang@outlook.com> Fix command issue (#10515) Signed-off-by: mooncake <xcoder@tenxcloud.com> remove imported community files per issue 10184 (#10501) networking.md: Markdown fix (#10498) Fix front matter, federation command-line tools (#10500) Clean up glossary entry (#10399) update slack link (#10536) typo in StatefulSet docs (#10558) fix discription about horizontal pod autoscale (#10557) Remove redundant symbols (#10556) Fix issue #10520 (#10554) Signed-off-by: William Zhang <warmchang@outlook.com> Update api-concepts.md (#10534) Revert "Fix command issue (#10515)" This reverts commit c02a7fb9f9d19872d9227814b3e9ffaaa28d85f0. Update memory-constraint-namespace.md (#10530) update memory request to 100MiB corresponding the yaml content Blog: Introducing Volume Snapshot Alpha for Kubernetes (#10562) * blog post for azure vmss * snapshot blog post Resolve merge conflicts in OWNERS* Minor typo fix (#10567) Not sure what's supposed to be here, proposing removing it. * Feedback from gochist Tweaks to feedback * Feedback from ClaudiaJKang
2018-10-12 21:25:01 +00:00
对于 Kubernetes 1.5,发送给类型为 [Type=LoadBalancer](/docs/user-guide/services/#type-nodeport) Services 的数据包默认进行源地址 NAT这是由于所有处于 `Ready` 状态的 Kubernetes 节点对于负载均衡的流量都是符合条件的。所以如果数据包到达一个没有 endpoint 的节点,系统将把这个包代理到*有* endpoint 的节点,并替换数据包的源 IP 为节点的 IP如前面章节所述
2017-08-19 07:33:01 +00:00
2017-08-19 07:33:01 +00:00
你可以通过在一个 loadbalancer 上暴露这个 source-ip-app 来进行测试。
```console
$ kubectl expose deployment source-ip-app --name=loadbalancer --port=80 --target-port=8080 --type=LoadBalancer
service "loadbalancer" exposed
$ kubectl get svc loadbalancer
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
loadbalancer 10.0.65.118 104.198.149.140 80/TCP 5m
$ curl 104.198.149.140
CLIENT VALUES:
client_address=10.240.0.5
...
```
然而,如果你的集群运行在 Google Kubernetes Engine/GCE 上,设置 `service.spec.externalTrafficPolicy` 字段值为 `Local` 可以强制使*没有* endpoints 的节点把他们自己从负载均衡流量的可选节点名单中删除。这是通过故意使它们健康检查失败达到的。
2017-08-19 07:33:01 +00:00
形象的:
```
client
|
lb VIP
/ ^
v /
health check ---> node 1 node 2 <--- health check
200 <--- ^ | ---> 500
| V
endpoint
```
2017-08-19 07:33:01 +00:00
你可以设置 annotation 来进行测试:
```console
$ kubectl patch svc loadbalancer -p '{"spec":{"externalTrafficPolicy":"Local"}}'
```
2017-08-19 07:33:01 +00:00
你应该能够立即看到 Kubernetes 分配的 `service.spec.healthCheckNodePort` 字段:
```console
$ kubectl get svc loadbalancer -o yaml | grep -i healthCheckNodePort
healthCheckNodePort: 32122
```
2017-08-19 07:33:01 +00:00
`service.spec.healthCheckNodePort` 字段指向每个节点在 `/healthz` 路径上提供的用于健康检查的端口。你可以这样测试:
```
$ kubectl get pod -o wide -l run=source-ip-app
NAME READY STATUS RESTARTS AGE IP NODE
source-ip-app-826191075-qehz4 1/1 Running 0 20h 10.180.1.136 kubernetes-minion-group-6jst
kubernetes-minion-group-6jst $ curl localhost:32122/healthz
1 Service Endpoints found
kubernetes-minion-group-jj1t $ curl localhost:32122/healthz
No Service Endpoints Found
```
2017-08-19 07:33:01 +00:00
主节点运行的 service 控制器负责分配 cloud loadbalancer。在这样做的同时它也会分配指向每个节点的 HTTP 健康检查的 port/path。等待大约 10 秒钟之后,没有 endpoints 的两个节点的健康检查会失败,然后 curl 负载均衡器的 ip
```console
$ curl 104.198.149.140
CLIENT VALUES:
client_address=104.132.1.79
...
```
2017-08-19 07:33:01 +00:00
__跨平台支持__
2017-08-19 07:33:01 +00:00
由于 Kubernetes 1.5 在类型为 Type=LoadBalancer 的 Services 中支持源 IP 保存的特性仅在 cloudproviders 的子集中实现GCP and Azure。你的集群运行的 cloudprovider 可能以某些不同的方式满足 loadbalancer 的要求:
1. 使用一个代理终止客户端连接并打开一个到你的 nodes/endpoints 的新连接。在这种情况下,源 IP 地址将永远是云负载均衡器的地址而不是客户端的。
2. 使用一个包转发器,因此从客户端发送到负载均衡器 VIP 的请求在拥有客户端源 IP 地址的节点终止,而不被中间代理。
2017-08-19 07:33:01 +00:00
第一类负载均衡器必须使用一种它和后端之间约定的协议来和真实的客户端 IP 通信,例如 HTTP [X-FORWARDED-FOR](https://en.wikipedia.org/wiki/X-Forwarded-For) 头,或者 [proxy 协议](http://www.haproxy.org/download/1.5/doc/proxy-protocol.txt)。
第二类负载均衡器可以通过简单的在保存于 Service 的 `service.spec.healthCheckNodePort` 字段上创建一个 HTTP 健康检查点来使用上面描述的特性。
{{% /capture %}}
2017-08-19 07:33:01 +00:00
{{% capture cleanup %}}
2017-08-19 07:33:01 +00:00
2017-08-19 07:33:01 +00:00
删除服务:
```console
$ kubectl delete svc -l run=source-ip-app
```
2017-08-19 07:33:01 +00:00
删除 Deployment、ReplicaSet 和 Pod
```console
$ kubectl delete deployment source-ip-app
```
{{% /capture %}}
2017-08-19 07:33:01 +00:00
{{% capture whatsnext %}}
2017-08-19 07:33:01 +00:00
* 学习更多关于 [通过 services 连接应用](/docs/concepts/services-networking/connect-applications-service/)
* 学习更多关于 [负载均衡](/docs/user-guide/load-balancer)
{{% /capture %}}
2017-08-19 07:33:01 +00:00