Multizone: docs

Docs for multizone aka Ubernetes Lite
pull/140/head
Justin Santa Barbara 2016-03-16 01:18:42 -04:00
parent 54ee722793
commit cbbf1fa6a7
2 changed files with 315 additions and 0 deletions

View File

@ -120,6 +120,8 @@ toc:
path: /docs/admin/multi-cluster/
- title: Using Large Clusters
path: /docs/admin/cluster-large/
- title: Running in Multiple Zones
path: /docs/admin/multiple-zones/
- title: Building High-Availability Clusters
path: /docs/admin/high-availability/
- title: Accessing Clusters

View File

@ -0,0 +1,313 @@
---
---
## Introduction
Kubernetes 1.2 adds support for running a single cluster in multiple failure zones
(GCE calls them simply "zones", AWS calls them "availability zones", here we'll refer to them as "zones").
This is a lightweight version of a broader effort for federating multiple
Kubernetes clusters together (sometimes referred to by the affectionate
nickname ["Ubernetes"](https://github.com/kubernetes/kubernetes/blob/master/docs/proposals/federation.md).
Full federation will allow combining separate
Kubernetes clusters running in different regions or clouds. However, many
users simply want to run a more available Kubernetes cluster in multiple zones
of their cloud provider, and this is what the multizone support in 1.2 allows
(we nickname this "Ubernetes Lite").
Multizone support is deliberately limited: a single Kubernetes cluster can run
in multiple zones, but only within the same region (and cloud provider). Only
GCE and AWS are currently supported automatically (though it is easy to
add similar support for other clouds or even bare metal, by simply arranging
for the appropriate labels to be added to nodes and volumes).
* TOC
{:toc}
## Functionality
When nodes are started, the kubelet automatically adds labels to them with
zone information.
Kubernetes will automatically spread the pods in a replication controller
or service across nodes in a single-zone cluster (to reduce the impact of
failures.) With multiple-zone clusters, this spreading behaviour is
extended across zones (to reduce the impact of zone failures.) (This is
achieved via `SelectorSpreadPriority`). This is a best-effort
placement, and so if the zones in your cluster are heterogenous
(e.g. different numbers of nodes, different types of nodes, or
different pod resource requirements), this might prevent perfectly
even spreading of your pods across zones. If desired, you can use
homogenous zones (same number and types of nodes) to reduce the
probability of unequal spreading.
When persistent volumes are created, the `PersistentVolumeLabel`
admission controller automatically adds zone labels to them. The scheduler (via the
`VolumeZonePredicate` predicate) will then ensure that pods that claim a
given volume are only placed into the same zone as that volume, as volumes
cannot be attached across zones.
## Limitations
There are some important limitations of the multizone support:
* We assume that the different zones are located close to each other in the
network, so we don't perform any zone-aware routing. In particular, traffic
that goes via services might cross zones (even if pods in some pods backing that service
exist in the same zone as the client), and this may incur additional latency and cost.
* Volume zone-affinity will only work with a `PersistentVolume`, and will not
work if you directly specify an EBS volume in the pod spec (for example).
* Clusters cannot span clouds or regions (this functionality will require full
federation support).
* Although your nodes are in multiple zones, kube-up currently builds
a single master node by default. While services are highly
available and can tolerate the loss of a zone, the control plane is
located in a single zone. Users that want a highly available control
plane should follow the [high availability](/docs/admin/high-availability) instructions.
## Walkthough
We're now going to walk through setting up and using a multi-zone
cluster on both GCE & AWS. To do so, you bring up a full cluster
(specifying `MULTIZONE=1`), and then you add nodes in additional zones
by running `kube-up` again (specifying `KUBE_USE_EXISTING_MASTER=true`).
### Bringing up your cluster
Create the cluster as normal, but pass MULTIZONE to tell the cluster to manage multiple zones; creating nodes in us-central1-a.
GCE:
```shell
curl -sS https://get.k8s.io | MULTIZONE=1 KUBERNETES_PROVIDER=gce KUBE_GCE_ZONE=us-central1-a NUM_NODES=3 bash
```
AWS:
```shell
curl -sS https://get.k8s.io | MULTIZONE=1 KUBERNETES_PROVIDER=aws KUBE_AWS_ZONE=us-west-2a NUM_NODES=3 bash
```
This step brings up a cluster as normal, still running in a single zone
(but `MULTIZONE=1` has enabled multi-zone capabilities).
### Nodes are labeled
View the nodes; you can see that they are labeled with zone information.
They are all in `us-central1-a` (GCE) or `us-west-2a` (AWS) so far. The
labels are `failure-domain.beta.kubernetes.io/region` for the region,
and `failure-domain.beta.kubernetes.io/zone` for the zone:
```shell
> kubectl get nodes --show-labels
NAME STATUS AGE LABELS
kubernetes-master Ready,SchedulingDisabled 6m beta.kubernetes.io/instance-type=n1-standard-1,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-master
kubernetes-minion-87j9 Ready 6m beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-87j9
kubernetes-minion-9vlv Ready 6m beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-9vlv
kubernetes-minion-a12q Ready 6m beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-a12q
```
### Add more nodes in a second zone
Let's add another set of nodes to the existing cluster, reusing the
existing master, running in a different zone (us-central1-b or us-west-2b).
We run kube-up again, but by specifying `KUBE_USE_EXISTING_MASTER=1`
kube-up will not create a new master, but will reuse one that was previously
created instead.
GCE:
```shell
KUBE_USE_EXISTING_MASTER=true MULTIZONE=1 KUBERNETES_PROVIDER=gce KUBE_GCE_ZONE=us-central1-b NUM_NODES=3 kubernetes/cluster/kube-up.sh
```
On AWS we also need to specify the network CIDR for the additional
subnet, along with the master internal IP address:
```shell
KUBE_USE_EXISTING_MASTER=true MULTIZONE=1 KUBERNETES_PROVIDER=aws KUBE_AWS_ZONE=us-west-2b NUM_NODES=3 KUBE_SUBNET_CIDR=172.20.1.0/24 MASTER_INTERNAL_IP=172.20.0.9 kubernetes/cluster/kube-up.sh
```
View the nodes again; 3 more nodes should have launched and be tagged
in us-central1-b:
```shell
> kubectl get nodes --show-labels
NAME STATUS AGE LABELS
kubernetes-master Ready,SchedulingDisabled 16m beta.kubernetes.io/instance-type=n1-standard-1,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-master
kubernetes-minion-281d Ready 2m beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kubernetes.io/hostname=kubernetes-minion-281d
kubernetes-minion-87j9 Ready 16m beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-87j9
kubernetes-minion-9vlv Ready 16m beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-9vlv
kubernetes-minion-a12q Ready 17m beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-a12q
kubernetes-minion-pp2f Ready 2m beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kubernetes.io/hostname=kubernetes-minion-pp2f
kubernetes-minion-wf8i Ready 2m beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kubernetes.io/hostname=kubernetes-minion-wf8i
```
### Volume affinity
Create a volume (only PersistentVolumes are supported for zone
affinity), using the new dynamic volume creation:
```json
kubectl create -f - <<EOF
{
"kind": "PersistentVolumeClaim",
"apiVersion": "v1",
"metadata": {
"name": "claim1",
"annotations": {
"volume.alpha.kubernetes.io/storage-class": "foo"
}
},
"spec": {
"accessModes": [
"ReadWriteOnce"
],
"resources": {
"requests": {
"storage": "5Gi"
}
}
}
}
EOF
```
The PV is also labeled with the zone & region it was created in. For
version 1.2, dynamic persistent volumes are always created in the zone
of the cluster master (here us-centaral1-a / us-west-2a); this will
be improved in a future version (issue [#23330](https://github.com/kubernetes/kubernetes/issues/23330).)
```shell
> kubectl get pv --show-labels
NAME CAPACITY ACCESSMODES STATUS CLAIM REASON AGE LABELS
pv-gce-mj4gm 5Gi RWO Bound default/claim1 46s failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a
```
So now we will create a pod that uses the persistent volume claim.
Because GCE PDs / AWS EBS volumes cannot be attached across zones,
this means that this pod can only be created in the same zone as the volume:
```yaml
kubectl create -f - <<EOF
kind: Pod
apiVersion: v1
metadata:
name: mypod
spec:
containers:
- name: myfrontend
image: nginx
volumeMounts:
- mountPath: "/var/www/html"
name: mypd
volumes:
- name: mypd
persistentVolumeClaim:
claimName: claim1
EOF
```
Note that the pod was automatically created in the same zone as the volume, as
cross-zone attachments are not generally permitted by cloud providers:
```shell
> kubectl describe pod mypod | grep Node
Node: kubernetes-minion-9vlv/10.240.0.5
> kubectl get node kubernetes-minion-9vlv --show-labels
NAME STATUS AGE LABELS
kubernetes-minion-9vlv Ready 22m beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-9vlv
```
### Pods are spread across zones
Pods in a replication controller or service are automatically spread
across zones. First, let's launch more nodes in a third zone:
GCE:
```shell
KUBE_USE_EXISTING_MASTER=true MULTIZONE=1 KUBERNETES_PROVIDER=gce KUBE_GCE_ZONE=us-central1-f NUM_NODES=3 kubernetes/cluster/kube-up.sh
```
AWS:
```shell
KUBE_USE_EXISTING_MASTER=true MULTIZONE=1 KUBERNETES_PROVIDER=aws KUBE_AWS_ZONE=us-west-2c NUM_NODES=3 KUBE_SUBNET_CIDR=172.20.2.0/24 MASTER_INTERNAL_IP=172.20.0.9 kubernetes/cluster/kube-up.sh
```
Verify that you now have nodes in 3 zones:
```shell
kubectl get nodes --show-labels
```
Create the guestbook-go example, which includes an RC of size 3, running a simple web app:
```shell
find kubernetes/examples/guestbook-go/ -name '*.json' | xargs -I {} kubectl create -f {}
```
The pods should be spread across all 3 zones:
```shell
> kubectl describe pod -l app=guestbook | grep Node
Node: kubernetes-minion-9vlv/10.240.0.5
Node: kubernetes-minion-281d/10.240.0.8
Node: kubernetes-minion-olsh/10.240.0.11
> kubectl get node kubernetes-minion-9vlv kubernetes-minion-281d kubernetes-minion-olsh --show-labels
NAME STATUS AGE LABELS
kubernetes-minion-9vlv Ready 34m beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-9vlv
kubernetes-minion-281d Ready 20m beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kubernetes.io/hostname=kubernetes-minion-281d
kubernetes-minion-olsh Ready 3m beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-f,kubernetes.io/hostname=kubernetes-minion-olsh
```
Load-balancers span all zones in a cluster; the guestbook-go example
includes an example load-balanced service:
```shell
> kubectl describe service guestbook | grep LoadBalancer.Ingress
LoadBalancer Ingress: 130.211.126.21
> ip=130.211.126.21
> curl -s http://${ip}:3000/env | grep HOSTNAME
"HOSTNAME": "guestbook-44sep",
> (for i in `seq 20`; do curl -s http://${ip}:3000/env | grep HOSTNAME; done) | sort | uniq
"HOSTNAME": "guestbook-44sep",
"HOSTNAME": "guestbook-hum5n",
"HOSTNAME": "guestbook-ppm40",
```
The load balancer correctly targets all the pods, even though they are in multiple zones.
### Shutting down the cluster
When you're done, clean up:
GCE:
```shell
KUBERNETES_PROVIDER=gce KUBE_USE_EXISTING_MASTER=true KUBE_GCE_ZONE=us-central1-f kubernetes/cluster/kube-down.sh
KUBERNETES_PROVIDER=gce KUBE_USE_EXISTING_MASTER=true KUBE_GCE_ZONE=us-central1-b kubernetes/cluster/kube-down.sh
KUBERNETES_PROVIDER=gce KUBE_GCE_ZONE=us-central1-a kubernetes/cluster/kube-down.sh
```
AWS:
```shell
KUBERNETES_PROVIDER=aws KUBE_USE_EXISTING_MASTER=true KUBE_AWS_ZONE=us-west-2c kubernetes/cluster/kube-down.sh
KUBERNETES_PROVIDER=aws KUBE_USE_EXISTING_MASTER=true KUBE_AWS_ZONE=us-west-2b kubernetes/cluster/kube-down.sh
KUBERNETES_PROVIDER=aws KUBE_AWS_ZONE=us-west-2a kubernetes/cluster/kube-down.sh
```