Multizone: docs

Docs for multizone aka Ubernetes Lite
2016-03-16 01:18:42 -04:00 · 2016-03-16 01:18:42 -04:00 · cbbf1fa6a7
parent 54ee722793
commit cbbf1fa6a7
2 changed files with 315 additions and 0 deletions
--- a/_data/guides.yml
+++ b/_data/guides.yml
@ -120,6 +120,8 @@ toc:
    path: /docs/admin/multi-cluster/
  - title: Using Large Clusters
    path: /docs/admin/cluster-large/
+  - title: Running in Multiple Zones
+    path: /docs/admin/multiple-zones/
  - title: Building High-Availability Clusters
    path: /docs/admin/high-availability/
  - title: Accessing Clusters
--- a/docs/admin/multiple-zones.md
+++ b/docs/admin/multiple-zones.md
@ -0,0 +1,313 @@
+---
+---
+
+## Introduction
+
+Kubernetes 1.2 adds support for running a single cluster in multiple failure zones
+(GCE calls them simply "zones", AWS calls them "availability zones", here we'll refer to them as "zones").
+This is a lightweight version of a broader effort for federating multiple
+Kubernetes clusters together (sometimes referred to by the affectionate
+nickname ["Ubernetes"](https://github.com/kubernetes/kubernetes/blob/master/docs/proposals/federation.md).
+Full federation will allow combining separate
+Kubernetes clusters running in different regions or clouds.  However, many
+users simply want to run a more available Kubernetes cluster in multiple zones
+of their cloud provider, and this is what the multizone support in 1.2 allows
+(we nickname this "Ubernetes Lite").
+
+Multizone support is deliberately limited: a single Kubernetes cluster can run
+in multiple zones, but only within the same region (and cloud provider).  Only
+GCE and AWS are currently supported automatically (though it is easy to
+add similar support for other clouds or even bare metal, by simply arranging
+for the appropriate labels to be added to nodes and volumes).
+
+
+* TOC
+{:toc}
+
+## Functionality
+
+When nodes are started, the kubelet automatically adds labels to them with
+zone information.
+
+Kubernetes will automatically spread the pods in a replication controller
+or service across nodes in a single-zone cluster (to reduce the impact of
+failures.)  With multiple-zone clusters, this spreading behaviour is
+extended across zones (to reduce the impact of zone failures.)  (This is
+achieved via `SelectorSpreadPriority`).  This is a best-effort
+placement, and so if the zones in your cluster are heterogenous
+(e.g. different numbers of nodes, different types of nodes, or
+different pod resource requirements), this might prevent perfectly
+even spreading of your pods across zones. If desired, you can use
+homogenous zones (same number and types of nodes) to reduce the
+probability of unequal spreading.
+
+When persistent volumes are created, the `PersistentVolumeLabel`
+admission controller automatically adds zone labels to them.  The scheduler (via the
+`VolumeZonePredicate` predicate) will then ensure that pods that claim a
+given volume are only placed into the same zone as that volume, as volumes
+cannot be attached across zones.
+ 
+## Limitations
+
+There are some important limitations of the multizone support:
+
+* We assume that the different zones are located close to each other in the
+network, so we don't perform any zone-aware routing.  In particular, traffic
+that goes via services might cross zones (even if pods in some pods backing that service
+exist in the same zone as the client), and this may incur additional latency and cost.
+
+* Volume zone-affinity will only work with a `PersistentVolume`, and will not
+work if you directly specify an EBS volume in the pod spec (for example).
+
+* Clusters cannot span clouds or regions (this functionality will require full
+federation support).
+
+* Although your nodes are in multiple zones, kube-up currently builds
+a single master node by default.  While services are highly
+available and can tolerate the loss of a zone, the control plane is
+located in a single zone.  Users that want a highly available control
+plane should follow the [high availability](/docs/admin/high-availability) instructions.
+
+
+## Walkthough
+
+We're now going to walk through setting up and using a multi-zone
+cluster on both GCE & AWS.  To do so, you bring up a full cluster
+(specifying `MULTIZONE=1`), and then you add nodes in additional zones
+by running `kube-up` again (specifying `KUBE_USE_EXISTING_MASTER=true`).
+
+### Bringing up your cluster
+
+Create the cluster as normal, but pass MULTIZONE to tell the cluster to manage multiple zones; creating nodes in us-central1-a.
+
+GCE:
+
+```shell
+curl -sS https://get.k8s.io | MULTIZONE=1 KUBERNETES_PROVIDER=gce KUBE_GCE_ZONE=us-central1-a NUM_NODES=3 bash
+```
+
+AWS:
+
+```shell
+curl -sS https://get.k8s.io | MULTIZONE=1 KUBERNETES_PROVIDER=aws KUBE_AWS_ZONE=us-west-2a NUM_NODES=3 bash
+```
+
+This step brings up a cluster as normal, still running in a single zone
+(but `MULTIZONE=1` has enabled multi-zone capabilities).
+
+### Nodes are labeled
+
+View the nodes; you can see that they are labeled with zone information.
+They are all in `us-central1-a` (GCE) or `us-west-2a` (AWS) so far.  The
+labels are `failure-domain.beta.kubernetes.io/region` for the region,
+and `failure-domain.beta.kubernetes.io/zone` for the zone:
+
+```shell
+> kubectl get nodes --show-labels
+
+
+NAME                     STATUS                     AGE       LABELS
+kubernetes-master        Ready,SchedulingDisabled   6m        beta.kubernetes.io/instance-type=n1-standard-1,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-master
+kubernetes-minion-87j9   Ready                      6m        beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-87j9
+kubernetes-minion-9vlv   Ready                      6m        beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-9vlv
+kubernetes-minion-a12q   Ready                      6m        beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-a12q
+```
+
+### Add more nodes in a second zone
+
+Let's add another set of nodes to the existing cluster, reusing the
+existing master, running in a different zone (us-central1-b or us-west-2b).
+We run kube-up again, but by specifying `KUBE_USE_EXISTING_MASTER=1`
+kube-up will not create a new master, but will reuse one that was previously
+created instead.
+
+GCE:
+
+```shell
+KUBE_USE_EXISTING_MASTER=true MULTIZONE=1 KUBERNETES_PROVIDER=gce KUBE_GCE_ZONE=us-central1-b NUM_NODES=3 kubernetes/cluster/kube-up.sh
+```
+
+On AWS we also need to specify the network CIDR for the additional
+subnet, along with the master internal IP address:
+
+```shell
+KUBE_USE_EXISTING_MASTER=true MULTIZONE=1 KUBERNETES_PROVIDER=aws KUBE_AWS_ZONE=us-west-2b NUM_NODES=3 KUBE_SUBNET_CIDR=172.20.1.0/24 MASTER_INTERNAL_IP=172.20.0.9 kubernetes/cluster/kube-up.sh
+```
+
+
+View the nodes again; 3 more nodes should have launched and be tagged
+in us-central1-b:
+
+```shell
+> kubectl get nodes --show-labels
+
+NAME                     STATUS                     AGE       LABELS
+kubernetes-master        Ready,SchedulingDisabled   16m       beta.kubernetes.io/instance-type=n1-standard-1,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-master
+kubernetes-minion-281d   Ready                      2m        beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kubernetes.io/hostname=kubernetes-minion-281d
+kubernetes-minion-87j9   Ready                      16m       beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-87j9
+kubernetes-minion-9vlv   Ready                      16m       beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-9vlv
+kubernetes-minion-a12q   Ready                      17m       beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-a12q
+kubernetes-minion-pp2f   Ready                      2m        beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kubernetes.io/hostname=kubernetes-minion-pp2f
+kubernetes-minion-wf8i   Ready                      2m        beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kubernetes.io/hostname=kubernetes-minion-wf8i
+```
+
+### Volume affinity
+
+Create a volume (only PersistentVolumes are supported for zone
+affinity), using the new dynamic volume creation:
+
+```json
+kubectl create -f - <<EOF
+{
+  "kind": "PersistentVolumeClaim",
+  "apiVersion": "v1",
+  "metadata": {
+    "name": "claim1",
+    "annotations": {
+        "volume.alpha.kubernetes.io/storage-class": "foo"
+    }
+  },
+  "spec": {
+    "accessModes": [
+      "ReadWriteOnce"
+    ],
+    "resources": {
+      "requests": {
+        "storage": "5Gi"
+      }
+    }
+  }
+}
+EOF
+```
+
+The PV is also labeled with the zone & region it was created in.  For
+version 1.2, dynamic persistent volumes are always created in the zone
+of the cluster master (here us-centaral1-a / us-west-2a); this will
+be improved in a future version (issue [#23330](https://github.com/kubernetes/kubernetes/issues/23330).)
+
+```shell
+> kubectl get pv --show-labels
+NAME           CAPACITY   ACCESSMODES   STATUS    CLAIM            REASON    AGE       LABELS
+pv-gce-mj4gm   5Gi        RWO           Bound     default/claim1             46s       failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a
+```
+
+So now we will create a pod that uses the persistent volume claim.
+Because GCE PDs / AWS EBS volumes cannot be attached across zones,
+this means that this pod can only be created in the same zone as the volume:
+
+```yaml
+kubectl create -f - <<EOF
+kind: Pod
+apiVersion: v1
+metadata:
+  name: mypod
+spec:
+  containers:
+    - name: myfrontend
+      image: nginx
+      volumeMounts:
+      - mountPath: "/var/www/html"
+        name: mypd
+  volumes:
+    - name: mypd
+      persistentVolumeClaim:
+        claimName: claim1
+EOF
+```
+
+Note that the pod was automatically created in the same zone as the volume, as
+cross-zone attachments are not generally permitted by cloud providers:
+
+```shell
+> kubectl describe pod mypod | grep Node
+Node:		kubernetes-minion-9vlv/10.240.0.5
+> kubectl get node kubernetes-minion-9vlv --show-labels
+NAME                     STATUS    AGE       LABELS
+kubernetes-minion-9vlv   Ready     22m       beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-9vlv
+```
+
+### Pods are spread across zones
+
+Pods in a replication controller or service are automatically spread
+across zones.  First, let's launch more nodes in a third zone:
+
+GCE:
+
+```shell
+KUBE_USE_EXISTING_MASTER=true MULTIZONE=1 KUBERNETES_PROVIDER=gce KUBE_GCE_ZONE=us-central1-f NUM_NODES=3 kubernetes/cluster/kube-up.sh
+```
+
+AWS:
+
+```shell
+KUBE_USE_EXISTING_MASTER=true MULTIZONE=1 KUBERNETES_PROVIDER=aws KUBE_AWS_ZONE=us-west-2c NUM_NODES=3 KUBE_SUBNET_CIDR=172.20.2.0/24 MASTER_INTERNAL_IP=172.20.0.9 kubernetes/cluster/kube-up.sh
+```
+
+Verify that you now have nodes in 3 zones:
+
+```shell
+kubectl get nodes --show-labels
+```
+
+Create the guestbook-go example, which includes an RC of size 3, running a simple web app:
+
+```shell
+find kubernetes/examples/guestbook-go/ -name '*.json' | xargs -I {} kubectl create -f {}
+```
+
+The pods should be spread across all 3 zones:
+
+```shell
+>  kubectl describe pod -l app=guestbook | grep Node
+Node:		kubernetes-minion-9vlv/10.240.0.5
+Node:		kubernetes-minion-281d/10.240.0.8
+Node:		kubernetes-minion-olsh/10.240.0.11
+
+ > kubectl get node kubernetes-minion-9vlv kubernetes-minion-281d kubernetes-minion-olsh --show-labels
+NAME                     STATUS    AGE       LABELS
+kubernetes-minion-9vlv   Ready     34m       beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-a,kubernetes.io/hostname=kubernetes-minion-9vlv
+kubernetes-minion-281d   Ready     20m       beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-b,kubernetes.io/hostname=kubernetes-minion-281d
+kubernetes-minion-olsh   Ready     3m        beta.kubernetes.io/instance-type=n1-standard-2,failure-domain.beta.kubernetes.io/region=us-central1,failure-domain.beta.kubernetes.io/zone=us-central1-f,kubernetes.io/hostname=kubernetes-minion-olsh
+```
+
+
+Load-balancers span all zones in a cluster; the guestbook-go example
+includes an example load-balanced service:
+
+```shell
+> kubectl describe service guestbook | grep LoadBalancer.Ingress
+LoadBalancer Ingress:   130.211.126.21
+
+> ip=130.211.126.21
+
+> curl -s http://${ip}:3000/env | grep HOSTNAME
+  "HOSTNAME": "guestbook-44sep",
+
+> (for i in `seq 20`; do curl -s http://${ip}:3000/env | grep HOSTNAME; done)  | sort | uniq
+  "HOSTNAME": "guestbook-44sep",
+  "HOSTNAME": "guestbook-hum5n",
+  "HOSTNAME": "guestbook-ppm40",
+```
+
+The load balancer correctly targets all the pods, even though they are in multiple zones.
+
+### Shutting down the cluster
+
+When you're done, clean up:
+
+GCE:
+
+```shell
+KUBERNETES_PROVIDER=gce KUBE_USE_EXISTING_MASTER=true KUBE_GCE_ZONE=us-central1-f kubernetes/cluster/kube-down.sh
+KUBERNETES_PROVIDER=gce KUBE_USE_EXISTING_MASTER=true KUBE_GCE_ZONE=us-central1-b kubernetes/cluster/kube-down.sh
+KUBERNETES_PROVIDER=gce KUBE_GCE_ZONE=us-central1-a kubernetes/cluster/kube-down.sh
+```
+
+AWS:
+
+```shell
+KUBERNETES_PROVIDER=aws KUBE_USE_EXISTING_MASTER=true KUBE_AWS_ZONE=us-west-2c kubernetes/cluster/kube-down.sh
+KUBERNETES_PROVIDER=aws KUBE_USE_EXISTING_MASTER=true KUBE_AWS_ZONE=us-west-2b kubernetes/cluster/kube-down.sh
+KUBERNETES_PROVIDER=aws KUBE_AWS_ZONE=us-west-2a kubernetes/cluster/kube-down.sh
+```