* Do not use a cluster with two master replicas. Consensus on a two replica cluster requires both replicas running when changing persistent state.
As a result, both replicas are needed and a failure of any replica turns cluster into majority failure state.
A two-replica cluster is thus inferior, in terms of HA, to a single replica cluster.
* When you add a master replica, cluster state (etcd) is copied to a new instance.
If the cluster is large, it may take a long time to duplicate its state.
This operation may be sped up by migrating etcd data directory, as described [here](https://coreos.com/etcd/docs/latest/admin_guide.html#member-migration)
(we are considering adding support for etcd data dir migration in future).
## Implementation notes
![ha-master-gce](/images/docs/ha-master-gce.png)
### Overview
Each of master replicas will run the following components in the following mode:
* etcd instance: all instances will be clustered together using consensus;
* API server: each server will talk to local etcd - all API servers in the cluster will be available;
* controllers, scheduler, and cluster auto-scaler: will use lease mechanism - only one instance of each of them will be active in the cluster;
* add-on manager: each manager will work independently trying to keep add-ons in sync.
In addition, there will be a load balancer in front of API servers that will route external and internal traffic to them.
### Load balancing
When starting the second master replica, a load balancer containing the two replicas will be created
and the IP address of the first replica will be promoted to IP address of load balancer.
Similarly, after removal of the penultimate master replica, the load balancer will be removed and its IP address will be assigned to the last remaining replica.
Please note that creation and removal of load balancer are complex operations and it may take some time (~20 minutes) for them to propagate.
### Master service & kubelets
Instead of trying to keep an up-to-date list of Kubernetes apiserver in the Kubernetes service,
the system directs all traffic to the external IP:
* in one master cluster the IP points to the single master,
* in multi-master cluster the IP points to the load balancer in-front of the masters.
Similarly, the external IP will be used by kubelets to communicate with master.
### Master certificates
Kubernetes generates Master TLS certificates for the external public IP and local IP for each replica.
There are no certificates for the ephemeral public IP for replicas;
to access a replica via its ephemeral public IP, you must skip TLS verification.
### Clustering etcd
To allow etcd clustering, ports needed to communicate between etcd instances will be opened (for inside cluster communication).
To make such deployment secure, communication between etcd instances is authorized using SSL.