Non-raft nodes need to be notifified when the metastore changes. For
example, a database could be dropped on node 1 (non-raft) and node 2
would not know. Since queries for that database would not be a cache
miss, node 2 would not get updated.
To propogate changes to non-raft nodes, each non-raft node maintains
a blocking connection to a raft node that blocks until a metadata
change occurs. When the change is triggered, the updated metadata
is returned to the client and the client idempotently updates its local
cache. It then reconnects and waits for another change. This is
similar watches in zookeeper or etcd. Since the blocking request is
always recreated, it also serves as a polling mechanism that will retry
another raft member if the current connection is lost.
Nodes that are not part of the raft cluster will not reliably know who the
current raft cluster leader is. To make communication simpler, proxy all
rpc and raft calls to the current raft leader if a non-leader receives one.
This adds some basic ability to join a node to an existing cluster. It
uses a rpc layer to initiate a join request to an existing memeber. The
response indicates whether the joining node should take part in the raft
cluster and who it's peers should be. If raft should not be started, the
peers are the addresses of the current raft members that it should delegate
consensus operations.
To keep the meta store implementation agnostic of whether it's running
a local raft or not, a consensusStrategy type was also added.
This adds some basic plumbing to make remote procedure calls to other cluster
members. This first implementation allows a node to contact the raft leader
and fetch a copy of the meta data. This will be used by non-raft members to
pull down the latest metadata.
With this change, the query engine code gathers information about
shards and tagsets by working with individual shards, collating the
information, and returning that to the client. It does not assume that any
particular shard is local, and accesses all shards through abstracted
Mappers, of which there are two types -- a Mapper type for Raw queries
and a second type for Aggregate queries. There are corresponding
Executors for each type of Mapper, but both types of Executors share the
same interface.
The way it was, shard groups that were deleted by retention policy
enforcement were being recreated again, just to be deleted in the
next enforcement run. This change will help keep raft log free from
this unnecessary creation and deletion.
This commit sets a cluster ID when the first node is initialized.
The ID is generated on every CreateNodeCommand so that it can be
applied consistently in the state machine of every server.