Commit Graph

147 Commits (db2bff682b0293e2592c7a32a6ba093f5350daf9)

Author SHA1 Message Date
Philip O'Toole 878d7fc5f5 Update shard retention time when policy changes
Fixes issue #3702.
2015-08-19 12:42:05 -07:00
Jason Wilder 0e59568825 Change how cluster is started in tests
Instead of trying to start all the nodes with dynamic peer addresses
set, alwasy start one, then join the rest to this one.  The SetPeers
in the test may be causing leadership changes and sporadic failures.
2015-08-14 13:17:38 -06:00
Jason Wilder 3d203885f0 Remove extraneous join error logging 2015-08-13 16:58:25 -06:00
Jason Wilder 22550dce54 Fix starting a cluster with self in the join urls
If you started a 3 node cluster and passed the same join URLs to
all three nodes, the first node started would not bootstrap correctly.
2015-08-13 16:20:34 -06:00
Jason Wilder 9b16353893 Shutdown raft transport and layer first
The raft.Shutdown() call was deadlocking if operations were still
being applied to the log sometimes.  Change the shutdown behavior to
match how consul shuts down raft:

e37b5ecb69/consul/server.go (L471-L477)
2015-08-13 15:31:31 -06:00
Jason Wilder 5796aec703 Fix race when closing local raft state 2015-08-13 10:02:05 -06:00
Jason Wilder 668181d275 Make log statements more consistent
* Capitalize first letter of message
* Log all services staring consistently
* Remove some extraneous log statements in meta.Store
* Log data dirs for meta, data and hinted handoff
2015-08-13 10:01:42 -06:00
Jason Wilder 29c6094a54 Log raft leader state changes
Make is much easier to determine when a cluster is in a healthly state
as well as who the current leader is.
2015-08-13 10:01:42 -06:00
Jason Wilder 5280b20e66 Make host rename on single node more seemless
Renaming a host that is a raft peer member is pretty difficult but
we can special case single-node renames since we know all the member
in the cluster and we can update the peer store directly on all nodes
(just one).

Fixes #3632
2015-08-13 10:01:42 -06:00
Jason Wilder ffcca1ceff Cap auto-created retention policy replica count at 3
Defaulting to the number of nodes in the cluster is doesn't make
sense with larger clusters. (e.g. 10 nodes = RF 10)
2015-08-12 14:18:02 -06:00
Jason Wilder 17583f7c5d Remove [meta].peers config option
Adding a new peer must happen via the -join flag.
2015-08-12 13:01:27 -06:00
Jason Wilder 3b0b227d31 Wait for raft to close before meta store close returns
Fixes #3516
2015-08-05 15:41:39 -06:00
Jason Wilder e0b25c723d Code review fixes 2015-08-05 14:48:30 -06:00
Jason Wilder b5b8754904 Fix comments and whitespace issues 2015-08-05 14:17:26 -06:00
Jason Wilder d521b625ed Update show servers output to should address, not url
The addresses listed from show servers are not http endpoints so just
show the address without http:// prefix.
2015-08-05 14:17:26 -06:00
Jason Wilder 13052e60f2 Sync hostname to metastore after startup
If the -hostname flag is passed, the node will startup and be accessible from
remote nodes using the specified hostname.  At startup, we attempt to update
the hostname if it's different.  For data-only nodes, this is pretty straight-forward.
For nodes part of the raft cluster, it is much more complicated as the the cluster
must be up and stable (with a leader) for a the update to take place.  The main
complication in this case is that the node starting up will have a different
hostname and will fail to take part of the raft cluster because each other node
does not have this new name in the it's raft peers list.  Since this is very problematic
and very easy to break a cluster, this PR just aborts startup and alerts the operator that
some manual actions must be taken to update the raft peer on all raft members before
the hostname can be fully updated.

Fixes #3421
2015-08-05 14:17:26 -06:00
Jason Wilder 2b76dac479 Don't resolve hostname when creating node
Hostnames were always being resolved to an IP address and the IP
address was used as the host address and raft peer address.  There
was no way to use an actual hostname instead of an IP address.
2015-08-05 14:17:26 -06:00
Jason Wilder ce26a3097a Add UpdateNode raft command 2015-08-05 14:17:26 -06:00
Jason Wilder 90c85cb933 Fix restart single node
Restarting a single node would not bootstrap its raft state
2015-07-28 13:17:59 -06:00
Jason Wilder 95c98d1ab7 Fix data race in WaitForDataChanged 2015-07-28 09:40:25 -06:00
Jason Wilder f5d86b95b3 Add raft column to show servers statement
Reports whether the not is part of the raft consensus cluster or not.
2015-07-28 09:40:25 -06:00
Jason Wilder 06d8ff7c13 Use config.Peers when passing -join flag
Removes the two separate variables in the meta.Config.  -join will
now override the Peers var.
2015-07-28 09:40:25 -06:00
Jason Wilder 2938601e9e Add more meta store cluster tests
* Test add new nodes that become raft peers
* Test restarting a cluster w/ 3 raft nodes and 3 non-raft nodes
2015-07-28 09:40:25 -06:00
Jason Wilder c93e46d569 Support add new raft nodes
This change adds the first 3 nodes to the cluster as raft peers. Other
nodes are data-only.
2015-07-28 09:40:25 -06:00
Jason Wilder f5705aebe1 Rename raftState.openRaft to open 2015-07-28 09:40:25 -06:00
Jason Wilder 9dd66fa4ad Make meta RPC private 2015-07-23 10:21:25 -06:00
Jason Wilder e9044166d6 Invalidate raft member by fetching from leader 2015-07-23 10:21:25 -06:00
Jason Wilder 47b8de7ce8 Hide Meta.Join from config command using toml skip annotation 2015-07-23 10:21:25 -06:00
Jason Wilder eb7d18125e Fix race in test code 2015-07-23 10:21:25 -06:00
Jason Wilder 29011c5cf2 Code review fixes 2015-07-23 10:21:25 -06:00
Jason Wilder b78ac4bf15 Add RPC tests 2015-07-23 10:21:24 -06:00
Jason Wilder 84a8d7d24b Add cluster-tracing option to meta config
Useful for troubleshooting but too verbose for regular use.
2015-07-23 10:21:24 -06:00
Jason Wilder c1fc83e3d5 Make join private so it does not show up in config command 2015-07-23 10:21:24 -06:00
Jason Wilder 29b11a20a2 Support multiple comma-separated join addresses
Will try each once until one succeeds
2015-07-23 10:21:24 -06:00
Jason Wilder 85db9c46e8 Move remaining raft impl details to local raft state 2015-07-23 10:21:24 -06:00
Jason Wilder 790733daad Move snapshot to raft state 2015-07-23 10:21:24 -06:00
Jason Wilder 54e116507f Move apply to raft state 2015-07-23 10:21:24 -06:00
Jason Wilder a9314d6bb7 Move raft index to raft state 2015-07-23 10:21:24 -06:00
Jason Wilder 17a9bb041b Remove raftEnabled func
Not needed since it was just used as a safeguard for seeing if we
are the leader.
2015-07-23 10:21:24 -06:00
Jason Wilder 72e2e1a6f2 Move addPeer to raft state 2015-07-23 10:21:24 -06:00
Jason Wilder 80248f9b53 Remote leaderCh
Not used
2015-07-23 10:21:24 -06:00
Jason Wilder b86fecfd80 Move setPeers to raft state 2015-07-23 10:21:24 -06:00
Jason Wilder 9e4339753f Move leaderCh() to raft state 2015-07-23 10:21:23 -06:00
Jason Wilder 33730da32b Move isLeader to raft state 2015-07-23 10:21:23 -06:00
Jason Wilder fb8a4db74f Move raft closing to localRaft state 2015-07-23 10:21:23 -06:00
Jason Wilder 5ea8342892 Move raft state to separate file
store.go is getting big.
2015-07-23 10:21:23 -06:00
Jason Wilder f3fcfebf83 Make raftState interface private 2015-07-23 10:21:23 -06:00
Jason Wilder a7fa5eb634 Propogate metadata changes from raft nodes to non-raft nodes
Non-raft nodes need to be notifified when the metastore changes. For
example, a database could be dropped on node 1 (non-raft) and node 2
would not know.  Since queries for that database would not be a cache
miss, node 2 would not get updated.

To propogate changes to non-raft nodes, each non-raft node maintains
a blocking connection to a raft node that blocks until a metadata
change occurs.  When the change is triggered, the updated metadata
is returned to the client and the client idempotently updates its local
cache.  It then reconnects and waits for another change.  This is
similar watches in zookeeper or etcd.  Since the blocking request is
always recreated, it also serves as a polling mechanism that will retry
another raft member if the current connection is lost.
2015-07-23 10:21:23 -06:00
Jason Wilder ad8948b4a6 Fix up rpc error handling
Some errors would not be returned to the client because something
failed before we could create the appropriate respone.  For these
cases, a general error response is returned.
2015-07-23 10:21:23 -06:00
Jason Wilder 5486d3e216 Move invalidate to raft state
The behavior is different depending on the state.  Local raft just
waits a second (?) and remote raft fetches the meta data from the leader.
2015-07-23 10:21:23 -06:00