Commit Graph

121 Commits (856546820db6ad846b705132107390b1e85627f2)

Author SHA1 Message Date
Ben Johnson 00ce4a504e Wait for quorum write before returning from Log.Apply().
This commit ensures a commit is written to a quorum before returning
from Log.Apply().
2015-05-13 16:05:26 -06:00
Philip O'Toole c47585b890 Server Broker diagnostics over HTTP 2015-05-08 16:53:14 -07:00
Philip O'Toole a2792e353c Add broker and topic diagnostic types 2015-05-08 16:53:14 -07:00
Philip O'Toole b038afa049 Don't truncate topic data unless fully replicated
Fix issue #2521
2015-05-08 12:41:56 -07:00
Philip O'Toole 2320241ada Merge pull request #2433 from influxdb/peer_shard
Peer shard replication -- broker side
2015-04-30 16:32:41 -07:00
Philip O'Toole cc3f8f1c90 Integrate PR #2433 review comments 2015-04-30 16:15:26 -07:00
Philip O'Toole f851f896db Lock the client when randomizing its broker URL 2015-04-30 13:40:41 -07:00
Philip O'Toole 6460d52440 Return list of data nodes with topic data
This is done using a custom HTTP header "X-Broker-Truncated".
2015-04-29 13:18:22 -07:00
Philip O'Toole 9635811efa Unit test TopicReader and truncation 2015-04-29 13:18:22 -07:00
Philip O'Toole 8ab44301b9 Return redirect to client if topic is truncated 2015-04-29 13:18:22 -07:00
Philip O'Toole 6eefb9ffdc Provide URLs as nodes for truncated topic data 2015-04-29 13:18:22 -07:00
Jason Wilder a3013009aa Fix data race in client setConfig()/randomizeURL() 2015-04-27 23:12:14 -07:00
Jason Wilder 6b36e419fd Use read locks for Topic 2015-04-27 22:26:04 -07:00
Jason Wilder fc90719261 Use read lock for messaging connections 2015-04-27 22:21:49 -07:00
Jason Wilder b8f7c24413 Use a read lock for retrieving client urls 2015-04-27 22:18:23 -07:00
Jason Wilder 026ee0f7c3 Use a read lock for messaging client URL 2015-04-27 22:15:50 -07:00
Philip O'Toole 72fefcd2fa Enhance topic truncation logging 2015-04-27 12:36:02 -07:00
Philip O'Toole 812a73cd11 RLock broker during topic truncation 2015-04-24 21:55:16 -07:00
Philip O'Toole 5a37e049b9 Merge pull request #2407 from influxdb/broker_truncation
Topic reports whether it has ever been truncated
2015-04-23 13:07:54 -07:00
Philip O'Toole 6a16f2d5a9 Topic reports whether it has ever been truncated
This function is needed by Brokers to decide if shard requests by data
nodes should be redirected to peers.
2015-04-23 12:49:11 -07:00
Jason Wilder 38628e540b Make drop database close and release resources
Drop database did not close any open shard files or close
any topic reader/heartbeats.  In the tests, we create and drop new
databases during each test run so these were open files and connection
slowed things down and consumed a lot of RAM as the tests progressed.
2015-04-21 13:39:58 -06:00
Ben Johnson b54f81fcac Fix stream writer flushing to be thread safe.
This change fixes the raft streams so that Flush() is not called
asynchronously while the snapshot is being written.
2015-04-15 12:07:03 -06:00
Philip O'Toole dab100a3b1 Update sample config with topic truncation 2015-04-14 16:35:44 -07:00
Philip O'Toole 2002e64f20 Code review changes 2015-04-14 16:35:41 -07:00
Philip O'Toole f591150699 Update CHANGELOG 2015-04-14 15:56:43 -07:00
Philip O'Toole e9ce592e97 Create tombstone when topic is truncated
If a data node requests a topic index that is earier than is present for
a topic, tombstones allow the broker to know that the data node should
be redirected to another node that has the topic's data already
replicated. If no tombstone exists, then the broker can simply restart
replaying the topic data it has.
2015-04-14 15:56:39 -07:00
Philip O'Toole f6768e500f Don't delete unreplicated segments
A segment must now have been replicated by at least 1 node before it can
be deleted.
2015-04-14 15:15:27 -07:00
Philip O'Toole c9ea5f13de Correct out-of-date comment 2015-04-14 15:15:26 -07:00
Philip O'Toole b0ee5d0a78 Add broker truncation
Unit test same.
2015-04-14 15:15:26 -07:00
Philip O'Toole 9dbef70ff8 Fix typo in comments 2015-04-14 15:15:26 -07:00
Philip O'Toole 8470ee27b6 Make MaxTopicSize and MaxSegmentSize configurable 2015-04-14 15:15:26 -07:00
Ben Johnson 3404386a02 Merge pull request #2236 from influxdb/term-signal
Term signal
2015-04-10 17:02:13 -06:00
Ben Johnson eaf4bfca0a Fix term signal.
This commit changes raft so that term changes are made immediately and
term change signals are made afterward. Previously, election timeouts
were invalidated by incoming term changes which caused an election loop.

Stale term was also fixed and http/pprof was added too.
2015-04-10 13:52:20 -06:00
Jason Wilder a5e180ca31 Merge pull request #2229 from influxdb/jw-run
Close resources when stopping a node
2015-04-09 22:10:50 -06:00
Jason Wilder c22909f984 Shutdown idle http connections when closing client
When closing a node (and client) in the tests, idle connections
remain even after the client is closed.
2015-04-09 20:51:46 -06:00
Jason Wilder cb51ace768 Avoid spurious logging when stopping streamer
If the client is closed, we'd always log a reconnect error even
when trying to close.
2015-04-09 20:45:02 -06:00
Ben Johnson 47fa90fae2 Remove zero-length data panic from messaging.Conn.
This check is no longer necessary as the checksum added
to the messaging.Message will catch any data errors.
2015-04-09 10:53:43 -06:00
Ben Johnson cc83f2c39b Add checksum to message encoding
This commit changes the binary format of messaging.Message to encode
a 4-byte checksum at the beginning of it. This is used when reading
data back out to verify that it is not corrupt.

Corrupted messages are truncated on recovery so the broker can
restart from the previous message.
2015-04-07 14:24:22 -06:00
Ben Johnson 7aad1a5820 Remove dead code in messaging. 2015-04-07 12:36:07 -06:00
Jason Wilder 0f1fb3a5c3 Fix race on Topic.indexByUrl
applySetTopicMaxIndex() was updating the topics.indexByUrl w/o locking it.

WARNING: DATA RACE
Write by goroutine 1365:
  runtime.mapassign1()
      /usr/local/go/src/runtime/hashmap.go:376 +0x0
  github.com/influxdb/influxdb/messaging.(*Broker).applySetTopicMaxIndex()
      /home/ubuntu/.go_project/src/github.com/influxdb/influxdb/messaging/broker.go:496 +0x198
  github.com/influxdb/influxdb/messaging.(*Broker).Apply()
      /home/ubuntu/.go_project/src/github.com/influxdb/influxdb/messaging/broker.go:542 +0x33a
  github.com/influxdb/influxdb.(*Broker).Apply()
      <autogenerated>:1 +0x78
  github.com/influxdb/influxdb/messaging.(*RaftFSM).Apply()
      /home/ubuntu/.go_project/src/github.com/influxdb/influxdb/messaging/broker.go:614 +0x24f
  github.com/influxdb/influxdb/raft.(*Log).applyNextUnappliedEntry()
      /home/ubuntu/.go_project/src/github.com/influxdb/influxdb/raft/log.go:1431 +0x75c
  github.com/influxdb/influxdb/raft.(*Log).applier()
      /home/ubuntu/.go_project/src/github.com/influxdb/influxdb/raft/log.go:1369 +0x18f

Previous read by goroutine 1540:
  runtime.mapiterinit()
      /usr/local/go/src/runtime/hashmap.go:535 +0x0
  github.com/influxdb/influxdb/messaging.(*Topic).DataURLs()
      /home/ubuntu/.go_project/src/github.com/influxdb/influxdb/messaging/broker.go:681 +0x11d
  github.com/influxdb/influxdb/cmd/influxd.(*Handler).serveMetadata()
      /home/ubuntu/.go_project/src/github.com/influxdb/influxdb/cmd/influxd/handler.go:95 +0x3fd
  github.com/influxdb/influxdb/cmd/influxd.(*Handler).ServeHTTP()
      /home/ubuntu/.go_project/src/github.com/influxdb/influxdb/cmd/influxd/handler.go:45 +0x540
  net/http.serverHandler.ServeHTTP()
      /usr/local/go/src/net/http/server.go:1703 +0x1f6
  net/http.(*conn).serve()
      /usr/local/go/src/net/http/server.go:1204 +0x1087
2015-04-06 21:16:19 -06:00
Ben Johnson fd96e245cb Merge branch 'master' of https://github.com/influxdb/influxdb into broker-recovery
Conflicts:
	CHANGELOG.md
2015-04-04 08:09:06 -06:00
Ben Johnson ba3026f400 Broker log recovery.
This pull request adds recovery to the messaging.Topic when opening. If
any partial messages are found then the file is truncated at that point
and started from there. This can occur when ungracefully shutting down
a server. It can leave half written messages at the end of segments.
2015-04-04 08:06:35 -06:00
Jason Wilder 9ee0f6445e Fix broker connect race at startup
When a data node starts up, the broker URLs were not set before
they were actually being used.  The call to client.Open() in
turn triggers the raft streamer and heartbeat which try to connect
to the broker.   If those started before the subsequent client.SetURLs()
call, you would see the following error in the logs at startup:

[messaging] 2015/04/01 11:59:22 reconnecting to broker: url={  <nil>  /messaging/messages index=2&streaming=true&topicID=0 }, err=Get /messaging/messages?index=2&streaming=true&topicID=0: unsupported protocol scheme ""

Fixing this race uncovered another bug where the join urls would be
cleared the first time the broker was started.  In this case, the
join urls should be left alone since they were set properly w/ SetURLs.

Fixes #2152
2015-04-03 21:04:42 -06:00
Jason Wilder 6d4c7e9cd5 Handle broker and data node endpoints regardless of role
This is a pre-requisite for #1934.  When running separate
broker and data nodes, you currently need to know what role
a host is performing.  This complicates cluster setup in
that you must configure separate broker URLs and data node
URLs.

This change allows a broker only node to redirect data nodes endpoints
to a valid data node and a data only node to redirect broker
endpoints to a valid broker.
2015-04-03 21:00:43 -06:00
Todd Persen 82bf75c691 Merge pull request #2150 from runner-mei/connection_refused
Make InfluxDB win32 friendly, fix unit test is failed on the windows and it is for connection refused.
2015-04-03 16:31:51 -07:00
runner.mei e129c53f64 fix unit test is failed on the windows 2015-04-03 13:50:28 +08:00
runner.mei 6a7cb61f6d fix unit test is failed with connection refused on the win32 2015-04-03 13:48:52 +08:00
Jason Wilder 64f6900ce2 Change 1000ms to 1s 2015-04-02 11:27:59 -06:00
Jason Wilder 91fb7e3756 Track data node urls on brokers
This sends data node urls via the broker heartbeat from each data
node.  The urls are tracked on the broker to support simpler
cluster setup as well as distributed queries.
2015-04-02 11:27:53 -06:00
Ben Johnson 057309fc8e Simplify raft snapshotting, entry apply. 2015-03-26 20:32:39 -06:00