Commit Graph

416 Commits (ee01a5f66ffba7115939b72ac70530f387e51524)

Author SHA1 Message Date
Cory LaNou b922cdcb76 shard stat created vs opened 2015-04-22 17:20:16 -06:00
Cory LaNou 9fee13ce41 format show stats properly and add more shard stats 2015-04-22 17:14:37 -06:00
Philip O'Toole b2b60532f1 Merge pull request #2383 from influxdb/serve_shard
Add HTTP endpoint that serves a requested shard
2015-04-22 09:33:35 -07:00
Philip O'Toole 68ba7ba005 Merge pull request #2387 from influxdb/no_local_shard_stats
There are no stats for non-local shards
2015-04-22 09:30:49 -07:00
Philip O'Toole e75e6a9526 Add HTTP endpoint that serves a requested shard
With this change a datanode can stream the requested shard to the
client. An error is returned if the shard does not exist or the the
shard is not local to that node.

1 data node can hit this endpoint to request data for a given shard if
the data no longer resides on the broker.
2015-04-22 09:29:19 -07:00
Philip O'Toole 52f968fbc4 There are no stats for non-local shards 2015-04-22 08:47:08 -07:00
Jason Wilder efa87633fa Fix shard datanodes stats getting appended too many times 2015-04-21 23:48:12 -06:00
Philip O'Toole c855549973 Add shard path to first diag value
Fix issue #2369
2015-04-21 19:19:08 -07:00
Jason Wilder 38628e540b Make drop database close and release resources
Drop database did not close any open shard files or close
any topic reader/heartbeats.  In the tests, we create and drop new
databases during each test run so these were open files and connection
slowed things down and consumed a lot of RAM as the tests progressed.
2015-04-21 13:39:58 -06:00
Jason Wilder 90e3059a8b Fix processRawQuery from returning duplicate data 2015-04-21 13:39:58 -06:00
Philip O'Toole ec57f8c84f RLock shard during diagnostics 2015-04-20 14:03:55 -07:00
Philip O'Toole 16befaa834 SHOW DIAGNOSTICS must check if shards are local
Fix issue #2323.
2015-04-18 11:39:08 -07:00
David Norton a1790f2d0c fix #2337: panic if tag key isn't double quoted 2015-04-18 13:05:41 -04:00
Jason Wilder 8aa0d32b6f Add failover to other data nodes for distributed queries
Fixes #2190
2015-04-17 11:28:47 -06:00
Jason Wilder c52dfce897 Load balance distributed queries across data nodes
Adds a Balancer interface to allow RemoteMappers to send data node
requests to multiple nodes.  It also provides the ability to failed
requests to mark the data node as offline using exponential
backoff with a 5 min max wait time.

Fixes #2242
2015-04-17 11:28:47 -06:00
ben hockey dde380832a wire up drop CQ statements
fixes #2141
2015-04-15 11:14:30 -05:00
Philip O'Toole 37c42c9dd2 RLock server for SHOW RETENTION POLICIES 2015-04-14 17:13:35 -07:00
Ben Johnson c5bdb5af86 Fix cluster-wide restart issue. 2015-04-14 13:43:25 -06:00
Todd Persen eed570ea1b Fix merge conflict in CHANGELOG.md 2015-04-13 15:55:38 -07:00
Cory LaNou 3ab660fe28 fix error message from having invalid format \n 2015-04-13 15:56:49 -06:00
Jason Wilder e47ee66b07 Make top-level handler less brittle
Move the data node specific routes under a common /data prefix so
add new handler does require updates to the top level handler as well.
2015-04-13 15:38:42 -06:00
David Norton c94785780d fix #2251: fix panic when changing default RP 2015-04-12 13:04:10 -04:00
Philip O'Toole 9282a8ae6d Fix compilation errors after parser merge 2015-04-10 16:11:34 -07:00
Philip O'Toole bf1a8aa1e4 Use uint64 for Series IDs
Fixes issue #1649
2015-04-10 16:11:34 -07:00
Paul Dix 7661546a47 Finish up distributed queries. 2015-04-10 16:11:34 -07:00
Paul Dix d41b85a715 Remove the interval setting from NextInterval to make remote mappers work. 2015-04-10 16:11:34 -07:00
Paul Dix 113995032e WIP: Initial implementation of remote mapper for distributed queries. 2015-04-10 16:11:34 -07:00
Ben Johnson 3404386a02 Merge pull request #2236 from influxdb/term-signal
Term signal
2015-04-10 17:02:13 -06:00
Ben Johnson eaf4bfca0a Fix term signal.
This commit changes raft so that term changes are made immediately and
term change signals are made afterward. Previously, election timeouts
were invalidated by incoming term changes which caused an election loop.

Stale term was also fixed and http/pprof was added too.
2015-04-10 13:52:20 -06:00
Jason Wilder a5e180ca31 Merge pull request #2229 from influxdb/jw-run
Close resources when stopping a node
2015-04-09 22:10:50 -06:00
David Norton 25cea58635 refactor scanning & parsing of identifiers 2015-04-09 13:21:13 -04:00
Jason Wilder a12bd42330 Close the messaging client on a data node when closing
Just setting it to nil was leaking resources
2015-04-09 10:00:46 -06:00
Jason Wilder 94f9ad0624 Fix unable to join race
2015/04/08 22:27:01 no broker or server configured to handle messaging endpoints
2015/04/08 22:27:02 join: failed to connect data node: http://box296:9012: unable to join
2015/04/08 22:27:02 join: failed to connect data node to any specified server

There is a race when joining a data only node to a broker and another data only node between the
data node heartbeater and the join operation.  If the heartbeater
fire before the join attempt, it's possible for the booting data node
to be selected as the first data node for redirection by the broker.
The join attempt would request a data node endpoint on the broker "/data_nodes"
but since the broker cannot handle it, it would redirect to a valid broker.

During this race, the broker would redirect the request back to the same server.  If
this happens, the data node would get stuck and not be able to join because it's
still booting.

To work around this, the redirect is randonmized and the join calls will not attempt
to call itself and instead re-request the original URL.  A better fix might be to
not start the heartbeater until after the datanode has joined or initialized.
2015-04-08 20:50:24 -06:00
Jason Wilder cad3f3c604 Set data node join retries higher
3 was fairly arbitrary and would cause errors such as:

2015/04/08 14:01:12 join: failed to connect data node: {http  <nil> influxdb.local:8191   }: unable to join
2015/04/08 14:01:12 join: failed to connect data node to any specified server

in the tests.  This can happen when the nodes are slow to startup. The limit is set
arbitarily higher to avoid this error but still give up if it can't connect
after a minute.
2015-04-08 20:49:58 -06:00
Cory LaNou 2e6f28f4cd lock if you plan on writing 2015-04-08 17:30:46 -06:00
Cory LaNou 8e19d52359 Merge pull request #2200 from influxdb/renenable-cq
Re-enable Continuous Queries
2015-04-08 17:22:20 -06:00
Cory LaNou a67e88ceef fix nil writes to data for cq 2015-04-08 17:12:37 -06:00
Philip O'Toole 553c94e206 Merge pull request #2185 from influxdb/64_int_storage
Store Go memory stats as int64
2015-04-08 13:15:12 -07:00
Philip O'Toole 7258a4bc7c Lock server during Open()
Fix issue #2196
2015-04-08 09:18:40 -07:00
Paul Dix 5ed69589ea Merge pull request #2158 from n1tr0g/set_password
Add support for SET PASSWORD FOR user = 'PASSWORD'
2015-04-08 09:30:44 -04:00
Philip O'Toole 2755c261f5 Start move to 64-bit ints 2015-04-07 12:58:44 -07:00
Todd Persen 214b405c47 Merge pull request #2181 from influxdb/show_diags
Switch on "SHOW DIAGNOSTICS" statement
2015-04-07 11:18:22 -07:00
Philip O'Toole 6536beb549 Switch on "SHOW DIAGNOSTICS" statement
Fix issue #2179
2015-04-07 08:21:42 -07:00
Jason Wilder 9e109e847b Fix typos in comments 2015-04-06 21:39:18 -06:00
Jason Wilder 54821538d1 Ignore join urls if restarting a node
If a node is restarted and it had already joined the cluster,
ignore and log that the join urls are being ignored and existing
cluster state will be used.
2015-04-06 16:38:01 -06:00
Jason Wilder aa5696c10d Handle server unavailable response
When starting multiple servers concurrently, they can race to connect
to each other.  This change just has the join attempts retry to make
cluster setup easier.
2015-04-06 16:38:01 -06:00
Jason Wilder 8b5307f6e8 Remove all join URLs from config
This removes all join URLs from the config.  To join a node to a
cluster, the URL of another member of the cluster should be passed
on the command line w/ the -join flag.  The join URLs can now be
any node regardless of whether the node is a broker only or data
only node.  At join time, the receiving node will redirect the
request to a valid broker or data node if it cannot handle the request
itself.
2015-04-06 16:38:01 -06:00
Jason Wilder 60c66c8515 Add data node join URLs
To add a new data node, it currently needs a broker
and another data node to join.  Temporarily adding
a JoinURLs option to the Data node section so a
standalone data node can be created but the intent is
that this will be removed.

Ideally, the the joinURL could point to either a data node
or a broker and it would get the required URLs from that
host but that is not possible currently.
2015-04-06 16:38:00 -06:00
Philip O'Toole 501b4ceedb Don't panic if presented with a field of unknown type
This can happen, though is very unlikely. If this node receives encoded
data, to be written to disk, and is queried for that data before its
metastore is updated, there will be no field mapping for the data during
decode. All this can happen because data is encoded by the node that first
received the write request, not the node that actually writes the data to
disk. So if this happens, skip the data.
2015-04-04 10:33:56 -07:00
Jari Sukanen 704691454d server: rename influxdb.Results type to influxdb.Response (issue: #2050)
Rename influxdb.Results to influxdb.Response as it already has Results
property itself. Renaming it to Response makes code look much less
ugly.
2015-04-04 12:17:33 +03:00