Philip O'Toole
878663e1e3
Periodic upload of stats to Enterprise
2015-10-19 15:25:07 -07:00
Philip O'Toole
667ad3342a
Refactor registration as a service
...
Registration also involves statistics and diagnostics upload, for the
purposes of remote management. This means there will be long-running
goroutines in effect. Therefore move the code to a service model.
2015-10-19 15:01:14 -07:00
Philip O'Toole
ef72c3c64d
Fix typo in retention service comment
...
[ci skip]
2015-10-19 14:24:25 -07:00
Philip O'Toole
ff18bf7213
Make Open() and Close() on Graphite sync'ed
...
This will ensure that these operations don't run concurrently. This
change also ensures nil batchers are not closed.
Fixes issue #4494 .
2015-10-19 11:13:31 -07:00
David Norton
e73a8e423c
fix #4472:too many points in the GROUP BY interval
2015-10-16 07:17:14 -04:00
Nathaniel Cook
cb1aaa8e42
Merge pull request #4375 from influxdb/subscriptions
...
Feature add subscriber service for creating/dropping subscriptions
2015-10-15 09:17:26 -06:00
Philip O'Toole
485c446e98
Correct typos in UDP README
...
[ci skip]
2015-10-15 07:48:34 -07:00
Sean Beckett
82f104a8b1
Merge pull request #4436 from influxdb/tag-names-to-keys
...
WIP tag name --> tag key, field name --> field key
2015-10-14 16:02:46 -07:00
Nathaniel Cook
8b31007aa7
Adds subscriber service for creating/dropping subscriptions to the
...
InfluxDB data stream.
2015-10-14 15:23:45 -06:00
Philip O'Toole
25f957c5c6
Only call Stop on non-nil batchers
2015-10-14 08:55:06 -07:00
Philip O'Toole
a938cd3dee
openTSDB Open should complete before Close runs
2015-10-14 08:55:06 -07:00
Philip O'Toole
3907656cc2
Add README for UDP service
...
Fixes issue #4041 .
2015-10-14 08:30:10 -07:00
Philip O'Toole
f298e88b39
Auto-create UDP service database
...
All other services operate like this, so make UDP service consistent.
2015-10-14 08:30:09 -07:00
Sean Beckett
5ab86f7578
Update README.md
2015-10-13 16:56:37 -07:00
Sean Beckett
ed7b9f7485
tag name --> tag key
2015-10-13 16:41:57 -07:00
Daniel Morsing
822af73f88
implement continuous queries as regular execs of into queries.
...
Now that we have into queries, we can implement them as regular
queries that are just run on a timer.
2015-10-13 15:51:19 +00:00
Sergey Kamardin
d25e264009
Update handler.go
...
Add `Access-Control-Expose-Headers` for `ping` endpoint clients be able to retrieve `X-Influxdb-Version` and `Date` from the server.
2015-10-12 18:05:18 +03:00
Philip O'Toole
37cf9a1610
Deletion while iterating is OK in Go
2015-10-09 16:30:20 -07:00
Philip O'Toole
f12470a99e
If there are no HH segments, then nothing to purge
2015-10-09 14:29:21 -07:00
Philip O'Toole
c06ac8f94c
Don't add a new segment every purge check
...
Everytime the purge check was running, a new segment was being added.
This meant the list of almost-empty files in the HH directories would
grow continually.
2015-10-09 14:26:47 -07:00
Philip O'Toole
657aa5a134
Add README for collectd
2015-10-09 09:15:22 -07:00
Philip O'Toole
b009f25e3d
Delete queues for inactive nodes
...
Deletion only takes place if all data in the queue is older than the
configured time.
2015-10-08 20:34:24 -07:00
Philip O'Toole
5b0a8ed306
HH should not process dropped nodes
2015-10-08 18:23:12 -07:00
Cameron Sparr
2add55107e
Fix graphite parser merge error, nargs
2015-10-08 11:15:03 -06:00
dgnorton
a9bf213076
Merge pull request #3484 from dawbs/dawbs-fix-3429
...
Bugfix for #3429 String representations of RegexLiterals generated in…
2015-10-08 13:12:10 -04:00
Cameron Sparr
3bea25b428
graphite parser: apply tags from the Parser on the template
2015-10-08 10:56:13 -06:00
Nick Dawbarn
136dbef0e7
Formatting fixes
2015-10-08 19:41:36 +10:00
Nick Dawbarn
26f6d00668
Bugfix for #3429 String representations of RegexLiterals generated in influxql/ast.go add the / char as a start and end delimiter, but does not escape any / characters that may exist with the regex
2015-10-08 19:41:36 +10:00
Cameron Sparr
73a630dfa6
graphite parser: apply tags from the Parser on the template
2015-10-07 23:19:29 -06:00
Rob Wilson
f3e3bf7a0e
typo
2015-10-07 21:25:49 +01:00
Rob Wilson
5815e0b0ee
updated documentation
2015-10-07 21:24:05 +01:00
Rob Wilson
d8ac746703
correct formatting
2015-10-07 20:35:05 +01:00
Rob Wilson
5fd8777c56
add tests
2015-10-07 20:32:10 +01:00
Rob Wilson
a27186fb7a
raise exception when field keyword is specified multiple times
2015-10-07 20:31:46 +01:00
Rob Wilson
bcd6c06173
Merge remote-tracking branch 'upstream/master' into graphite-template-custom-field
...
Conflicts:
services/graphite/parser.go
2015-10-07 17:48:34 +01:00
Philip O'Toole
44d52ac138
Fully lock HH node queue creation
...
I believe this change address the issues with hinted-handoff not fully replicating all data to nodes that come back online after an outage.. A detailed explanation follows.
During testing of of hinted-handoff (HH) under various scenarios, HH stats showed that the HH Processor was occasionally encountering errors while unmarshalling hinted data. This error was not handled completely correctly, and in clusters with more than 3 nodes, this could cause the HH service to stall until the node was restarted. This was the high-level reason why HH data was not being replicated.
Furthermore by watching, at the byte-level, the hinted-handoff data it could be seen that HH segment block lengths were getting randomly set to 0, but the block data itself was fine (Block data contains hinted writes). This was the root cause of the unmarshalling errors outlined above. This, in turn, was tracked down to the HH system opening each segment file multiple times concurrently, which was not file-level thread-safe, so these mutiple open calls were corrupting the file.
Finally, the reason a segment file was being opened multiple times in parallel was because WriteShard on the HH Processor was checking for node queues in an unsafe manner. Since WriteShard can be called concurrently this was adding queues for the same node more than once, and each queue-addition results in opening segment files.
This change fixes the locking in WriteShard such the check for an existing HH queue for a given node is performed in a synchronized manner.
2015-10-07 02:33:43 -07:00
Philip O'Toole
5b0767c30b
EOF is OK in HH processor
2015-10-07 01:56:55 -07:00
Philip O'Toole
8b49c37120
Count HH errors
2015-10-06 20:49:40 -07:00
Philip O'Toole
5d5515a497
If HH can't unmarshal a block, skip that block
2015-10-06 20:49:40 -07:00
Cameron Sparr
883d32cfd0
Add public function to graphite parser to apply template
2015-10-06 17:42:36 -06:00
Paul Dix
bb398daf75
Updates based on @otoolp's PR comments
2015-10-05 20:09:56 -04:00
Jason Wilder
5d9b89d601
Disable copier test
...
Not implemented for tsm1 engine
2015-10-05 20:09:56 -04:00
Paul Dix
7555ccbd70
WIP: engine work
2015-10-05 20:06:21 -04:00
Philip O'Toole
2ac0357406
Support dropping non-Raft nodes
2015-10-04 00:19:52 -07:00
Philip O'Toole
d74e0690c7
Revert "Merge pull request #4233 from influxdb/drop-server"
...
This reverts commit 0bdb36f6dc
, reversing
changes made to 3085fbc138
.
2015-10-02 08:39:57 -07:00
Cory LaNou
f50813460e
protobuf update.. :-(
2015-10-01 15:39:15 -05:00
Philip O'Toole
8a1e5a9e53
Clamp initial value of HH retry interval
...
This could happen due to misconfiguration, so do something sensible in
that case.
2015-10-01 12:04:33 -07:00
Philip O'Toole
878f776403
Exponential backoff if any hinted-handoff fails
2015-09-30 21:27:13 -07:00
Philip O'Toole
4eba2c1725
Add config support for max HH retry interval
2015-09-30 21:10:03 -07:00
Philip O'Toole
235714755c
HH processor-level stats
...
This change maintains stats on a per-shard and per-node basis.
2015-09-28 18:39:39 -07:00
Philip O'Toole
14db3ce9f5
Add service-level stats for hinted-handoff
2015-09-28 18:08:35 -07:00
Philip O'Toole
a196d3663a
Allow configuration of UDP retention policy
...
Fixes issue #4529
2015-09-28 15:17:56 -07:00
Philip O'Toole
49a70d0fca
Merge pull request #4238 from influxdb/hh_control
...
Fully disable hinted-handoff service if requested
2015-09-28 12:11:18 -07:00
Philip O'Toole
a4a8fa0ff0
Fully disable hinted-handoff service if requested
...
Without this change if hinted-handoff was disabled the service would
correctly reject writes, but it would process any data sitting in
hinted-handoff queues. With this change the service is completely
disabled.
2015-09-25 18:03:43 -07:00
Philip O'Toole
9de3125f6b
Graphite TCP should not block system shutdown
...
With this change Graphite TCP connections are tracked on a per-service
basis. This allows a closing Graphite service to first shutdown any
active connections, thereby unblocking the rest of shutdowm.
This work exposed small shortcomings with the existing Diagnostics
system and that code has alse been tweaked.
Fixes issue #4017
2015-09-24 14:08:38 -07:00
Antonio Murdaca
49c0b6ea73
Fix go vet warnings
...
This patch fixes the following go vet warnings:
```
services/continuous_querier/service.go:326: influxql.Statements
composite literal uses unkeyed fields
exit status 1
services/httpd/handler_test.go:145: models.Rows composite literal uses
unkeyed fields
services/httpd/handler_test.go:146: models.Rows composite literal uses
unkeyed fields
services/httpd/handler_test.go:165: models.Rows composite literal uses
unkeyed fields
services/httpd/handler_test.go:166: models.Rows composite literal uses
unkeyed fields
services/httpd/handler_test.go:187: models.Rows composite literal uses
unkeyed fields
services/httpd/handler_test.go:188: models.Rows composite literal uses
unkeyed fields
exit status 1
```
Signed-off-by: Antonio Murdaca <runcom@linux.com>
2015-09-21 15:28:54 +02:00
Rob Wilson
ef35d6dcc2
formatting
2015-09-21 12:26:43 +01:00
Rob Wilson
27c1cc23fd
Working prototype..
2015-09-21 12:18:19 +01:00
Rob Wilson
9121b422f8
comment out tests for now..
2015-09-21 10:47:42 +01:00
Rob Wilson
20e4fdfa9a
allow specifying fieldname in graphite template
2015-09-20 21:17:50 +01:00
Cory LaNou
72f6f7d268
Merge pull request #4134 from influxdb/issue-3447
...
Refactor Points and Rows to dedicated packages
2015-09-17 15:27:48 -05:00
Cory LaNou
38cb7b49de
Mising defer in httpd recovery. fixes #4124
2015-09-17 09:37:27 -05:00
Cory LaNou
ba830be3b9
actually move influxql.Row* -> models.Row*
2015-09-16 16:32:50 -05:00
Cory LaNou
d19a510ad2
refactor Points and Rows to dedicated packages
2015-09-16 15:33:08 -05:00
Philip O'Toole
d538829b4c
Enhance openTSDB logging and stats
2015-09-09 13:30:11 -07:00
Philip O'Toole
fef20c77b2
Cleanly terminate openTSDB connection on EOF
...
This is not really an error, so don't log it.
2015-09-09 13:01:13 -07:00
Philip O'Toole
02fcaf853d
Add node re Graphite configuration
...
[ci skip]
2015-09-08 23:22:34 -07:00
Philip O'Toole
519a30a463
Add note on openTSDB batching
...
[ci skip]
2015-09-08 23:19:17 -07:00
Philip O'Toole
24aca5611a
Add batch-pending control to openTSDB input
2015-09-08 19:35:42 -07:00
Philip O'Toole
95530e1623
Set UDP input defaults if not set
2015-09-08 19:32:20 -07:00
Philip O'Toole
5373f263a3
Add pending control to batcher
...
With this change, the generic batcher used by many inputs can now be
buffered. Testing shows that this performance of the Graphite input by
10-100%, with the biggest improvements at lower numbers of connections.
2015-09-08 19:32:00 -07:00
Philip O'Toole
e38a204afc
Merge pull request #4043 from influxdb/opentsdb_batching
...
Add batching and stats to openTSDB input
2015-09-08 19:27:35 -07:00
Philip O'Toole
1ce5187b66
Merge pull request #4049 from influxdb/udp_stats
...
Add stats to the UDP input
2015-09-08 19:18:17 -07:00
Philip O'Toole
9677a0faab
Add collectd stats
2015-09-08 19:07:47 -07:00
Philip O'Toole
27932409b0
Add stats to the UDP input
2015-09-08 18:48:35 -07:00
Philip O'Toole
817328d378
Add basic stats to the CQ service
2015-09-08 18:17:20 -07:00
Philip O'Toole
349ba8b307
Add batching and stats to openTSDB input
2015-09-08 16:19:50 -07:00
Jason Wilder
73510a0a68
Fix invalid time stamp in graphite metric causes panic
...
If a timestamp was larger than the max epoch value was sent via
graphite it would cause the timestamp to overflow when it was
marshaled/unmarshaled back from the raft log. The overflow cause
the shard group to get created with the wrong timestamp which cause
a panic when writing the point. The panic was caused because the
timestamp that were supposed to exists in a map created by MapShards
did not actually exist so a nil ShardGroup was used.
The change prevents creating the point with an invalid timestamp. Since
graphite using a timestamp in seconds, the maximum range is known and
can be prevented. This also adds a check for the minimum range as well.
Fixes #3785
2015-09-08 10:07:47 -06:00
Philip O'Toole
332ce6481d
Removed unused Graphite NewConfig
...
This function is not helpful for sections of the config that support
multiple instances.
2015-09-08 08:32:19 -07:00
Philip O'Toole
bbc103305b
Support multiple Graphite inputs
...
Fixes issue #3636
2015-09-06 21:33:46 -07:00
Philip O'Toole
fa29e12222
Shutdown UDP Graphite on SIGTERM
...
Service.Close() had no way of closing the UDP Conn. This change makes
the UDP an attribute of the server, so Close() can access it.
2015-09-05 00:30:59 -07:00
Philip O'Toole
579e2a250c
Add stats to httpd package
2015-09-04 12:37:59 -07:00
Philip O'Toole
3df898bd90
Merge pull request #3987 from influxdb/global_expvar_hookup_diagnostics
...
Use expvar statistics directly
2015-09-04 11:13:17 -07:00
Philip O'Toole
89bc392ec4
Access expvar directly from monitor
...
expvar map is already global so access it directly. This simplifies the
code and makes it much eaisier to use from other modules.
2015-09-04 09:45:24 -07:00
Philip O'Toole
cf5a655249
Don't precreate shard groups entirely in past
...
Fixes issue #3722
2015-09-04 08:31:50 -07:00
Philip O'Toole
6ad35e23e9
Integrate code review feedback
2015-09-03 20:50:54 -07:00
Philip O'Toole
d58532d844
Add Graphite diagnostics
...
Graphite diagnostics currently show TCP connections.
2015-09-03 20:50:54 -07:00
Philip O'Toole
e07432c59f
Implement diagnostics support
...
This change adds support for diagnostics by decomposing the existing
interface into two interfaces -- one for stats, and the other for
diags. It also adds some basic monitor of system, network, and the Go
runtime.
2015-09-03 20:50:54 -07:00
David Norton
dce666e757
fix #3979 : fix race in CQ service
2015-09-03 19:55:40 -04:00
Ben Johnson
deff06f850
add copier service
...
This commit adds the copier service which allows one server to
copy shards from another server. This will be used for moving
shards in the cluster.
2015-09-03 13:07:35 -06:00
David Norton
0cb9618d6d
fix CQ intoDB()
2015-09-03 09:07:57 -04:00
David Norton
d466b19388
update CQ service unit tests
2015-09-03 07:12:15 -04:00
David Norton
66001cfbb5
fix #2555 : add integration tests for CQs
2015-09-03 07:12:15 -04:00
David Norton
021a6f5453
rename CQ tests
2015-09-03 07:12:15 -04:00
David Norton
99a22c174b
fix #2555 : add backreference in CQs
...
Add new query syntax to allow the following in CQs:
INTO "1hPolicy".:MEASUREMENT
2015-09-03 07:12:15 -04:00
Philip O'Toole
4e2ee1ea70
Rename MonitorService to just Monitor
...
monitor is not a service, it has more in common with meta, since it
provides functionality to the query layer. This names makes this
clearer.
2015-09-02 15:07:30 -07:00
Philip O'Toole
366c0115f9
Serve expvar information from HTTP package
2015-09-01 15:22:37 -07:00
Philip O'Toole
9df17409d3
Use monitor service with Graphite
2015-09-01 15:21:36 -07:00
Philip O'Toole
d87e668c78
Remove obsolete monitoring code
2015-09-01 15:03:52 -07:00
Philip O'Toole
d771612718
Set default retention check interval to 30 minutes
...
Since the minimum retention period is 1 hour, checking every 10 minutes
seems excessive and generates noise in the logs.
2015-08-27 16:08:03 -07:00