Commit Graph

291 Commits (62dff895e227bc5cfcd93f04213a49871af5ff43)

Author SHA1 Message Date
Philip O'Toole 519a30a463 Add note on openTSDB batching
[ci skip]
2015-09-08 23:19:17 -07:00
Philip O'Toole 24aca5611a Add batch-pending control to openTSDB input 2015-09-08 19:35:42 -07:00
Philip O'Toole 95530e1623 Set UDP input defaults if not set 2015-09-08 19:32:20 -07:00
Philip O'Toole 5373f263a3 Add pending control to batcher
With this change, the generic batcher used by many inputs can now be
buffered. Testing shows that this performance of the Graphite input by
10-100%, with the biggest improvements at lower numbers of connections.
2015-09-08 19:32:00 -07:00
Philip O'Toole e38a204afc Merge pull request #4043 from influxdb/opentsdb_batching
Add batching and stats to openTSDB input
2015-09-08 19:27:35 -07:00
Philip O'Toole 1ce5187b66 Merge pull request #4049 from influxdb/udp_stats
Add stats to the UDP input
2015-09-08 19:18:17 -07:00
Philip O'Toole 9677a0faab Add collectd stats 2015-09-08 19:07:47 -07:00
Philip O'Toole 27932409b0 Add stats to the UDP input 2015-09-08 18:48:35 -07:00
Philip O'Toole 817328d378 Add basic stats to the CQ service 2015-09-08 18:17:20 -07:00
Philip O'Toole 349ba8b307 Add batching and stats to openTSDB input 2015-09-08 16:19:50 -07:00
Jason Wilder 73510a0a68 Fix invalid time stamp in graphite metric causes panic
If a timestamp was larger than the max epoch value was sent via
graphite it would cause the timestamp to overflow when it was
marshaled/unmarshaled back from the raft log.  The overflow cause
the shard group to get created with the wrong timestamp which cause
a panic when writing the point.  The panic was caused because the
timestamp that were supposed to exists in a map created by MapShards
did not actually exist so a nil ShardGroup was used.

The change prevents creating the point with an invalid timestamp.  Since
graphite using a timestamp in seconds, the maximum range is known and
can be prevented.  This also adds a check for the minimum range as well.

Fixes #3785
2015-09-08 10:07:47 -06:00
Philip O'Toole 332ce6481d Removed unused Graphite NewConfig
This function is not helpful for sections of the config that support
multiple instances.
2015-09-08 08:32:19 -07:00
Philip O'Toole bbc103305b Support multiple Graphite inputs
Fixes issue #3636
2015-09-06 21:33:46 -07:00
Philip O'Toole fa29e12222 Shutdown UDP Graphite on SIGTERM
Service.Close() had no way of closing the UDP Conn. This change makes
the UDP an attribute of the server, so Close() can access it.
2015-09-05 00:30:59 -07:00
Philip O'Toole 579e2a250c Add stats to httpd package 2015-09-04 12:37:59 -07:00
Philip O'Toole 3df898bd90 Merge pull request #3987 from influxdb/global_expvar_hookup_diagnostics
Use expvar statistics directly
2015-09-04 11:13:17 -07:00
Philip O'Toole 89bc392ec4 Access expvar directly from monitor
expvar map is already global so access it directly. This simplifies the
code and makes it much eaisier to use from other modules.
2015-09-04 09:45:24 -07:00
Philip O'Toole cf5a655249 Don't precreate shard groups entirely in past
Fixes issue #3722
2015-09-04 08:31:50 -07:00
Philip O'Toole 6ad35e23e9 Integrate code review feedback 2015-09-03 20:50:54 -07:00
Philip O'Toole d58532d844 Add Graphite diagnostics
Graphite diagnostics currently show TCP connections.
2015-09-03 20:50:54 -07:00
Philip O'Toole e07432c59f Implement diagnostics support
This change adds support for diagnostics by decomposing the existing
interface into two interfaces -- one for stats, and the other for
diags. It also adds some basic monitor of system, network, and the Go
runtime.
2015-09-03 20:50:54 -07:00
David Norton dce666e757 fix #3979: fix race in CQ service 2015-09-03 19:55:40 -04:00
Ben Johnson deff06f850 add copier service
This commit adds the copier service which allows one server to
copy shards from another server. This will be used for moving
shards in the cluster.
2015-09-03 13:07:35 -06:00
David Norton 0cb9618d6d fix CQ intoDB() 2015-09-03 09:07:57 -04:00
David Norton d466b19388 update CQ service unit tests 2015-09-03 07:12:15 -04:00
David Norton 66001cfbb5 fix #2555: add integration tests for CQs 2015-09-03 07:12:15 -04:00
David Norton 021a6f5453 rename CQ tests 2015-09-03 07:12:15 -04:00
David Norton 99a22c174b fix #2555: add backreference in CQs
Add new query syntax to allow the following in CQs:

INTO "1hPolicy".:MEASUREMENT
2015-09-03 07:12:15 -04:00
Philip O'Toole 4e2ee1ea70 Rename MonitorService to just Monitor
monitor is not a service, it has more in common with meta, since it
provides functionality to the query layer. This names makes this
clearer.
2015-09-02 15:07:30 -07:00
Philip O'Toole 366c0115f9 Serve expvar information from HTTP package 2015-09-01 15:22:37 -07:00
Philip O'Toole 9df17409d3 Use monitor service with Graphite 2015-09-01 15:21:36 -07:00
Philip O'Toole d87e668c78 Remove obsolete monitoring code 2015-09-01 15:03:52 -07:00
Philip O'Toole d771612718 Set default retention check interval to 30 minutes
Since the minimum retention period is 1 hour, checking every 10 minutes
seems excessive and generates noise in the logs.
2015-08-27 16:08:03 -07:00
Philip O'Toole ae825fdf3d Correct typo in retention service logs 2015-08-27 16:08:03 -07:00
Cory LaNou 74dad8c68c fix collectd tests for float data 2015-08-25 09:14:38 -05:00
Philip O'Toole 6193226ce8 Revert "Merge pull request #3771 from influxdb/tcp_graphite_timeout"
This reverts commit d7f646f7a4, reversing
changes made to d6f9903f10.

Conflicts:
	CHANGELOG.md

Fixes issue #3809
2015-08-24 10:53:14 -07:00
Philip O'Toole d7f646f7a4 Merge pull request #3771 from influxdb/tcp_graphite_timeout
Close idle Graphite TCP connections
2015-08-20 17:08:17 -07:00
Philip O'Toole 50b0f67290 Add Graphite TCP timeout tests 2015-08-20 15:46:08 -07:00
Philip 4930a6d8bb Start adding timeouts to TCP Graphite input 2015-08-20 15:10:22 -07:00
Jason Wilder afe1f598ca Cache name and fields if requested
Through profiling of writes, point.Fields() and point.Name() were called
repeatedly in PointsWriter and the Shard.  These calls are somewhat expensive
when writing large batches so we can cache them to avoid wasting CPU cycles.

Using influx_stress with default settings

Before:
  Wrote 10000000 points at average rate of 202570
  Average response time:  235.450355ms

After:
  Wrote 10000000 points at average rate of 246120
  Average response time:  182.881008ms
2015-08-20 15:48:38 -06:00
Philip 8e51064db1 Log Graphite batch size and timeout 2015-08-20 11:23:09 -07:00
Gunnar cf5ac2603d Fix Graphite README typo
Fixes #3727
2015-08-19 07:53:29 -07:00
Gunnar 409fe0afe3 Merge pull request #3686 from jonseymour/secure-options
Prevent 'p' parameter of OPTIONS requests being logged.
2015-08-18 17:19:39 -07:00
Philip O'Toole 5bb699e9a9 Enhance precreation log messages 2015-08-18 16:20:55 -07:00
Jon Seymour bdce79fe57 Merge branch 'secure-options-minimal' into secure-options 2015-08-19 09:15:58 +10:00
Jon Seymour 1d5ff55d76 Remove redaction logic from parseCredentials.
We now redact the credentials in the logger, so the function implemented
by the deleted lines now seems redudndant.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2015-08-19 09:08:54 +10:00
Philip O'Toole 28a6b1f3fd Merge pull request #3697 from influxdb/chunking_10k
Merge same-series data if not chunking
2015-08-18 13:23:10 -07:00
Jon Seymour 2805c4a9b5 Ensure 'p' parameter is not logged, even on OPTIONS requests.
Previously password redaction only occurred inside the
authentication handler and the authentication handler is not on
the request path for OPTIONS requests and, in any case, would
not be invoked because of an early return on OPTIONS
requests by the CORS handler.

Now, we change the response logger to explictly replace any
occurrence of the 'p' parameter from the query string with
'[REDACTED]' prior to logging the response.

Signed-off-by: Jon Seymour <jon@wildducktheories.com>
2015-08-18 09:41:16 +10:00
Philip O'Toole 6415944d01 Don't repeat retention policy log message 2015-08-17 16:15:51 -07:00
Philip O'Toole 487c336571 Correctly merge rows for identical series
If no chunking was requested by the user, the co-ordinating node buffers all
results in RAM before emitting a single result. However buffering was not
merging results for rows which had data for the same series. This change fixes this.

Fixes issue #3242.
2015-08-17 13:43:17 -07:00
Jason Wilder 7cf31a74cd Prevent out of memory range slices from being created
If the hinted handoff segment is corrupt, the size read could be
invalid and attempting to create a slice using that size causes
a panic.  Ideally, we'd have a checksum on the seqment record but
for now just return an error when the size is larger than the
segment file.

Fixes #3687
2015-08-17 10:48:01 -06:00
Jason Wilder e5e782d13d Merge pull request #3517 from dim/fix-cq-timeouts
Batch CQ writes to avoid timeouts
2015-08-14 10:52:17 -06:00
Jason Wilder bb6de3b8f3 Merge pull request #3522 from dim/fix-cq-timeout-bug
Consume CQ results on request timeouts
2015-08-14 10:52:04 -06:00
Cory LaNou 0b05980ae2 silence snapshotter logger for testing 2015-08-13 20:53:40 -05:00
Jason Wilder a7cb0df4af Fix typos/spacing 2015-08-13 10:02:05 -06:00
Jason Wilder 668181d275 Make log statements more consistent
* Capitalize first letter of message
* Log all services staring consistently
* Remove some extraneous log statements in meta.Store
* Log data dirs for meta, data and hinted handoff
2015-08-13 10:01:42 -06:00
Philip O'Toole 089d947bf3 Shutdown Graphite listener first during Close()
Without this the WaitGroup was not fully decremented as the Accept()
call on the listener never exited, and Wait() then never exited.
2015-08-12 12:49:58 -07:00
Philip O'Toole 966dee7559 Set sensible Graphite batching defaults 2015-08-11 18:34:36 -07:00
gunnaraasen 7dc7389e96 Remove dump from client and handler 2015-08-07 11:56:30 -07:00
Jason Wilder 398ffabab7 Fix panic in hinted handoff processor
A short write has occurred and we do not have enough bytes to determine
the size of the payload.  This is corrupted record that we should drop.
Instead of panicing, log the error and advance the queue since the error
at this location is unreoverable currently.

Fixes #3436
2015-08-06 14:06:41 -06:00
Jason Wilder 4f7df336f2 Fix go vet 2015-08-05 12:16:17 -06:00
Nathaniel Cook 2fac5471ce add TLS support to the OpenTSDB plugin 2015-07-31 12:00:22 -06:00
Dimitrij Denissenko 762a1c69d5 Consume CQ results on request timeouts 2015-07-30 16:59:47 +01:00
Dimitrij Denissenko 642d6eba85 Batch CQ writes to avoid timeouts 2015-07-30 15:24:51 +01:00
Gunnar 2a02605ef6 Merge pull request #3474 from influxdb/ga-options
Respond to OPTIONS requests on /query endpoint
2015-07-28 16:32:26 -07:00
Paul Dix fb76c34a79 Merge pull request #3426 from jhorwit2/jah/continuous-logging
Added additional logging to continuous queries
2015-07-27 17:43:55 -04:00
gunnaraasen 5ef0be2d71 Respond to OPTIONS requests on /query endpoint 2015-07-27 11:58:44 -07:00
Vorn Mom 38387ba6b7 Fixed typo. 2015-07-24 19:41:46 -04:00
Vorn Mom 5b50002728 Fixed typo in README 2015-07-24 17:07:41 -04:00
Josh Horwitz e722b4b4ad Added additional logging to continuous queries 2015-07-23 19:09:42 -04:00
Gunnar d9f16987fc Merge pull request #3439 from influxdb/ga-admin-https
Add HTTPS option and logger to admin service
2015-07-23 15:09:56 -07:00
Gunnar d1fc0a3cc9 Merge pull request #3375 from influxdb/https
First pass at re-enabling HTTPS.
2015-07-23 15:04:29 -07:00
gunnaraasen b30351f750 Remove redundant loggers and clean up logic 2015-07-23 15:01:48 -07:00
gunnaraasen 8ab9424295 Return error if HTTPS fails 2015-07-23 15:00:33 -07:00
gunnaraasen 614332bf17 Exit if HTTPS fails 2015-07-23 14:50:45 -07:00
Gunnar 96575e678a Merge pull request #3427 from influxdb/ga-pw-log
Logging tweaks, sanitize passwords and note if authentication is enabled
2015-07-23 14:12:41 -07:00
gunnaraasen e5ead383e5 Add HTTPS option and logger to admin service 2015-07-22 16:49:12 -07:00
Nathaniel Cook 17cc09259b change for better code clarity 2015-07-21 19:40:43 -06:00
gunnaraasen 785a8b4d9a Sanitize password from HTTP logs 2015-07-21 18:28:05 -07:00
Todd Persen 0780cf6599 Add a config test for HTTPS. 2015-07-21 18:20:08 -07:00
gunnaraasen 20de2bc914 Log authentication enabled message 2015-07-21 17:53:12 -07:00
Nathaniel Cook 3e29d8821a catch opentsdb malformed tags if they are missing keys or values 2015-07-21 16:11:58 -06:00
Philip O'Toole 425a65fca1 RemoteShard mapping now performed over TCP
With this change remote mapping no longer uses HTTP, as the HTTP ports
exposed by nodes on the cluster are not known cluster wide. The TCP
ports exposed by the cluster service are, so this change uses that
functionality. Each RemoteMapper has its own dedicated connection pool
for each node, and remote mapping TCP connections are in no way coupled
with query TCP connections.
2015-07-20 10:44:38 -07:00
Todd Persen 47d5c2d65f First pass at re-enabling HTTPS. 2015-07-17 16:57:31 -07:00
Philip O'Toole f549910a18 Merge pull request #3279 from LK4D4/fix_style_else
Fix style issues with else
2015-07-17 11:53:42 -07:00
gunnaraasen 9ba37325f6 Fixes authorization.
Adds GRANT and REVOKE statements for admin privilege. Adds authorization to the query endpoint.
2015-07-17 11:33:06 -07:00
Alexander Morozov 675eacbf2c Fix style issues with else
In go it's better to just continue flow without "else", if it is return in
"if" statement.

Signed-off-by: Alexander Morozov <lk4d4@docker.com>
2015-07-17 11:10:23 -07:00
Philip O'Toole a0dba108a0 Unit test shard mapper handler error handling 2015-07-15 16:08:26 -07:00
Philip O'Toole 65a580779b Shard mapping handler returns on errors 2015-07-15 16:08:10 -07:00
Philip O'Toole 7bdba556a1 Support pretty printing shard mapping 2015-07-15 13:06:16 -07:00
Philip O'Toole 74cb96646c Refactor query engine for distributed query support
With this change, the query engine code gathers information about
shards and tagsets by working with individual shards, collating the
information, and returning that to the client. It does not assume that any
particular shard is local, and accesses all shards through abstracted
Mappers, of which there are two types -- a Mapper type for Raw queries
and a second type for Aggregate queries. There are corresponding
Executors for each type of Mapper, but both types of Executors share the
same interface.
2015-07-15 12:54:55 -07:00
Josh Horwitz e4f2d8a6c4 Fixed httpd logger to get user from query params 2015-07-13 17:36:34 -04:00
Sean Beckett 72f52d44c9 making queries syntactically correct 2015-07-08 17:25:56 -06:00
Jason Wilder 6b8d3268e6 Fix code review comments 2015-07-07 11:41:12 -06:00
Jason Wilder db63ada7db Drop NaN and Inf values from graphite input
NaN is skipped by graphite.  Inf is not a supported value for Influxdb.
2015-07-06 16:14:02 -06:00
Jason Wilder b58df5344c Use a single batcher for graphite service
Previously there was a batcher per connection and each batcher was
flushed when the connection was closed.  This didn't have much of an
effect when multiple clients connected and disconnected since it would
flush the batch immediately.  It also did not help UDP traffic.

Instead, there is now a shared batcher for the service so that multiple
connections will not cause frequent flushes.
2015-07-06 16:14:02 -06:00
Jason Wilder 4a71692b88 Allow extra tags using graphite default template
Fixes #3223
2015-07-06 16:14:02 -06:00
Jason Wilder 0b481d55f4 Fix graphite filter searching matching wrong template
Default template would get chosen in some cases when a matching filter
existed.

Fixes #3245
2015-07-06 16:14:02 -06:00
Jason Wilder a3ab093996 Parse NaN as float
Fixes #3230
2015-07-06 16:14:01 -06:00
Jason Wilder 87c962d62d Fix panics when collectd fails to start
If collectd fails to start, it can panic when close is called because
some variables have not be initialized yet.
2015-06-29 22:09:47 -06:00