Commit Graph

12102 Commits (643b2eb30cd4726e58a9052ee6f15ec9bea759cf)

Author SHA1 Message Date
Jason Wilder 9445ccbad3 Expose shard meta info on Shard 2017-05-16 11:18:02 -06:00
Mark Rushakoff 51b60e2e28 Merge pull request #7835 from influxdata/mr-rpc-bind
Bind RPC to localhost by default, add to sample config
2017-05-15 11:03:22 -07:00
Edd Robinson b4a427f9a2 Address PR feedback 2017-05-15 14:41:51 +01:00
Edd Robinson bbeb3e2f15 Update issue template 2017-05-15 14:12:00 +01:00
Edd Robinson 1cbbaa9317 Add support for shards, stats and diagnostics 2017-05-15 14:12:00 +01:00
Edd Robinson 8f8ff0ec61 Adds handler for returning a profile archive
Currently, when debugging issues with InfluxDB we often ask for the
following profiles:

  curl -o block.txt "http://localhost:8086/debug/pprof/block?debug=1"
  curl -o goroutine.txt
"http://localhost:8086/debug/pprof/goroutine?debug=1"
  curl -o heap.txt "http://localhost:8086/debug/pprof/heap?debug=1"
  curl -o cpu.txt "http://localhost:8086/debug/pprof/profile

This can be bothersome for users, or even difficult if they're
unfamiliar with cURL (or it's not on their system).

This commit adds a new endpoint: /debug/pprof/all which will return a
single compressed archive of all of the above profiles. The CPU profile
is optional, and not returned by default. To include a CPU profile the
URL to request should be: /debug/pprof/all?cpu=true. It's also possible
to vary the length of the CPU profile by adding a `seconds=x` parameter,
where x defaults to 30, if absent.

The new command for gathering profiles from users should now be:

  curl -o profiles.tar.gz "http://localhost:8086/debug/pprof/all"

Or, if we need to see a CPU profile:

  curl -o profiles.tar.gz
"http://localhost:8086/debug/pprof/all?cpu=true"

It's important to remember that a CPU profile is a blocking operation
and by default it will take 30 seconds for the response to be returned
to the user.

Finally, if the user is unfamiliar with cURL, they will now be able to
visit http://localhost:8086/debug/pprof/all in a web browser, and the
archive will be downloaded to their machine.
2017-05-15 14:11:38 +01:00
Mark Rushakoff 6f438ea467 Update CHANGELOG 2017-05-12 17:09:09 -07:00
Mark Rushakoff 3da9fded59 Clean up sample config file
Fixes #7767, #7760.
2017-05-12 17:03:24 -07:00
Mark Rushakoff c4f11afc90 Default RPC bind to localhost
Prior to this change, the default configuration would listen on all
interfaces, potentially exposing the RPC to the public internet.
2017-05-12 17:02:51 -07:00
Jason Wilder 3b700863ea Merge pull request #8384 from influxdata/jw-write-values
Write and compaction stability
2017-05-12 14:48:28 -06:00
Stuart Carnie c863923e68 cache MarshalSize 2017-05-12 14:05:25 -06:00
Stuart Carnie 0151afe31c check size and allocate once 2017-05-12 14:05:25 -06:00
Stuart Carnie 096d6f65b4 explicit sizes 2017-05-12 14:05:24 -06:00
Jason Wilder 0b7c0b680c Update changelog 2017-05-12 14:05:24 -06:00
Jason Wilder 4d002bb370 Limit concurrent compactions within a shard
This changes full compactions within a shard to run sequentially
instead of running all the compaction groups in parallel.  Normally,
there is only 1 full compaction group to run.  At times, there could
be several which causes instability if they are all running concurrently
as they tie up a cpu for long periods of time.

Level compactions are also capped to a max of 4 concurrently running for each level
in a shard.  This prevents sudden spikes in CPU and disk usage due to a large backlog
of tsm files at a given level.
2017-05-12 14:05:24 -06:00
Jason Wilder 2cac46ebbc Convert usage of strings to []byte
Measurement name and field were converted between []byte and string
repetively causing lots of garbage.  This switches the code to use
[]byte in the write path.
2017-05-12 14:05:19 -06:00
Jason Wilder 503d41a08f Add LimitedBytePool for wal buffers
This pool was previously a pool.Bytes to avoid repetitive allocations.
It was recently switchted to a sync.Pool because pool.Bytes held onto
very larger buffers at times which were never released.  sync.Pool is
showing up in allocation profiles quite frequently.

This switches the pool to a new pool that limits how many buffers are
in the pool as well as the max size of each buffer in the pool.  This
provides better bounds on allocations.
2017-05-11 11:27:00 -06:00
Jason Wilder e17be9f4ba Merge pull request #8377 from influxdata/jw-encoders
Speed up time encoding/decoding
2017-05-11 10:38:27 -06:00
Joe LeGasse 9ee63681b1 Merge pull request #8383 from influxdata/jl-tsm-test-fix
tsm: fixed test to not require sorted backup tarball
2017-05-11 12:17:59 -04:00
Joe LeGasse 087d9f4670 tsm: fixed test to not require sorted backup tarball 2017-05-11 12:00:19 -04:00
Jason Wilder b150a6293c Merge pull request #8380 from influxdata/jw-wal-buffer
Use buffer writer for wal segments
2017-05-11 08:34:44 -06:00
Jason Wilder 76428d168c Merge pull request #8373 from sebito91/influx_inspect_sort_tags
sort influx_inspect detailed report results
2017-05-10 12:24:49 -06:00
Jason Wilder b81ac21bcb Merge pull request #8378 from influxdata/jw-snapshot-disable
Don't disable snapshots when snapshot compactions are disabled
2017-05-10 12:00:27 -06:00
Jason Wilder e102fcca9c Use buffer writer for wal segments 2017-05-10 11:42:32 -06:00
Jason Wilder 39a829c1ae Speed up time encoding/decoding
This speeds up time encoding and decoding by skipping the divisor
scaling if scaling by 1.  Since division and multiplication are expensive
cpu and scaling by 1 has no effect, this just slows encoding and decoding
down.
2017-05-10 11:12:35 -06:00
Jason Wilder 4e3e707abc Fix packed time encoded benchmark 2017-05-10 10:35:44 -06:00
Jonathan A. Sternberg 75530bd0b0 Merge pull request #8376 from influxdata/js-8358-etc-config-sample
Small edits to the etc/config.sample.toml file
2017-05-10 11:29:48 -05:00
Jonathan A. Sternberg dea02009e0 Small edits to the etc/config.sample.toml file 2017-05-10 10:56:34 -05:00
Jonathan A. Sternberg 38735b24f6 Merge pull request #8350 from influxdata/js-request-tracker
Track HTTP client requests for /write and /query with /debug/requests
2017-05-09 13:52:58 -05:00
Jonathan A. Sternberg 2780630a5f Track HTTP client requests for /write and /query with /debug/requests
After using `/debug/requests`, the client will wait for 30 seconds
(configurable by specifying `seconds=` in the query parameters) and the
HTTP handler will track every incoming query and write to the system.
After that time period has passed, it will output a JSON blob that looks
very similar to `/debug/vars` that shows every IP address and user
account (if authentication is used) that connected to the host during
that time.

In the future, we can add more metrics to track. This is an initial
start to aid with debugging machines that connect too often by looking
at a sample of time (like `/debug/pprof`).
2017-05-09 10:18:33 -05:00
Sebastian Borza 6bb85f809a
sort influx_inspect detailed report results 2017-05-08 23:30:40 -05:00
Jason Wilder e6f31c38b5 Merge pull request #8372 from influxdata/jw-tombstone-range
Fix deletes triggering unnecessary compactions
2017-05-08 16:52:59 -06:00
Jason Wilder a9920cd6a9 Merge pull request #8370 from influxdata/jw-races
Fixes races/memory usage
2017-05-08 15:13:11 -06:00
Jason Wilder 29c2b1958e Fix deletes triggering unnecessary compactions
Tombstone files would be written to all TSM files even if the deleted
keys or timerange did not exist in the TSM file.  This had the side
effect of causing shards to get recompacted back to the same state. If
any shards or large numbers of TSM files existed, disk usage and CPU
utilization would spike causing issues.

This prevents tombstones being written for TSM files that could not
possiby contain the series keys being deleted or if the delted time
range is outside the range of the file.
2017-05-08 14:52:28 -06:00
Jonathan A. Sternberg 4df54aa86b Merge pull request #8357 from rw-influxdata/2017-05--fix-panic-in-AST-rewriter
Fix panic in AST rewriter when (*SelectStatement).Condition == nil
2017-05-08 15:21:29 -05:00
Jason Wilder 9374c4f513 Reduce allocations when monitoring shards
When monitoring shards, a slice of measurements is allocated for
each shard.  With many shards and measurements, these allocations
can be large.  Since inmem shards share the same index, we only
need to do this once since the resulting slices are all the same.
This reduces memory usage when monitoring shard cardinality.
2017-05-08 13:34:40 -06:00
Jason Wilder 00bdf62b83 Make shard is ready before returning index type
Shard can be created before they are opened and not have an index
setup yet.  This can cause a panic if IndexType is called.
2017-05-08 12:48:35 -06:00
Jason Wilder 041262af0e Fix race in shard
engine was accessed outside of an RLock which can cause a race when
montitoring goroutines access the shard while it's closed/closing.
2017-05-08 12:37:18 -06:00
Ben Johnson ef6b0e214b Merge pull request #8366 from benbjohnson/tsi-inspect
Add TSI support tooling.
2017-05-08 11:01:17 -06:00
Ben Johnson 489c89bea4
Add tsi support tooling. 2017-05-08 11:00:15 -06:00
Jason Wilder c0c6ad6880 Don't disable snapshots when snapshot compactions are disabled
Snapshot compactions can be disabled independently of snapshotting
capability.  This prevents taking backups of shards that have compactions
disabled.
2017-05-05 14:15:45 -06:00
Jonathan A. Sternberg a4a902e3f2 Merge pull request #8344 from influxdata/js-8343-csv-output-null-values
Set the CSV output to an empty string for null values
2017-05-05 10:01:55 -05:00
Jonathan A. Sternberg 260bdef3d4 Set the CSV output to an empty string for null values 2017-05-04 20:51:58 -05:00
Jason Wilder 0b018caf87 Merge pull request #8359 from influxdata/jw-index-race
Fix race in SeriesN and CreateSeriesIfNotExists
2017-05-04 17:49:14 -06:00
rw-influxdata 67279ccc64 Fix AST rewriting panic due to a nil Condition. 2017-05-04 14:51:53 -07:00
Jason Wilder 73ddd4787b Fix race in SeriesN and CreateSeriesIfNotExists 2017-05-04 14:40:50 -06:00
Jason Wilder 23af70add4 Merge pull request #8348 from influxdata/jw-tsm-compaction-limit
Compaction limits
2017-05-04 11:08:11 -06:00
Jason Wilder fc34d30038 Uses SeriesN instead of copying sketches
Avoids some extra allocations.
2017-05-04 10:12:38 -06:00
Jason Wilder bc639c5982 Make disableLevelCompactions lighter weight
Since this is called more frequently now, the cleanup func was invoked
quite a bit which makes several syscalls per shard.  This should only
be called the first time compactions are disabled.
2017-05-04 09:56:15 -06:00
Jason Wilder 7371f1067b Fix deadlock in Index.ForEachMeasurementTagKey
Index.ForEachMeasurementTagKey held an RLock while call the fn,
if the fn made another call into the index which acquired an RLock
and after another goroutine tried to acquire a Lock, it would deadlock.
2017-05-03 22:48:10 -06:00