Commit Graph

12926 Commits (31f1ec2947d0b7b09a29f809f5017adba1c7c876)

Author SHA1 Message Date
Jason Wilder 31f1ec2947
Merge pull request #9225 from influxdata/jw-snapshost-concurrency
Disk utilization fixes
2017-12-14 23:07:32 -07:00
Jason Wilder 2d85ff1d09 Adjust compaction planning
Increase level 1 min criteria, fix only fast compactions getting run,
and fix very large generations getting included in optimize plans.
2017-12-14 22:41:34 -07:00
Jason Wilder 749c9d2483 Rate limit disk IO when writing TSM files
This limits the disk IO for writing TSM files during compactions
and snapshots.  This helps reduce the spiky IO patterns on SSDs and
when compactions run very quickly.
2017-12-14 22:02:32 -07:00
Jason Wilder 6e3602c937 Revert "Increase cache-snapshot-memory-size default"
This reverts commit 171b427a1b.
2017-12-13 13:18:08 -07:00
Jason Wilder 7dc5327a0a Adjust snapshot concurrency by latency
This changes the approach to adjusting the amount of concurrency
used for snapshotting to be based on the snapshot latency vs
cardinality.  The cardinality approach could use too much concurrency
and increase the number of level 1 TSM files too quickly which incurs
more disk IO.

The latency model seems to adjust better to different workloads.
2017-12-13 13:17:56 -07:00
David Norton 44a06e2aa6
Merge pull request #9213 from influxdata/dn-generate-digests
feat #9212: add ability to generate shard digests
2017-12-13 09:54:34 -05:00
David Norton 253ea7cc5e feat #9212: fix file in use bug on Windows 2017-12-13 09:29:07 -05:00
David Norton 0ae7fc821d feat #9212: update CHANGELOG.md 2017-12-13 09:29:07 -05:00
David Norton 98ebad951f feat #9212: move reader/writer tests over 2017-12-13 09:28:34 -05:00
David Norton 4e13248d85 feat #9212: add ability to generate shard digests 2017-12-13 09:28:34 -05:00
Stuart Carnie cfc742885d
Merge pull request #9218 from influxdata/sgc-prometheus
add Prometheus metrics HTTP endpoint
2017-12-12 07:40:54 -07:00
Adam af2918a193
fix file_store path bug that affects windows users (#9219) 2017-12-11 17:31:33 -05:00
Stuart Carnie 44f0147f67 update CHANGELOG 2017-12-11 08:57:37 -07:00
Stuart Carnie 0d29dc1121 add Prometheus metrics HTTP endpoint 2017-12-11 08:51:40 -07:00
Stuart Carnie 0be32d5fd9
Merge pull request #9208 from influxdata/sgc-client
Go 1.10 omits Content-Type header if server response has no body
2017-12-08 10:32:38 -07:00
Stuart Carnie beef4e64d9 update CHANGELOG 2017-12-07 19:21:17 -07:00
Stuart Carnie 4f0c70591b Go 1.10 omits Content-Type if server response has no body
https://tip.golang.org/doc/go1.10#net/http

> The content-serving handlers also now omit the Content-Type header
when serving zero-length content.
2017-12-07 19:18:44 -07:00
Adam a0b2195d6b
Pulled in backup-relevant code for review (#9193)
for issue #8879
2017-12-07 11:35:20 -05:00
Jason Wilder f250b64721
Merge pull request #9204 from influxdata/jw-tsm-sync
Fix higher disk utilization regression
2017-12-07 07:57:01 -07:00
Jason Wilder 0b929fe669 Update changelog 2017-12-06 13:45:43 -07:00
Jason Wilder 9f2a422039 Use disk based TSM index more selectively
The disk based temp index for writing a TSM file was used for
compactions other than snapshot compactions.  That meant it was
used even for smaller compactiont that would not use much memory.
An unintended side-effect of this is higher disk IO when copying
the index to the final file.

This switches when to use the index based on the estimated size of
the new index that will be written.  This isn't exact, but seems to
work kick in at higher cardinality and larger compactions when it
is necessary to avoid OOMs.
2017-12-06 13:45:43 -07:00
Jason Wilder 0a85ce2b73 Schedule compactions less aggressively
This runs the scheduler every 5s instead of every 1s as well as reduces
the scope of a level 1 plan.
2017-12-06 13:45:43 -07:00
Jason Wilder 56d8f05f12 Cap concurrent compactions when large number of cores exists
The default max-concurrent-compactions settings allows up to 50%
of cores to be used for compactions.  When the number of cores is
high (>8), this can lead to high disk utilization.  Capping at
4 and combined with high snapshot sizes seems to keep the compaction
backlog reasonable and not tax the disks as much.  Systems with lots
of IOPS, RAM and CPU cores may want to increase these.
2017-12-06 13:45:08 -07:00
Jason Wilder e584cb6842 Increase cache-snapshot-memory-size default
With the recent changes to compactions and snapshotting, the current
default can create lots of small level 1 TSM files.  This increases
the default in order to create larger level 1 files and less disk
utilization.
2017-12-06 09:39:03 -07:00
Jason Wilder 9c1d7d00a9 Switch O_SYNC to periodic fsync
O_SYNC was added with writing TSM files to fix an issue where the
final fsync at the end cause the process to stall.  This ends up
increase disk util to much so this change switches to use multiple
fsyncs while writing the TSM file instead of O_SYNC or one large
one at the end.
2017-12-06 09:35:24 -07:00
Jason Wilder fd11e200c8
Merge pull request #9185 from influxdata/jw-deletes
Delete fixes
2017-11-30 15:41:50 -07:00
Jason Wilder 909a2fb6cc Fix deletes removing index for invalid time ranges
If a delete for a time that does not exist was run, we would not
remove the series key from the slice of series to remove from the
index.

This could be triggered by running somethin like "delete from cpu where
time = 0" and if there was no data at time 0, the series would still
be removed from the index.
2017-11-30 15:01:01 -07:00
Jason Wilder b6096414c2 Fix compactions aborting early
If there were many individual deletes to a series that ended up
deleting every value in the block and the tombstone timestamps
were not contigous, it was possible for the TSMKeyIterator to
return false for Next incorrectly.  This causes the compaction to
drop any remaining data in the file.

Normally, if all the data is deleted via tombstones, we remove the
whole key from the TSM index.  In this case, we're not able to determine
that the key is fully deleted until the block is decode and tombstones
are applied.

This changes the TSMKeyIterator to detect this condition and continue
to the next key instead of aborting.
2017-11-30 14:38:09 -07:00
Mark Rushakoff 48217793ac
Merge pull request #9182 from influxdata/mr-changelog-fix
Fix incorrect link in CHANGELOG
2017-11-30 08:44:02 -08:00
Jonathan A. Sternberg 95e1e3b332
Merge pull request #8015 from influxdata/js-code-coverage
Expand code coverage for undercovered packages
2017-11-29 19:30:47 -06:00
Mark Rushakoff d8d0d2440a Fix incorrect link in CHANGELOG 2017-11-29 16:25:01 -08:00
Andrew Hare 4531165ce2
Merge pull request #9181 from influxdata/amh-import-compaction
Schedule a full compaction after a successful import
2017-11-29 15:30:07 -07:00
Andrew Hare 0f937065c1 Update CHANGELOG 2017-11-29 13:52:04 -07:00
Andrew Hare 761a8f8bec Schedule a full compaction after a successful import 2017-11-29 13:50:38 -07:00
Jason Wilder a520ed99ee
Merge pull request #9179 from influxdata/jw-delete-range
Fix removing series from index
2017-11-29 11:58:10 -07:00
Jason Wilder 5cf7d52694 Ensure series keys are sorted
The Measurement added series keys from a map where the iteration
order is non-deterministic.  The keys should be returned in sorted
order.
2017-11-29 11:24:10 -07:00
Jason Wilder 8633e38549 Fix removing series from index
The loop to check if a series still exists in a TSM file was wrong
in that it 1) exited early after one iteration and 2) had an off
by one error that causes the wrong series to be marked as existing.

This fixes both of these cases which can cause the index to become
inconsistent with the data store on disk.
2017-11-29 10:45:04 -07:00
Edd Robinson c2f7f0f430
Merge pull request #8491 from influxdata/er-tsi-restore
Add support for TSI shard streaming and shard size
2017-11-29 15:40:52 +00:00
Jason Wilder df96d3064a
Merge pull request #9173 from influxdata/jw-cache-delete-sort
Fix Cache.DeleteRange not deleting all data
2017-11-28 17:33:32 -07:00
Andrew Hare 7b705732b4
Merge pull request #8971 from influxdata/ah-truncate-shards
Create a command to truncated shard groups
2017-11-28 17:26:35 -07:00
Andrew Hare d7e328050c
Merge branch 'master' into ah-truncate-shards 2017-11-28 17:25:31 -07:00
Andrew Hare 1d9a765084
Merge pull request #8845 from influxdata/amh-8789
Fix CLI to allow quoted database names
2017-11-28 17:24:24 -07:00
Andrew Hare 28ec02a7c1
Merge branch 'master' into amh-8789 2017-11-28 17:05:42 -07:00
Jason Wilder 887bca752e Skip flaky test on windows 2017-11-28 16:43:45 -07:00
Jonathan A. Sternberg b775ad3d5d Expand unit test code coverage in services that were undercovered
This expands code coverage for the following packages:
* monitor (3.5% -> 86.9%)
* services/precreator (31.6% -> 83.8%)
* services/retention (83.0% -> 84.9%)
* services/snapshotter (0.0% -> 82.1%)
* tcp (48.7% -> 60.0%)
2017-11-28 15:44:35 -06:00
Jonathan A. Sternberg d83b123b4f
Merge pull request #9145 from influxdata/js-9144-multiple-nested-distinct-calls
Fix query compilation so multiple nested distinct calls is allowable
2017-11-28 12:46:47 -06:00
Edd Robinson 81976bca59 Refactor based on new design 2017-11-28 17:54:29 +00:00
Jason Wilder e62f6d7cdf Fix Cache.DeleteRange not deleting all data
This fixes a regression in the Cache introduced in ca40c1ad3c where
not all the values in the cache entry would be removed.  Previously,
calling Exclude did not require the values to be sorted.  The change
in ca40c1ad3c relies on the values being sorted so it was possible for
it to find the wrong indexes in when calling FindRange and leave some
data that should be deleted.

Fixes #9161
2017-11-28 10:39:21 -07:00
Jonathan A. Sternberg db60a83d5a Fix query compilation so multiple nested distinct calls is allowable
When refactoring the query engine, I thought calling
`count(distinct(value))` multiple times was disallowed and so the
refactor made it so that wasn't possible.

It turns out that this pattern is allowed because since the distinct is
nested, it is aggregated anyway and can be combined with other
aggregates.

This removes the erroneously placed restriction.
2017-11-28 11:09:32 -06:00
Edd Robinson b10249a9b3 Fix rebase 2017-11-28 15:58:35 +00:00