Commit Graph

9034 Commits (c1d6c14c47dfb4d60878a979fbdd8b526281d3d6)

Author SHA1 Message Date
Jason Wilder 2f7a0090c1 Don't allocate a pre-sized buffer for each cursor
This is contributing to some of the high memory usage on queries and possibly
some OOMs.  This is slightly slower, but removing it allows some fairly large
count queries over 5M series to complete instead of crashing the process using
tsm1 engine.
2016-01-06 10:50:38 -07:00
Philip O'Toole 614a37cdcd Merge pull request #5281 from influxdata/c_fixes
Cleanup TSM files
2016-01-05 20:45:07 -08:00
Philip O'Toole 9c916b0b76 Best-effort cleanup of converter. 2016-01-05 19:27:27 -08:00
Philip O'Toole 2a547b0db3 Increment sequence, not generation
Avoid having all the new files getting picked up by the compaction
planner on startup.
2016-01-05 19:27:22 -08:00
Philip O'Toole 53afa0addc Merge pull request #5280 from influxdata/c_fixes
Gather conversion stats and filter NaN and Inf
2016-01-05 21:13:45 -05:00
Philip O'Toole fbb3e861ca Clearer database backup message 2016-01-05 17:51:52 -08:00
Philip O'Toole 075ef45ae1 Gather conversion stats 2016-01-05 16:55:22 -08:00
Philip O'Toole cac96113c0 Merge pull request #5278 from influxdata/c_fixes
Skip bz1 bolt files without any points
2016-01-05 19:09:49 -05:00
Jason Wilder 90292dd429 Merge pull request #5279 from influxdata/jw-compaction-memory
Reduce allocations during TSM compactions
2016-01-05 16:50:18 -07:00
Philip O'Toole 140f54a01d Skip bolt files without any points 2016-01-05 15:19:56 -08:00
Jason Wilder 6f577cfef5 Reduce allocations when compacting
Key() returned the key and the entries.  We did not always need the
entries so they would be allocated and ignored.  Added a KeyAt func
that just returns the key to avoid the unnecesary entries allocation.
2016-01-05 16:16:44 -07:00
Jason Wilder 9a9ccab560 Reduce allocation in wal encoder
Use sync.Pool for some temporary buffers used while encoding instead of
allocatin new ones each time.  Also increased the default buffer size which
might be too small.  Probably need to make this a config var.
2016-01-05 16:12:25 -07:00
Jason Wilder ee54a1e791 Write TSM data directly to writer
We were buffering up the data to write into byte slices to reduce
IO calls but at larger sizes, this causes memory to spike.  The TSMWriter
was switched to use a bufio.Writer internally so this byte slice buffering
is unnecessary and costly now.
2016-01-05 14:46:07 -07:00
Jason Wilder d2889ecd6a Avoid creating slices of all keys during compaction 2016-01-05 09:38:00 -07:00
Jason Wilder dd90824eb5 Fix go vet in restore.go 2016-01-05 09:37:44 -07:00
Philip O'Toole d9ed54ce37 More updates for new Github org 2016-01-04 15:48:47 -08:00
Philip O'Toole 7c77d63eab Update packaing and build for new github org 2016-01-04 15:47:04 -08:00
Philip O'Toole d53674e2cc Add note about new tsm1 directory permissions
[ci skip]
2016-01-04 15:23:05 -08:00
Philip O'Toole c8ab232ea5 Merge pull request #5267 from influxdb/converter_fixes
Various converter fixes
2016-01-04 17:16:45 -05:00
Philip O'Toole ed10978b71 Increase default max TSM file size 2016-01-04 13:58:43 -08:00
Philip O'Toole 250c10f126 Tweak influx_tsm help output 2016-01-04 13:58:12 -08:00
Philip O'Toole 8212bc82b9 Correct typo in influx_tsm help 2016-01-04 13:55:36 -08:00
Jason Wilder 3bd323d817 Merge pull request #5264 from influxdb/jw-5257
Fix panic: runtime error: slice bounds out of range
2016-01-04 13:51:47 -07:00
Jonathan A. Sternberg a2df2ff162 Merge pull request #5240 from influxdb/js-5204-unbalanced-quote-in-tag-value-fix
Fix scanLine to handle quotes properly
2016-01-04 15:01:50 -05:00
Jonathan A. Sternberg c825ff7bae Merge pull request #5203 from influxdb/js-fix-use-test-panic
Add a mock client to the cli test for Use
2016-01-04 13:54:03 -05:00
Jason Wilder 7794b9c5d4 Fix panic: runtime error: slice bounds out of range
The block count was an uint16 when incrementing the index location
which was an int32.  This caused the value the uint16 value to overflow
before the index location was incremented causing the wrong location
to be read on the next iteration of the loop.  This triggers the slice
out of range errors.

Added a test that recreates the panic seen in #5257 and possibly #5202 which
is older code.

Fixes #5257
2016-01-04 11:20:24 -07:00
Paul Dix 6ccc416ef0 Update CHANGELOG.md 2015-12-31 09:13:56 -05:00
Paul Dix ee233c849a Merge pull request #5224 from influxdb/pd-backup-restore
Implement backup/restore for TSM.
2015-12-31 08:56:12 -05:00
Paul Dix 49d480cb0c Fix races in backup/restore 2015-12-31 08:42:01 -05:00
Paul Dix 5974d37649 Fix backup test to mock out compaction 2015-12-31 08:15:13 -05:00
Paul Dix 9cede5fb71 Address PR comments 2015-12-30 18:06:51 -05:00
Paul Dix 26e1c6464a Update backup to address PR comments 2015-12-30 18:06:51 -05:00
Paul Dix 59fbd371fc Implement backup/restore for TSM.
This changes backup and restore to work for TSM. It breaks it for b1 and bz1, but since those are getting removed it's ok.

The backup runs against any host that is specified and can backup either the metasstore, a database, specific retention policy, or a specific shard. It can also take incremental backups with the `since` flag, which will only backup TSM files that have been created since that timestamp.

The backup is safe to run online. However, for shards that are still hot for writes, they won't be able to create new TSM files while the backup for that single shard runs. If the backup isn't too large and the write throughput isn't too high this shouldn't be a problem since the writes will just go into the WAL cache.
2015-12-30 18:06:50 -05:00
Michael Desa bf1673f466 Merge pull request #5239 from influxdb/md-add-db
Add flag to specify db and clarify flag descriptions
2015-12-30 10:59:31 -08:00
Michael Desa 7c025d8497 Change db flag message 2015-12-29 13:12:05 -08:00
Philip O'Toole def0148584 Merge pull request #5226 from influxdb/b_converter
b*1 to TSM converter
2015-12-29 16:09:47 -05:00
Philip O'Toole eaec514ca0 b*1 to tsm1 shard converter 2015-12-29 15:31:07 -05:00
Jonathan A. Sternberg 6b546cb766 Remove calls of os.Exit from influx cli Run method and fix influx tests
One of the first unit tests in the cli tests called the Run method.
Since the Run method called os.Exit, it reported the unit tests as
succeeded. When parallel is set to 1, this skips _all_ unit tests after
the first one. When parallel is set to a higher value, unit tests run by
other processes still get run.

This changes the Run method to return an error (if one occurred). This
error can then be printed out and a bad exit status can be used to exit
the program from the main program instead.  That causes the unit tests
to run correctly regardless of how many parallel processes are running.

Also added an additional option to the CLI called `IgnoreSignals`. If
this is set to true, then signals are not registered with the process.
Setting signals doesn't really work in unit tests so it's good to ensure
they don't get set in the first place.

In addition to fixing the influx cli tests, this adds a mock client to
the cli test for Use. PR #5183 added a validation for `use` to only be
able to select public databases so `_internal` couldn't be chosen. To
implement this, the `SHOW DATABASES` command was used by the internal
client.

Some of the unit tests in `cli_test.go` don't set the client to
anything. `TestParseCommand_Use` previously didn't, but now it needs to
have a client in the unit test with an empty test server.
2015-12-29 14:58:54 -05:00
Jonathan A. Sternberg 2994eafc9b Fix scanLine to handle quotes properly
Quotes are handled differently in the line protocol depending on when
they are encountered. Quotes in field values matter, quotes anywhere
else don't.

`scanLine()` didn't understand this difference and treated all quotes
the same as ones for tag values. This resulted in `scanLine()` reading
the wrong amount of data sometimes when quotes were involved.

This fixes #5204.
2015-12-29 14:35:00 -05:00
Michael Desa ebd9b9978e Add flag to specify db and clarify flag descriptions 2015-12-29 11:31:28 -08:00
Jonathan A. Sternberg 0931e30dd2 Merge pull request #5194 from influxdb/js-5136-per-cq-options
Custom continuous query options per query rather than per node
2015-12-29 14:00:39 -05:00
Jonathan A. Sternberg 5d4ecf853c Add continuous query option for customizing resampling
This makes the following syntax possible:

    CREATE CONTINUOUS QUERY mycq ON mydb
        RESAMPLE EVERY 1m FOR 1h
        BEGIN
          SELECT mean(value) INTO cpu_mean FROM cpu GROUP BY time(5m)
        END

The RESAMPLE option customizes how often an interval will be sampled and
the duration. The interval is customized with EVERY. Any intervals
within the resampling duration on a multiple of the resample interval
will be updated with the new results from the query.

The duration is customized with FOR. This determines how long an
interval will participate in resampling.

Both options are optional. If RESAMPLE is in the syntax, at least one of
the two needs to be given. The default for both is the interval of the
continuous query.

The service also improves tracking of the last run time and the logic of
when a query for an interval should be run. When determining the oldest
interval to run for a query, the continuous query service determines
what would have been the optimal time to perform the next query based on
the last run time. It then uses this time to determine the oldest
interval that should be run using the resample duration and will
resample all intervals between this time and the current time as opposed
to potentially forgetting about the last run in an interval if the
continuous query service gets delayed for some reason.

This removes the previous config options for customizing continuous
queries since they are no longer relevant and adds a new option of
customizing the run interval. The run interval determines how often the
continuous query service polls for when it should execute a query. This
option defaults to 1s, but can be set to 1m if the least common factor
of all continuous queries' intervals is a higher value (like 1m).
2015-12-28 16:43:49 -05:00
Jason Wilder 86f433b2ab Merge pull request #5221 from influxdb/jw-compactions
Compaction concurrency
2015-12-27 14:54:20 -07:00
Jason Wilder b6da176a4b Fix direct index size not calculated 2015-12-23 18:01:11 -07:00
Jason Wilder f9ae8077da Allow compactions to run when files have tombstones 2015-12-23 18:01:11 -07:00
Jason Wilder a38c95ec85 Update compactions to run concurrently
This has a few changes in it (unfortuantely).  The main change is to run compactions
concurrently.  While implementing this, a few query and performance bugs showed up that
are also fixed by this commit.
2015-12-23 18:01:11 -07:00
Jason Wilder 48d4156eac Fix blocks not sorted correctly when chunking 2015-12-23 18:01:11 -07:00
Jason Wilder bb2562b2ab Return CompactionGroups from planning 2015-12-23 18:01:11 -07:00
Jason Wilder d0ec0a15e2 Fix wrong test data setup 2015-12-23 18:01:11 -07:00
Philip O'Toole cbbb01ce8e Merge pull request #5215 from influxdb/run_flags
Add profiling flags to help output
2015-12-23 15:49:17 -05:00