This is contributing to some of the high memory usage on queries and possibly
some OOMs. This is slightly slower, but removing it allows some fairly large
count queries over 5M series to complete instead of crashing the process using
tsm1 engine.
Key() returned the key and the entries. We did not always need the
entries so they would be allocated and ignored. Added a KeyAt func
that just returns the key to avoid the unnecesary entries allocation.
Use sync.Pool for some temporary buffers used while encoding instead of
allocatin new ones each time. Also increased the default buffer size which
might be too small. Probably need to make this a config var.
We were buffering up the data to write into byte slices to reduce
IO calls but at larger sizes, this causes memory to spike. The TSMWriter
was switched to use a bufio.Writer internally so this byte slice buffering
is unnecessary and costly now.
The block count was an uint16 when incrementing the index location
which was an int32. This caused the value the uint16 value to overflow
before the index location was incremented causing the wrong location
to be read on the next iteration of the loop. This triggers the slice
out of range errors.
Added a test that recreates the panic seen in #5257 and possibly #5202 which
is older code.
Fixes#5257
This changes backup and restore to work for TSM. It breaks it for b1 and bz1, but since those are getting removed it's ok.
The backup runs against any host that is specified and can backup either the metasstore, a database, specific retention policy, or a specific shard. It can also take incremental backups with the `since` flag, which will only backup TSM files that have been created since that timestamp.
The backup is safe to run online. However, for shards that are still hot for writes, they won't be able to create new TSM files while the backup for that single shard runs. If the backup isn't too large and the write throughput isn't too high this shouldn't be a problem since the writes will just go into the WAL cache.
One of the first unit tests in the cli tests called the Run method.
Since the Run method called os.Exit, it reported the unit tests as
succeeded. When parallel is set to 1, this skips _all_ unit tests after
the first one. When parallel is set to a higher value, unit tests run by
other processes still get run.
This changes the Run method to return an error (if one occurred). This
error can then be printed out and a bad exit status can be used to exit
the program from the main program instead. That causes the unit tests
to run correctly regardless of how many parallel processes are running.
Also added an additional option to the CLI called `IgnoreSignals`. If
this is set to true, then signals are not registered with the process.
Setting signals doesn't really work in unit tests so it's good to ensure
they don't get set in the first place.
In addition to fixing the influx cli tests, this adds a mock client to
the cli test for Use. PR #5183 added a validation for `use` to only be
able to select public databases so `_internal` couldn't be chosen. To
implement this, the `SHOW DATABASES` command was used by the internal
client.
Some of the unit tests in `cli_test.go` don't set the client to
anything. `TestParseCommand_Use` previously didn't, but now it needs to
have a client in the unit test with an empty test server.
Quotes are handled differently in the line protocol depending on when
they are encountered. Quotes in field values matter, quotes anywhere
else don't.
`scanLine()` didn't understand this difference and treated all quotes
the same as ones for tag values. This resulted in `scanLine()` reading
the wrong amount of data sometimes when quotes were involved.
This fixes#5204.
This makes the following syntax possible:
CREATE CONTINUOUS QUERY mycq ON mydb
RESAMPLE EVERY 1m FOR 1h
BEGIN
SELECT mean(value) INTO cpu_mean FROM cpu GROUP BY time(5m)
END
The RESAMPLE option customizes how often an interval will be sampled and
the duration. The interval is customized with EVERY. Any intervals
within the resampling duration on a multiple of the resample interval
will be updated with the new results from the query.
The duration is customized with FOR. This determines how long an
interval will participate in resampling.
Both options are optional. If RESAMPLE is in the syntax, at least one of
the two needs to be given. The default for both is the interval of the
continuous query.
The service also improves tracking of the last run time and the logic of
when a query for an interval should be run. When determining the oldest
interval to run for a query, the continuous query service determines
what would have been the optimal time to perform the next query based on
the last run time. It then uses this time to determine the oldest
interval that should be run using the resample duration and will
resample all intervals between this time and the current time as opposed
to potentially forgetting about the last run in an interval if the
continuous query service gets delayed for some reason.
This removes the previous config options for customizing continuous
queries since they are no longer relevant and adds a new option of
customizing the run interval. The run interval determines how often the
continuous query service polls for when it should execute a query. This
option defaults to 1s, but can be set to 1m if the least common factor
of all continuous queries' intervals is a higher value (like 1m).
This has a few changes in it (unfortuantely). The main change is to run compactions
concurrently. While implementing this, a few query and performance bugs showed up that
are also fixed by this commit.