top() and bottom() point ordering was incorrect and using an inefficient
method of sorting. It has now been updated to use a heap and ordering is
being done by value first and time second (with earlier times always
taking priority).
Removed unit tests that test using `time` inside of the query to get the
real time instead of the interval time and only allowing the default
behavior. We will have another mechanism to get the real time during an
interval, but the current method is deprecated.
The top() and bottom() methods now have integer support.
It had the time values for the selectors being returned equal the actual
points time. We have decided to have the time always be the interval
time and adding another feature later that can return the selected
point's time in the future.
The history file is cleared before WriteHistory is called after each
command/exit() to prevent exponential file growth.
This commit addresses issue #5436, please see PR for full explanation.
Under highly conncurrent write load, the coordinating node would
create a connection to any other node that is part of the replica
group. Since each connection can be expensive, OOM sitations could
occur because there was no bounds on the number of new connections
that would be created. If writes on a remote node were slow, connections
could pile up an exacerbate the problem.
This switches the pool to be bounded and has a checkout that is blocking
with a timeout. If a connection is available, it's returned immediately.
If the pool still has room for more connections, it will create one if needed.
Otherwise, the call will block until a connection becomes available or
the timeout expires. In the case of a timeout, it is propogated back up
to the PointsWriter that determine what do return to the client.
If a bind-address of :8088 is used, cluster nodes cannot
connect to those nodes because there is no hostname portion
of the address. When we see a bind-address without a hostname,
use the os hostname or localhost if that fails if it is not specified
in the config already.
* Improve the ping endpoint so that it can optionally check for leader agreement across all meta servers
* Add Ping method to the meta client
* Fix ClusterID tests
* Remove WaitForLeader from meta client and remove unnecessary references to it
* Updated CreateShardGroup to not return an error if it already exists so it's idempotent
* Removed old test making sure you can't delete the default RP. You can delete it now, there was no reason to disallow it.
* Wired up the UpdateRetentionPolicy functionality
* Add dir, hostname, and bind address to top level config since it applies to services other than meta
* Add enabled flags to example toml for data and meta services
* Wire up add/remove raft peers and meta servers to meta service
* Update DROP SERVER to be either DROP META SERVER or DROP DATA SERVER
* Bring over statement executor from old meta package
* Start meta service client implementation
* Update meta service test to use the client
* Wire up node ID/meta server storage information
This changes backup and restore to work for TSM. It breaks it for b1 and bz1, but since those are getting removed it's ok.
The backup runs against any host that is specified and can backup either the metasstore, a database, specific retention policy, or a specific shard. It can also take incremental backups with the `since` flag, which will only backup TSM files that have been created since that timestamp.
The backup is safe to run online. However, for shards that are still hot for writes, they won't be able to create new TSM files while the backup for that single shard runs. If the backup isn't too large and the write throughput isn't too high this shouldn't be a problem since the writes will just go into the WAL cache.
One of the first unit tests in the cli tests called the Run method.
Since the Run method called os.Exit, it reported the unit tests as
succeeded. When parallel is set to 1, this skips _all_ unit tests after
the first one. When parallel is set to a higher value, unit tests run by
other processes still get run.
This changes the Run method to return an error (if one occurred). This
error can then be printed out and a bad exit status can be used to exit
the program from the main program instead. That causes the unit tests
to run correctly regardless of how many parallel processes are running.
Also added an additional option to the CLI called `IgnoreSignals`. If
this is set to true, then signals are not registered with the process.
Setting signals doesn't really work in unit tests so it's good to ensure
they don't get set in the first place.
In addition to fixing the influx cli tests, this adds a mock client to
the cli test for Use. PR #5183 added a validation for `use` to only be
able to select public databases so `_internal` couldn't be chosen. To
implement this, the `SHOW DATABASES` command was used by the internal
client.
Some of the unit tests in `cli_test.go` don't set the client to
anything. `TestParseCommand_Use` previously didn't, but now it needs to
have a client in the unit test with an empty test server.
This has a few changes in it (unfortuantely). The main change is to run compactions
concurrently. While implementing this, a few query and performance bugs showed up that
are also fixed by this commit.