Fix for #5804.
The commit for #5789 rendered the semantics of snapshotCount statistic
useless. This commit restores semantics that have diagnostic value to
this statistic.
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
The Cache had support for taking multiple snapshots to support writing
multiple snapshots to TSM files concurrently if that happened to be
a bottleneck. In practice, this is never a bottleneck and we only
run one snappshoting goroutine continously per shard which has worked
well for all workloads.
The multiple snapshot support introduces some unhandled failure scenarios
where wal segments could be removed without writing them to TSM files. If
a snapshot compaction fails to write due to transient disk errors, subsequent
snapshots will continue, but the failed one will not be retried. When the
subsequent ones succeeded, all closed wal segments are removed causing data
loss.
This change simplifies the snapshotting capability to ensure that there is only
ever one snapshot. If one fails, the next snapshot will update the existing
snapshot and retry all of old and new data.
Fixes#5686
This fixes a regression introduced in #5757 due to the node.ID getting
assigned by both the meta and data services. When both roles are active,
the data CreateDataNode path was not getting called because a node ID was
already assigned.
This fixes the issue by seeing if a DataNode already exists for our node
ID, and if it does not, we create one.
The dimensions array in `RewriteWildcards` gets emptied by an earlier
section of the code and then tries to iterate over that empty slice to
append it to the list of dimensions.
That makes the loop dead code that can't ever be hit.
Also improve the efficiency of this method by not creating a new slice
when there are no wildcards. We already check at the beginning of the
function if there is a wildcard out of necessity. There's no point in
making a new slice and copying the contents if we know that there will
be no wildcards to expand.
It also improves memory efficiency by assuming that if a wildcard
exists, there is only one and the pre-allocated slice can take advantage
of that. If there are multiple wildcards, then a new slice will have to
be created in the middle of the loop to raise the capacity.
Fixes#5680.
When dropping a data node, the following will now happen on the
Meta Store.
1) If any shards no longer have any owners (because the data node
being dropped is the only owner), they will be reassigned a
new owner from within their respective shard group.
2) If a shard group no longer has any shards/data nodes, they will
be marked as deleted.
When a shard is being assigned a new owner a data node with the fewest
number of shards in the shard group will be selected as the new owner.
Finally, checking the validity of a data node's ID now happens in the
Meta store, rather than in the state machine.
When a wildcard is specified for the field but not the dimensions, the
dimensions get added to the list of fields as part of
`RewriteWildcards()`.
But when a dimension was given with no wildcard, the dimension didn't
get removed from the wildcard in the fields section. This teaches the
rewriter to disclude dimensions explicitly included from being expanded
as a field. Now this statement when a measurement has one tag named host
and a field named value:
SELECT * FROM cpu GROUP BY host
Would expand to this:
SELECT value FROM cpu GROUP BY host
Instead of this:
SELECT host, value FROM cpu GROUP BY host
If you want the latter behavior, you can include it like this:
SELECT host, * FROM cpu GROUP BY host
Fixes#5770.
This fixes a couple of issues with starting meta-only nodes.
1. We were always calling CreateDataNode regardless of whether the the
node is running data services. We only call that now when node is
data enabled.
2. The node.json was created along-side creating the data node. Since
we are not creatinga a data node, this didn't happen anymore. There
wasn't a simple way to do this in one place so it's actually handle
for when creating a meta or a data node now. Since the ID assigned
to the node is the same regardless of role this works in all combinations
of roles.
3. The JoinMetaServer didn't return the ID of the joining node which
created some races when multiple nodes were joining. The join call now
returns that information to the caller.
Fixes#5754
The join option was incorrectly exposed on the meta config. It should
be at the top-level as a string and propogate down to the meta config
as a slice.
The cache had some incorrect logic for determine when a series needed
to be deduplicated. The logic was checking for unsorted points and
not considering duplicate points. This would manifest itself as many
points (duplicate) points being returned from the cache and after a
snapshot compaction run, the points would disappear because snapshot
compaction always deduplicates and sorts the points.
Added a test that reproduces the issue.
Fixes#5719
The intent of this change is to avoid writing caches created for
snapshot cache instances into the tsm1_cache measurement. We can do
this by avoiding use of the NewCache constructor. All other methods
are only intended to be called from on the engine cache - never
on a snapshot.
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
Since we are not locking but relying on atomic arithmetic,
use Add rather than Set. Will also result in slightly less garbage
being created.
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
The intent of this change is to ensure that all statistic fields of the
resulting tsm1_cache measurement are initialized on initialization of
the cache. That way, any consumer of those measurements doesn't
have to deal with the null case.
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
Influx does not support fields with empty names or points
with no fields.
NewPoint is changed to validate that all field names are non-empty.
AddField is removed because we now require that all fields are
specified on construction.
NewPointFromByte is changed to return an error if a unmarshaled
binary point does not have any fields.
newFieldsFromBinary is changed to prevent an infinite loop that
can arise while attempting to parse corrupt binary point data.
TestNewPointsWithBytesWithCorruptData is changed to reflect the
change in the behaviour of NewPointFromByte.
Signed-off-by: Jon Seymour <jon@wildducktheories.com>
Complementing and extending the changes in #5758.
Add 2 level statistics:
* snapshotCount
* cacheAgeMs
Add 2 counter statistics
* cachedBytes
* WALCompactionTimeMs
snapshotCount can be used to measure transient write errors that are causing snapshots to accumulate
cacheAgeMs can be used to guage the level of write activity into the cache
The differences between cachedBytes stats sampled at different times can be used to calculate cache throughput rates
The ratio (cachedBytes-diskBytes)/WALCompactionTimeMs can be used calculate WAL compaction throughput.
The ratio of difference between first and last WAL compaction time over the interval
length is an estimate of percentage of cache throughput consumed.
Signed-off-by: Jon Seymour <jon@wildducktheories.com>