* feat(query): hyper log log counting in query engine
In addition to helping with normal queries, this can improve the 'SHOW CARDINALITY'
meta-queries:
time influx -database mydb -execute 'select count_hll(sum_hll(_seriesKey)) from big'
name: big
time count_hll
---- ---------
0 200767781
influx -database mydb -execute 0.06s user 0.12s system 0% cpu 8:49.99 total
Extending the context instead of fixing the API breaks type safety.
For tracking the number of points / values written, it is much clearer
to pass an explicit tracker.
When using queries like 'select count(_seriesKey) from bigmeasurement`, we
should iterate over the tsi structures to serve the query instead of loading
all the series into memory up front.
Closes#20543
fields.idx frequent writes cause lock contention and fields.idx is recreated
when a field or measurement is added in a WritePointsWithContext()
This eliminates locking during the actual file rewrite, and limits it to
the times when the MeasurementFieldSet is actually being read or written
in memory and when the new file is being renamed.
Test verification of correct behavior by checking the fields.idx
file matches the in-memory copy after heavily parallel measurement addition.
Fixes https://github.com/influxdata/influxdb/issues/20500
When a SELECT INTO query generates an illegal value that cannot be inserted,
like +/- Inf, it should return an error, rather than failing silently.
This adds a boolean parameter to the [data] section of influxdb.conf:
* strict-error-handling
When false, the default, the old behavior is preserved. When true,
unsupported values will return an error from SELECT INTO queries
Fixes https://github.com/influxdata/influxdb/issues/20426
A customer has seen a rash of "connection refused" errors to the meta node.
This fix ensures that when net.Listener instances are closed because of an
error in Accept(), influxdb logs the error which caused the closures, as well
as any errors in closing the Listeners.
Fixes https://github.com/influxdata/influxdb/issues/20256
JSON marshalling errors should be returned properly formatted in JSON
like other errors. This fix formats marshalling errors the same way
influxdb formats other query errors.
Fixes https://github.com/influxdata/influxdb/issues/20249
* fix(query): Group By queries with offset that crosses a DST boundary can fail
Customer reported that a GROUP BY query with an offset that caused an interval
to cross a daylight savings change inserted an extra output row off by one hour.
This fix ensured that the start time for the interval of a GROUP BY operator is
correctly set before calculating the time zone offset for that date and time.
Add TestGroupByIterator_DST() in query/iterator_test.go
for regression testing of this bug.
Fixes https://github.com/influxdata/influxdb/issues/20238
Loop with backoff in (*Engine).CreateSnapshot() to retry
(*Engine).WriteSnapshot() up to 3 times if
ErrSnapshotInPrgress is returned. Then continue
on no error or on SnapshotInProgress if skipCacheOk is
true.
https://github.com/influxdata/plutonium/issues/3227
(cherry picked from commit dfa6aa8cea)
Test the skipCacheOk flag to tsdb.Shard.CreateSnapshot() and
tsdb.Engine.CreateSnapshot()
A value of true allows the backup to proceed even if a cache
snapshot cannot be taken.
https://github.com/influxdata/plutonium/issues/3227
This fix adds a skipCacheOk flag to
tsdb.Store.CreateShardSnapshot() and tsdb.Shard.CreateSnapshot()
to pass to tsdb.Engine.CreateSnapshot()
A value of true allows the backup to proceed even if a cache snapshot
cannot be taken.
This flag is set to true in tsm1.Engine.Backup(), the OSS backup code path
This flag is set to false in tsm1.Engine.Export()
https://github.com/influxdata/plutonium/issues/3227
* fix: Upgrade version of jwt-go package to v4.0.0
This commit updates the dependencies for influxdb to require v4.0.0-preview1 of
the jwt-go package. This required updating the go.mod and go.sum files as well
as any source file that directly imported that package.
Prior to this commit, the TestHandler_Query_Auth() tests would fail as it
checked for specific error strigns returned by the jwt-go package.
Version 4.0.0-preview1 of the package changed the verbiage of those errors a
bit. This patch updates the test to detect the new error string.
When an InfluxDB database is very busy writing new points the backup
the process can fail because it can not write a new snapshot.
The error is: operation timed out with error: create snapshot: snapshot in progress.
This happens because InfluxDB takes almost "continuously" a snapshot
from the cache caused by the high number of points ingested.
The fix for this was https://github.com/influxdata/influxdb/pull/16627
but it was for OSS only, and was not in the code path for backups
in clusters.
This fix adds a skipCacheOk flag to tsdb.Engine.CreateSnapshot().
A value of true allows the backup to proceed even if a cache snapshot
cannot be taken.
This flag is set to true in tsm1.Engine.Backup(), the OSS backup code path
and in tsdb.Shard.CreateSnapshot(), the cluster backup code path.
This flag is set to false in tsm1.Engine.Export()
https://github.com/influxdata/plutonium/issues/3227
Prior to this commit, we had our own recursive file walker which
required a condition based on if s.Config.TypesDB pointed to a directory
or a regular file.
This commit replaces our own readdir() with filepath.Walk() and treats
recursing directories and loading one file as a single case. This
simplifies the code quite a bit.
* feat: generate modern profiles
Prior to this commit, influxd was writing legacy profiling data which
often (always?) required an accompanying executable to use.
This commit instructs influxd to write profiles in the new format which
can be examined without a binary.
While we're at it, this commit also adds the allocs and threadcreate
profiles.
Finally, this patch also changes the format of the downloaded tar in the
following ways:
* The profiles are added to the profile/ directory -- so instead of
extracting the profiles into your current directory, they're placed in
a "profiles" directory.
* This commit adds the .pb.gz extension to each of the files since
they're gzipped protobuf files and not .txt.