* feat(query): hyper log log counting in query engine
In addition to helping with normal queries, this can improve the 'SHOW CARDINALITY'
meta-queries:
time influx -database mydb -execute 'select count_hll(sum_hll(_seriesKey)) from big'
name: big
time count_hll
---- ---------
0 200767781
influx -database mydb -execute 0.06s user 0.12s system 0% cpu 8:49.99 total
When comparing strings in a case-insensitive way, strings.EqualFold() is
(almost?) always faster than comparing the results of strings.ToLower().
In addition, strings.EqualFold() never causes an allocation.
This patch replaces case-insensitive string comparisons that use
strings.ToLower() with a strings.EqualFold() call.
This commit quiets staticcheck's warnings about "unnecessary use of
fmt.Sprintf" and "unnecessary use of fmt.Sprint".
Prior to this commit we were wrapping simple constant strings without
any formatting verbs with fmt.Sprintf().
This integrates the influxdb 1.x series to the latest version of Flux
and updates the code to use it. It also removes the dependency on
platform and copies the necessary code from storage into the 1.x series
so the dependency is unneeded.
The flux functions specific to 1.x have been moved to the same structure
that flux changed to with having a `stdlib` directory instead of a
`functions` directory. It also adds a `databases()` function that
returns the databases from the meta client.
Specifically:
* renamed files for consistency between versions
* added `time-interval` schema option
* updated schema example documentation
Back port of improvements from #12710
If 120th or 240th value is not a 1, k still passes the check in the
switch, causing the last value to be lost. If this value occurs at
the boundary of a block, the max time will be incorrect, resulting in
compaction failing to make forward progress.
it's slightly slower, but the safety is worth it i think.
```
name old time/op new time/op delta
Next-8 30.0ns ± 2% 31.0ns ± 3% +3.56% (p=0.002 n=7+8)
NextParallel-8 79.4ns ± 1% 92.5ns ± 1% +16.58% (p=0.000 n=8+8)
```
use atomics rather than mutexes to synchronize state between calls.
```
name old time/op new time/op delta
Next-8 244ns ± 0% 30ns ± 2% -87.70% (p=0.000 n=8+7)
NextParallel-8 215µs ±60% 0µs ± 1% -99.96% (p=0.000 n=8+8)
```
The results for NextParallel are around ~80ns/op, but that doesn't
show up in the benchstat output.
multiple users have attempted to run influxdb in a docker container
with a windows host and a volume mounted from windows. that causes
problems because it apparently uses samba/cifs which does not
support fsync on directories. this patchset will, if it receives an EINVAL
on directory fsync, as is what appears to happen on samba/cifs, then it
will ignore it. this should help.
fixes#9833.
fixes#9630.
- reduce allocations by making leaf a value type with a bool
- make longestPrefix inlineable and have no bounds checks
- delete any code for functions we don't plan to use
- operate on []byte and only copy when necessary
- inline calls to sort.Search to avoid allocations and indirections
- insert directly in the correct location for addEdge
- reduce allocations during copying with a buffer helper
results:
name old time/op new time/op delta
Tree_Insert-8 1.10ms ± 4% 0.73ms ± 4% -33.54% (p=0.000 n=10+10)
Tree_InsertNew-8 3.18ms ± 2% 1.91ms ± 6% -39.90% (p=0.000 n=10+10)
name old speed new speed delta
Tree_Insert-8 9.12MB/s ± 4% 13.72MB/s ± 4% +50.46% (p=0.000 n=10+10)
Tree_InsertNew-8 3.15MB/s ± 2% 5.24MB/s ± 6% +66.42% (p=0.000 n=10+10)
name old alloc/op new alloc/op delta
Tree_InsertNew-8 1.62MB ± 0% 1.60MB ± 0% -1.28% (p=0.000 n=10+9)
name old allocs/op new allocs/op delta
Tree_InsertNew-8 35.0k ± 0% 15.0k ± 0% -57.04% (p=0.000 n=10+10)
MB/sec in this case is 1 byte per key inserted, so it's really millions
of keys inserted per second.
This is a fork of https://github.com/armon/go-radix that changes
a few things from the original:
* Does not allow updates to nodes
* Typed for int values only
* Is concurrent using a big lock on Tree
* Fix stream package to allow for renaming the file before writing it to the stream
* updated test to make sure that the final tsm file has more than one block
This commit adds initial empty sketches back to the tsi1 index, as well
as ensuring that ephemeral sketches in the index `LogFile` are updated
accordingly.
The commit also adds a test that verifies that the merged sketches at
the store level produce the correct results under writes, deletions and
re-opening of the store.
This commit does not provide working sketches for post-compaction on the
tsi1 index.