Commit Graph

24 Commits (db/6263/compaction-debug-logging)

Author SHA1 Message Date
WeblWabl 45a8227ad6
fix(influxd): update xxhash, avoid stringtoslicebyte in cache (#578) (#25622) (#25624)
* fix(influxd): update xxhash, avoid stringtoslicebyte in cache (#578)

* fix(influxd): update xxhash, avoid stringtoslicebyte in cache

This commit does 3 things:

* it updates xxhash from v1 to v2; v2 includes a assembly arm version of
  Sum64
* it changes the cache storer to write with a string key instead of a
  byte slice. The cache only reads the key which WriteMulti already has
as a string so we can avoid a host of allocations when converting back
and forth from immutable strings to mutable byte slices. This includes
updating the cache ring and ring partition to write with a string key
* it updates the xxhash for finding the cache ring partition to use
Sum64String which uses unsafe pointers to directly use a string as a
byte slice since it only reads the string. Note: this now uses an
assembly version because of the v2 xxhash update. Go 1.22 included new
compiler ability to recognize calls of Method([]byte(myString)) and not
make a copy but from looking at the call sites, I'm not sure the
compiler would recognize it as the conversion to a byte slice was
happening several calls earlier.

That's what this change set does. If we are uncomfortable with any of
these, we can do fewer of them (for example, not upgrade xxhash; and/or
not use the specialized Sum64String, etc).

For the performance issue in maz-rr, I see converting string keys to
byte slices taking between 3-5% of cpu usage on both the primary and
secondary. So while this pr doesn't address directly the increased cpu
usage on the secondary, it makes cpu usage less on both which still
feels like a win. I believe these changes are easier to review that
switching to a byte slice pool that is likely needed in other places as
the compiler provides nearly all of the correctness checks we need (we
are relying also on xxhash v2 being correct).

* helps #550

* chore: fix tests/lint

* chore: don't use assembly version; should inline

This 2 line change causes xxhash to use a purego Sum64 implementation
which allows the compiler to see that Sum64 only read the byte slice
input which them means is can skip the string to byte slice allocation
and since it can skip that, it should inline all the calls to
getPartitionStringKey and Sum64 avoiding 1 call to Sum64String which
isn't inlined.

* chore: update ci build file

the ci build doesn't use the make file!!!

* chore: revert "chore: update ci build file"

This reverts commit 94be66fde03e0bbe18004aab25c0e19051406de2.

* chore: revert "chore: don't use assembly version; should inline"

This reverts commit 67d8d06c02e17e91ba643a2991e30a49308a5283.

(cherry picked from commit 1d334c679ca025645ed93518b7832ae676499cd2)

* feat: need to update go sum

---------

Co-authored-by: Phil Bracikowski <13472206+philjb@users.noreply.github.com>
(cherry picked from commit 06ab224516)
2024-12-06 16:05:03 -06:00
Brandon Pfeifer e484c4d871
chore: upgrade Go to v1.19.3 (1.x) (#23941)
* chore: upgrade Go to 1.19.3

This re-runs ./generate.sh and ./checkfmt.sh to format and update
source code (this is primarily responsible for the huge diff.)

* fix: update tests to reflect sorting algorithm change
2022-11-28 12:15:47 -05:00
Sam Arnold c240c7906e
fix: a few suddenly flaky tests involving randomness (#21818)
Closes #21817
2021-07-09 12:17:23 -04:00
Sam Arnold 903b8cd0ea
feat(query): Hyper log log operators in influxql (#20603)
* feat(query): hyper log log counting in query engine

In addition to helping with normal queries, this can improve the 'SHOW CARDINALITY'
meta-queries:

time influx -database mydb -execute 'select count_hll(sum_hll(_seriesKey)) from big'
name: big
time count_hll
---- ---------
0    200767781
influx -database mydb -execute   0.06s user 0.12s system 0% cpu 8:49.99 total
2021-02-08 08:38:14 -05:00
Ben Wells e9bada090f Fix misspelling identified by misspell 2019-02-03 20:27:43 +00:00
Jacob Marble 2dc2b97fb9
tsdb/index: Add Bytes() methods (#9794) 2018-05-04 08:47:05 -07:00
Jacob Marble 63f0eee472 HLL++: estimate memory footprint bytes 2018-04-24 09:28:25 -07:00
Mark Rushakoff 426a9a0b8b Use math/bits exclusively instead of go-bits
We won't be rolling back to pre-Go1.9, so prefer the standard library
over a dependency that provides backwards compatibility.
2018-03-15 12:03:24 -07:00
Edd Robinson e5c8fd9dc5 Ensure nil sketches never returned 2018-02-09 15:29:42 +00:00
Edd Robinson 0d164f3164
WIP - tsi integration sketches 2018-02-07 14:52:13 -07:00
Edd Robinson 544329380f
Add empty series sketches back to tsi1 index
This commit adds initial empty sketches back to the tsi1 index, as well
as ensuring that ephemeral sketches in the index `LogFile` are updated
accordingly.

The commit also adds a test that verifies that the merged sketches at
the store level produce the correct results under writes, deletions and
re-opening of the store.

This commit does not provide working sketches for post-compaction on the
tsi1 index.
2018-02-07 14:52:13 -07:00
Edd Robinson a174f65595 use math/bits in HLL implementation 2017-09-26 12:51:08 +01:00
Seif Lotfy 4cb01c1768 change beta constants for the hll cardinality bias estimator 2017-06-30 07:47:16 -07:00
Seif Lotfy 643b2eb30c Switch to LogLog-Beta Cardinality estimation
The new algorithm uses only one formula and needs no additional bias corrections for the entire range of cardinalities,
therefore, it is more efficient and simpler to implement. Our simulations show that the accuracy provided by the new
algorithm is as good as or better than the accuracy provided by either of HyperLogLog or HyperLogLog++. The sparse
representation was kept in to provide better low cardinality accuracy. However the linear counting and range estimations
are replaced.
2017-06-20 15:25:01 +02:00
Ben Johnson 623ff67221
Fix HLL variableLengthList size decoding. 2017-05-19 11:44:25 -06:00
Edd Robinson 1c4ecb12c1 Don't panic on nil engine 2017-03-22 10:07:29 -06:00
Stuart Carnie 0ebbfb8f77 hll: skip recalc of sparseSet if tmpSet is empty
```
benchmark                                 old ns/op     new ns/op     delta
BenchmarkSet_Count/set_size_1000-8        38095         28.3          -99.93%
BenchmarkSet_Count/set_size_5000-8        152052        30.1          -99.98%
BenchmarkSet_Count/set_size_10000-8       50953         54978         +7.90%
BenchmarkSet_Count/set_size_50000-8       32495         31222         -3.92%
BenchmarkSet_Count/set_size_1000000-8     32603         30800         -5.53%

benchmark                                 old allocs     new allocs     delta
BenchmarkSet_Count/set_size_1000-8        4              0              -100.00%
BenchmarkSet_Count/set_size_5000-8        4              0              -100.00%
BenchmarkSet_Count/set_size_10000-8       0              0              +0.00%
BenchmarkSet_Count/set_size_50000-8       0              0              +0.00%
BenchmarkSet_Count/set_size_1000000-8     0              0              +0.00%

benchmark                                 old bytes     new bytes     delta
BenchmarkSet_Count/set_size_1000-8        16496         0             -100.00%
BenchmarkSet_Count/set_size_5000-8        16497         0             -100.00%
BenchmarkSet_Count/set_size_10000-8       0             0             +0.00%
BenchmarkSet_Count/set_size_50000-8       0             0             +0.00%
BenchmarkSet_Count/set_size_1000000-8     0             0             +0.00%
```
2017-01-31 08:51:05 -07:00
Edd Robinson ab94c1b743 Fixes #7882 2017-01-30 19:12:24 +00:00
Edd Robinson 695adafc00
Add measurement sketches 2017-01-05 10:15:37 -07:00
Edd Robinson 1339c7b146
Initialise HLL with error 2017-01-05 10:15:37 -07:00
Ben Johnson cb93f10120
Remove per-shard in-memory index. 2017-01-05 10:11:09 -07:00
Edd Robinson e2c3b52ca4
Adds a custom HyperLogLog++ implementation 2017-01-05 10:00:14 -07:00
Edd Robinson bd8dd9a291
Sketches working 2017-01-05 09:54:04 -07:00
Edd Robinson d19fbf5ab4
Wire in HLL estimator 2017-01-05 09:54:03 -07:00