Commit Graph

45 Commits (db/6263/compaction-debug-logging)

Author SHA1 Message Date
WeblWabl 45a8227ad6
fix(influxd): update xxhash, avoid stringtoslicebyte in cache (#578) (#25622) (#25624)
* fix(influxd): update xxhash, avoid stringtoslicebyte in cache (#578)

* fix(influxd): update xxhash, avoid stringtoslicebyte in cache

This commit does 3 things:

* it updates xxhash from v1 to v2; v2 includes a assembly arm version of
  Sum64
* it changes the cache storer to write with a string key instead of a
  byte slice. The cache only reads the key which WriteMulti already has
as a string so we can avoid a host of allocations when converting back
and forth from immutable strings to mutable byte slices. This includes
updating the cache ring and ring partition to write with a string key
* it updates the xxhash for finding the cache ring partition to use
Sum64String which uses unsafe pointers to directly use a string as a
byte slice since it only reads the string. Note: this now uses an
assembly version because of the v2 xxhash update. Go 1.22 included new
compiler ability to recognize calls of Method([]byte(myString)) and not
make a copy but from looking at the call sites, I'm not sure the
compiler would recognize it as the conversion to a byte slice was
happening several calls earlier.

That's what this change set does. If we are uncomfortable with any of
these, we can do fewer of them (for example, not upgrade xxhash; and/or
not use the specialized Sum64String, etc).

For the performance issue in maz-rr, I see converting string keys to
byte slices taking between 3-5% of cpu usage on both the primary and
secondary. So while this pr doesn't address directly the increased cpu
usage on the secondary, it makes cpu usage less on both which still
feels like a win. I believe these changes are easier to review that
switching to a byte slice pool that is likely needed in other places as
the compiler provides nearly all of the correctness checks we need (we
are relying also on xxhash v2 being correct).

* helps #550

* chore: fix tests/lint

* chore: don't use assembly version; should inline

This 2 line change causes xxhash to use a purego Sum64 implementation
which allows the compiler to see that Sum64 only read the byte slice
input which them means is can skip the string to byte slice allocation
and since it can skip that, it should inline all the calls to
getPartitionStringKey and Sum64 avoiding 1 call to Sum64String which
isn't inlined.

* chore: update ci build file

the ci build doesn't use the make file!!!

* chore: revert "chore: update ci build file"

This reverts commit 94be66fde03e0bbe18004aab25c0e19051406de2.

* chore: revert "chore: don't use assembly version; should inline"

This reverts commit 67d8d06c02e17e91ba643a2991e30a49308a5283.

(cherry picked from commit 1d334c679ca025645ed93518b7832ae676499cd2)

* feat: need to update go sum

---------

Co-authored-by: Phil Bracikowski <13472206+philjb@users.noreply.github.com>
(cherry picked from commit 06ab224516)
2024-12-06 16:05:03 -06:00
Jack 6af0be9234
fix: panic index out of range for invalid series keys (#24565)
* chore: add scaffolding for naive solution

* feat: test case scaffolding

* fix: implement check for series key before proceeding

* fix: add validation for ReadSeriesKeyMeasurement usage

* refactor: explicit use of series key len

* feat: add remaining check to index

* feat: add check to remaining files

As the Len function is used as part of the parseSeriesKey, this also needs to be accounted for on the nil return from this function as it is used in different contexts

* feat: expand test cases

* chore: go fmt

* chore: update test failure message

* chore: impl feedback on unnecessary sz checks

* feat: expand test cases

* fix: nil series key check

In both sections for index.go there is a pre-existing length check against the series key which should catch invalid values, perhaps this explains why it hasn't cropped up in the reported panics. For even more safety, we can also skip a nil key because we know that subsequent calls will cause a panic where this key is attempted to be used

* fix: remove nil tags check

A key with no tags is valid, so we should not check for BOTH nil key and tags as a key could be nil, which is invalid, yet still have tags and therefore cause the check to pass which we do not want

* feat: extend test cases from feedback

* fix: extend checks for CompareSeriesKeys

* feat: add nilKeyHandler for shared key checking logic

* fix: logical error in nilKeyHandler

Prior to this, the else was always defaulted to at the end of the conditional branch, which causes unexpected behaviour and a failure of a bunch of tests.

* fix: return tags keep nil data

In a recent change to this, we agreed on a simple name == nil check for the actual data. As a follow on to this, I just realised that we don't actually want to nil back the tags, even if they're not checked, because having no tags is a valid input so we can simply return whatever we were passed unchanged.

* fix: use len == 0 for extra safety

* feat: extra test for blank series key
2024-01-23 09:44:29 +00:00
davidby-influx 969abf3da2
fix: avoid SIGBUS when reading non-std series segment files (#24509)
Some series files which are smaller than the standard
sizes cause SIGBUS in influx_inspect and influxd, because
entry iteration walks onto mapped memory not backed by the
the file.  Avoid walking off the end of the file while
iterating series entries in oddly sized files.

closes https://github.com/influxdata/influxdb/issues/24508

Co-authored-by: Geoffrey Wossum <gwossum@influxdata.com>
2023-12-08 15:46:11 -08:00
Sam Arnold 21823db00b
feat: series creation ingress metrics (#20700)
After turning this on and testing locally, note the 'seriesCreated' metric

"localStore": {"name":"localStore","tags":null,"values":{"pointsWritten":2987,"seriesCreated":58,"valuesWritten":23754}},
"ingress": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"cq","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":4}},
"ingress:1": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"database","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":4}},
"ingress:2": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"httpd","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":46}},
"ingress:3": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"ingress","rp":"monitor"},"values":{"pointsWritten":14,"seriesCreated":14,"valuesWritten":42}},
"ingress:4": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"localStore","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":6}},
"ingress:5": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"queryExecutor","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":10}},
"ingress:6": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"runtime","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":30}},
"ingress:7": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"shard","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":22}},
"ingress:8": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"subscriber","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":6}},
"ingress:9": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_cache","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":18}},
"ingress:10": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_engine","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":58}},
"ingress:11": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_filestore","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":4}},
"ingress:12": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_wal","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":8}},
"ingress:13": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"write","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":18}},
"ingress:14": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"cpu","rp":"autogen"},"values":{"pointsWritten":1342,"seriesCreated":13,"valuesWritten":13420}},
"ingress:15": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"disk","rp":"autogen"},"values":{"pointsWritten":642,"seriesCreated":6,"valuesWritten":4494}},
"ingress:16": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"diskio","rp":"autogen"},"values":{"pointsWritten":214,"seriesCreated":2,"valuesWritten":2354}},
"ingress:17": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"mem","rp":"autogen"},"values":{"pointsWritten":107,"seriesCreated":1,"valuesWritten":963}},
"ingress:18": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"processes","rp":"autogen"},"values":{"pointsWritten":107,"seriesCreated":1,"valuesWritten":856}},
"ingress:19": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"swap","rp":"autogen"},"values":{"pointsWritten":214,"seriesCreated":1,"valuesWritten":642}},
"ingress:20": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"system","rp":"autogen"},"values":{"pointsWritten":321,"seriesCreated":1,"valuesWritten":749}},

Closes: https://github.com/influxdata/influxdb/issues/20613
2021-02-05 14:52:43 -04:00
Sam Arnold 98a76a11a0 feat(tsi): optimize series iteration
When using queries like 'select count(_seriesKey) from bigmeasurement`, we
should iterate over the tsi structures to serve the query instead of loading
all the series into memory up front.

Closes #20543
2021-01-25 14:27:31 -05:00
Edd Robinson 34c0fdafc0 feat(storage): Offline series file compaction 2020-02-03 13:57:31 -07:00
Jacob Marble b731cc5e60
feat(storage): Limit concurrent series partition compaction (#14240)
* feat(storage): Limit concurrent series partition snapshots

* feat: make concurrency configurable

* fix: integrate review feedback

* refactor: rename config value
2019-07-30 10:34:06 -07:00
Grzegorz Pomykala 8448cf4a9c build fixed 2019-02-06 16:27:59 +01:00
Grzegorz Pomykala fb3c837de9 code review sugestions applied 2019-02-06 09:10:51 +01:00
Grzegorz Pomykala be97f36e66 trace on SeriesFile.Open() failure 2018-12-06 15:58:05 +01:00
Grzegorz Pomykala a346109198 do not acquire a lock upon closing a SeriesFile when called from Open() method 2018-12-06 11:59:03 +01:00
Ben Johnson e4f8637234 Fix ParseSeriesKeyInto() buffer shrinkage. 2018-09-18 15:58:38 -07:00
Edd Robinson 6b3860e9a1 Reduce allocations in TSI TagSets implementation
Since all tag sets are materialised to strings before this method
returns, a large number of allocations can be avoided by carefully
resuing buffers and containers.

This commit reduces allocations by about 75%, which can be very
significant for high cardinality workloads.

The benchmark results shown below are for a benchmark that asks for all
series keys matching `tag5=value0'.

name                                               old time/op    new time/op    delta
Index_ConcurrentWriteQuery/inmem/queries_100000-8     5.66s ± 4%     5.70s ± 5%     ~     (p=0.739 n=10+10)
Index_ConcurrentWriteQuery/tsi1/queries_100000-8      26.5s ± 8%     26.8s ±12%     ~     (p=0.579 n=10+10)
IndexSet_TagSets/1M_series/inmem-8                   11.9ms ±18%    10.4ms ± 2%  -12.81%  (p=0.000 n=10+10)
IndexSet_TagSets/1M_series/tsi1-8                    23.4ms ± 5%    18.9ms ± 1%  -19.07%  (p=0.000 n=10+9)

name                                               old alloc/op   new alloc/op   delta
Index_ConcurrentWriteQuery/inmem/queries_100000-8    2.50GB ± 0%    2.50GB ± 0%     ~     (p=0.315 n=10+10)
Index_ConcurrentWriteQuery/tsi1/queries_100000-8     32.6GB ± 0%    32.6GB ± 0%     ~     (p=0.247 n=10+10)
IndexSet_TagSets/1M_series/inmem-8                   3.56MB ± 0%    3.56MB ± 0%     ~     (all equal)
IndexSet_TagSets/1M_series/tsi1-8                    12.7MB ± 0%     5.2MB ± 0%  -59.02%  (p=0.000 n=10+10)

name                                               old allocs/op  new allocs/op  delta
Index_ConcurrentWriteQuery/inmem/queries_100000-8     24.0M ± 0%     24.0M ± 0%     ~     (p=0.353 n=10+10)
Index_ConcurrentWriteQuery/tsi1/queries_100000-8      96.6M ± 0%     96.7M ± 0%     ~     (p=0.579 n=10+10)
IndexSet_TagSets/1M_series/inmem-8                     51.0 ± 0%      51.0 ± 0%     ~     (all equal)
IndexSet_TagSets/1M_series/tsi1-8                     80.4k ± 0%     20.4k ± 0%  -74.65%  (p=0.000 n=10+10)
2018-08-10 16:01:49 +01:00
Jacob Marble 5e4085a9df Correct godoc for SeriesFile.CreateSeriesListIfNotExists 2018-05-10 17:03:09 -07:00
Jacob Marble 87d73d405c tsdb/SeriesFile: remove unused function param 2018-05-04 11:22:12 -07:00
Jason Wilder 97f61e0ff4 Allow SeriesFile compaction to be disabled 2018-01-18 15:54:52 -07:00
Jason Wilder 28edf1392a Use full 32bits for series IDs
This reworks the series ID allocation to prevent an overflow issue.
2018-01-18 09:45:36 -07:00
Jason Wilder 5d1f76192a Ensure series file is not closed while in use 2018-01-12 16:58:33 -07:00
Jason Wilder 72910b6bf0 Don't clear partitions on close
Since partitions slice is not protected under a lock, setting it to
nil while closing causes a race if concurrent calls are accessing
the partitions slice elsewhere.  Since partitions do handle
concurrency and Open resets the slice, just leave it as it after
closing.
2018-01-11 15:46:38 -07:00
Ben Johnson 9bf45fcae0
Improve inmem insert performance with non-sequential series ids. 2018-01-10 13:08:16 -07:00
Ben Johnson ac4dc91c64
Partition series file. 2018-01-10 08:33:25 -07:00
Ben Johnson c0a46d2d3d
Fix series file compaction stall.
The series file compaction previously did not snapshot the max
offset before compacting and would keep compacting until it reached
the end of segment file. This caused more entries than expected into
the RHH map and this map gets exponentially slower as it gets close
to full.
2018-01-08 09:53:01 -07:00
Ben Johnson 6f32f15fd3
Add closed series file checks. 2018-01-08 09:11:29 -07:00
Ben Johnson 28e6a1a7c2
Fix series file key/id map compaction. 2018-01-04 09:16:29 -07:00
Ben Johnson 3ac428920c
Merge pull request #9276 from benbjohnson/fix-series-index-compaction
Fix series index compaction.
2018-01-03 15:13:19 -07:00
Ben Johnson 48648e828d
Fix series index compaction. 2018-01-03 12:47:07 -07:00
Ben Johnson 31c50532db
Add series existence check in tsi1. 2018-01-03 12:20:35 -07:00
Ben Johnson 3900c948a2
Fix requested changes. 2018-01-03 10:04:12 -07:00
Ben Johnson 52630e69d7
Integrate SeriesFileCompactor 2018-01-02 12:20:03 -07:00
Ben Johnson 56980b0d24
Segment series file 2017-12-29 11:57:45 -07:00
Ben Johnson 4ab1542cfc
Series file compactor. 2017-12-29 11:57:45 -07:00
Ben Johnson f240a930c7
Preserve original mmap in series file. 2017-12-21 20:00:05 -07:00
Ben Johnson 895ca7a04b
Adjust 386 max series file size. 2017-12-21 14:50:07 -07:00
Ben Johnson 4fd48cfcd1
Replace in-memory series file map with rhh. 2017-12-21 12:57:21 -07:00
Ben Johnson 4d7426ebbd
Fix race bug. 2017-12-21 10:12:21 -07:00
Ben Johnson d8b1d208c0
rebase 2017-12-20 15:13:34 -07:00
Ben Johnson 8b2dbf4d83
Merge branch 'er-tsi-index-part' of https://github.com/influxdata/influxdb into er-tsi-index-part 2017-12-19 10:33:02 -07:00
Ben Johnson 107291c6b0
series file refactor 2017-12-19 10:31:33 -07:00
Edd Robinson d59f79338b Update series map threshold 2017-12-15 15:58:01 +00:00
Edd Robinson 3bfe525705 Add 32-bit support to series file
This commit ensures that the series file should work appropriately on
32-bit architecturs. It does this by reducing the maximum size of a
series file to 512MB on 32-bit systems, which should be fully
addressable.

It further updates tests so that the series file size can be reduced
further when running many tests in parallel on 32-bit architectures.
2017-12-15 15:47:26 +00:00
Edd Robinson 68de5ca24f 🔥 little endianness 2017-11-30 17:57:16 +00:00
Edd Robinson a46f186118 Tweak constants 2017-11-30 17:23:03 +00:00
Ben Johnson 01491ca4f4
intermediate 2017-11-27 07:52:18 -07:00
Ben Johnson fc966a1b67
Add series file backup/restore. 2017-11-22 08:55:54 -07:00
Ben Johnson ede3fcf98e
intermediate 2017-11-15 16:09:25 -07:00