Commit Graph

135 Commits (1.11)

Author SHA1 Message Date
Geoffrey Wossum 733faa40d5
chore: improve error messages and logging during shard opening (#25323)
(cherry picked from commit 23008e5286)

Closes: #25320
2024-09-12 15:59:05 -05:00
WeblWabl a709074fb3
feat: add hook for optimizing series reads based on authorizer (#25207) (#25283) 2024-09-04 18:40:00 -05:00
Jack 8b9a9c63c5
fix: panic index out of range for invalid series keys (#24565) (#24595)
* chore: add scaffolding for naive solution

* feat: test case scaffolding

* fix: implement check for series key before proceeding

* fix: add validation for ReadSeriesKeyMeasurement usage

* refactor: explicit use of series key len

* feat: add remaining check to index

* feat: add check to remaining files

As the Len function is used as part of the parseSeriesKey, this also needs to be accounted for on the nil return from this function as it is used in different contexts

* feat: expand test cases

* chore: go fmt

* chore: update test failure message

* chore: impl feedback on unnecessary sz checks

* feat: expand test cases

* fix: nil series key check

In both sections for index.go there is a pre-existing length check against the series key which should catch invalid values, perhaps this explains why it hasn't cropped up in the reported panics. For even more safety, we can also skip a nil key because we know that subsequent calls will cause a panic where this key is attempted to be used

* fix: remove nil tags check

A key with no tags is valid, so we should not check for BOTH nil key and tags as a key could be nil, which is invalid, yet still have tags and therefore cause the check to pass which we do not want

* feat: extend test cases from feedback

* fix: extend checks for CompareSeriesKeys

* feat: add nilKeyHandler for shared key checking logic

* fix: logical error in nilKeyHandler

Prior to this, the else was always defaulted to at the end of the conditional branch, which causes unexpected behaviour and a failure of a bunch of tests.

* fix: return tags keep nil data

In a recent change to this, we agreed on a simple name == nil check for the actual data. As a follow on to this, I just realised that we don't actually want to nil back the tags, even if they're not checked, because having no tags is a valid input so we can simply return whatever we were passed unchanged.

* fix: use len == 0 for extra safety

* feat: extra test for blank series key
2024-01-23 17:07:52 +00:00
davidby-influx 926020e331
fix: correct error return shadowing (#22353) 2021-08-31 11:46:21 -07:00
davidby-influx a989f8f8b6
fix: copy names from mmapped memory before closing iterator (#22040)
This fix ensures that memory-mapped files are not released
before pointers into them are copied into heap memory.
MeasurementNamesByExpr() and MeasurementNamesByPredicate() can
cause panics by copying memory from mmapped files that have been
released. The functions they call use iterators to files which
are closed (releasing the mmapped files) before the memory is
safely copied to the heap.

closes https://github.com/influxdata/influxdb/issues/22000
2021-08-04 13:16:00 -07:00
Sam Arnold b7e7de24d6
refactor: separate coarse and fine permission interfaces (#20996) 2021-03-22 09:52:33 -04:00
Sam Arnold 04f4817aae
fix(services/storage): multi measurement queries return all applicable series (#19592) (#20934)
This fixes multi measurement queries that go through the storage service
to correctly pick up all series that apply with the filter. Previously,
negative queries such as `!=`, `!~`, and predicates attempting to match
empty tags did not work correctly with the storage service when multiple
measurements or `OR` conditions were included.

This was because these predicates would be categorized as "multiple
measurements" and then it would attempt to use the field keys iterator
to find the fields for each measurement. The meta queries for these did
not correctly account for negative equality operators or empty tags when
finding appropriate measurements and those could not be changed because
it would cause a breaking change to influxql too.

This modifies the storage service to use new methods that correctly
account for the above situations rather than the field keys iterator.

Some queries that appeared to be single measurement queries also get
considered as multiple measurement queries. Any query with an `OR`
condition will be considered a multiple measurement query.

This bug did not apply to single measurement queries where one
measurement was selected and all of the logical operators were `AND`
values. This is because it used a different code path that correctly
handled these situations.

Backport of #19566.

(cherry picked from commit ceead88bd5)

Co-authored-by: Jonathan A. Sternberg <jonathan@influxdata.com>
2021-03-12 16:34:14 -05:00
Sam Arnold 17b9ea8723
feat: Add WITH KEY to show tag keys (#20793)
* fix: Change from RewriteExpr to PartitionExpr

Also remove some dead code

* feat: WITH KEY implementation

* feat: query rewriting for WITH KEY in SHOW TAG KEYS
2021-02-25 08:38:29 -05:00
Sam Arnold de1a0eb2a9
feat: use count_hll for 'show series cardinality' queries (#20745)
Closes: https://github.com/influxdata/influxdb/issues/20614

Also fix nil pointer for seriesKey iterator

Fix for bug in: https://github.com/influxdata/influxdb/issues/20543

Also add a test for ingress metrics
2021-02-10 16:00:16 -05:00
Sam Arnold 21823db00b
feat: series creation ingress metrics (#20700)
After turning this on and testing locally, note the 'seriesCreated' metric

"localStore": {"name":"localStore","tags":null,"values":{"pointsWritten":2987,"seriesCreated":58,"valuesWritten":23754}},
"ingress": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"cq","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":4}},
"ingress:1": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"database","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":4}},
"ingress:2": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"httpd","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":46}},
"ingress:3": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"ingress","rp":"monitor"},"values":{"pointsWritten":14,"seriesCreated":14,"valuesWritten":42}},
"ingress:4": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"localStore","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":6}},
"ingress:5": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"queryExecutor","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":10}},
"ingress:6": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"runtime","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":30}},
"ingress:7": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"shard","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":22}},
"ingress:8": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"subscriber","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":6}},
"ingress:9": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_cache","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":18}},
"ingress:10": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_engine","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":58}},
"ingress:11": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_filestore","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":4}},
"ingress:12": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_wal","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":8}},
"ingress:13": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"write","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":18}},
"ingress:14": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"cpu","rp":"autogen"},"values":{"pointsWritten":1342,"seriesCreated":13,"valuesWritten":13420}},
"ingress:15": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"disk","rp":"autogen"},"values":{"pointsWritten":642,"seriesCreated":6,"valuesWritten":4494}},
"ingress:16": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"diskio","rp":"autogen"},"values":{"pointsWritten":214,"seriesCreated":2,"valuesWritten":2354}},
"ingress:17": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"mem","rp":"autogen"},"values":{"pointsWritten":107,"seriesCreated":1,"valuesWritten":963}},
"ingress:18": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"processes","rp":"autogen"},"values":{"pointsWritten":107,"seriesCreated":1,"valuesWritten":856}},
"ingress:19": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"swap","rp":"autogen"},"values":{"pointsWritten":214,"seriesCreated":1,"valuesWritten":642}},
"ingress:20": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"system","rp":"autogen"},"values":{"pointsWritten":321,"seriesCreated":1,"valuesWritten":749}},

Closes: https://github.com/influxdata/influxdb/issues/20613
2021-02-05 14:52:43 -04:00
Sam Arnold 98a76a11a0 feat(tsi): optimize series iteration
When using queries like 'select count(_seriesKey) from bigmeasurement`, we
should iterate over the tsi structures to serve the query instead of loading
all the series into memory up front.

Closes #20543
2021-01-25 14:27:31 -05:00
dengzhi.ldz 331569bc11 perf(tsi1): batch write tombstone entries when dropping/deleting 2020-06-24 09:26:09 -06:00
Ben Johnson 51f647d763 fix(tsdb): Defer closing of underlying SeriesIDSetIterators
This commit changes the SeriesIDSet merge/union/intersect functions
to attach the underlying iterators as closers so that files can be
retained until the data is no longer in use. The roaring operations
can leave containers pointing at mmap data in the resulting bitmap
so we have to track underlying file usage until the data is finished
with.
2020-05-22 10:46:05 -06:00
Ben Wells e9bada090f Fix misspelling identified by misspell 2019-02-03 20:27:43 +00:00
Jeff Wendling 0a2f6191a6 tsdb: clean up fields index for every kind of delete
Before this, if you deleted everything with `delete where true`
for example, then you would be left with all of your measurements
in the fields index. That would cause ghost fields to reappear
if someone reinserted to the measurement.

This fixes that by making it so the deepest most delete code
checks if the measurement was removed from the index, and if so
cleaning it up out of the fields index.

Additionally, it fixes bugs in that cleanup code where if you had
a measurement like "m1" and "m10", when iterating over the cache
or file store, "m1" would match "m10" due to it only checking the
prefix. This also has it check the character right after the
measurement to be either a comma because tags started, or the first
character of the field separator.
2018-11-27 16:12:06 -07:00
Edd Robinson 42827219f3
Merge pull request #10423 from influxdata/er-nil-shard
Fix panic in IndexSet
2018-10-26 19:05:08 +01:00
Jeff Wendling 5c2d36225d fix(tsdb): copy measurement names when expression is provided
We already make copies when no expression is provided, because
the backing slices may go away if the shard they came from is
closed. This fixes the other spot where some backing slices
would be returned.
2018-10-26 11:25:25 -06:00
Edd Robinson cade59e253 Fix panic in IndexSet
This commit fixes a panic where a concurrent removal of a shard and meta
query could cause a `nil` index to be added to the IndexSet`.
2018-10-26 18:23:54 +01:00
Jonathan A. Sternberg af8bf99256
Do not panic when a series id iterator is nil 2018-10-11 15:16:59 -05:00
Edd Robinson 52b5640a4a Add test for TagValueSeriesIDIterator 2018-09-18 15:58:38 -07:00
Edd Robinson 7d00a45ebf Don't allocate when reading tombstone SeriesID set 2018-09-18 15:58:38 -07:00
Edd Robinson dece5b847f Refactor index names 2018-08-21 14:32:30 +01:00
Edd Robinson a67f15fad4 Promote DropSeriesGlobal to Index interface 2018-08-20 17:57:16 +01:00
Edd Robinson 6b3860e9a1 Reduce allocations in TSI TagSets implementation
Since all tag sets are materialised to strings before this method
returns, a large number of allocations can be avoided by carefully
resuing buffers and containers.

This commit reduces allocations by about 75%, which can be very
significant for high cardinality workloads.

The benchmark results shown below are for a benchmark that asks for all
series keys matching `tag5=value0'.

name                                               old time/op    new time/op    delta
Index_ConcurrentWriteQuery/inmem/queries_100000-8     5.66s ± 4%     5.70s ± 5%     ~     (p=0.739 n=10+10)
Index_ConcurrentWriteQuery/tsi1/queries_100000-8      26.5s ± 8%     26.8s ±12%     ~     (p=0.579 n=10+10)
IndexSet_TagSets/1M_series/inmem-8                   11.9ms ±18%    10.4ms ± 2%  -12.81%  (p=0.000 n=10+10)
IndexSet_TagSets/1M_series/tsi1-8                    23.4ms ± 5%    18.9ms ± 1%  -19.07%  (p=0.000 n=10+9)

name                                               old alloc/op   new alloc/op   delta
Index_ConcurrentWriteQuery/inmem/queries_100000-8    2.50GB ± 0%    2.50GB ± 0%     ~     (p=0.315 n=10+10)
Index_ConcurrentWriteQuery/tsi1/queries_100000-8     32.6GB ± 0%    32.6GB ± 0%     ~     (p=0.247 n=10+10)
IndexSet_TagSets/1M_series/inmem-8                   3.56MB ± 0%    3.56MB ± 0%     ~     (all equal)
IndexSet_TagSets/1M_series/tsi1-8                    12.7MB ± 0%     5.2MB ± 0%  -59.02%  (p=0.000 n=10+10)

name                                               old allocs/op  new allocs/op  delta
Index_ConcurrentWriteQuery/inmem/queries_100000-8     24.0M ± 0%     24.0M ± 0%     ~     (p=0.353 n=10+10)
Index_ConcurrentWriteQuery/tsi1/queries_100000-8      96.6M ± 0%     96.7M ± 0%     ~     (p=0.579 n=10+10)
IndexSet_TagSets/1M_series/inmem-8                     51.0 ± 0%      51.0 ± 0%     ~     (all equal)
IndexSet_TagSets/1M_series/tsi1-8                     80.4k ± 0%     20.4k ± 0%  -74.65%  (p=0.000 n=10+10)
2018-08-10 16:01:49 +01:00
Jacob Marble f1fc1b0264
Merge pull request #10175 from influxdata/jgm-copy-byte-slices
tsdb: Copy return value of IndexSet.MeasurementNamesByExpr
2018-08-09 11:20:45 -07:00
Stuart Carnie 990824ceca fix(tsdb): Fix panic, don't add nil iterator to slice
fixes #10171
2018-08-09 10:12:49 -07:00
Jacob Marble 7bd9b2a627 tsdb: Copy return value of IndexSet.MeasurementNamesByExpr 2018-08-08 23:48:06 -07:00
Ben Johnson 979d790154
Implement bitset iterator 2018-07-05 09:01:22 -06:00
Edd Robinson 6059db3d3a Filter series IDs at the last possible moment 2018-07-02 16:48:40 +01:00
Edd Robinson 609b980671 Don't filter at low-level 2018-07-02 16:47:44 +01:00
Ben Johnson 8be85c154a
Allow value filtering on SHOW TAG VALUES
This commit allows users to filter on the `value` field in the
`SHOW TAG VALUES` command:

	SHOW TAG VALUES WITH KEY = "mytag" WHERE "value" = 'myvalue'

Previously this command would return all values.
2018-06-28 09:50:03 -06:00
David Norton aa61f5016e
Merge pull request #9970 from influxdata/dn-show-tag-keys-perf
fix SHOW TAG KEYS perfomance regression
2018-06-15 17:35:06 -04:00
David Norton 57f97a72e6 fix SHOW TAG KEYS perfomance regression 2018-06-15 11:26:43 -04:00
Edd Robinson 28b6df7afb Ensure remote read can handle no data in time 2018-06-12 23:10:18 +01:00
Jacob Marble 735aa2d7dc Add SeriesIDSet() to Index interface 2018-05-18 09:22:43 -07:00
Jacob Marble 7f8b7af61e
Cleanup index memory footprint counting code (#9828)
* Fix IndexSet.DedupeInmemIndexes

* Cleanup index memory footprint code
2018-05-15 11:25:19 -07:00
Jacob Marble 0763d1789e Get inmem index bytes without double-counting 2018-05-10 11:33:52 -07:00
Jacob Marble 2dc2b97fb9
tsdb/index: Add Bytes() methods (#9794) 2018-05-04 08:47:05 -07:00
Jonathan A. Sternberg 6607c29a02
Merge pull request #9649 from influxdata/js-eval-functions-in-where
Allow math functions to be used in the condition
2018-05-02 08:29:08 -05:00
Stuart Carnie 7ebfc9c544 add default to avoid blocking 2018-04-12 15:42:33 -07:00
Jonathan A. Sternberg 1f9227e20c Allow math functions to be used in the condition 2018-04-10 10:55:34 -05:00
Ben Johnson f6fdba2590
Allow SHOW SERIES kill. 2018-03-15 11:22:34 -06:00
Stuart Carnie 6cf6ae7af4 Use combined IndexSet when executing meta queries
* removed unused fieldset field
2018-03-15 09:59:11 -07:00
Edd Robinson ec93b0eb0c Ensure all shards checked for fields within an IndexSet 2018-03-12 15:25:45 +00:00
Jonathan A. Sternberg 87ac8ad385
Merge pull request #9491 from influxdata/js-9290-index-boolean-literals
Evaluate a true boolean literal when calculating tag sets
2018-02-28 09:14:24 -06:00
Jonathan A. Sternberg 6baf354818 Evaluate a true boolean literal when calculating tag sets 2018-02-28 08:08:21 -06:00
Stuart Carnie 48fb2a4cc5
Merge pull request #9487 from influxdata/sgc-tagsets
fallback to inmem TagSets implementation
2018-02-27 09:06:54 -07:00
Stuart Carnie b72e0c5941 fallback to inmem TagSets implementation 2018-02-27 07:49:51 -07:00
Edd Robinson 96c0ecf618 Improve startup time of `inmem` index
This commit improves the startup time when using the `inmem` index by
ensuring that the series are created in the index and series file in
batches of 10000, rather than individually.

Fixes #9486.
2018-02-27 13:33:00 +00:00
Edd Robinson 544329380f
Add empty series sketches back to tsi1 index
This commit adds initial empty sketches back to the tsi1 index, as well
as ensuring that ephemeral sketches in the index `LogFile` are updated
accordingly.

The commit also adds a test that verifies that the merged sketches at
the store level produce the correct results under writes, deletions and
re-opening of the store.

This commit does not provide working sketches for post-compaction on the
tsi1 index.
2018-02-07 14:52:13 -07:00