Commit Graph

319 Commits (522c32754c95da79c3811b397ded4ff4f75c33e0)

Author SHA1 Message Date
Geoffrey Wossum 160cf678d5
fix: MeasurementsCardinality should not be less than 0 (#23286)
Clamp the value of Store.MeasurementsCardinality so that it can not be less
than 0. This primarily shows up as a negative `numMeasurements` value in
/debug/vars under some circumstances.

refs #23285
2022-04-21 13:32:12 -05:00
Dane Strandboge 0574163566
build: upgrade to go1.18 (#23250) 2022-03-31 16:17:57 -05:00
davidby-influx eb3bc7069f
feat: configurable DELETE concurrency (#23055)
Currently, deletion of series or measurements are
serialized. This new feature will add
max-concurrent-deletes to the [data] section of the
 configuration file. Legal values are any positive
 number, defaulting to 1, the current behavior.

 closes https://github.com/influxdata/influxdb/issues/23054
2022-01-13 11:04:57 -08:00
davidby-influx af9e89a4d4
fix: detect misquoted tag values and return an error (#22754)
SHOW TAG KEYS FROM "foo" where bar="misquoted" is
erroneous, because the tag value must be enclosed
in single, not double, quotes. Although this
correctly returns no tag keys, it is very
inefficient and has cause out-of-memory failures
at a customer. This fix short-circuits the query.

closes https://github.com/influxdata/influxdb/issues/22755
2021-10-27 11:26:20 -07:00
Sam Arnold 611a4370a2
feat: show measurements database and retention policy wildcards (#22388)
* feat: show measurements database and retention policy wildcards

Closes #3318

* chore: run formatter
2021-10-05 09:07:25 -04:00
davidby-influx c8da9bafbf
chore(ae): add more logging (#21381) (#21452)
tsdb.Engine.IsIdle and tsdb.Engine.Digest now return a reason string for why the engine & shard are not idle.
Callers can then use this string for logging, if desired. The returned reason does not allocate memory, so the
caller may want to add the shard ID and path for more information in the log. This is intended to be used in
calls from the anti-entropy service in Enterprise.

(cherry picked from commit bf45841359)

fixes https://github.com/influxdata/influxdb/issues/21448
2021-05-11 09:46:45 -07:00
Sam Arnold b7e7de24d6
refactor: separate coarse and fine permission interfaces (#20996) 2021-03-22 09:52:33 -04:00
Sam Arnold 17b9ea8723
feat: Add WITH KEY to show tag keys (#20793)
* fix: Change from RewriteExpr to PartitionExpr

Also remove some dead code

* feat: WITH KEY implementation

* feat: query rewriting for WITH KEY in SHOW TAG KEYS
2021-02-25 08:38:29 -05:00
davidby-influx 092c7a9976 feat: Make meta queries respect QueryTimeout values
Meta queries (SHOW TAG VALUES, SHOW TAG KEYS, SHOW SERIES CARDINALITY, etc.) do not respect
the QueryTimeout config parameter. Meta queries should check the query context when possible
to allow cancellation and timeout. This will not be as frequent as regular queries, which
use iterators, because meta queries return data in batches.

Add a context.Context to
(*Store).MeasurementNames()
(*Store).MeasurementsCardinality()
(*Store).SeriesCardinality()
(*Store).TagValues()
(*Store).TagKeys()
(*Store).SeriesSketches()
(*Store).MeasurementsSketches()
which is tested for timeout or cancellation
to allow limitation of time spent in meta queries

https://github.com/influxdata/influxdb/issues/20736
2021-02-23 12:52:44 -08:00
Sam Arnold 21823db00b
feat: series creation ingress metrics (#20700)
After turning this on and testing locally, note the 'seriesCreated' metric

"localStore": {"name":"localStore","tags":null,"values":{"pointsWritten":2987,"seriesCreated":58,"valuesWritten":23754}},
"ingress": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"cq","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":4}},
"ingress:1": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"database","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":4}},
"ingress:2": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"httpd","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":46}},
"ingress:3": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"ingress","rp":"monitor"},"values":{"pointsWritten":14,"seriesCreated":14,"valuesWritten":42}},
"ingress:4": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"localStore","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":6}},
"ingress:5": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"queryExecutor","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":10}},
"ingress:6": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"runtime","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":30}},
"ingress:7": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"shard","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":22}},
"ingress:8": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"subscriber","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":6}},
"ingress:9": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_cache","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":18}},
"ingress:10": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_engine","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":58}},
"ingress:11": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_filestore","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":4}},
"ingress:12": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_wal","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":2,"valuesWritten":8}},
"ingress:13": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"write","rp":"monitor"},"values":{"pointsWritten":2,"seriesCreated":1,"valuesWritten":18}},
"ingress:14": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"cpu","rp":"autogen"},"values":{"pointsWritten":1342,"seriesCreated":13,"valuesWritten":13420}},
"ingress:15": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"disk","rp":"autogen"},"values":{"pointsWritten":642,"seriesCreated":6,"valuesWritten":4494}},
"ingress:16": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"diskio","rp":"autogen"},"values":{"pointsWritten":214,"seriesCreated":2,"valuesWritten":2354}},
"ingress:17": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"mem","rp":"autogen"},"values":{"pointsWritten":107,"seriesCreated":1,"valuesWritten":963}},
"ingress:18": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"processes","rp":"autogen"},"values":{"pointsWritten":107,"seriesCreated":1,"valuesWritten":856}},
"ingress:19": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"swap","rp":"autogen"},"values":{"pointsWritten":214,"seriesCreated":1,"valuesWritten":642}},
"ingress:20": {"name":"ingress","tags":{"db":"telegraf","login":"_systemuser_unknown","measurement":"system","rp":"autogen"},"values":{"pointsWritten":321,"seriesCreated":1,"valuesWritten":749}},

Closes: https://github.com/influxdata/influxdb/issues/20613
2021-02-05 14:52:43 -04:00
Sam Arnold dd3baf6d4a
feat: measurement metrics by login (#20687)
After turning on authentication and both forms of ingress metrics:

"ingress": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"cq","rp":"monitor"},"values":{"pointsWritten":38,"valuesWritten":76}},
"ingress:1": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"database","rp":"monitor"},"values":{"pointsWritten":76,"valuesWritten":152}},
"ingress:2": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"httpd","rp":"monitor"},"values":{"pointsWritten":38,"valuesWritten":874}},
"ingress:3": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"ingress","rp":"monitor"},"values":{"pointsWritten":534,"valuesWritten":1068}},
"ingress:4": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"localStore","rp":"monitor"},"values":{"pointsWritten":38,"valuesWritten":76}},
"ingress:5": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"queryExecutor","rp":"monitor"},"values":{"pointsWritten":38,"valuesWritten":190}},
"ingress:6": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"runtime","rp":"monitor"},"values":{"pointsWritten":38,"valuesWritten":570}},
"ingress:7": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"shard","rp":"monitor"},"values":{"pointsWritten":76,"valuesWritten":836}},
"ingress:8": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"subscriber","rp":"monitor"},"values":{"pointsWritten":38,"valuesWritten":114}},
"ingress:9": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_cache","rp":"monitor"},"values":{"pointsWritten":76,"valuesWritten":684}},
"ingress:10": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_engine","rp":"monitor"},"values":{"pointsWritten":76,"valuesWritten":2204}},
"ingress:11": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_filestore","rp":"monitor"},"values":{"pointsWritten":76,"valuesWritten":152}},
"ingress:12": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"tsm1_wal","rp":"monitor"},"values":{"pointsWritten":76,"valuesWritten":304}},
"ingress:13": {"name":"ingress","tags":{"db":"_internal","login":"_systemuser_monitor","measurement":"write","rp":"monitor"},"values":{"pointsWritten":38,"valuesWritten":342}},
"ingress:14": {"name":"ingress","tags":{"db":"telegraf","login":"admin","measurement":"cpu","rp":"autogen"},"values":{"pointsWritten":1,"valuesWritten":1}},
"ingress:15": {"name":"ingress","tags":{"db":"telegraf","login":"telegraf","measurement":"cpu","rp":"autogen"},"values":{"pointsWritten":1316,"valuesWritten":13160}},
"ingress:16": {"name":"ingress","tags":{"db":"telegraf","login":"telegraf","measurement":"disk","rp":"autogen"},"values":{"pointsWritten":642,"valuesWritten":4494}},
"ingress:17": {"name":"ingress","tags":{"db":"telegraf","login":"telegraf","measurement":"diskio","rp":"autogen"},"values":{"pointsWritten":214,"valuesWritten":2354}},
"ingress:18": {"name":"ingress","tags":{"db":"telegraf","login":"telegraf","measurement":"mem","rp":"autogen"},"values":{"pointsWritten":107,"valuesWritten":963}},
"ingress:19": {"name":"ingress","tags":{"db":"telegraf","login":"telegraf","measurement":"processes","rp":"autogen"},"values":{"pointsWritten":107,"valuesWritten":856}},
"ingress:20": {"name":"ingress","tags":{"db":"telegraf","login":"telegraf","measurement":"swap","rp":"autogen"},"values":{"pointsWritten":214,"valuesWritten":642}},
"ingress:21": {"name":"ingress","tags":{"db":"telegraf","login":"telegraf","measurement":"system","rp":"autogen"},"values":{"pointsWritten":321,"valuesWritten":749}},

Only by login:

"ingress": {"name":"ingress","tags":{"login":"_systemuser_monitor"},"values":{"pointsWritten":42,"valuesWritten":354}},
"ingress:1": {"name":"ingress","tags":{"login":"admin"},"values":{"pointsWritten":1,"valuesWritten":1}},
"ingress:2": {"name":"ingress","tags":{"login":"telegraf"},"values":{"pointsWritten":3547,"valuesWritten":28246}},

Notice writes by users 'telegraf', '_systemuser_monitor', and 'admin'.
2021-02-04 11:52:53 -05:00
Sam Arnold b3e763d96f
fix: consistent error for missing shard (#20694) 2021-02-04 09:49:14 -05:00
Sam Arnold eb92c997cd feat: Ingress metrics by measurement
Partial implementation of https://github.com/influxdata/influxdb/issues/20612

Implements per-measurement points written metric. Next step: Also support per-login.
2021-02-02 15:58:28 -05:00
Sam Arnold 117341fb0f fix: Move value metric down to tsdb store
Previously we tracked values on the http ingress, but the tsdb store is the correct
place to track total values written for the instance.
2021-02-02 10:58:47 -05:00
Sam Arnold 6795ec6c01 refactor: do not use context value anti-pattern
Extending the context instead of fixing the API breaks type safety.
For tracking the number of points / values written, it is much clearer
to pass an explicit tracker.
2021-02-01 14:34:11 -05:00
Sam Arnold d1a1e4b667 chore: restore ImportShard
This reverts commit d14acea44d.
2020-12-07 11:01:00 -04:00
davidby-influx 6ec446f422 fix(tsm1): "snapshot in progress" error during backup
This fix adds a skipCacheOk flag to
tsdb.Store.CreateShardSnapshot() and tsdb.Shard.CreateSnapshot()
to pass to tsdb.Engine.CreateSnapshot()
A value of true allows the backup to proceed even if a cache snapshot
cannot be taken.
This flag is set to true in tsm1.Engine.Backup(), the OSS backup code path
This flag is set to false in tsm1.Engine.Export()

https://github.com/influxdata/plutonium/issues/3227
2020-11-05 11:08:08 -08:00
Ayan George 6297ede3d9
fix(tsdb): return error on nonexistent shard id (#17060)
Have Store.DeleteShard() return a useful error if it cannot find the
requested shard.

Fixes #17059
2020-08-24 14:34:44 +00:00
Ayan George 6ce0e11738
feat: Collect values written stats (#19187)
* feat(engine/tsm1): Add WritePointsWithContext()

Add WritePontsWithContext() and make WritePoints() a thin wrapper for
it.

The purpose is to add statistics context values that we'll use to
propagate the number of fields and points written to calls up the call
chain.

* feat(tsdb): Add WriteToShardWithContext()

When applied, this patch adds WriteToShardWithContext() and wraps it
with WriteToShard() to preserve the API.

The the purpose of this addition is to propagate a context.Context value
to Shard.WritePointsWithContext().

* feat(tsdb/shard): Add WritePointsWithContext()

The purpose of adding WritePointsWithContext() is to propage context
values down to engine code and propage statistics via the context.Value
up to callers.

This patch also adds values written statistics to the shard.

* feat(http): Gather values written stats

WritePointsWithContext() was added to propagate context values down to
the engine and communicate stats to the caller.

* feat(http): Gather values written stats

WritePointsWithContext() was added to propagate context values down to
the engine and communicate stats to the caller.

* refactor: Change MetricKey to ContextKey

This patch gives the type we're useing for context keys a better name.
2020-08-12 11:26:12 -04:00
dengzhi.ldz 42dba6487a fix(tsi1): wait deleting epoch before dropping shard 2020-06-24 09:37:13 -06:00
Tristan Su d14acea44d chore: clean up unused functions 2020-05-08 13:45:34 +08:00
Jacob Marble b731cc5e60
feat(storage): Limit concurrent series partition compaction (#14240)
* feat(storage): Limit concurrent series partition snapshots

* feat: make concurrency configurable

* fix: integrate review feedback

* refactor: rename config value
2019-07-30 10:34:06 -07:00
Jeff Wendling 1d9ce868e2 Fix some more shard epoch races
We're not allowed to access the s.epochs map without holding the
mutex against shard creation and deletion, so create a copy of
all of the epoch trackers we will need while we hold the mutex.
2019-02-19 08:59:13 -07:00
Ben Johnson b87605f521
Fix shard epoch race. 2019-02-11 12:15:46 -07:00
Jeff Wendling 9f0cd683b9
Merge pull request #10516 from influxdata/jmw-conflict-concurrency
tsdb: conflict based concurrency resolution
2018-11-29 14:14:24 -07:00
Ben Johnson 298eddb82c
Skip and warn series files in retention policy directory. 2018-11-28 11:20:18 -07:00
Jeff Wendling 4cad51a604 tsdb: conflict based concurrency resolution
There are some problematic races that occur when deletes happen
against writes to the same points at the same time. This change
introduces guards and an epoch based system to coordinate these
modifications.

A guard matches a point based on the time, measurement name, and
some conditions loaded from an influxql expression. The intent
is to be as precise as possible without allowing any false
neagatives: if a point would be deleted, the guard must match it.
We are allowed to match more points than necessary, at the cost
of slowing down writes.

The epoch based system keeps track of outstanding writes and
deletes and their associated guards. When a delete operation
is going to start, it waits until all current writes are
done, and installs its guard, blocking all future writes that
contain points that may conflict with the delete. This allows
writes to disjoint points to proceed uncontended, and the
implementation is optimized for assuming there are few
outstanding deletes. For example, in the case that there are no
deletes, a write just has to take a mutex, bump a counter, and
compare a value against zero. The epoch trackers are per shard,
so that different shards never have to contend with one another.
2018-11-21 19:19:53 -07:00
Jeff Wendling 030adf4bd5 tsdb: don't allow deletes to a database in mixed index mode
TSI1 and inmem indexes have different properties during deletes.
Specifically, inmem shares a global index across all shards, where
every tsi1 index is contained to a specific shard. When deleting
a series, it may cause the last reference to the series across all
shards to be dropped, necessitating a removal from the series file.
Since the inmem index shares the index across all shards, removing
the series when it's removed from the series file is sufficient.

However, in the case of a mixed index database, if the last shard
is a TSI1 shard, the other inmem indexes are not available when we
discover that it was the last reference to the series. This ends
up leaving the series in the inmem index without a series id in
the series file, causing all sorts of misbehavior.

Rather than continue curling ourselves into a ball to try to fix
this unsupported mode, give a helpful error message to the user
that they must run their database in a non-mixed index mode to
allow deletes.
2018-11-21 18:18:38 -07:00
Edd Robinson cade59e253 Fix panic in IndexSet
This commit fixes a panic where a concurrent removal of a shard and meta
query could cause a `nil` index to be added to the IndexSet`.
2018-10-26 18:23:54 +01:00
Stuart Carnie 9520b8d956 fix(tsdb): Fix race calling filterShards outside a lock
Move filterShards inside the lock, as it enumerates the shards map,
which can result in data race when the map is written concurrently.
2018-10-17 14:14:53 -07:00
Edd Robinson f52de2d1e7 Ensure orphaned series removed from inmem index
This commit ensures that any orphaned series (series that are to be
removed and no longer are referenced anywhere in the database) are
removed from the `inmem` index when a shard is dropped.
2018-08-21 15:00:35 +01:00
Edd Robinson dece5b847f Refactor index names 2018-08-21 14:32:30 +01:00
Jacob Marble 786d637780 tsdb: Cleanup compaction throughput code 2018-08-07 11:12:41 -07:00
Zach Goldstein 0ef3752a1a Add configuration parameter to expose rate limit for TSM compaction.
Closes: 9938
2018-08-07 10:05:36 -04:00
Edd Robinson 9eece563b1 Simplify loops 2018-08-05 15:16:33 +01:00
Jeff Wendling 63fbf53699
Merge pull request #10063 from influxdata/jmw-extra-log-context
Make store include context in logs
2018-07-18 11:53:22 -06:00
Edd Robinson 95db829631 Remove default max concurrent compaction limit
PR #9204 introduced a maximum default concurrent compaction limit of 4.
The idea was to reduce IO utilisation on large systems with many cores,
and high write load. Often on these systems, disks were not scaled
appropriately to to the write volume, and while the write path could
keep up, compactions would saturate disks.

In #9225 work was done to reduce IO saturation by limiting the
compaction throughput. To some extent, both #9204 and #9225 work towards
solving the same problem.

We have recently begun to notice larger clusters to suffer from
situations where compactions are not keeping up because they have been
scaled up, but the limit of 4 has stayed in place. While users can
manually override the setting, it seems more user friendly if we remove
the limit by default, and set it manually in cases where compactions are
causing too much IO on large boxes.
2018-07-18 17:27:49 +01:00
Edd Robinson 55ffeb563a Tidy up logging of compaction settings 2018-07-18 17:26:34 +01:00
Jeff Wendling 7bdbe26534 Make store include context in logs
If some error or message is in the context of some shard or database
be sure to include it in the message.
2018-07-18 10:22:53 -06:00
David Norton 6016a80997 allow tag keys to contain underscores 2018-07-17 09:39:08 -04:00
Stuart Carnie 88cd9f3fcf pr(influx-tools): Improvements per PR review 2018-06-13 10:29:59 -07:00
Stuart Carnie 7e998779e6 feat(tsdb/store): Option to disable compactions for offline tools
Allows an offline tool to open the tsdb.Store with compactions disabled.
2018-06-13 10:29:59 -07:00
Stuart Carnie 7abf3ec048 fix(tsdb/store): Fix hang when closing Store if monitor is disabled 2018-06-13 10:29:59 -07:00
Ben Johnson d3e3b05a49
Add tsm1 open limiter
This commit restricts the number of TSM1 files that can be opened
concurrently across the entire `tsdb.Store`. There is currently
a limit for the number of shards that can be opened concurrently,
however, this limit does not help when the number of CPU cores
is higher than the number of shards. Because TSM1 files have a 2GB
limit and there is no limit on the number of files per shard,
extremely large shards (1TB+) can load 1,000s of files simultaneously.
2018-05-29 10:21:53 -06:00
Jacob Marble 735aa2d7dc Add SeriesIDSet() to Index interface 2018-05-18 09:22:43 -07:00
Jacob Marble 7f8b7af61e
Cleanup index memory footprint counting code (#9828)
* Fix IndexSet.DedupeInmemIndexes

* Cleanup index memory footprint code
2018-05-15 11:25:19 -07:00
Jacob Marble 0763d1789e Get inmem index bytes without double-counting 2018-05-10 11:33:52 -07:00
Jacob Marble 7de2dcd3d9 TSM: TSMReader.Close blocks until reads complete 2018-04-30 13:46:03 -07:00
Edd Robinson 0b4a403679 Provide warning when mixed index used on db 2018-04-25 13:57:08 +01:00
Edd Robinson 32e195860b Log index type when opening shard 2018-04-25 13:02:09 +01:00