influxdb

Commit Graph

Author	SHA1	Message	Date
Carol (Nichols \|\| Goulding)	952a3ea498	fix: Return querier sharding to use sequencer ID	2022-08-29 14:06:44 -04:00
Carol (Nichols \|\| Goulding)	698f1a47ff	refactor: Rename test structures from sequencer to shard where appropriate	2022-08-29 14:06:44 -04:00
Jake Goulding	4abf21c724	refactor: Rename Sequencer (and its entourage) to Shard	2022-08-29 14:06:43 -04:00
Sam Arnold	05657ea068	fix: optimizations for metadata fetch and chunk pruning (#5467 ) * fix: hoist repeated computation out of chunk creation We have hundreds of chunks per table, so it is beneficial to only do common work once. * chore: remove TableCache as it is no longer used * fix: prune chunks both before and after metadata fetch Fetching the metadata for all the chunks in a table is expensive, especially when we have a narrow time range query that only needs a few chunks. * chore: fix clippy * fix: fix up some last tests * fix: review comments Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-29 14:59:05 +00:00
Marco Neumann	3a4a17a48e	feat: refresh namespace cache before expiration (#5449 ) Closes #5318.	2022-08-29 11:52:18 +00:00
Dom Dwyer	abf26767c1	refactor: infallible JumpHash initialisation This doesn't really need to be fallible but forces propagation of a ton of error handling - no shards is always a sign of something being very wrong, and can be caught in the caller if it's for some reason an acceptable state / can be recovered from.	2022-08-24 13:18:57 +02:00
Marco Neumann	f34f99c5ed	refactor: port LRU cache backend to policy framework (#5406 ) * refactor: port LRU cache backend to policy framework Closes #5320. * test: extend `test_oversized_entries` Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-17 14:43:24 +00:00
Andrew Lamb	7f0ae53d6f	chore: Update to (almost) released object_store 0.4.0 (#5419 ) * chore: update object_store * chore: update hakari config * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-08-17 13:44:48 +00:00
Marco Neumann	49ab568ca8	refactor: convert `remove_if` feature to policy framework (#5398 ) * refactor: allow `ChangeRequest` to carry a lifetime Let's not restrict our change functions to `'static` because this would require us to clone loads of data to achieve predicate-based `remove_if`. * refactor: convert `remove_if` feature to policy framework Decided to drop the "shared" functionality. We only use the small `remove_if` bit which is way easier to reason about. For #5320. * refactor: address review comments Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-16 08:23:27 +00:00
Marco Neumann	0ccefa0d0c	refactor: port TTL backend to policy framework (#5396 ) * refactor: port TTL backend to policy framework Note that this is "just" a port, it does NOT change how TTL works. This will be done in #5318. Helps with #5320. * fix: ensure inner backend is empty * test: add some smoke test	2022-08-15 16:48:16 +00:00
Carol (Nichols \|\| Goulding)	b982bdaf2f	fix: Derive Eq when we derive PartialEq and members can derive Eq Allow this in generated code that we don't control, though. Recommended by clippy now. https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq	2022-08-11 15:04:06 -04:00
Andrew Lamb	b834bc630c	chore: more readability improvements to sort keys (#5366 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-10 17:59:25 +00:00
Andrew Lamb	16ddc5efc6	chore: Update datafusion / arrow/parquet/arrow-flight and prost/tonic ecosystem (#5360 ) * chore: Update datafusion and arrow * chore: Update Cargo.lock * chore: update to Decimal128 * chore: Update tonic/prost/pbjson/etc * chore: Run cargo hakari tasks * fix: doctest in generated types Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-08-09 17:30:44 +00:00
Andrew Lamb	172f893368	fix: fix logging typo in querier (#5345 ) * fix: fix logging typo * fix: fix type in typo fix ;(	2022-08-09 06:34:06 +00:00
Marco Neumann	cd0dc42b4a	refactor: use a single chunk filter/pruning step in querier (#5338 ) We already prune all chunks in the query-access layer. There's no need to do that another time (which is actually the first time) in `QuerierTable::chunks`. The time savings we get from feeding less chunks into the state reconciling should be negligible. On the pro-side however we get a more streamlined data flow and actually correct chunk pruning metrics. Also see #5336. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-08 12:55:14 +00:00
Marco Neumann	fc1870ff76	fix: chunk pruning stats (#5319 ) - emit a warning if we cannot even attempt to prune chunks due to an error. This is always either a missing feature or a bug (even though it does not impact correctness but _only_ performance). Also see https://github.com/influxdata/conductor/issues/1107 - change metrics to clearly differentiate between "could not prune" and "not pruned" - add new "not pruned" observer hook (this was missing for some reason, the "pruned" hook existed though)	2022-08-05 10:50:31 +00:00
Marco Neumann	0d714878ca	feat: chunk pruning metrics (#5273 ) * refactor: make could-not-prune reason a static string * refactor: introduce `QuerierTableArgs` * feat: chunk pruning metrics Closes #4974. * refactor: address review comments * refactor: use static typing for not-pruned reason * refactor: pass chunk to not-pruned observer and use it for some metrics Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-04 15:29:21 +00:00
Nga Tran	34ccc9c7f5	chore: Revert "chore: Revert "refactor: bump batch size (#5251 )" (#5288 )" (#5300 ) This reverts commit `471b8be92f`. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-04 13:19:46 +00:00
Marco Neumann	840e4801b8	feat: make querier RAM pool split a proper feature (#5283 ) * feat: make querier RAM pool split a proper feature - use propre pool names - expose sizing via CLI/env Closes https://github.com/influxdata/conductor/issues/1102. * refactor: improve naming and docs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-03 15:27:23 +00:00
Marco Neumann	663a20d743	refactor: remove `--ingster-address` (#5255 ) Closes #5002.	2022-08-03 15:05:01 +00:00
Nga Tran	471b8be92f	chore: Revert "refactor: bump batch size (#5251 )" (#5288 ) This reverts commit `bb172f8fa8`.	2022-08-03 14:23:45 +00:00
Marco Neumann	8e2443d879	feat: use two RAM pools in querier (#5271 ) Quick&Dirty implementation of a RAM-pool split to see if this has any effect. I expect the querier performance to improve due to this because large read buffers can no longer evict precious metadata. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-02 15:14:26 +00:00
Marco Neumann	ee491cbbfc	fix: re-enable querier read buffer cache (#5268 ) This reverts commit `82913743f1` / #5252. I misjudged the cache hit ratio for the RB, see https://github.com/influxdata/k8s-infra/pull/4548 So let's bring back the RB cache until we have some form of parquet cache in place.	2022-08-02 08:37:30 +00:00
Marco Neumann	a8f6d579c8	feat: add metric for predicate-based cache entry removal (#5257 )	2022-08-02 07:44:53 +00:00
Marco Neumann	fec6b18d80	feat: add metric for TTL cache expiration (#5256 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-02 07:00:30 +00:00
Marco Neumann	82913743f1	refactor: disable querier read buffer cache (#5252 ) Let's try and see how this performs in prod. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-01 15:43:22 +00:00
Marco Neumann	bb172f8fa8	refactor: bump batch size (#5251 ) This is what DataFusion uses by default and I don't see a reason why we should use such small batch sizes. The affect is probably only visible in certain filter-aggregate queries that don't focus on a single series (because there we likely end up with 1 or 2 batches only, esp. after #5250) for coarse-grained filters, esp. when the filter key is not the first sort key. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-01 13:49:58 +00:00
dependabot[bot]	fbd39844d8	chore(deps): Bump async-trait from 0.1.56 to 0.1.57 (#5247 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.56 to 0.1.57. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.56...0.1.57) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-08-01 08:30:33 +00:00
Andrew Lamb	9215a534d0	chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to `19.0.0` (#5229 ) * chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to `19.0.0` * chore: Run cargo hakari tasks * fix: Update for API changes * fix: clippy Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-28 08:10:47 +00:00
Marco Neumann	9a9a1a4777	feat: limit per-table chunk data for every query (#5223 ) * feat: `QueryChunk::as_any` * feat: allo `ChunkPruner::prune_chunks` to fail * feat: limit per-table chunk data for every query Closes #5211. * fix: address review comments Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-07-27 13:20:05 +00:00
Marco Neumann	85c186f5b8	feat: cache projected chunk schemas in querier (#5213 ) * feat: cache projected chunk schemas in querier Ref #5202. * refactor: simplify size calculations Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-27 08:23:20 +00:00
Andrew Lamb	495bbe48f2	refactor: Reduce boiler plate calling `SpanRecorder::child` (#5180 ) * refactor: call SpanRecorder::child * refactor: update more locations Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-22 11:11:45 +00:00
Marco Neumann	0f54281d24	feat: trace namespace cache For #5129.	2022-07-21 16:10:06 +02:00
Marco Neumann	9031ed390b	feat: trace parquet_file cache For #5129.	2022-07-21 16:10:06 +02:00
Marco Neumann	4c5227292f	feat: trace partition cache For #5129.	2022-07-21 16:10:06 +02:00
Marco Neumann	ff88702749	feat: wire up cache tracing (1/2) (#5170 ) * feat: trace tombstone cache For #5129. * feat: trace table cache For #5129. * feat: trace read buffer cache For #5129. * feat: trace processed_tombstones cache For #5129. * refactor: improve span name Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-21 13:59:55 +00:00
Nga Tran	69cb3f2b19	refactor: remove min_sequence_number from Compactor and Querier, add `count_by_overlaps_with_level_0` and `count_by_overlaps_with_level_1` to catalog (#5151 ) * refactor: remove min_sequnce_number * fix: typos * fix: remove min_sequencer_number from new files from merging main * fix: add back throwing error if the compactor compacts files persisted by the ingester after the ingester sends max seq_num back to querier * test: add test_compactor_collision back but modify the input to make it work woth new changes Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-21 13:51:54 +00:00
Marco Neumann	b35502ce61	feat: cache tracing (#5164 ) * feat: cache tracing Add tracing to the metrics cache wrapper. The extra arguments for GET and PEEK make this quite simple, because the wrapper can just extend the inner args with the trace information. We currently terminate the span in `querier::cache` (i.e. only pass in `None`, so no tracing will occur) to keep this PR rather small. This will be changed in subsequent PRs. For #5129. * fix: typo Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-21 11:54:22 +00:00
Marco Neumann	0561423475	refactor: enforce proper `IOxSessionContext` (#5158 ) - remove `IOxSessionContext::default()` because untracked contexts should only be created by tests - remove `Option<IOxSessionContext>` because it is a typed workaround for `IOxSessionContext::default` Tests should use `IOxSessionContext::testing` and all _normal_ users should create proper contexts. I suspect this will help tracing or at least prevent silent regressions. See #5129. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-20 16:25:43 +00:00
Marco Neumann	3b8f98c7b8	feat: allow passing for extra arguments to `Cache::peek` (#5161 ) This will be used to pass spans down to `CacheWithMetrics` (or a new wrapper specific to tracing) and will help with #5129. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-20 13:51:21 +00:00
Marco Neumann	8b9119a0c6	feat: trace querier->ingester, stopping at gRPC layer (#5159 ) This adds tracing of querire->ingester request up to the point where we perform the network request, i.e. the trace will only appear on the querier side. We may extend this at some point to carry the tracing information to the ingester as well. Ref #5129. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-20 11:48:52 +00:00
Marco Neumann	b8d9799a26	feat: wire span all the way to `QuerierTable::chunks` (#5134 ) * feat: pass context to `QueryDatabase::chunks` * feat: wire span all the way to `QuerierTable::chunks` This is required for #5129.	2022-07-19 14:12:55 +00:00
Andrew Lamb	e2d871b00b	chore: Update datafusion and arrow/parquet/arrow-flight to `18.0.0` (#5079 ) * chore: Update datafusion to 10.0.0, arrow/parquet/arrow-flight to 18 * chore: Run cargo hakari tasks * fix: update cargo pin Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-18 15:01:03 +00:00
Marco Neumann	f0bd278652	feat: add tracing to instrumented semaphores (#5130 ) This will allow us to easily see how much time we spend during query processing waiting for the query semaphore. Ref #5129.	2022-07-15 07:50:28 +00:00
dependabot[bot]	9b67de2f43	chore(deps): Bump tokio from 1.19.2 to 1.20.0 Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.19.2 to 1.20.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.19.2...tokio-1.20.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2022-07-14 01:21:43 +00:00
Carol (Nichols \|\| Goulding)	61c023139b	refactor: Switch compaction levels to an enum with values rather than separate consts Bonuses: - Type checking - Validation - Less casting - Exhaustiveness checking - Less use of the numerical value	2022-07-13 11:30:36 -04:00
Marco Neumann	89c24dfec0	fix: do not force-load chunks into read buffer (#5112 ) I forgot to address a TODO in #5091. Extends to test to actually check the chunk stage and removes the function for manual force-loads. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-13 14:46:24 +00:00
Marco Neumann	b1b2cb5d4a	feat: load read buffer on demand (#5091 ) * refactor: extract `select_schema` * refactor: improve `InternalLostInputField` error message * test: improve SQL runner output * feat: load read buffer on demand Closes #5032. * refactor: move `[Half]OwnedSelection` to `schema` crate`	2022-07-13 08:51:40 +00:00
Nga Tran	bce8924b4c	refactor: use max_sequence_number to sort chunks for deduplication (#5101 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-12 16:23:53 +00:00
Marco Neumann	96da584139	test: do NOT create expensive bloom filters when we do not need them (#5089 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-11 16:29:53 +00:00

1 2 3 4 5 ...

303 Commits (952a3ea4983629a8589087d07d168ec8f77aba04)