influxdb

Commit Graph

Author	SHA1	Message	Date
Andrew Lamb	786711f244	chore: Update datafusion (#5672 ) * chore: Update datafusion pin * chore: Update expected results Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-20 10:19:43 +00:00
Marco Neumann	7e00426d49	refactor: concurrent table scan for "tag values" (#5671 ) Ref #5668.	2022-09-19 14:11:51 +00:00
Marco Neumann	274bd80ecd	refactor: concurrent table scan for "tag keys" (#5670 ) * refactor: concurrent table scan for "tag keys" Ref #5668. * feat: add table name to context metadata	2022-09-19 13:27:18 +00:00
Marco Neumann	ef09573255	refactor: concurrent table scan in "field columns" (#5651 ) * refactor: concurrent table scan in "field columns" Similar to #5647 and #5649. * docs: improve Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-19 10:50:25 +00:00
Dom Dwyer	b0eb85ddd5	refactor: store ShardId in child nodes Instead of passing the ShardId into each function for child nodes of the Shard, store it. This avoids the possibility of mistakenly passing the wrong value.	2022-09-16 18:00:11 +02:00
Marco Neumann	e346433914	refactor: concurrent table scan for "table names" (#5649 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-15 15:39:00 +00:00
Marco Neumann	159250e776	refactor: concurrent table planning in InfluxRPC (#5647 ) * refactor: concurrent table planning in InfluxRPC Some InfluxRPC can scan multiple tables. Prior to this PR we were always scanning the tables in sequence, adding up potential latencies (catalog, ingester, object store). There is no reason we need to do this, "ordinary" SQL queries would not serialize this way either. So let's scan tables concurrently. This add concurrency to: - read filter - read group - read window aggregate There are other query types that could benefit from a similar treatment. They will be changed in a follow-up. * docs: improve Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * test: explain `Send` assertion * refactor: change `CONCURRENT_TABLE_JOBS` to 10 Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-09-15 13:55:22 +00:00
Dom Dwyer	074722eb3e	refactor(ingester): split data.rs into modules Breaks the gigantic data.rs file into sub-modules for Shard, Namespace, Table, Partition, and finally the actual data buffer used to store writes.	2022-09-14 14:20:19 +02:00
Andrew Lamb	45d795055a	feat: Support calling influxql/flux selector aggregates from IOx SQL (#5628 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-14 10:37:17 +00:00
Andrew Lamb	1fd31ee3bf	chore: Update datafusion / `arrow` / `arrow-flight` / `parquet` to version 22.0.0 (#5591 ) * chore: Update datafusion / `arrow` / `arrow-flight` / `parquet` to version 22.0.0 * fix: enable dynamic comparison flag * chore: derive Eq for clippy * chore: update explain plans * chore: Update sizes for ReadBuffer encoding * chore: update more tests Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-12 17:45:03 +00:00
Marco Neumann	df5ef875b4	revert: disable read buffer usage in querier (#5579 ) (#5603 ) This results in a 2x-3x slow down. It's not horrible, but also not good.	2022-09-09 11:26:09 +00:00
Marco Neumann	c3b47dfe59	refactor: disable read buffer usage in querier (#5579 ) * refactor: read querier parquet files from cache * refactor: only use parquet files in querier (no RB) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-08 13:18:22 +00:00
Marco Neumann	adeacf416c	ci: fix (#5569 ) * ci: use same feature set in `build_dev` and `build_release` * ci: also enable unstable tokio for `build_dev` * chore: update tokio to 1.21 (to fix console-subscriber 0.1.8 * fix: "must use"	2022-09-06 14:13:28 +00:00
dependabot[bot]	366c4d9965	chore(deps): Bump once_cell from 1.13.1 to 1.14.0 (#5555 ) Bumps [once_cell](https://github.com/matklad/once_cell) from 1.13.1 to 1.14.0. - [Release notes](https://github.com/matklad/once_cell/releases) - [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md) - [Commits](https://github.com/matklad/once_cell/compare/v1.13.1...v1.14.0) --- updated-dependencies: - dependency-name: once_cell dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-06 09:02:28 +00:00
Marco Neumann	0a0b3bd95b	feat: querier object store cache (#5527 ) * feat: querier object store cache * docs: improve Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com> Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>	2022-09-02 09:48:53 +00:00
Dom	ed2490deb2	Merge branch 'main' into dom/ingester-row-limit	2022-08-31 14:56:42 +01:00
Dom Dwyer	2a19606456	feat(ingester): restrict partition row count This limit restricts a single partition to containing at most N rows before it is marked for persistence (note: being marked for persistence does not currently prevent further ingest for that partition.)	2022-08-31 15:48:18 +02:00
Andrew Lamb	6669d85fb4	chore: Update datafusion + arrow/parquet to `21.0.0` (#5519 ) * chore: Update arrow/arrow-flight/parquet to 21.0.0 * chore: Update datafusion pin * chore: Fix arrow update script * chore: Update Cargo.lock * chore: Update for new API	2022-08-31 13:30:47 +00:00
Marco Neumann	fecbbd9fa1	refactor: improve namespace caching in querier (#5492 ) 1. Cache converted schema instead of catalog schema. This safes a buch of memcopies during conversion. 2. Simplify creation of new chunks, we now only need a `CachedTable` instead of a namespace and a table schema. In an artificial benchmark, this removed around 10ms from the query (although that was prior to #5467 which moved schema conversion one level up). Still I think it is the cleaner cache design. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-30 11:42:21 +00:00
Andrew Lamb	de47f5605b	chore: Update datafusion (with new sqlparser release) - option 1 (#5433 ) * chore: Update datafusion pin * chore: Update now that user is a reserved word * chore: Update cargo.lock * fix: update query for user function Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-29 19:10:00 +00:00
Carol (Nichols \|\| Goulding)	74c9529062	fix: Rename KafkaPartition to ShardIndex	2022-08-29 14:07:18 -04:00
Carol (Nichols \|\| Goulding)	95b7529079	fix: Rename more test values to shard	2022-08-29 14:06:45 -04:00
Carol (Nichols \|\| Goulding)	fe9c474620	fix: rustfmt	2022-08-29 14:06:45 -04:00
Carol (Nichols \|\| Goulding)	698f1a47ff	refactor: Rename test structures from sequencer to shard where appropriate	2022-08-29 14:06:44 -04:00
Jake Goulding	4abf21c724	refactor: Rename Sequencer (and its entourage) to Shard	2022-08-29 14:06:43 -04:00
Marco Neumann	3a4a17a48e	feat: refresh namespace cache before expiration (#5449 ) Closes #5318.	2022-08-29 11:52:18 +00:00
Nga Tran	283e908132	test: workaround for time > a number (#5477 ) * test: workaround for time > a number * chore: cargo update * chore: Revert "chore: cargo update" This reverts commit 0798e4e14674267ddd2308b12a25031fc35de8b6.	2022-08-26 20:52:12 +00:00
Dom Dwyer	abf26767c1	refactor: infallible JumpHash initialisation This doesn't really need to be fallible but forces propagation of a ton of error handling - no shards is always a sign of something being very wrong, and can be caught in the caller if it's for some reason an acceptable state / can be recovered from.	2022-08-24 13:18:57 +02:00
kodiakhq[bot]	2b3ca54168	Merge branch 'main' into cn/upgrade-l0-metrics	2022-08-17 16:01:42 +00:00
dependabot[bot]	78665d3092	chore(deps): Bump once_cell from 1.13.0 to 1.13.1 (#5413 ) Bumps [once_cell](https://github.com/matklad/once_cell) from 1.13.0 to 1.13.1. - [Release notes](https://github.com/matklad/once_cell/releases) - [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md) - [Commits](https://github.com/matklad/once_cell/compare/v1.13.0...v1.13.1) --- updated-dependencies: - dependency-name: once_cell dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-17 08:31:46 +00:00
Carol (Nichols \|\| Goulding)	ed44817ed1	feat: Add a histogram of ingested (new L0) Parquet file sizes Connects to #5348.	2022-08-15 10:13:54 -04:00
Carol (Nichols \|\| Goulding)	d4a472f775	fix: Remove let bindings that are immediately returned As now caught by clippy. https://rust-lang.github.io/rust-clippy/master/index.html#let_and_return	2022-08-11 15:21:02 -04:00
Carol (Nichols \|\| Goulding)	b982bdaf2f	fix: Derive Eq when we derive PartialEq and members can derive Eq Allow this in generated code that we don't control, though. Recommended by clippy now. https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq	2022-08-11 15:04:06 -04:00
Andrew Lamb	16ddc5efc6	chore: Update datafusion / arrow/parquet/arrow-flight and prost/tonic ecosystem (#5360 ) * chore: Update datafusion and arrow * chore: Update Cargo.lock * chore: update to Decimal128 * chore: Update tonic/prost/pbjson/etc * chore: Run cargo hakari tasks * fix: doctest in generated types Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-08-09 17:30:44 +00:00
Andrew Lamb	e0ea335b70	fix: Support RegExMatch and RegExNotMatch predicates on `_field` (#5301 ) * test: add tests for regex_match_on_field * feat: more general `_field` predicate handling * fix: remove old comment * fix: update tests * fix: improve test a little more * fix: fmt * fix: Update predicate/src/rpc_predicate/field_rewrite.rs Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com> * fix: Handle predicates that can not be evaluated Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-04 19:42:16 +00:00
Nga Tran	34ccc9c7f5	chore: Revert "chore: Revert "refactor: bump batch size (#5251 )" (#5288 )" (#5300 ) This reverts commit `471b8be92f`. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-04 13:19:46 +00:00
Marco Neumann	840e4801b8	feat: make querier RAM pool split a proper feature (#5283 ) * feat: make querier RAM pool split a proper feature - use propre pool names - expose sizing via CLI/env Closes https://github.com/influxdata/conductor/issues/1102. * refactor: improve naming and docs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-03 15:27:23 +00:00
Nga Tran	471b8be92f	chore: Revert "refactor: bump batch size (#5251 )" (#5288 ) This reverts commit `bb172f8fa8`.	2022-08-03 14:23:45 +00:00
Marco Neumann	bb172f8fa8	refactor: bump batch size (#5251 ) This is what DataFusion uses by default and I don't see a reason why we should use such small batch sizes. The affect is probably only visible in certain filter-aggregate queries that don't focus on a single series (because there we likely end up with 1 or 2 batches only, esp. after #5250) for coarse-grained filters, esp. when the filter key is not the first sort key. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-01 13:49:58 +00:00
Sam Arnold	3fbe860bb9	fix: interpret [MIN_NANO_TIME, MAX_NANO_TIME) range as all time for optimization (#5231 ) InfluxQL queries can send (technically incorrect) ranges like this, meaning all time but excluding the max nanosecond time. Since this is an important case, we should handle it specially and use the optimized 'all time' handling for meta queries even though this is technically wrong in that it does not filter out column names / measurement names at MAX_NANO_TIME exactly. Closes: https://github.com/influxdata/conductor/issues/1072 Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-28 12:24:26 +00:00
Andrew Lamb	9215a534d0	chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to `19.0.0` (#5229 ) * chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to `19.0.0` * chore: Run cargo hakari tasks * fix: Update for API changes * fix: clippy Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-28 08:10:47 +00:00
Marco Neumann	9a9a1a4777	feat: limit per-table chunk data for every query (#5223 ) * feat: `QueryChunk::as_any` * feat: allo `ChunkPruner::prune_chunks` to fail * feat: limit per-table chunk data for every query Closes #5211. * fix: address review comments Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-07-27 13:20:05 +00:00
Andrew Lamb	66af2bdd88	refactor: Split up `delete_three_delete_three_chunks.sql` test case (#5197 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-22 20:57:31 +00:00
Andrew Lamb	9fed013848	chore: Update datafusion pin (#5162 ) * chore: Update datafusion pin * fix: Update expected output Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-20 14:34:08 +00:00
Marko Mikulicic	b8236e2b9d	fix: Fix SeriesKey sort order for special _measurement and _field (#5150 ) * fix: Fix SeriesKey sort order for special _measurement and _field * fix: Update expected test output * fix: Update more tests * fix: Re-sort tag key when using binary encoding Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-07-20 08:45:17 +00:00
Marco Neumann	b8d9799a26	feat: wire span all the way to `QuerierTable::chunks` (#5134 ) * feat: pass context to `QueryDatabase::chunks` * feat: wire span all the way to `QuerierTable::chunks` This is required for #5129.	2022-07-19 14:12:55 +00:00
Andrew Lamb	e2d871b00b	chore: Update datafusion and arrow/parquet/arrow-flight to `18.0.0` (#5079 ) * chore: Update datafusion to 10.0.0, arrow/parquet/arrow-flight to 18 * chore: Run cargo hakari tasks * fix: update cargo pin Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-18 15:01:03 +00:00
Marco Neumann	9c2b6cd96c	fix: always pass proper context to `InfluxRpcPlanner` (#5144 ) There were some instances were we forgot to pass context (and therefore tracing) information to `InfluxRpcPlanner`. This removes the `Default` implementation requires to always pass a context when creating `InfluxRpcPlanner` to prevent this type of bug. Ref #5129. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-18 14:45:22 +00:00
dependabot[bot]	9b67de2f43	chore(deps): Bump tokio from 1.19.2 to 1.20.0 Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.19.2 to 1.20.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.19.2...tokio-1.20.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2022-07-14 01:21:43 +00:00
Marco Neumann	b1b2cb5d4a	feat: load read buffer on demand (#5091 ) * refactor: extract `select_schema` * refactor: improve `InternalLostInputField` error message * test: improve SQL runner output * feat: load read buffer on demand Closes #5032. * refactor: move `[Half]OwnedSelection` to `schema` crate`	2022-07-13 08:51:40 +00:00

1 2 3 4 5 ...

435 Commits (e0ad5e4c204b6828684bd762cb4bea2edcd005b4)