influxdb

Commit Graph

Author	SHA1	Message	Date
Marco Neumann	a5d693eba2	feat: lower Influx regex expressions to DF regex expressions (#6394 ) * feat: lower Influx regex experessions to DF regex expressions For #6388. * refactor: address review comments	2022-12-15 09:33:28 +00:00
kodiakhq[bot]	66c610f7b1	Merge branch 'main' into cn/ingester-persisted-file-count	2022-12-14 14:58:31 +00:00
Marco Neumann	65687bf0fa	test: regex baseline test (#6389 ) For #6388. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-13 17:42:31 +00:00
Carol (Nichols \|\| Goulding)	1c7f322a4e	feat: Keep track of and report number of Parquet files persisted Per partition and starting over each time the ingester restarts. Fixes #6334.	2022-12-12 11:45:00 -05:00
kodiakhq[bot]	727efcbdee	Merge branch 'main' into cn/ingester2-uuid	2022-12-12 16:21:15 +00:00
Andrew Lamb	e0ecacf6cc	chore: Update DataFusion (get median fix and automatic string to timestamp coercion) (#6363 ) * chore: Update DataFusion pin to get median fix * chore: Update for new Expr node * test: add test for median * test: add test for coercion of strings to timestamps * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-12 12:14:00 +00:00
Carol (Nichols \|\| Goulding)	2fd2d05ef6	feat: Identify each run of an ingester with a Uuid And send that UUID in the Flight response for queries to that ingester run. Fixes #6333.	2022-12-08 17:22:52 -05:00
Andrew Lamb	9175f4a0b5	chore: Upgrade datafusion to get correct support for multi-part identifiers (#6349 ) * test: add tests for periods in measurement names * chore: Update Datafusion * chore: Update for changed APIs * chore: Update expected plan output * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-08 11:27:13 +00:00
Marco Neumann	080aff8f71	fix: account for memory allocations in InfluxRPC group outputs (#6345 ) * fix: account for memory allocations in InfluxRPC group outputs This should prevent the querier from OOMing. See https://github.com/influxdata/idpe/issues/16614 . * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> * refactor: pull out constant Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2022-12-08 09:55:31 +00:00
Marco Neumann	cd6a8a1a82	refactor: DF-driven on-demand mem limit instead of ahead-of-time heuristics (#6313 ) * refactor: DF-driven on-demand mem limit instead of ahead-of-time heuristics Closes #6310. * refactor: rename and tune default exec mem limits * fix: ingester2 bits after rebase	2022-12-05 12:38:28 +00:00
Marco Neumann	514aa60f91	refactor: stream-based(TM) `to_series_and_groups`, part 1 (#6261 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-29 14:16:22 +00:00
Andrew Lamb	fc5697b8e7	chore: Update datafusion again (N of N) (#6218 ) * chore: Update datafusion again (4 of N) * fix: Update plans * fix: Update for renamed API * fix: Update more plans * chore: Update to datafusion @ d355f69aae2cc951cfd021e5c0b690861ba0c4ac * fix: update explain plan tests * fix: update test after schema error * chore: Update datafusion again * fix: Add size() calculation to selectors * chore: Run cargo hakari tasks * fix: Update newly added test Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-28 17:09:40 +00:00
Nga Tran	45d25b0af2	refactor: remove duplicate tests (#6243 )	2022-11-28 16:39:57 +00:00
Christopher M. Wolff	aa7a3a7721	fix: ignore fields when considering tag predicates (#6212 ) * fix: ignore fields when considering tag predicates * chore: update test to not use time column in predicate * chore: update with review feedback * chore: update tests to avoid fields refs in RPC preds This is more like what would be coming off the wire from Influx RPC. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-28 15:16:55 +00:00
Nga Tran	52d70b060a	test: retention test for querier inthe query_tests (#6220 )	2022-11-23 17:04:14 +00:00
Andrew Lamb	9fb1de0428	chore: Update datafusion (2 of N) right before arrow 27 upgrade (#6207 ) * chore: Update datafusion (2 of N) right before arrow 27 upgrade * fix: Update tests for better unsigned pushdown * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-11-23 11:04:14 +00:00
Andrew Lamb	1a1ea74cb7	chore: Upgrade datafusion again (#6160 ) * Revert "Revert "chore: Update datafusion again (#6108)"" This reverts commit 766b3bbeb440618cfe332f6ee7d4f8a8217acc48. * fix: Respect the partition sort key * chore: update plans Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-22 19:28:26 +00:00
Nga Tran	dd1755b23a	feat: querier filters data outsude retnetion period (#6209 )	2022-11-22 15:41:00 +00:00
dependabot[bot]	a9db7581cd	chore(deps): Bump tokio from 1.21.2 to 1.22.0 (#6183 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.21.2 to 1.22.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.21.2...tokio-1.22.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-21 10:21:24 +00:00
Andrew Lamb	4630bbb956	feat: push down all predicates (#6042 ) * feat: push down all predicates * fix: fmt * fix: fmt Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-18 16:22:01 +00:00
Carol (Nichols \|\| Goulding)	02c3083192	fix: Remove table names from Dml operations	2022-11-18 10:40:38 -05:00
Nga Tran	49a9565240	feat: gRPC that creates namespace (#6103 ) * feat: create namespace API call in router Co-authored-by: Nga Tran <nga-tran@live.com> * chore: treat retention as ns except in CLI * fix: overflow in nanosecond calc * fix: retention test after changing it from hours to ns * chore: comment clarification in cli; better response type for error in ns API * fix: correct some rebase mistakes * chore: merge namespace create & create_with_retention; renamed ns create test helper fn & const * fix: ns autocreation test was wrong after rebase * fix: mem catalog has default 1hr retention, accidently removed in rebase * chore: remove mem catalogs default 1hr retention; make it settable in sets & router Co-authored-by: Luke Bond <luke.n.bond@gmail.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-18 13:02:12 +00:00
Marco Neumann	71ffc92559	fix: only push safe select expression through de-dup (#6156 ) * fix: only push safe select expression through de-dup Fixes #6066. * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> * fix: rebase * test: ensure we do not split ORs Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2022-11-18 09:56:11 +00:00
Andrew Lamb	67712b595c	Revert "chore: Update datafusion again (#6108 )" (#6159 ) This reverts commit `fbe9f27f10`.	2022-11-16 21:14:55 +00:00
Andrew Lamb	fbe9f27f10	chore: Update datafusion again (#6108 ) * chore: Update datafusion pin + api code * chore: Run cargo hakari tasks * refactor: combine_sort_key is more idomatic and add rationale comments * refactor: satisfy borrow checker and updated comments * fix: Add test case for combine_sort_key * fix: Apply suggestions from code review Co-authored-by: Marco Neumann <marco@crepererum.net> * fix: Add back test for deeply nested expression * fix: Update output ordering Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: Marco Neumann <marco@crepererum.net> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-16 14:41:52 +00:00
Andrew Lamb	20f1ae1c8f	test: tests in the reorg planner and query tests for merging parquet files (#6137 ) * test: tests in the reorg planner and query tests for merging parquet files * fix: use 20 files Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-15 20:29:44 +00:00
Carol (Nichols \|\| Goulding)	3943faf998	fix: Remove namespace from DmlWrite and DmlDelete constructors	2022-11-14 16:46:04 -05:00
Dom Dwyer	9e97866b48	refactor: internalise PartitionProvider Removes the need to leak the PartitionProvider outside of the ingester crate. This will allow the PartitionProvider to utilise a DeferredLoad<TableName> without having to make the DeferredLoad and TableName pub.	2022-11-14 10:50:05 +01:00
Carol (Nichols \|\| Goulding)	0657ad9600	fix: Rename QueryDatabase to QueryNamespace	2022-11-11 16:14:12 -05:00
Carol (Nichols \|\| Goulding)	fa46951524	fix: Remove needless deref done by auto deref, thanks Clippy!	2022-11-09 10:54:18 -05:00
Marco Neumann	1a5fc3d772	test: use `EXPLAIN ANALYZE` for SQL metric tests (#6084 ) * test: use `EXPLAIN ANALYZE` for SQL metric tests Needs a bit more infra (due to normalization), but this seems to be worth it so we can easily hook up more metrics in the future. * docs: explain regexes	2022-11-09 09:00:27 +00:00
Marco Neumann	903f7bafa7	refactor: expose `ParquetExec` directly to DataFusion phys. plan (#6072 ) * refactor: expose `ParquetExec` directly to DataFusion phys. plan Closes #5897. * fix: update tracing tests * refactor: use `EmptyExec` * refactor: use `target_partitions` * refactor: improve UUID normalization in query tests Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-11-08 12:19:28 +00:00
Andrew Lamb	034d9b371d	chore: Update datafusion and arrow/arrow-flight/parquet to `26.0.0` (#6061 ) * chore: Update datafusion and arrow/arrow-flight/parquet to `26.0.0` * fix: Update query_functions * fix: update for TimestampNanosecondArray API changes * fix: update for TimestampNanosecondArray API changes * chore: Update flatbuffers and remove rustsec warning * chore: Update text * fix: update more test * fix: Lock ahash to exactly 0.8.0 * fix: Update datafusion pin * chore: Run cargo hakari tasks Co-authored-by: Carol (Nichols \|\| Goulding) <carol.nichols@gmail.com> Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-07 11:01:58 +00:00
Marco Neumann	f511db380c	refactor: remove table name from chunks (#6063 ) It should be always clear from the context to which table a chunk belongs. I think having a table name bound to a chunk goes back to a time where chunks had multiple tables. Helps with #6049. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-07 10:42:57 +00:00
Carol (Nichols \|\| Goulding)	09e9b69b85	Merge remote-tracking branch 'origin/main' into dom/dml-delete-namespace-id	2022-11-04 14:56:10 -04:00
Andrew Lamb	8c8e607dca	chore: Update datafusion pin (#6054 ) * chore: Update datafusion pin * chore: Run cargo hakari tasks * chore: Update expected error Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-11-03 19:41:31 +00:00
Dom Dwyer	6fa48731aa	feat: NamespaceId in DmlDelete Changes the DmlDelete to contain the NamespaceId for which it should be applied, propagating this value over the wire. Like the existing IDs within the DmlWrite, these values are marked unsafe to use due to avoid the consumers utilising them accidentally during deployment. Unlike DmlWrite, the DmlDelete is completely unused, so this is less of an issue.	2022-11-03 13:57:40 +01:00
Andrew Lamb	4fb2843d05	refactor: Rename `schema::selection::Selection` to `schema::projection::Projection` (#6037 ) * chore: Rename `schema::selection::Selection` to `schema::projection::Projection` * fix: docs Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-02 18:15:04 +00:00
Andrew Lamb	58838e214e	feat: enable parquet predicate pushdown in IOx (#5930 )	2022-11-02 18:00:47 +00:00
Dom Dwyer	ddd6ab0ba4	refactor(write_buffer): pass IDs in wire format This commit is part of a two-part change in order to add the table & namespace IDs to the write buffer wire format. This commit forms the first half; changing the producer to send the IDs. In this commit the new ID values are never read on the consumer side, ensuring there is no consumer dependency on them. This ensures they remain operational during a rollout, where the consumer may be updated to the latest code dependent on the IDs before the producer is updated to send them. This also ensures we have a window of time where where the consumers can be rolled back after being updated, and still handle replaying messages in Kafka.	2022-11-02 13:28:56 +01:00
dependabot[bot]	b1572c50a6	chore(deps): Bump once_cell from 1.15.0 to 1.16.0 (#6009 ) Bumps [once_cell](https://github.com/matklad/once_cell) from 1.15.0 to 1.16.0. - [Release notes](https://github.com/matklad/once_cell/releases) - [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md) - [Commits](https://github.com/matklad/once_cell/compare/v1.15.0...v1.16.0) --- updated-dependencies: - dependency-name: once_cell dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-31 16:23:40 +00:00
Dom Dwyer	72a358e52f	refactor(dml): PartitionKey required for writes Changes the DmlWrite type to require a PartitionKey be specified, instead of accepting an Option. This requirement was already in place - the write buffer upheld an invariant that all writes contained a partition key value (was not "None") or it panicked at runtime when attempting to enqueue the write. It is now possible to encode this invariant in the type system, which is what this change does.	2022-10-28 10:57:30 +02:00
Carol (Nichols \|\| Goulding)	3145e2c05b	feat: Use workspace dep inheritance for the arrow crate	2022-10-26 10:34:29 -04:00
Carol (Nichols \|\| Goulding)	44936f661a	feat: Use workspace dep inheritance for datafusion instead of shim crate	2022-10-26 10:33:56 -04:00
Andrew Lamb	474620f4a7	chore: Update datafusion and other dependencies (#5976 ) * chore: Update datafusion and other dependencies * chore: Update expected plan * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-26 14:14:13 +00:00
Marco Neumann	9b48437711	refactor: make influx column type mandatory (#5978 ) We basically assume everywhere that a column falls into one of the three known categories (time, tag, field), so lets encode this in our type system instead of defining "unknown" as "undefined behavior, may or may not crash". Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-26 11:20:29 +00:00
Carol (Nichols \|\| Goulding)	2e83e04eab	feat: Use workspace package metadata to reduce differences and repetition	2022-10-24 13:04:09 -04:00
Marco Neumann	3e4db81bc6	refactor: make `SchemaBuilder::field` fallible It would be nice if the IOx data type would not be optional and this is a prep clean-up to achieve that.	2022-10-24 18:12:42 +02:00
Marco Neumann	1d440ddb2d	refactor: `IOxReadFilterNode` can always accumulate statistics (#5954 ) * refactor: `IOxReadFilterNode` can always accumulate statistics `IOxReadFilterNode` used to not emit statistics if one chunk has duplicates or delete predicates. This is wrong (or at least overly conservative), because the node itself (or the chunks themselves) do NOT perform dedup or delete predicate filtering. Instead this is done is done by parent nodes (`DeduplicateExec` and `FilterExec`) and its their job to propagate statistics correctly. Helps w/ #5897. * test: explain setup Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2022-10-24 13:34:22 +00:00
Marco Neumann	e0062f2d40	refactor: do NOT use fake DF context for parquet reading (#5942 ) Use the proper top-level DataFusion context and register the object store there. Note that we still hide the `ParquetExec` behind an opaque record batch stream. Fixing that is next on my list. Helps with #5897. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-24 08:20:26 +00:00

1 2 3 4 5 ...

513 Commits (07772e8d2254fb734e7f826298559658a4964015)