influxdb

Commit Graph

Author	SHA1	Message	Date
Marco Neumann	7d06a61b5f	fix: use `create_at` to order querier chunks under kafkaless (#6554 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-10 17:15:08 +00:00
Marco Neumann	042b7c4521	refactor: invalidate querier cache if ingester is gone (#6550 ) * refactor: invalidate querier cache if ingester is gone For #6549 but I think even w/o the plan illustrated there, this is the right thing to do. Also changes the cache system to use flats sorted vectors instead of costly hash maps. * refactor: simplify code Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-10 13:46:18 +00:00
Marco Neumann	2bb6db3f37	fix: ensure ingester state tracked in querier cache is always in-sync (#6512 ) Fixes #6510. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-10 12:00:05 +00:00
dependabot[bot]	c68049c37a	chore(deps): Bump regex from 1.7.0 to 1.7.1 (#6546 ) Bumps [regex](https://github.com/rust-lang/regex) from 1.7.0 to 1.7.1. - [Release notes](https://github.com/rust-lang/regex/releases) - [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md) - [Commits](https://github.com/rust-lang/regex/compare/1.7.0...1.7.1) --- updated-dependencies: - dependency-name: regex dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-10 09:55:41 +00:00
dependabot[bot]	b49cc2e35e	chore(deps): Bump tokio from 1.24.0 to 1.24.1 (#6545 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.0 to 1.24.1. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.24.0...tokio-1.24.1) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-10 09:48:44 +00:00
dependabot[bot]	e31c84a794	chore(deps): Bump async-trait from 0.1.60 to 0.1.61 (#6533 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.60 to 0.1.61. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.60...0.1.61) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-09 07:44:35 +00:00
Raphael Taylor-Davies	e1036a0c63	refactor: cleanup schema boxing (#6511 ) * refactor: cleanup Schema boxing * chore: clippy	2023-01-06 10:57:39 +00:00
Marco Neumann	6f4b128285	refactor: improve "Parquet files after filtering" dbg log (#6502 ) - Place IDs last because they may hit the "max line length" limit and be truncated. The other information should NOT be truncated with it. - Unpack IDs to integer to remove useless `ParquetFileID(...)` wrappers in output. - Print number of files in addition to the actual list to simplify debugging.	2023-01-05 11:13:33 +00:00
Carol (Nichols \|\| Goulding)	f121d395cc	refactor: Extract a constructor for PolicyBackend using a HashMap	2022-12-21 14:32:35 -05:00
Carol (Nichols \|\| Goulding)	7c6ccdb6d7	fix: Use keys and values functions. Thanks clippy!	2022-12-21 14:32:35 -05:00
Carol (Nichols \|\| Goulding)	56ba3b17de	fix: Allow partitions from ingesters to overlap in RPC write mode This was added in c82d0d8ca6dc02dcdd40a4c656a1ee51f3f9bfee with the comment: > Right now this would clearly indicate a bug and before I am trying to > understand some prod issues, I wanna rule that one out. In the RPC write path, this isn't a bug, it's quite expected.	2022-12-21 11:32:58 -05:00
Carol (Nichols \|\| Goulding)	257c155d1e	chore: Line wrapping at 100 cols	2022-12-21 11:18:47 -05:00
Dom Dwyer	adc6fcfb04	feat(catalog): linearise sort key updates Updating the sort key is not commutative and MUST be serialised. The correctness of the current catalog interface relies on the caller serialising updates globally, something it cannot reasonably assert in a distributed system. This change of the catalog interface pushes this responsibility to the catalog itself where it can be effectively enforced, and allows a caller to detect parallel updates to the sort key.	2022-12-20 12:31:00 +01:00
Carol (Nichols \|\| Goulding)	200f4fe9bd	fix: Disable parquet file filtering in the querier based on max seq num in RPC write mode (#6443 ) Connects to #6421. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-19 18:01:21 +00:00
Andrew Lamb	9b22ede3f0	refactor: Make arrow flight client return `futures::Streams` (#6438 ) * refactor: Make arrow flight client use futures::Streams * refactor: concision	2022-12-19 17:09:26 +00:00
Andrew Lamb	94c2f94ea1	refactor: Extract common ArrowFlight client into iox_arrow_flight (#6427 ) * refactor: Extract common ArrowFlight client into iox_arrow_flight * chore: Run cargo hakari tasks * fix: clarify intent of iox_arrow_flight crate * refactor: Apply suggestions from code review Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com> * fix: loop --> while let * fix: REmove make_tonic_error in favor of From impl Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-19 11:35:20 +00:00
dependabot[bot]	c72734473c	chore(deps): Bump async-trait from 0.1.59 to 0.1.60 (#6433 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.59 to 0.1.60. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.59...0.1.60) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-19 10:09:23 +00:00
Marco Neumann	ffe8b98f47	refactor: clean up querier code base (#6404 ) * refactor: `s/QuerierChunk/QuerierParquetChunk/g` * refactor: isolate parquet chunk creation code * refactor: fuse `chunk` and `chunk_parts` * refactor: pass catalog cache instead of chunk adapter to state reconciler * refactor: move parquet chunks creation into its own method Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-15 07:01:11 +00:00
kodiakhq[bot]	d6afc9eee1	Merge branch 'main' into cn/ingester-persisted-file-count	2022-12-14 15:48:59 +00:00
Marco Neumann	4e36c590af	refactor: speed up partition sort key syncing (#6400 ) * refactor: speed up partition sort key syncing Prior to syncing, all chunks have a "locally correct" partiton sort key, i.e. one that at least covers all chunk columns (this is ensured during chunk creation, both for parquet chunks as well as ingester chunks). However due to the timing, some chunks may have a newer (= longer) partition sort key. All we need to do to fix this is to pick the longest partition sort key, there is no need to go through the whole cache system again. For #6358. * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2022-12-14 15:48:08 +00:00
kodiakhq[bot]	66c610f7b1	Merge branch 'main' into cn/ingester-persisted-file-count	2022-12-14 14:58:31 +00:00
Marco Neumann	c51548f28b	refactor: improve concurrency during parquet chunk creation (#6376 ) * refactor: de-correletate parquet file processing * refactor: increase concurrent chunk creation jobs to 100 (from 10) * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> * refactor: use deterministic RNG Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-13 16:16:09 +00:00
Carol (Nichols \|\| Goulding)	44c3486db0	feat: Expire the querier's cache using info from ingester2 Fixes #6335. For each table, keep track of the ingester UUIDs and associated persisted Parquet file counts that we've seen from previous requests to ingesters. When doing a query, determine if we should expire the Parquet file catalog cache by looking at the new information from the ingesters. If we see a new ingester UUID or if the number of persisted files for a known ingester UUID is different than what we've stored, then we should expire this table's Parquet file cache. Either way, incorporate the new information into the saved values for comparing with the next request.	2022-12-12 15:53:39 -05:00
Carol (Nichols \|\| Goulding)	b4b50d7dc1	feat: Collect the ingester UUIDs and persistence counts in the table And pass them to the parquet file cache, which doesn't use them yet.	2022-12-12 15:52:56 -05:00
Carol (Nichols \|\| Goulding)	b0ba171742	feat: Keep track of ingester UUIDs and counts in IngesterPartition	2022-12-12 15:52:08 -05:00
Carol (Nichols \|\| Goulding)	9c8b55c5be	docs: Fix some wrapping/typos in comments	2022-12-12 14:30:52 -05:00
Carol (Nichols \|\| Goulding)	1c7f322a4e	feat: Keep track of and report number of Parquet files persisted Per partition and starting over each time the ingester restarts. Fixes #6334.	2022-12-12 11:45:00 -05:00
Carol (Nichols \|\| Goulding)	33886970ef	refactor: Extract a helper fn for test messages Reduces duplication, makes it easier to see what's different between the tests, will make it easier to add another field in the next commit	2022-12-12 11:45:00 -05:00
kodiakhq[bot]	727efcbdee	Merge branch 'main' into cn/ingester2-uuid	2022-12-12 16:21:15 +00:00
Marco Neumann	e49ffc02f8	refactor: faster sort key calculation (#6375 ) Avoid nasty string lookups to dermine which columns make a parquet's sort key. For #6358. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-12 15:32:04 +00:00
Marco Neumann	6b1c43f01e	refactor: use column IDs for partition cache invalidation (#6374 ) This shall avoid a bunch of string hashing during query planning. For #6358. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-12 14:22:28 +00:00
Marco Neumann	db933c44b6	refactor: store reverse column ID map for cached tables (#6360 )	2022-12-09 11:58:24 +00:00
Marco Neumann	450b452148	refactor: avoid string-hashing of parquet file column names (#6359 )	2022-12-09 11:51:18 +00:00
Carol (Nichols \|\| Goulding)	2fd2d05ef6	feat: Identify each run of an ingester with a Uuid And send that UUID in the Flight response for queries to that ingester run. Fixes #6333.	2022-12-08 17:22:52 -05:00
kodiakhq[bot]	6f7cb5ccf0	Merge branch 'main' into cn/ingester2-querier	2022-12-08 14:00:49 +00:00
Marco Neumann	d4e321a2bd	refactor: add additional span around chunk spans (#6353 ) * refactor: add additional span around chunk spans * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2022-12-08 13:57:32 +00:00
Andrew Lamb	9175f4a0b5	chore: Upgrade datafusion to get correct support for multi-part identifiers (#6349 ) * test: add tests for periods in measurement names * chore: Update Datafusion * chore: Update for changed APIs * chore: Update expected plan output * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-08 11:27:13 +00:00
Carol (Nichols \|\| Goulding)	e13e668d26	refactor: Share more code in the querier in the RPC write path mode	2022-12-07 13:54:08 -05:00
Carol (Nichols \|\| Goulding)	b1c5ec4dee	fix: Correct compiler errors in places I missed while running crate tests	2022-12-07 10:25:36 -05:00
Carol (Nichols \|\| Goulding)	9166ace796	feat: Make a mode for the querier to use ingester2 instead, behind the rpc_write feature flag	2022-12-07 09:56:50 -05:00
dependabot[bot]	1d38d400f0	chore(deps): Bump object_store from 0.5.1 to 0.5.2 (#6339 ) * chore(deps): Bump object_store from 0.5.1 to 0.5.2 Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.1 to 0.5.2. - [Release notes](https://github.com/apache/arrow-rs/releases) - [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md) - [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.1...object_store_0.5.2) --- updated-dependencies: - dependency-name: object_store dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * chore: Run cargo hakari tasks Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-06 07:53:54 +00:00
Marco Neumann	cd6a8a1a82	refactor: DF-driven on-demand mem limit instead of ahead-of-time heuristics (#6313 ) * refactor: DF-driven on-demand mem limit instead of ahead-of-time heuristics Closes #6310. * refactor: rename and tune default exec mem limits * fix: ingester2 bits after rebase	2022-12-05 12:38:28 +00:00
Marco Neumann	ec2e72d223	test: simplify test executors (#6312 ) Have a single global test executor w/ reasonable defaults. Also don't require tests to join/await executor shutdowns (most tests forget this anyways and will get a runtime warning). Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-12-02 11:38:18 +00:00
Marco Neumann	befc6d668b	fix: avoid user error for unsupported querier<>ingester preds (#6238 ) Fixes #6195. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-28 16:51:41 +00:00
Nga Tran	45d25b0af2	refactor: remove duplicate tests (#6243 )	2022-11-28 16:39:57 +00:00
Andrew Lamb	1a1ea74cb7	chore: Upgrade datafusion again (#6160 ) * Revert "Revert "chore: Update datafusion again (#6108)"" This reverts commit 766b3bbeb440618cfe332f6ee7d4f8a8217acc48. * fix: Respect the partition sort key * chore: update plans Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-22 19:28:26 +00:00
Andrew Lamb	f89d542715	refactor: Minor cleanup of retention predicate code (#6211 ) * refactor: Minor cleanup of retention predicate code * fix: use cow	2022-11-22 18:28:54 +00:00
Nga Tran	dd1755b23a	feat: querier filters data outsude retnetion period (#6209 )	2022-11-22 15:41:00 +00:00
Marco Neumann	0c6afd7dbe	refactor: tune circuit breaker config (#6202 ) At the moment it takes way to long to half-open and close circuits ones they were opened. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-22 09:03:04 +00:00
dependabot[bot]	04c00bbb62	chore(deps): Bump bytes from 1.2.1 to 1.3.0 (#6199 ) Bumps [bytes](https://github.com/tokio-rs/bytes) from 1.2.1 to 1.3.0. - [Release notes](https://github.com/tokio-rs/bytes/releases) - [Changelog](https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md) - [Commits](https://github.com/tokio-rs/bytes/commits) --- updated-dependencies: - dependency-name: bytes dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-22 08:23:24 +00:00
dependabot[bot]	a9db7581cd	chore(deps): Bump tokio from 1.21.2 to 1.22.0 (#6183 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.21.2 to 1.22.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.21.2...tokio-1.22.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-21 10:21:24 +00:00
Nga Tran	49a9565240	feat: gRPC that creates namespace (#6103 ) * feat: create namespace API call in router Co-authored-by: Nga Tran <nga-tran@live.com> * chore: treat retention as ns except in CLI * fix: overflow in nanosecond calc * fix: retention test after changing it from hours to ns * chore: comment clarification in cli; better response type for error in ns API * fix: correct some rebase mistakes * chore: merge namespace create & create_with_retention; renamed ns create test helper fn & const * fix: ns autocreation test was wrong after rebase * fix: mem catalog has default 1hr retention, accidently removed in rebase * chore: remove mem catalogs default 1hr retention; make it settable in sets & router Co-authored-by: Luke Bond <luke.n.bond@gmail.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-18 13:02:12 +00:00
Andrew Lamb	67712b595c	Revert "chore: Update datafusion again (#6108 )" (#6159 ) This reverts commit `fbe9f27f10`.	2022-11-16 21:14:55 +00:00
Andrew Lamb	fbe9f27f10	chore: Update datafusion again (#6108 ) * chore: Update datafusion pin + api code * chore: Run cargo hakari tasks * refactor: combine_sort_key is more idomatic and add rationale comments * refactor: satisfy borrow checker and updated comments * fix: Add test case for combine_sort_key * fix: Apply suggestions from code review Co-authored-by: Marco Neumann <marco@crepererum.net> * fix: Add back test for deeply nested expression * fix: Update output ordering Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: Marco Neumann <marco@crepererum.net> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-16 14:41:52 +00:00
Marco Neumann	62851afc27	feat: add querier->ingester circuit breaker (#6147 ) * feat: add log ingester memory pressure persist * feat: add querier->ingester circuit breaker Closes #4608. * docs: explain high-level circuit breaker * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> * test: add additional test assertion * refactor: upgrade info to warning log Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2022-11-16 10:50:33 +00:00
dependabot[bot]	a969754819	chore(deps): Bump chrono from 0.4.22 to 0.4.23 (#6129 ) * chore(deps): Bump chrono from 0.4.22 to 0.4.23 Bumps [chrono](https://github.com/chronotope/chrono) from 0.4.22 to 0.4.23. - [Release notes](https://github.com/chronotope/chrono/releases) - [Changelog](https://github.com/chronotope/chrono/blob/main/CHANGELOG.md) - [Commits](https://github.com/chronotope/chrono/compare/v0.4.22...v0.4.23) --- updated-dependencies: - dependency-name: chrono dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * refactor: chrono future compat Integer->timstamp conversions should not silently panic. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Marco Neumann <marco@crepererum.net> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-14 13:34:09 +00:00
Carol (Nichols \|\| Goulding)	3dde82b3b9	fix: Rename QueryDatabaseProvider to QueryNamespaceProvider	2022-11-11 16:14:12 -05:00
Carol (Nichols \|\| Goulding)	0657ad9600	fix: Rename QueryDatabase to QueryNamespace	2022-11-11 16:14:12 -05:00
Carol (Nichols \|\| Goulding)	621560a0dc	fix: Rename QueryDatabaseMeta to QueryNamespaceMeta	2022-11-11 16:14:12 -05:00
Carol (Nichols \|\| Goulding)	bdff4e8848	fix: Consistently use 'namespace' instead of 'database' in comments and other internal text	2022-11-11 15:46:04 -05:00
Dom	d9c97795fc	feat: use IDs in ingester query API (#6093 ) * refactor: NS+table ID (instead of name) in querier<>ingester * feat(ingester): use IDs for query API Changes the ingester to utilise the ID fields (instead of names) sent over the query wire message wrapped within the Flight API. BREAKING: this changes the "query-ingester" CLI command arguments which now expects the namespace & table IDs, rather than their names. * refactor(ingester): add more query logging context Updates the log messages during query execution to include more context fields. * style: remove unused import Co-authored-by: Marco Neumann <marco@crepererum.net>	2022-11-09 11:25:13 +00:00
Marco Neumann	903f7bafa7	refactor: expose `ParquetExec` directly to DataFusion phys. plan (#6072 ) * refactor: expose `ParquetExec` directly to DataFusion phys. plan Closes #5897. * fix: update tracing tests * refactor: use `EmptyExec` * refactor: use `target_partitions` * refactor: improve UUID normalization in query tests Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-11-08 12:19:28 +00:00
Marco Neumann	f511db380c	refactor: remove table name from chunks (#6063 ) It should be always clear from the context to which table a chunk belongs. I think having a table name bound to a chunk goes back to a time where chunks had multiple tables. Helps with #6049. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-07 10:42:57 +00:00
Andrew Lamb	4fb2843d05	refactor: Rename `schema::selection::Selection` to `schema::projection::Projection` (#6037 ) * chore: Rename `schema::selection::Selection` to `schema::projection::Projection` * fix: docs Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-02 18:15:04 +00:00
Andrew Lamb	3ba0458653	feat: Add object_store handler to querier so `remote get-table` works (#6014 ) * feat: Add object_store handler to querier * test: end to end test for get-table from querier * fix: doc links Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-02 14:20:26 +00:00
Marco Neumann	2e74727baf	fix: handle recursing limit in querier<>ingester comm (#6020 ) * test: check server exit status on `TestServer` drop * fix: handle recursing limit in querier<>ingester comm Fixes #5974. * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> * test: simplify Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-02 09:16:34 +00:00
Marco Neumann	45b3984aa3	refactor: simplify `QueryChunk` data access (#6015 ) * refactor: simplify `QueryChunk` data access We have only two types for chunks (now that the RUB is gone): 1. In-memory RecordBatches 2. Parquet files Loads of logic is duplicated in the different `read_filter` implementations. Also `read_filter` hides a solid amount of logic from DataFusion, which will prevent certain (future) optimizations. To enable #5897 and to simplify the interface, let the chunks return the data (batches or metadata for parquet files) directly and let `iox_query` perform the actual heavy-lifting. * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-02 08:18:33 +00:00
Andrew Lamb	9c1f0a3644	refactor: move SessionConfig creation into datafusion_utils (#6011 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-31 20:04:49 +00:00
Marco Neumann	072439e428	refactor: mandatory `QueryChunkMeta::summary` (#5997 ) With #5963 merged, all chunks now provide a summary (even though it may not contain data for all columns). So let's make it mandatory, which also removes a few 🙈-style `.except(...)` calls. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-31 16:38:02 +00:00
Carol (Nichols \|\| Goulding)	dad1ad1318	feat: Add the catalog service to ingester, querier, and compactor So that `remote get` that uses the catalog service can work no matter what kind of server you contact.	2022-10-28 10:49:26 -04:00
Carol (Nichols \|\| Goulding)	53445af25d	chore: Alphabetize some dependencies I can't handle not knowing where to look for a dependency or knowing where to add a new dependency.	2022-10-28 10:34:25 -04:00
Marco Neumann	8447d46093	refactor: remove `QueryChunkMeta::timestamp_min_max` (#5963 ) Use the table summary instead. This allows us to have a single mechanism that both IOx and DataFusion understand. This basically lifts the "basic table summary" mechanism that the querier uses to `iox_query` and let the compactor and ingester use the same mechanism. While not strictly necessary, simplifying the `QueryChunk[Meta]` interface helps with #5897. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-28 10:29:16 +00:00
Carol (Nichols \|\| Goulding)	3145e2c05b	feat: Use workspace dep inheritance for the arrow crate	2022-10-26 10:34:29 -04:00
Carol (Nichols \|\| Goulding)	44936f661a	feat: Use workspace dep inheritance for datafusion instead of shim crate	2022-10-26 10:33:56 -04:00
Marco Neumann	9b48437711	refactor: make influx column type mandatory (#5978 ) We basically assume everywhere that a column falls into one of the three known categories (time, tag, field), so lets encode this in our type system instead of defining "unknown" as "undefined behavior, may or may not crash". Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-26 11:20:29 +00:00
Carol (Nichols \|\| Goulding)	2e83e04eab	feat: Use workspace package metadata to reduce differences and repetition	2022-10-24 13:04:09 -04:00
Marco Neumann	3e4db81bc6	refactor: make `SchemaBuilder::field` fallible It would be nice if the IOx data type would not be optional and this is a prep clean-up to achieve that.	2022-10-24 18:12:42 +02:00
Marco Neumann	e0062f2d40	refactor: do NOT use fake DF context for parquet reading (#5942 ) Use the proper top-level DataFusion context and register the object store there. Note that we still hide the `ParquetExec` behind an opaque record batch stream. Fixing that is next on my list. Helps with #5897. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-24 08:20:26 +00:00
Carol (Nichols \|\| Goulding)	712cfc3f38	fix: Use span rather than child_span	2022-10-20 09:14:28 -04:00
Carol (Nichols \|\| Goulding)	59e1c1d5b9	feat: Pass trace id through Flight requests from querier to ingester Fixes #5723.	2022-10-20 08:55:30 -04:00
Marco Neumann	21e8fcad25	feat: rework cache refresh logic (#5886 ) * feat: rework cache refresh logic Instead of issuing a single refresh when a GET request for a cached key comes in, start a background job (using some efficient logic to not overload tokio) per key that refreshes the key using some exponential backoff. The timer is reset a new GET request comes in. This has the following advantages: - our backoff logic decorrelates the requests - the longer a key was not used, the less often it will be updated All test (esp. integration tests) as adjusted accordingly, mostly to account for the fact that no extra GET is required to start the refresh timer. Closes #5720. * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> * refactor: simplify rng overwrite Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2022-10-19 16:01:39 +00:00
Andrew Lamb	cd88e72f88	fix: reduce verbosity from `INFO querier::ingester: Time spent in ...` to `DEBUG` (#5913 ) * fix: reduce verbosity from `INFO querier::ingester: Time spent in ingester` * fix: clippy	2022-10-19 15:09:28 +00:00
Marco Neumann	eb5a661ab3	refactor: prep work for #5897 (#5907 ) * refactor: add ID to `ParquetStorage` * refactor: remove duplicate code * refactor: use dedicated `StorageId`	2022-10-19 11:54:42 +00:00
dependabot[bot]	b5574c07b7	chore(deps): Bump async-trait from 0.1.57 to 0.1.58 (#5904 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.57 to 0.1.58. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.57...0.1.58) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-10-19 09:40:26 +00:00
Marco Neumann	e1b50227f8	refactor: avoid some clones while caching ns schema (#5896 ) Found while reviewing the code. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-19 06:28:15 +00:00
Andrew Lamb	d706f8221d	chore: Update datafusion and arrow / parquet / arrow-flight 25.0.0 (#5900 ) * chore: Update datafusion and `arrow` / `parquet` / `arrow-flight` 25.0.0 * chore: Update for structure changes * chore: Update for new projection pushdown * chore: Run cargo hakari tasks * fix: fmt Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-18 20:58:47 +00:00
Marco Neumann	9310d26b92	refactor: remove querier dual chunk stage (#5890 )	2022-10-18 12:38:30 +00:00
Marco Neumann	d89aae88eb	refactor: remove querier read buffer cache (#5889 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-18 10:55:04 +00:00
Marco Neumann	819dbe9e0c	refactor: remove querier chunk load settings (#5888 ) We no longer use dual-state ReadBuffer/Parquet chunks.	2022-10-18 10:22:46 +00:00
Andrew Lamb	6f931411f3	feat: read from parquet and only parquet (#5879 ) * feat: query only from parquet * Revert "feat: query only from parquet" This reverts commit 5ce3c3449c0b9c90154c8c6ece4a40a9c083b7ba. * Revert "revert: disable read buffer usage in querier (#5579) (#5603)" This reverts commit `df5ef875b4`. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-18 10:09:48 +00:00
Andrew Lamb	9134ccd6c3	chore: Update datafusion again (#5855 ) * chore: Update datafusion * chore: Updates for changes in datafusion * chore: more updates * fix: update doc example Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-13 19:18:57 +00:00
Andrew Lamb	d57c99638c	chore: Update datafusion + `arrow`, `arrow-flight`, and `parquet` to 24.0.0.0 (#5792 ) * chore: Update datafusion + `arrow`, `arrow-flight`, and `parquet` to 24.0.0.0 * fix: Update for coercion, fix explain plans for change in column name display * chore: Update datafusion lock * fix: Update for other API changes * chore: Update to latest datafusion pin * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-12 16:19:14 +00:00
Dom Dwyer	c4f542bbe2	refactor(ingester): remove tombstone support This commit removes tombstone support from the ingester, and deletes associated code/helpers/tests. This commit does NOT remove tombstone support from any other service, but MAY include removing overlapping test coverage. This also removes the tombstone support from the Ingester -> Querier RPC response message. This has the nice side effect of removing a whole lot of thread spawning in the ingester tests for the Executor, speeding everything up!	2022-10-11 13:10:04 +02:00
dependabot[bot]	933493fab3	chore(deps): Bump object_store from 0.5.0 to 0.5.1 Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.0 to 0.5.1. - [Release notes](https://github.com/apache/arrow-rs/releases) - [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md) - [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.0...object_store_0.5.1) --- updated-dependencies: - dependency-name: object_store dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2022-10-11 01:19:10 +00:00
Nga Tran	95ed41f140	feat: Projection pushdown for querier -> ingester for rpc queries (#5782 ) * feat: initial step to identify where the projection should be provided * feat: start getting columns of all expressions * chore: format * test: test for the table_chunk_stream * fix: fix a compile error. Thanks @alamb * test: full tests for table_chunk_stream * chore: cleanup * fix: do not cut any columns in case all fields are needed * test: add one more test case of reading all columns * refactor: move code that identify columbs ot push down to a function. Add the use of field_columns * chore: cleanup * refactor: make sream_from_batch support empty batches * chore: cleanup * chore: fix clippy after auto merge Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-06 17:21:23 +00:00
Marco Neumann	c4c83e0840	fix: query error propagation (#5801 ) - treat OOM protection as "resource exhausted" - use `DataFusionError` in more places instead of opaque `Box<dyn Error>` - improve conversion from/into `DataFusionError` to preserve more semantics Overall, this improves our error handling. DF can now return errors like "resource exhausted" and gRPC should now automatically generate a sensible status code for it. Fixes #5799.	2022-10-06 08:54:01 +00:00
Dom Dwyer	cd4087e00d	style: add no todo!() or dbg!() lints Some crates had theme, some not - lets be consistent and have the compiler spot dbg!() and todo!() macro calls - they should never be in prod code!	2022-09-29 13:10:07 +02:00
Andrew Lamb	66dbb9541f	chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to 23.0.0, `thrift` to 0.16.0 (#5694 ) * chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to 23.0.0 * chore: Update thrift / remove parquet_format * fix: Update APIs * chore: Update lock + Run cargo hakari tasks * fix: use patched version of arrow-rs to work around https://github.com/apache/arrow-rs/issues/2779 * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-27 12:50:54 +00:00
Nga Tran	84b10b28b2	feat: send only needed projection columns from querier to ingester in… (#5678 ) * feat: send only needed projection columns from querier to ingester in case of normal SQL queries * refactor: push column index down until we need to convert them strings * fix: make the test deterministic * test: test for the projection pushdown * test: add asserts for the proj pushdown test * test: implement projection pushdown for partitions of MockIngesterConnection * chore: cleanup * chore: address review comments * chore: Apply suggestions from code review Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * refactor: address review comments Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-26 17:19:20 +00:00
Carol (Nichols \|\| Goulding)	c8108f01e7	chore: Upgrade to Rust 1.64 (#5727 ) * chore: Upgrade to Rust 1.64 * fix: Use iter find instead of a for loop, thanks clippy * fix: Remove some needless borrows, thanks clippy * fix: Use then_some rather than then with a closure, thanks clippy * fix: Use iter retain rather than filter collect, thanks clippy Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-22 18:04:00 +00:00

1 2 3 4 5 ...

489 Commits (63b51fdd50f26d08ba1934217d6ffa3d7c2bebe7)