influxdb

Commit Graph

Author	SHA1	Message	Date
Andrew Lamb	f89d542715	refactor: Minor cleanup of retention predicate code (#6211 ) * refactor: Minor cleanup of retention predicate code * fix: use cow	2022-11-22 18:28:54 +00:00
Nga Tran	dd1755b23a	feat: querier filters data outsude retnetion period (#6209 )	2022-11-22 15:41:00 +00:00
Marco Neumann	0c6afd7dbe	refactor: tune circuit breaker config (#6202 ) At the moment it takes way to long to half-open and close circuits ones they were opened. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-22 09:03:04 +00:00
dependabot[bot]	04c00bbb62	chore(deps): Bump bytes from 1.2.1 to 1.3.0 (#6199 ) Bumps [bytes](https://github.com/tokio-rs/bytes) from 1.2.1 to 1.3.0. - [Release notes](https://github.com/tokio-rs/bytes/releases) - [Changelog](https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md) - [Commits](https://github.com/tokio-rs/bytes/commits) --- updated-dependencies: - dependency-name: bytes dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-22 08:23:24 +00:00
dependabot[bot]	a9db7581cd	chore(deps): Bump tokio from 1.21.2 to 1.22.0 (#6183 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.21.2 to 1.22.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.21.2...tokio-1.22.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-21 10:21:24 +00:00
Nga Tran	49a9565240	feat: gRPC that creates namespace (#6103 ) * feat: create namespace API call in router Co-authored-by: Nga Tran <nga-tran@live.com> * chore: treat retention as ns except in CLI * fix: overflow in nanosecond calc * fix: retention test after changing it from hours to ns * chore: comment clarification in cli; better response type for error in ns API * fix: correct some rebase mistakes * chore: merge namespace create & create_with_retention; renamed ns create test helper fn & const * fix: ns autocreation test was wrong after rebase * fix: mem catalog has default 1hr retention, accidently removed in rebase * chore: remove mem catalogs default 1hr retention; make it settable in sets & router Co-authored-by: Luke Bond <luke.n.bond@gmail.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-18 13:02:12 +00:00
Andrew Lamb	67712b595c	Revert "chore: Update datafusion again (#6108 )" (#6159 ) This reverts commit `fbe9f27f10`.	2022-11-16 21:14:55 +00:00
Andrew Lamb	fbe9f27f10	chore: Update datafusion again (#6108 ) * chore: Update datafusion pin + api code * chore: Run cargo hakari tasks * refactor: combine_sort_key is more idomatic and add rationale comments * refactor: satisfy borrow checker and updated comments * fix: Add test case for combine_sort_key * fix: Apply suggestions from code review Co-authored-by: Marco Neumann <marco@crepererum.net> * fix: Add back test for deeply nested expression * fix: Update output ordering Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: Marco Neumann <marco@crepererum.net> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-16 14:41:52 +00:00
Marco Neumann	62851afc27	feat: add querier->ingester circuit breaker (#6147 ) * feat: add log ingester memory pressure persist * feat: add querier->ingester circuit breaker Closes #4608. * docs: explain high-level circuit breaker * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> * test: add additional test assertion * refactor: upgrade info to warning log Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2022-11-16 10:50:33 +00:00
dependabot[bot]	a969754819	chore(deps): Bump chrono from 0.4.22 to 0.4.23 (#6129 ) * chore(deps): Bump chrono from 0.4.22 to 0.4.23 Bumps [chrono](https://github.com/chronotope/chrono) from 0.4.22 to 0.4.23. - [Release notes](https://github.com/chronotope/chrono/releases) - [Changelog](https://github.com/chronotope/chrono/blob/main/CHANGELOG.md) - [Commits](https://github.com/chronotope/chrono/compare/v0.4.22...v0.4.23) --- updated-dependencies: - dependency-name: chrono dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> * refactor: chrono future compat Integer->timstamp conversions should not silently panic. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Marco Neumann <marco@crepererum.net> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-14 13:34:09 +00:00
Carol (Nichols \|\| Goulding)	3dde82b3b9	fix: Rename QueryDatabaseProvider to QueryNamespaceProvider	2022-11-11 16:14:12 -05:00
Carol (Nichols \|\| Goulding)	0657ad9600	fix: Rename QueryDatabase to QueryNamespace	2022-11-11 16:14:12 -05:00
Carol (Nichols \|\| Goulding)	621560a0dc	fix: Rename QueryDatabaseMeta to QueryNamespaceMeta	2022-11-11 16:14:12 -05:00
Carol (Nichols \|\| Goulding)	bdff4e8848	fix: Consistently use 'namespace' instead of 'database' in comments and other internal text	2022-11-11 15:46:04 -05:00
Dom	d9c97795fc	feat: use IDs in ingester query API (#6093 ) * refactor: NS+table ID (instead of name) in querier<>ingester * feat(ingester): use IDs for query API Changes the ingester to utilise the ID fields (instead of names) sent over the query wire message wrapped within the Flight API. BREAKING: this changes the "query-ingester" CLI command arguments which now expects the namespace & table IDs, rather than their names. * refactor(ingester): add more query logging context Updates the log messages during query execution to include more context fields. * style: remove unused import Co-authored-by: Marco Neumann <marco@crepererum.net>	2022-11-09 11:25:13 +00:00
Marco Neumann	903f7bafa7	refactor: expose `ParquetExec` directly to DataFusion phys. plan (#6072 ) * refactor: expose `ParquetExec` directly to DataFusion phys. plan Closes #5897. * fix: update tracing tests * refactor: use `EmptyExec` * refactor: use `target_partitions` * refactor: improve UUID normalization in query tests Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-11-08 12:19:28 +00:00
Marco Neumann	f511db380c	refactor: remove table name from chunks (#6063 ) It should be always clear from the context to which table a chunk belongs. I think having a table name bound to a chunk goes back to a time where chunks had multiple tables. Helps with #6049. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-07 10:42:57 +00:00
Andrew Lamb	4fb2843d05	refactor: Rename `schema::selection::Selection` to `schema::projection::Projection` (#6037 ) * chore: Rename `schema::selection::Selection` to `schema::projection::Projection` * fix: docs Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-02 18:15:04 +00:00
Andrew Lamb	3ba0458653	feat: Add object_store handler to querier so `remote get-table` works (#6014 ) * feat: Add object_store handler to querier * test: end to end test for get-table from querier * fix: doc links Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-02 14:20:26 +00:00
Marco Neumann	2e74727baf	fix: handle recursing limit in querier<>ingester comm (#6020 ) * test: check server exit status on `TestServer` drop * fix: handle recursing limit in querier<>ingester comm Fixes #5974. * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> * test: simplify Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-02 09:16:34 +00:00
Marco Neumann	45b3984aa3	refactor: simplify `QueryChunk` data access (#6015 ) * refactor: simplify `QueryChunk` data access We have only two types for chunks (now that the RUB is gone): 1. In-memory RecordBatches 2. Parquet files Loads of logic is duplicated in the different `read_filter` implementations. Also `read_filter` hides a solid amount of logic from DataFusion, which will prevent certain (future) optimizations. To enable #5897 and to simplify the interface, let the chunks return the data (batches or metadata for parquet files) directly and let `iox_query` perform the actual heavy-lifting. * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: Andrew Lamb <alamb@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-02 08:18:33 +00:00
Andrew Lamb	9c1f0a3644	refactor: move SessionConfig creation into datafusion_utils (#6011 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-31 20:04:49 +00:00
Marco Neumann	072439e428	refactor: mandatory `QueryChunkMeta::summary` (#5997 ) With #5963 merged, all chunks now provide a summary (even though it may not contain data for all columns). So let's make it mandatory, which also removes a few 🙈-style `.except(...)` calls. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-31 16:38:02 +00:00
Carol (Nichols \|\| Goulding)	dad1ad1318	feat: Add the catalog service to ingester, querier, and compactor So that `remote get` that uses the catalog service can work no matter what kind of server you contact.	2022-10-28 10:49:26 -04:00
Carol (Nichols \|\| Goulding)	53445af25d	chore: Alphabetize some dependencies I can't handle not knowing where to look for a dependency or knowing where to add a new dependency.	2022-10-28 10:34:25 -04:00
Marco Neumann	8447d46093	refactor: remove `QueryChunkMeta::timestamp_min_max` (#5963 ) Use the table summary instead. This allows us to have a single mechanism that both IOx and DataFusion understand. This basically lifts the "basic table summary" mechanism that the querier uses to `iox_query` and let the compactor and ingester use the same mechanism. While not strictly necessary, simplifying the `QueryChunk[Meta]` interface helps with #5897. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-28 10:29:16 +00:00
Carol (Nichols \|\| Goulding)	3145e2c05b	feat: Use workspace dep inheritance for the arrow crate	2022-10-26 10:34:29 -04:00
Carol (Nichols \|\| Goulding)	44936f661a	feat: Use workspace dep inheritance for datafusion instead of shim crate	2022-10-26 10:33:56 -04:00
Marco Neumann	9b48437711	refactor: make influx column type mandatory (#5978 ) We basically assume everywhere that a column falls into one of the three known categories (time, tag, field), so lets encode this in our type system instead of defining "unknown" as "undefined behavior, may or may not crash". Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-26 11:20:29 +00:00
Carol (Nichols \|\| Goulding)	2e83e04eab	feat: Use workspace package metadata to reduce differences and repetition	2022-10-24 13:04:09 -04:00
Marco Neumann	3e4db81bc6	refactor: make `SchemaBuilder::field` fallible It would be nice if the IOx data type would not be optional and this is a prep clean-up to achieve that.	2022-10-24 18:12:42 +02:00
Marco Neumann	e0062f2d40	refactor: do NOT use fake DF context for parquet reading (#5942 ) Use the proper top-level DataFusion context and register the object store there. Note that we still hide the `ParquetExec` behind an opaque record batch stream. Fixing that is next on my list. Helps with #5897. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-24 08:20:26 +00:00
Carol (Nichols \|\| Goulding)	712cfc3f38	fix: Use span rather than child_span	2022-10-20 09:14:28 -04:00
Carol (Nichols \|\| Goulding)	59e1c1d5b9	feat: Pass trace id through Flight requests from querier to ingester Fixes #5723.	2022-10-20 08:55:30 -04:00
Marco Neumann	21e8fcad25	feat: rework cache refresh logic (#5886 ) * feat: rework cache refresh logic Instead of issuing a single refresh when a GET request for a cached key comes in, start a background job (using some efficient logic to not overload tokio) per key that refreshes the key using some exponential backoff. The timer is reset a new GET request comes in. This has the following advantages: - our backoff logic decorrelates the requests - the longer a key was not used, the less often it will be updated All test (esp. integration tests) as adjusted accordingly, mostly to account for the fact that no extra GET is required to start the refresh timer. Closes #5720. * docs: improve Co-authored-by: Andrew Lamb <alamb@influxdata.com> * refactor: simplify rng overwrite Co-authored-by: Andrew Lamb <alamb@influxdata.com>	2022-10-19 16:01:39 +00:00
Andrew Lamb	cd88e72f88	fix: reduce verbosity from `INFO querier::ingester: Time spent in ...` to `DEBUG` (#5913 ) * fix: reduce verbosity from `INFO querier::ingester: Time spent in ingester` * fix: clippy	2022-10-19 15:09:28 +00:00
Marco Neumann	eb5a661ab3	refactor: prep work for #5897 (#5907 ) * refactor: add ID to `ParquetStorage` * refactor: remove duplicate code * refactor: use dedicated `StorageId`	2022-10-19 11:54:42 +00:00
dependabot[bot]	b5574c07b7	chore(deps): Bump async-trait from 0.1.57 to 0.1.58 (#5904 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.57 to 0.1.58. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.57...0.1.58) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-10-19 09:40:26 +00:00
Marco Neumann	e1b50227f8	refactor: avoid some clones while caching ns schema (#5896 ) Found while reviewing the code. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-19 06:28:15 +00:00
Andrew Lamb	d706f8221d	chore: Update datafusion and arrow / parquet / arrow-flight 25.0.0 (#5900 ) * chore: Update datafusion and `arrow` / `parquet` / `arrow-flight` 25.0.0 * chore: Update for structure changes * chore: Update for new projection pushdown * chore: Run cargo hakari tasks * fix: fmt Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-18 20:58:47 +00:00
Marco Neumann	9310d26b92	refactor: remove querier dual chunk stage (#5890 )	2022-10-18 12:38:30 +00:00
Marco Neumann	d89aae88eb	refactor: remove querier read buffer cache (#5889 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-18 10:55:04 +00:00
Marco Neumann	819dbe9e0c	refactor: remove querier chunk load settings (#5888 ) We no longer use dual-state ReadBuffer/Parquet chunks.	2022-10-18 10:22:46 +00:00
Andrew Lamb	6f931411f3	feat: read from parquet and only parquet (#5879 ) * feat: query only from parquet * Revert "feat: query only from parquet" This reverts commit 5ce3c3449c0b9c90154c8c6ece4a40a9c083b7ba. * Revert "revert: disable read buffer usage in querier (#5579) (#5603)" This reverts commit `df5ef875b4`. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-18 10:09:48 +00:00
Andrew Lamb	9134ccd6c3	chore: Update datafusion again (#5855 ) * chore: Update datafusion * chore: Updates for changes in datafusion * chore: more updates * fix: update doc example Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-13 19:18:57 +00:00
Andrew Lamb	d57c99638c	chore: Update datafusion + `arrow`, `arrow-flight`, and `parquet` to 24.0.0.0 (#5792 ) * chore: Update datafusion + `arrow`, `arrow-flight`, and `parquet` to 24.0.0.0 * fix: Update for coercion, fix explain plans for change in column name display * chore: Update datafusion lock * fix: Update for other API changes * chore: Update to latest datafusion pin * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-12 16:19:14 +00:00
Dom Dwyer	c4f542bbe2	refactor(ingester): remove tombstone support This commit removes tombstone support from the ingester, and deletes associated code/helpers/tests. This commit does NOT remove tombstone support from any other service, but MAY include removing overlapping test coverage. This also removes the tombstone support from the Ingester -> Querier RPC response message. This has the nice side effect of removing a whole lot of thread spawning in the ingester tests for the Executor, speeding everything up!	2022-10-11 13:10:04 +02:00
dependabot[bot]	933493fab3	chore(deps): Bump object_store from 0.5.0 to 0.5.1 Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.0 to 0.5.1. - [Release notes](https://github.com/apache/arrow-rs/releases) - [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md) - [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.0...object_store_0.5.1) --- updated-dependencies: - dependency-name: object_store dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com>	2022-10-11 01:19:10 +00:00
Nga Tran	95ed41f140	feat: Projection pushdown for querier -> ingester for rpc queries (#5782 ) * feat: initial step to identify where the projection should be provided * feat: start getting columns of all expressions * chore: format * test: test for the table_chunk_stream * fix: fix a compile error. Thanks @alamb * test: full tests for table_chunk_stream * chore: cleanup * fix: do not cut any columns in case all fields are needed * test: add one more test case of reading all columns * refactor: move code that identify columbs ot push down to a function. Add the use of field_columns * chore: cleanup * refactor: make sream_from_batch support empty batches * chore: cleanup * chore: fix clippy after auto merge Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-10-06 17:21:23 +00:00
Marco Neumann	c4c83e0840	fix: query error propagation (#5801 ) - treat OOM protection as "resource exhausted" - use `DataFusionError` in more places instead of opaque `Box<dyn Error>` - improve conversion from/into `DataFusionError` to preserve more semantics Overall, this improves our error handling. DF can now return errors like "resource exhausted" and gRPC should now automatically generate a sensible status code for it. Fixes #5799.	2022-10-06 08:54:01 +00:00
Dom Dwyer	cd4087e00d	style: add no todo!() or dbg!() lints Some crates had theme, some not - lets be consistent and have the compiler spot dbg!() and todo!() macro calls - they should never be in prod code!	2022-09-29 13:10:07 +02:00
Andrew Lamb	66dbb9541f	chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to 23.0.0, `thrift` to 0.16.0 (#5694 ) * chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to 23.0.0 * chore: Update thrift / remove parquet_format * fix: Update APIs * chore: Update lock + Run cargo hakari tasks * fix: use patched version of arrow-rs to work around https://github.com/apache/arrow-rs/issues/2779 * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-27 12:50:54 +00:00
Nga Tran	84b10b28b2	feat: send only needed projection columns from querier to ingester in… (#5678 ) * feat: send only needed projection columns from querier to ingester in case of normal SQL queries * refactor: push column index down until we need to convert them strings * fix: make the test deterministic * test: test for the projection pushdown * test: add asserts for the proj pushdown test * test: implement projection pushdown for partitions of MockIngesterConnection * chore: cleanup * chore: address review comments * chore: Apply suggestions from code review Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * refactor: address review comments Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-26 17:19:20 +00:00
Carol (Nichols \|\| Goulding)	c8108f01e7	chore: Upgrade to Rust 1.64 (#5727 ) * chore: Upgrade to Rust 1.64 * fix: Use iter find instead of a for loop, thanks clippy * fix: Remove some needless borrows, thanks clippy * fix: Use then_some rather than then with a closure, thanks clippy * fix: Use iter retain rather than filter collect, thanks clippy Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-22 18:04:00 +00:00
Marco Neumann	365a246f8d	refactor: do not run de-dup in ingester for querier requests (#5626 ) * refactor: do not run de-dup in ingester for querier requests This removes the entire de-dup logic from the inegster for querier requests. Furthermore, it even removes the entire datafusion execution from the querier and just dumps the in-memory record batches as quickly as possible. No filters are applied. Note that even prior to this PR, we've never applied projections (tracked by #5624). Pros: - speed up query planning within the querier (since we need the ingester response for state reconciling) - lowered ingester CPU load Cons: - more querier<>ingester network traffic Closes #5602. * test: extend query test case * fix: ingester tests Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-22 07:33:54 +00:00
Marco Neumann	fd45fbc9ab	refactor: use cheaper hash keys for projected schemas (#5713 ) * refactor: arc the cached table * refactor: use cheaper hash keys for projected schemas Instead of using the column names to address projected schemas, let's use the column IDs. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-22 05:31:02 +00:00
Marco Neumann	c66f16e4af	fix: ingester retries (#5708 ) * fix: retry ingester requests faster The retries introduced in #5695 are too slow and block the entire querier for minutes (until the very long gRPC timeout kicks in). * fix: add error details on why the query planning failed Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-21 09:27:47 +00:00
Marco Neumann	5e7fd55a42	refactor: retry querier->ingester requests (#5695 ) * refactor: retry querier->ingester requests Esp. for InfluxRPC requests that scan multiple tables, it may be that one ingester requests fails. We shall retry that request instead of failing the entire query. * refactor: improve docs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * fix: less foo * docs: remove outdated TODO * test: assert that panic happened Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-09-20 15:51:02 +00:00
Marco Neumann	fced536ebd	refactor: improve consistent access under "remove if" (#5693 ) * refactor: improve consistent access under "remove if" With all the concurrency introduced in #5668, we should be a bit more careful with our "remove if" handling, esp. if a removal is triggered while a load is running concurrently. This change introduces as `remove_if_and_get` helper that ensures this and the querier over to use it. The parquet file and tombstone caches required a bit of a larger change because there the invalidation and the actual GET were kinda separate. We had this separation for the other caches as well at some point and decided that this easily leads to API misuse, so I took this opportunity to "fix" the parquet file and tombstone cache as well. * docs: improve	2022-09-20 14:03:11 +00:00
Marco Neumann	513fdf1e26	feat: split "pruned" metric into "early" and "late" (#5645 ) * feat: split "pruned" metric into "early" and "late" * docs: improve Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * docs: explain `PruningMetrics` * test: try to test pruning Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-09-15 13:42:00 +00:00
Marco Neumann	f7b6f81fe1	feat: concurrent chunk creation (#5646 ) Create chunks in querier concurrently after we've pre-filtered them. Chunk creation still may require a bit of cached information (e.g. the partition sort key) and we can easily fetch these concurrently instead of in order. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-15 12:30:02 +00:00
Nga Tran	7c4c918636	chore: add parttion id into panic message (#5641 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-15 02:21:13 +00:00
Marco Neumann	2332e5de10	refactor: slightly increase querier namespace cache TTLs (#5635 ) This should lower catalog load and eliminate a few costly cache misses. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-14 13:54:51 +00:00
Andrew Lamb	f86d3e31da	chore: Update datafusion + object_store (#5619 ) * chore: Update datafusion pin * chore: update object_store to 0.5.0 * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-13 12:34:54 +00:00
Andrew Lamb	1fd31ee3bf	chore: Update datafusion / `arrow` / `arrow-flight` / `parquet` to version 22.0.0 (#5591 ) * chore: Update datafusion / `arrow` / `arrow-flight` / `parquet` to version 22.0.0 * fix: enable dynamic comparison flag * chore: derive Eq for clippy * chore: update explain plans * chore: Update sizes for ReadBuffer encoding * chore: update more tests Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-12 17:45:03 +00:00
Marco Neumann	8933f47ec1	refactor: make `QueryChunk::partition_id` non-optional (#5614 ) In our data model, a chunk always belongs to a partition[^1], so let's not make this attribute optional. The optional value only leads to -- mostly surprising -- conditional behavior, ranging from "do not equalize the partition sort key" (querier) to "always consider the chunk overlapping" (iox_query when dealing with ingester chunks). [^1]: This is even true when the chunk belongs to a parquet file that is not yet added to the catalog, contrary to what a comment in the ingester stated. The catalog and data model used by the querier are two totally different things.	2022-09-12 13:52:51 +00:00
Marco Neumann	df5ef875b4	revert: disable read buffer usage in querier (#5579 ) (#5603 ) This results in a 2x-3x slow down. It's not horrible, but also not good.	2022-09-09 11:26:09 +00:00
dependabot[bot]	786ce75e26	chore(deps): Bump tokio-util from 0.7.3 to 0.7.4 (#5596 ) Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.3 to 0.7.4. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.3...tokio-util-0.7.4) --- updated-dependencies: - dependency-name: tokio-util dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-09-09 07:40:16 +00:00
Marco Neumann	c3b47dfe59	refactor: disable read buffer usage in querier (#5579 ) * refactor: read querier parquet files from cache * refactor: only use parquet files in querier (no RB) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-08 13:18:22 +00:00
YIXIAO SHI	52ae60bf2e	chore: fix comment typo (#5551 ) Co-authored-by: Dom <dom@itsallbroken.com>	2022-09-07 08:49:29 +00:00
Luke Bond	a280acb860	Merge branch 'main' into alamb/guilio-python-main	2022-09-06 16:57:00 +01:00
Marco Neumann	adeacf416c	ci: fix (#5569 ) * ci: use same feature set in `build_dev` and `build_release` * ci: also enable unstable tokio for `build_dev` * chore: update tokio to 1.21 (to fix console-subscriber 0.1.8 * fix: "must use"	2022-09-06 14:13:28 +00:00
Marco Neumann	87772a6aec	refactor: debug log improvements (#5553 ) * feat: extend log output for ingester responses * feat: add debug log for parquet `read_filter` calls * feat: add debug log to `get_write_info` * feat: add debug log parquet cache invalidation	2022-09-05 13:54:13 +00:00
Marco Neumann	064f0e9b29	refactor: use DataFusion to read parquet files (#5531 ) Remove our own hand-rolled logic and let DataFusion read the parquet files. As a bonus, this now supports predicate pushdown to the deserialization step, so we can use parquets as in in-mem buffer. Note that this currently uses some "nested" DataFusion hack due to the way the `QueryChunk` interface works. Midterm I'll change the interface so that the `ParquetExec` nodes are directly visible to DataFusion instead of some opaque `SendableRecordBatchStream`.	2022-09-05 09:25:04 +00:00
Marco Neumann	f45cbfb88d	refactor: fine-grained file size mocking (#5541 ) * refactor: do not override parquet file size in querier This is going to be an issue when we actually rely on the size for reading, see #5531. * refactor: use selected file size mocking in compactor Do not blindly override parquet file sizes for all subsystems. This is going to be an issue when we actually rely on the size for reading, see #5531. * refactor: remove ability to override file sizes in catalog Blindly overriding data for all subsystems is dangerous, because some parts of our stack actually rely on the actual file size. See #5531. * docs: explain `size_overrides`	2022-09-05 08:50:04 +00:00
Andrew Lamb	1e1d964fdb	fix: Some other stragglers	2022-09-04 07:59:07 -04:00
Marco Neumann	0a0b3bd95b	feat: querier object store cache (#5527 ) * feat: querier object store cache * docs: improve Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com> Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>	2022-09-02 09:48:53 +00:00
Marco Neumann	5e187ae1c0	refactor: use concrete type in `MetricsLoader` (#5525 ) The API user may still use a `Box<dyn ...>` if they want, but they technically don't have to. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-01 12:22:12 +00:00
Marco Neumann	c59dd01742	refactor: use concrete inner type in `CacheWithMetrics` (#5522 ) The API user still CAN use dynamic dispatch but doesn't have to. This also simplifies the generics a bit. This is similar to #5520. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-01 06:05:59 +00:00
Marco Neumann	c0dda14cef	refactor: use concrete backend type in `CacheDriver` (#5520 ) This removes some `Box<dyn ...>` indirection when the user doesn't want it (you still can, but don't have to) and makes the whole type handling easier to understand. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-31 14:58:25 +00:00
Andrew Lamb	6669d85fb4	chore: Update datafusion + arrow/parquet to `21.0.0` (#5519 ) * chore: Update arrow/arrow-flight/parquet to 21.0.0 * chore: Update datafusion pin * chore: Fix arrow update script * chore: Update Cargo.lock * chore: Update for new API	2022-08-31 13:30:47 +00:00
Marco Neumann	fecbbd9fa1	refactor: improve namespace caching in querier (#5492 ) 1. Cache converted schema instead of catalog schema. This safes a buch of memcopies during conversion. 2. Simplify creation of new chunks, we now only need a `CachedTable` instead of a namespace and a table schema. In an artificial benchmark, this removed around 10ms from the query (although that was prior to #5467 which moved schema conversion one level up). Still I think it is the cleaner cache design. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-30 11:42:21 +00:00
Marco Neumann	430536f05f	refactor: use a single timestamp in policy backend (#5508 ) * refactor: use a single timestamp in policy backend Prior to this PR we had at least 1 `TimeProvider::now` calls per GET request (for caches that only used LRU) and up to 3 calls (caches with LRU + refresh + TTL). Let's instead use a single timestamp that is created by the policy backend itself (instead of the policies). This has the following consequences: - efficiency: `SystemProvider::now` is not free, even though under Linux this doesn't result in a syscall, it uses the stdlib time system which also checks for monotonicity - consistency: All changes for a single trigger (e.g. a GET cache call) now use a single timestamp instead of slightly increasing ones. I argue this is the better semantic, simpler to understand and better to debug. For some (slightly artificial) local performance experiment, this shaves off around 2ms per single-table SQL query. However I expect that there might be more degenerated cases (e.g. multi-table SQL queries or some InfluxRPC requests that hit multiple tables). The majority of this patch is moving the `TimeProvider` from the policies into the policy backend. * docs: explain `now` parameter	2022-08-30 11:23:25 +00:00
Carol (Nichols \|\| Goulding)	1b49ad25f7	refactor: Rename KafkaTopicId to TopicId	2022-08-29 14:27:02 -04:00
Carol (Nichols \|\| Goulding)	58f0b63cdc	refactor: Rename KafkaTopic to Topic or TopicMetadata or topic name as appropriate	2022-08-29 14:27:02 -04:00
Carol (Nichols \|\| Goulding)	cb52683a1a	fix: Redo uses after rebase	2022-08-29 14:08:33 -04:00
Carol (Nichols \|\| Goulding)	74c9529062	fix: Rename KafkaPartition to ShardIndex	2022-08-29 14:07:18 -04:00
Carol (Nichols \|\| Goulding)	6443858870	fix: Rename compactor option from sequencer to shard	2022-08-29 14:06:45 -04:00
Carol (Nichols \|\| Goulding)	95b7529079	fix: Rename more test values to shard	2022-08-29 14:06:45 -04:00
Carol (Nichols \|\| Goulding)	fe9c474620	fix: rustfmt	2022-08-29 14:06:45 -04:00
Carol (Nichols \|\| Goulding)	952a3ea498	fix: Return querier sharding to use sequencer ID	2022-08-29 14:06:44 -04:00
Carol (Nichols \|\| Goulding)	698f1a47ff	refactor: Rename test structures from sequencer to shard where appropriate	2022-08-29 14:06:44 -04:00
Jake Goulding	4abf21c724	refactor: Rename Sequencer (and its entourage) to Shard	2022-08-29 14:06:43 -04:00
Sam Arnold	05657ea068	fix: optimizations for metadata fetch and chunk pruning (#5467 ) * fix: hoist repeated computation out of chunk creation We have hundreds of chunks per table, so it is beneficial to only do common work once. * chore: remove TableCache as it is no longer used * fix: prune chunks both before and after metadata fetch Fetching the metadata for all the chunks in a table is expensive, especially when we have a narrow time range query that only needs a few chunks. * chore: fix clippy * fix: fix up some last tests * fix: review comments Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-29 14:59:05 +00:00
Marco Neumann	3a4a17a48e	feat: refresh namespace cache before expiration (#5449 ) Closes #5318.	2022-08-29 11:52:18 +00:00
Dom Dwyer	abf26767c1	refactor: infallible JumpHash initialisation This doesn't really need to be fallible but forces propagation of a ton of error handling - no shards is always a sign of something being very wrong, and can be caught in the caller if it's for some reason an acceptable state / can be recovered from.	2022-08-24 13:18:57 +02:00
Marco Neumann	f34f99c5ed	refactor: port LRU cache backend to policy framework (#5406 ) * refactor: port LRU cache backend to policy framework Closes #5320. * test: extend `test_oversized_entries` Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-17 14:43:24 +00:00
Andrew Lamb	7f0ae53d6f	chore: Update to (almost) released object_store 0.4.0 (#5419 ) * chore: update object_store * chore: update hakari config * chore: Run cargo hakari tasks Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-08-17 13:44:48 +00:00
Marco Neumann	49ab568ca8	refactor: convert `remove_if` feature to policy framework (#5398 ) * refactor: allow `ChangeRequest` to carry a lifetime Let's not restrict our change functions to `'static` because this would require us to clone loads of data to achieve predicate-based `remove_if`. * refactor: convert `remove_if` feature to policy framework Decided to drop the "shared" functionality. We only use the small `remove_if` bit which is way easier to reason about. For #5320. * refactor: address review comments Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-16 08:23:27 +00:00
Marco Neumann	0ccefa0d0c	refactor: port TTL backend to policy framework (#5396 ) * refactor: port TTL backend to policy framework Note that this is "just" a port, it does NOT change how TTL works. This will be done in #5318. Helps with #5320. * fix: ensure inner backend is empty * test: add some smoke test	2022-08-15 16:48:16 +00:00
Carol (Nichols \|\| Goulding)	b982bdaf2f	fix: Derive Eq when we derive PartialEq and members can derive Eq Allow this in generated code that we don't control, though. Recommended by clippy now. https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq	2022-08-11 15:04:06 -04:00
Andrew Lamb	b834bc630c	chore: more readability improvements to sort keys (#5366 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-10 17:59:25 +00:00
Andrew Lamb	16ddc5efc6	chore: Update datafusion / arrow/parquet/arrow-flight and prost/tonic ecosystem (#5360 ) * chore: Update datafusion and arrow * chore: Update Cargo.lock * chore: update to Decimal128 * chore: Update tonic/prost/pbjson/etc * chore: Run cargo hakari tasks * fix: doctest in generated types Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-08-09 17:30:44 +00:00
Andrew Lamb	172f893368	fix: fix logging typo in querier (#5345 ) * fix: fix logging typo * fix: fix type in typo fix ;(	2022-08-09 06:34:06 +00:00
Marco Neumann	cd0dc42b4a	refactor: use a single chunk filter/pruning step in querier (#5338 ) We already prune all chunks in the query-access layer. There's no need to do that another time (which is actually the first time) in `QuerierTable::chunks`. The time savings we get from feeding less chunks into the state reconciling should be negligible. On the pro-side however we get a more streamlined data flow and actually correct chunk pruning metrics. Also see #5336. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-08 12:55:14 +00:00
Marco Neumann	fc1870ff76	fix: chunk pruning stats (#5319 ) - emit a warning if we cannot even attempt to prune chunks due to an error. This is always either a missing feature or a bug (even though it does not impact correctness but _only_ performance). Also see https://github.com/influxdata/conductor/issues/1107 - change metrics to clearly differentiate between "could not prune" and "not pruned" - add new "not pruned" observer hook (this was missing for some reason, the "pruned" hook existed though)	2022-08-05 10:50:31 +00:00
Marco Neumann	0d714878ca	feat: chunk pruning metrics (#5273 ) * refactor: make could-not-prune reason a static string * refactor: introduce `QuerierTableArgs` * feat: chunk pruning metrics Closes #4974. * refactor: address review comments * refactor: use static typing for not-pruned reason * refactor: pass chunk to not-pruned observer and use it for some metrics Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-04 15:29:21 +00:00
Nga Tran	34ccc9c7f5	chore: Revert "chore: Revert "refactor: bump batch size (#5251 )" (#5288 )" (#5300 ) This reverts commit `471b8be92f`. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-04 13:19:46 +00:00
Marco Neumann	840e4801b8	feat: make querier RAM pool split a proper feature (#5283 ) * feat: make querier RAM pool split a proper feature - use propre pool names - expose sizing via CLI/env Closes https://github.com/influxdata/conductor/issues/1102. * refactor: improve naming and docs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-03 15:27:23 +00:00
Marco Neumann	663a20d743	refactor: remove `--ingster-address` (#5255 ) Closes #5002.	2022-08-03 15:05:01 +00:00
Nga Tran	471b8be92f	chore: Revert "refactor: bump batch size (#5251 )" (#5288 ) This reverts commit `bb172f8fa8`.	2022-08-03 14:23:45 +00:00
Marco Neumann	8e2443d879	feat: use two RAM pools in querier (#5271 ) Quick&Dirty implementation of a RAM-pool split to see if this has any effect. I expect the querier performance to improve due to this because large read buffers can no longer evict precious metadata. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-02 15:14:26 +00:00
Marco Neumann	ee491cbbfc	fix: re-enable querier read buffer cache (#5268 ) This reverts commit `82913743f1` / #5252. I misjudged the cache hit ratio for the RB, see https://github.com/influxdata/k8s-infra/pull/4548 So let's bring back the RB cache until we have some form of parquet cache in place.	2022-08-02 08:37:30 +00:00
Marco Neumann	a8f6d579c8	feat: add metric for predicate-based cache entry removal (#5257 )	2022-08-02 07:44:53 +00:00
Marco Neumann	fec6b18d80	feat: add metric for TTL cache expiration (#5256 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-02 07:00:30 +00:00
Marco Neumann	82913743f1	refactor: disable querier read buffer cache (#5252 ) Let's try and see how this performs in prod. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-01 15:43:22 +00:00
Marco Neumann	bb172f8fa8	refactor: bump batch size (#5251 ) This is what DataFusion uses by default and I don't see a reason why we should use such small batch sizes. The affect is probably only visible in certain filter-aggregate queries that don't focus on a single series (because there we likely end up with 1 or 2 batches only, esp. after #5250) for coarse-grained filters, esp. when the filter key is not the first sort key. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-08-01 13:49:58 +00:00
dependabot[bot]	fbd39844d8	chore(deps): Bump async-trait from 0.1.56 to 0.1.57 (#5247 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.56 to 0.1.57. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.56...0.1.57) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-08-01 08:30:33 +00:00
Andrew Lamb	9215a534d0	chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to `19.0.0` (#5229 ) * chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to `19.0.0` * chore: Run cargo hakari tasks * fix: Update for API changes * fix: clippy Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-28 08:10:47 +00:00
Marco Neumann	9a9a1a4777	feat: limit per-table chunk data for every query (#5223 ) * feat: `QueryChunk::as_any` * feat: allo `ChunkPruner::prune_chunks` to fail * feat: limit per-table chunk data for every query Closes #5211. * fix: address review comments Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-07-27 13:20:05 +00:00
Marco Neumann	85c186f5b8	feat: cache projected chunk schemas in querier (#5213 ) * feat: cache projected chunk schemas in querier Ref #5202. * refactor: simplify size calculations Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-27 08:23:20 +00:00
Andrew Lamb	495bbe48f2	refactor: Reduce boiler plate calling `SpanRecorder::child` (#5180 ) * refactor: call SpanRecorder::child * refactor: update more locations Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-22 11:11:45 +00:00
Marco Neumann	0f54281d24	feat: trace namespace cache For #5129.	2022-07-21 16:10:06 +02:00
Marco Neumann	9031ed390b	feat: trace parquet_file cache For #5129.	2022-07-21 16:10:06 +02:00
Marco Neumann	4c5227292f	feat: trace partition cache For #5129.	2022-07-21 16:10:06 +02:00
Marco Neumann	ff88702749	feat: wire up cache tracing (1/2) (#5170 ) * feat: trace tombstone cache For #5129. * feat: trace table cache For #5129. * feat: trace read buffer cache For #5129. * feat: trace processed_tombstones cache For #5129. * refactor: improve span name Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-21 13:59:55 +00:00
Nga Tran	69cb3f2b19	refactor: remove min_sequence_number from Compactor and Querier, add `count_by_overlaps_with_level_0` and `count_by_overlaps_with_level_1` to catalog (#5151 ) * refactor: remove min_sequnce_number * fix: typos * fix: remove min_sequencer_number from new files from merging main * fix: add back throwing error if the compactor compacts files persisted by the ingester after the ingester sends max seq_num back to querier * test: add test_compactor_collision back but modify the input to make it work woth new changes Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-21 13:51:54 +00:00
Marco Neumann	b35502ce61	feat: cache tracing (#5164 ) * feat: cache tracing Add tracing to the metrics cache wrapper. The extra arguments for GET and PEEK make this quite simple, because the wrapper can just extend the inner args with the trace information. We currently terminate the span in `querier::cache` (i.e. only pass in `None`, so no tracing will occur) to keep this PR rather small. This will be changed in subsequent PRs. For #5129. * fix: typo Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-21 11:54:22 +00:00
Marco Neumann	0561423475	refactor: enforce proper `IOxSessionContext` (#5158 ) - remove `IOxSessionContext::default()` because untracked contexts should only be created by tests - remove `Option<IOxSessionContext>` because it is a typed workaround for `IOxSessionContext::default` Tests should use `IOxSessionContext::testing` and all _normal_ users should create proper contexts. I suspect this will help tracing or at least prevent silent regressions. See #5129. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-20 16:25:43 +00:00
Marco Neumann	3b8f98c7b8	feat: allow passing for extra arguments to `Cache::peek` (#5161 ) This will be used to pass spans down to `CacheWithMetrics` (or a new wrapper specific to tracing) and will help with #5129. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-20 13:51:21 +00:00
Marco Neumann	8b9119a0c6	feat: trace querier->ingester, stopping at gRPC layer (#5159 ) This adds tracing of querire->ingester request up to the point where we perform the network request, i.e. the trace will only appear on the querier side. We may extend this at some point to carry the tracing information to the ingester as well. Ref #5129. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-20 11:48:52 +00:00
Marco Neumann	b8d9799a26	feat: wire span all the way to `QuerierTable::chunks` (#5134 ) * feat: pass context to `QueryDatabase::chunks` * feat: wire span all the way to `QuerierTable::chunks` This is required for #5129.	2022-07-19 14:12:55 +00:00
Andrew Lamb	e2d871b00b	chore: Update datafusion and arrow/parquet/arrow-flight to `18.0.0` (#5079 ) * chore: Update datafusion to 10.0.0, arrow/parquet/arrow-flight to 18 * chore: Run cargo hakari tasks * fix: update cargo pin Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-18 15:01:03 +00:00
Marco Neumann	f0bd278652	feat: add tracing to instrumented semaphores (#5130 ) This will allow us to easily see how much time we spend during query processing waiting for the query semaphore. Ref #5129.	2022-07-15 07:50:28 +00:00
dependabot[bot]	9b67de2f43	chore(deps): Bump tokio from 1.19.2 to 1.20.0 Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.19.2 to 1.20.0. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.19.2...tokio-1.20.0) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com>	2022-07-14 01:21:43 +00:00
Carol (Nichols \|\| Goulding)	61c023139b	refactor: Switch compaction levels to an enum with values rather than separate consts Bonuses: - Type checking - Validation - Less casting - Exhaustiveness checking - Less use of the numerical value	2022-07-13 11:30:36 -04:00
Marco Neumann	89c24dfec0	fix: do not force-load chunks into read buffer (#5112 ) I forgot to address a TODO in #5091. Extends to test to actually check the chunk stage and removes the function for manual force-loads. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-13 14:46:24 +00:00
Marco Neumann	b1b2cb5d4a	feat: load read buffer on demand (#5091 ) * refactor: extract `select_schema` * refactor: improve `InternalLostInputField` error message * test: improve SQL runner output * feat: load read buffer on demand Closes #5032. * refactor: move `[Half]OwnedSelection` to `schema` crate`	2022-07-13 08:51:40 +00:00
Nga Tran	bce8924b4c	refactor: use max_sequence_number to sort chunks for deduplication (#5101 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-12 16:23:53 +00:00
Marco Neumann	96da584139	test: do NOT create expensive bloom filters when we do not need them (#5089 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-11 16:29:53 +00:00
Marco Neumann	0a61989df8	refactor: `QuerierParquet` + `QuerierRBChunk` = ❤️ (merge them together) (#5063 ) * refactor: `QuerierParquet` + `QuerierRBChunk` = ❤️ * refactor: address review comments Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-08 08:06:53 +00:00
Marco Neumann	41c8a8428f	feat: `ReadBufferCache::peek` (#5064 ) For #5032. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-08 07:35:24 +00:00
Andrew Lamb	c46e1c6347	chore: Update datafusion + arrow/parquet/arrow-flight to `17.0.0` (#5021 ) * fix: correct nullability declaration of system tables * chore: Update datafusion and arrow/parquet/arrow-flight * chore: Run cargo hakari tasks * fix: Update tests * fix: Update tests * fix: predicate pruning * fix: add some tests * fix: query_functions * fix: fix read_buffer test * fix: fix clippy Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-07 19:22:15 +00:00
Marco Neumann	aacdeaca52	refactor: prep work for #5032 (#5060 ) * refactor: remove parquet chunk ID to `ChunkMeta` * refactor: return `Arc` from `QueryChunk::summary` This is similar to how we handle other chunk data like schemas. This allows a chunk to change/refine its "believe" over its own payload while it is passed around in the query stack. Helps w/ #5032.	2022-07-07 13:21:48 +00:00
Marco Neumann	2e5366a62a	refactor: disable TTL (caching) for non-existing namespaces (#5053 ) This is not relevant at the moment for prod since other layers prevent/filter queries for non-existing namespaces. However this messes up the flux integration tests, see https://github.com/influxdata/conductor/issues/997 So let's disable this specific cache case until #4617 is implemented which may be used by the flux tests. Fixes https://github.com/influxdata/conductor/issues/997 Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-06 15:22:58 +00:00
Marco Neumann	16bd3e67c0	refactor: unify `apply_predicate_to_metadata` (#5030 ) Instead of using some hand-rolled timestamp-based logic (or just "unknown") all over the place, just use logic introduced in #5017. This requires slightly improved table summaries within the querier that at least has min/max for the timestamp column. For that, the former `IngesterChunk`-specific `calculate_summary` method was extended to `create_basic_summary` to include that data and is now also used by `QuerierParquetChunk`. Note: `QuerierRBChunk` already has detailled metrics that are provided by the read buffer implementation. Should we ever need even better pruning for `QuerierParquetChunk` (or `IngesterChunk`) then we _only_ need add extra data to the table summaries. Closes #4976. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-05 12:51:59 +00:00
Andrew Lamb	c4c251129e	chore: Update datafusion (#5020 ) * chore: Update datafusion * fix: Update plan * fix: update explain plans Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-01 19:59:41 +00:00
kodiakhq[bot]	84d2573ab6	Merge branch 'main' into cn/move-sharding-logic	2022-07-01 17:46:33 +00:00
Marco Neumann	016dd93d9c	feat: filter chunks before requesting read buffers (#4996 ) Fixes #4976.	2022-07-01 08:59:07 +00:00
Marco Neumann	87a8579742	refactor: `ChunkOrder::new` cannot fail (#5004 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-30 22:26:20 +00:00
Carol (Nichols \|\| Goulding)	380166f4c0	refactor: Move sharding query from namespace to table Better supports future work. Fixes #5003.	2022-06-30 11:32:23 -04:00
Marco Neumann	be53716e4d	refactor: use IDs for `parquet_file.column_set` (#4965 ) * feat: `ColumnRepo::list_by_table_id` * refactor: use IDs for `parquet_file.column_set` Closes #4959. * refactor: introduce `TableSchema::column_id_map`	2022-06-30 15:08:41 +00:00
Carol (Nichols \|\| Goulding)	3049479b78	feat: Implement new querier to ingester config design	2022-06-30 08:26:50 -04:00
Carol (Nichols \|\| Goulding)	59da2dccb8	feat: Assert if no ingester addresses are found Temporarily support `--ingester-addresses` (and always return all ingesters) so that this PR can be deployed during the switchover.	2022-06-30 08:22:47 -04:00
Carol (Nichols \|\| Goulding)	0e450deca8	feat: Support a sequencer being mapped to multiple ingesters	2022-06-30 08:22:47 -04:00
Carol (Nichols \|\| Goulding)	44bce8e3ec	fix: Don't assume one ingester per shard/table	2022-06-30 08:22:47 -04:00
Carol (Nichols \|\| Goulding)	4e91121e29	feat: Allow specification of sequencer to ingester mappings in a JSON file	2022-06-30 08:22:46 -04:00
Carol (Nichols \|\| Goulding)	f37f8013ec	feat: Assign a sequencer id to QuerierTables to know which ingester to query	2022-06-30 08:22:46 -04:00
Carol (Nichols \|\| Goulding)	1824dbdebd	feat: Create IngesterConnection optionally using a map of sequencer IDs to ingester addresses	2022-06-30 08:22:46 -04:00
Raphael Taylor-Davies	835e1c91c7	chore: update object_store to 0.3.0 (#4707 ) * chore: update object_store to 0.3.0 * chore: review feedback Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-29 21:44:03 +00:00
Andrew Lamb	01fb2e132d	chore: Update datafusion pin (#4969 ) * chore: Update datafusion pin * fix: Update for api * fix: Explicitly set coalsce batch size * fix: Update batch size as well * fix: update tests for new explain plan, and improved coercion	2022-06-29 17:52:37 +00:00
Marco Neumann	1eac304305	refactor: fetch RB chunks in parallel (#4952 ) Currently the querier fetches RB in a serial manner, which is probably not good since each cache miss takes between 10ms and 250ms. Let's try to fetch 2 in parallel and if that works well, make this a proper config. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-28 07:54:58 +00:00
Marco Neumann	9b8086df74	fix: size estimates (#4950 ) * fix: `Tombstone::size` must include serialized predicate * fix: `CachedPartition::size` must include `Arc` heap allocation Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-27 15:25:32 +00:00
Marco Neumann	1a74f84494	refactor: remove `ParquetFileWithMetadata` usage outside the catalog (#4948 ) * refactor: remove `DecodedParquetFile` from `iox_tests` * refactor: remove `DecodedParquetFile` from querier Also pull out all the chunk schema and sort key handling into a function so that RB chunks and parquet chunks mostly use the same code path. * refactor: remove `DecodedParquetFile` * refactor: remove `ParquetFileWithMetadata` usage * fix: test data consistency	2022-06-27 15:19:29 +00:00
Marco Neumann	3b78bf1c48	refactor: remove binary parquet file MD from compactor (#4938 ) * refactor: simplify sort key calculation * refactor: use schema from catalog instead from file * refactor: do not request parquet file MD in compactor * test: ensure that `QueryableParquetChunk` works correctly	2022-06-27 15:11:15 +00:00
Marco Neumann	b9cbb3dfca	refactor: do not use in-parquet IOx metadata in compactor () (#4935 ) refactor: avoid feeding sort key from struct into same struct * feat: allow namespace schema query by ID * refactor: do not use binary parquet file MD in compactor tests * refactor: do not use in-parquet IOx metadata * refactor: reduce number of catalog queries	2022-06-27 08:06:11 +00:00
Marco Neumann	bd6c4659af	refactor: slim down parquet chunk (remove Metadata) (#4934 ) * feat: conversion from `ParquetFile` to `ParquetFilePath` * refactor: slim down parquet chunk - ensure it works without binary parquet metadata - timestamp range is no longer optional (ensured by the NG type system) - remove table summary: this is only needed for SOME API users. The compactor can perfectly work without statistics since has the timestamp range which is sufficient for the current overlap check (we don't use any other primary key stats at the moment). The querier currently does NOT use parquet chunks (was replaced by read buffer) but if it will again in some future it will likely need to find a way to fetch and cache the statistics. - the schema is now provided by the API user since it can be reconstructed using the NG catalog only (and "wrong" column orders are tolerated as of #4921) Ref #4124 Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-23 10:55:16 +00:00
Marco Neumann	463d430d43	refactor: do not fetch parquet MD from catalog in querier (#4926 ) Ref #4124	2022-06-23 09:03:19 +00:00
Marco Neumann	4b7d02fad1	feat: do not rely on encoded parquet metadata for RB chunks (#4924 ) * fix: use proper sort key in tests * feat: do not rely on encoded parquet metadata for RB chunks Ref #4124. * refactor: allocate less strings * refactor: use upstream PK calculation * fix: cache expiration w/o a good reason * refactor: make namespace cache safer to use * refactor: make partition cache safer to use	2022-06-23 08:55:52 +00:00
Marco Neumann	0534b80886	fix: `ParquetFile::size` must include column set (#4925 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-22 13:06:02 +00:00
Marco Neumann	9591bed696	refactor: make querier internals private (#4922 ) Queries internals are not meant to be used by other crates. Only a handful selected interfaces should be used by IOxD and the query tests. The compactor only used a very small subset just to read parquet files back into memory. It shall rather use the official `parquet_file` interface instead. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-22 13:00:08 +00:00
Marco Neumann	59accfe862	refactor: assorted fixes and prep work for #4124 (#4912 ) * refactor: `TestPartition::update_sort_key` should return an `Arc` The whole test framework is built around `Arc`s, so let's fix this consistency issue. * fix: actually calculate correct column set in test framework * feat: check expected parquet file schema While working on the querier I made some mistakes regarding schemas and such a check would have greatly improved the debugging experience. * feat: namespace cache expiration * fix: improve parquet schema check * fix: remove clone	2022-06-21 16:08:28 +00:00
Marco Neumann	70337087a8	refactor: do not require parquet metadata for RB cache (#4911 ) * test: add `TestParquetFile::schema` * refactor: do not require parquet metadata for RB cache Ref #4124.	2022-06-21 12:59:23 +00:00
Marco Neumann	db24838221	refactor: remove table name from read buffer (#4910 ) The low-level chunk storage shouldn't care about the table name (this is also true for parquet chunks btw). In fact, the table name is already only a partial information since it misses the namespace. If we need a table name, then the high-level chunk/data management is responsible for that. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-21 11:57:28 +00:00
Marco Neumann	0f63be26c3	refactor: pass path instead of metadata around to load parquet files (#4909 )	2022-06-21 10:57:10 +00:00
Marco Neumann	c3912e34e9	refactor: store per-file column set in catalog (#4908 ) * refactor: store per-file column set in catalog Together with the table-wide schema and the partition-wide sort key, this should be everything we need to read a parquet file directly into memory without peeking any file-level metadata. The querier will use this to directly load parquet files into the read buffer. WARNING: This requires a catalog wipe! Ref #4124. * refactor: use proper `ColumnSet` type	2022-06-21 10:26:12 +00:00
Marco Neumann	730f85a619	refactor(querier): split ingester partitions into chunks (#4893 ) * refactor(querier): split ingester partitions into chunks With the new wire protocol the ingester can now transmit multiple snapshots per partition with different schemas. This changes the querier to reflect this and and splits uses the individual snapshots as chunks for the query engine instead of a single partition. The schema handling was changed so that instead of a table-wide schema enforcement, we now use the snapshot-specific projections. This means we do not need to create all-NULL columns any longer because the batches within the chunks now always have the correct schema. * refactor: "disassembler" -> "decoder"	2022-06-20 08:58:58 +00:00
Nga Tran	72c8cfa6ed	fix: make ChunkOrder i64 data type to accept min sequence number 0 and match with data type of sequence number (#4888 ) * fix: make ChunkOrder u64 data type to accept min sequence number 0 * fix: make ChunkOrder i64 to match with sequence number type Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-17 13:45:17 +00:00
Marco Neumann	0fbff981ec	chore(deps): Bump sqlx to 0.6.0 and uuid to 1 (#4894 ) Closes #4889. Closes #4890. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-17 10:28:28 +00:00
Marco Neumann	c6bffac5d3	refactor: make querier->ingester request metrics per-ingester (#4879 ) The metrics and logs introduced in #4806 will be emitted once for all ingesters instead of per request. The accumulated view makes it pretty hard to judge the actual request-response timings and the number of requests. Instead we now measure the data per request. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 15:09:47 +00:00
Marco Neumann	66c7d95312	refactor: use new ingester<>querier wire protocol (#4867 ) * refactor: use new ingester<>querier wire protocol Use and document the new and more flexible ingester<>querier wire protocol. Note that the ingester does NOT stream the response data yet, but the internal data structures would allow that. A follow-up change will adjust the ingester code to stream the data. Ref #4849. * fix: typos Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * refactor: clarify naming and public interface * test: add schema assertion to `ingester_response_to_record_batches` Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-06-16 08:02:28 +00:00
kodiakhq[bot]	fa9a094068	Merge branch 'main' into cn/talk-to-ingesters-less	2022-06-15 17:42:40 +00:00
Carol (Nichols \|\| Goulding)	8331cb1afe	fix: Add retry to querying of catalog for sequencers in querier startup	2022-06-15 12:09:42 -04:00
Carol (Nichols \|\| Goulding)	03f6f59a9b	fix: Change the sharder to return error instead of panicking for no shards	2022-06-15 11:23:31 -04:00
Marco Neumann	7c60edd38c	refactor: prepare new ingester<>querier protocol on the querier side (#4863 ) * refactor: prepare new ingester<>querier protocol on the querier side This changes the querier internals to work with the new protocol. The wire protocol stays the same (for now). There's a (somewhat hackish) adapter in place on the querier side that converts the old to the new protocol on-the-fly. This is an intermediate step before we actually change the wire protocol (and in a step after that also take advantage of the new possibilites on the ingester side). Ref #4849. * docs: explain adapter	2022-06-15 14:32:24 +00:00
Carol (Nichols \|\| Goulding)	e9cdaffe74	fix: Create querier sharder from catalog sequencer info Panic if there are no sharders in the catalog.	2022-06-15 10:18:54 -04:00
Carol (Nichols \|\| Goulding)	874ef89daa	feat: Make specifying the write buffer, and thus getting a sharder, optional in querier	2022-06-15 10:01:45 -04:00
Marco Neumann	3bd24b67ba	feat: extend flight client to accept multiple (changing) schemas (#4853 ) * feat: extend flight client to accept multiple (changing) schemas See #4849. Originally I intended not to use Flight at all for the new ingester<>querier protocol. However since flight also deals with dictionary batches and multiple batches and the gRPC protocol that I would write would look very similar, I will use Flight with a bit more flexible message types. The rough idea for the protocol is the following stream: - for each partition: 1. "none" message with partition metadata 2. for each chunk (can have different schemas under certain circumstances): 1. "schema" message (resets dictionary state) 2. (optional) dictionary batch messages 3. one or more "record batch" message The nice thing about it is that the same arrow client works also for the existing client<>querier protocol since there we just send: 1. "schema" message (no app metadata) 2. (optional) dictionary batch messages 3. zero, one or more "record batch" message (no app metadata) * refactor: separate high- and low-level flight client It is very unlikely that a user will use the high-level batch-producing functionality and the low-level stuff within the same session. So let's split this into to clients (high-level uses the low-level one internally) to avoid confusion. Also add documentation on our protocol handling. * refactor: enumerate all variants in match statement to better catch errors in the future	2022-06-15 11:38:08 +00:00
Carol (Nichols \|\| Goulding)	e875a92cf8	feat: Log time spent requesting ingester partitions (#4806 ) * feat: Log time spent requesting ingester partitions Fixes #4558. * feat: Record a metric for the duration queriers wait on ingesters * fix: Use DurationHistogram instead of U64 Histogram * test: Add a test for the ingester ms metric * feat: Add back the logging to provide both logging and metrics for ingester duration * refactor: Use sample_count method on metrics * feat: Record ingester duration separately for success or failure * fix: Create a separate test for the ingester metrics Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-14 17:58:19 +00:00
Andrew Lamb	e91d00b10c	chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `16.0.0 (#4851 ) * chore: TEMP Update DataFusion to pre-release * chore: update arrow et al to 16.0.0 * chore: Run cargo hakari tasks * fix: update reader read_dictionary API * chore: Update to real Datafusion release * fix: Update parquet API * fix: update test Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-06-14 16:31:40 +00:00
Dom Dwyer	b41ea1d718	refactor: PartitionKey type This commit changes the code base to use a new reference-counted PartitionKey type wrapper, instead of passing a bare String around. This allows the compiler to type check & verify usage of the partition key, instead of passing a bare string around. By reference counting the underlying string, we reduce memory usage for some use cases.	2022-06-14 14:47:56 +01:00
Marco Neumann	2b84e5c087	feat: measure "probably reloaded" cache loads (#4813 ) To roughly gauge how much data we re-load into cached (i.e. data that was already loaded but was later evicted due to LRU pressure or TTL eviction) this change introduces a new metric that estimates if a cache entry that is requested from the loader was already seen before (using a probabilistic filter).	2022-06-13 13:51:45 +00:00
Marco Neumann	66623fe0cd	feat: expose query semaphore metrics (#4836 ) The groundwork for that was already done, just needed a bit of wiring. This might help us to judge timeouts.	2022-06-13 09:36:50 +00:00
Andrew Lamb	ddf61c5e98	refactor: Consolidate `Selection` creation, add tests (#4832 ) * refactor: Consolidate Selection --> DataFusion projection * fix: remove now unused function	2022-06-10 18:30:43 +00:00
kodiakhq[bot]	dd8d44e24f	Merge branch 'main' into cn/duration	2022-06-10 14:23:09 +00:00
Nga Tran	13c57d524a	feat: Change data type of catalog partition's sort_key from a string to an array of string (#4801 ) * feat: Change data type of catalog Postgres partition's sort_key from a string to an array of string * test: add column with comma * fix: use new protonuf field to avoid incompactible * fix: ensure sort_key is an empty array rather than NULL * refactor: address review comments * refactor: address more comments * chore: clearer comments * chore: Update iox_catalog/migrations/20220607102200_change_sort_key_type_to_array.sql * chore: Update iox_catalog/migrations/20220607102200_change_sort_key_type_to_array.sql * fix: Rename migration so it will be applied after Co-authored-by: Marko Mikulicic <mkm@influxdata.com>	2022-06-10 13:31:31 +00:00
Carol (Nichols \|\| Goulding)	1c7cbaf5ae	refactor: Use DurationHistogram in more places	2022-06-09 14:20:51 -04:00
Marco Neumann	4e5842dec7	feat: expose hit-miss metrics for querier caches (#4811 ) * feat: `MetricsCache` * feat: expose hit-miss metrics for querier caches * refactor: `MetricsCache` -> `CacheWithMetrics` Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-09 13:07:40 +00:00
Andrew Lamb	2ec7764fdd	refactor: rename builder like predicate methods to be `with_` (#4808 ) * refactor: rename builder like predicate methods to be `with_` * fix: merge conflict Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-09 11:26:03 +00:00
Andrew Lamb	5e4fcfaa4d	refactor: reduce mut usage in Predicate (#4807 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-09 10:46:01 +00:00

... 2 3 4 5 6 ...

543 Commits (71625043e2b393eecc803e7c30fb3554c7a7881c)