influxdb

Commit Graph

Author	SHA1	Message	Date
Marco Neumann	59accfe862	refactor: assorted fixes and prep work for #4124 (#4912 ) * refactor: `TestPartition::update_sort_key` should return an `Arc` The whole test framework is built around `Arc`s, so let's fix this consistency issue. * fix: actually calculate correct column set in test framework * feat: check expected parquet file schema While working on the querier I made some mistakes regarding schemas and such a check would have greatly improved the debugging experience. * feat: namespace cache expiration * fix: improve parquet schema check * fix: remove clone	2022-06-21 16:08:28 +00:00
Marco Neumann	70337087a8	refactor: do not require parquet metadata for RB cache (#4911 ) * test: add `TestParquetFile::schema` * refactor: do not require parquet metadata for RB cache Ref #4124.	2022-06-21 12:59:23 +00:00
Marco Neumann	db24838221	refactor: remove table name from read buffer (#4910 ) The low-level chunk storage shouldn't care about the table name (this is also true for parquet chunks btw). In fact, the table name is already only a partial information since it misses the namespace. If we need a table name, then the high-level chunk/data management is responsible for that. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-21 11:57:28 +00:00
Marco Neumann	0f63be26c3	refactor: pass path instead of metadata around to load parquet files (#4909 )	2022-06-21 10:57:10 +00:00
Marco Neumann	c3912e34e9	refactor: store per-file column set in catalog (#4908 ) * refactor: store per-file column set in catalog Together with the table-wide schema and the partition-wide sort key, this should be everything we need to read a parquet file directly into memory without peeking any file-level metadata. The querier will use this to directly load parquet files into the read buffer. WARNING: This requires a catalog wipe! Ref #4124. * refactor: use proper `ColumnSet` type	2022-06-21 10:26:12 +00:00
Marco Neumann	730f85a619	refactor(querier): split ingester partitions into chunks (#4893 ) * refactor(querier): split ingester partitions into chunks With the new wire protocol the ingester can now transmit multiple snapshots per partition with different schemas. This changes the querier to reflect this and and splits uses the individual snapshots as chunks for the query engine instead of a single partition. The schema handling was changed so that instead of a table-wide schema enforcement, we now use the snapshot-specific projections. This means we do not need to create all-NULL columns any longer because the batches within the chunks now always have the correct schema. * refactor: "disassembler" -> "decoder"	2022-06-20 08:58:58 +00:00
Nga Tran	72c8cfa6ed	fix: make ChunkOrder i64 data type to accept min sequence number 0 and match with data type of sequence number (#4888 ) * fix: make ChunkOrder u64 data type to accept min sequence number 0 * fix: make ChunkOrder i64 to match with sequence number type Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-17 13:45:17 +00:00
Marco Neumann	0fbff981ec	chore(deps): Bump sqlx to 0.6.0 and uuid to 1 (#4894 ) Closes #4889. Closes #4890. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-17 10:28:28 +00:00
Marco Neumann	c6bffac5d3	refactor: make querier->ingester request metrics per-ingester (#4879 ) The metrics and logs introduced in #4806 will be emitted once for all ingesters instead of per request. The accumulated view makes it pretty hard to judge the actual request-response timings and the number of requests. Instead we now measure the data per request. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 15:09:47 +00:00
Marco Neumann	66c7d95312	refactor: use new ingester<>querier wire protocol (#4867 ) * refactor: use new ingester<>querier wire protocol Use and document the new and more flexible ingester<>querier wire protocol. Note that the ingester does NOT stream the response data yet, but the internal data structures would allow that. A follow-up change will adjust the ingester code to stream the data. Ref #4849. * fix: typos Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * refactor: clarify naming and public interface * test: add schema assertion to `ingester_response_to_record_batches` Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-06-16 08:02:28 +00:00
kodiakhq[bot]	fa9a094068	Merge branch 'main' into cn/talk-to-ingesters-less	2022-06-15 17:42:40 +00:00
Carol (Nichols \|\| Goulding)	8331cb1afe	fix: Add retry to querying of catalog for sequencers in querier startup	2022-06-15 12:09:42 -04:00
Carol (Nichols \|\| Goulding)	03f6f59a9b	fix: Change the sharder to return error instead of panicking for no shards	2022-06-15 11:23:31 -04:00
Marco Neumann	7c60edd38c	refactor: prepare new ingester<>querier protocol on the querier side (#4863 ) * refactor: prepare new ingester<>querier protocol on the querier side This changes the querier internals to work with the new protocol. The wire protocol stays the same (for now). There's a (somewhat hackish) adapter in place on the querier side that converts the old to the new protocol on-the-fly. This is an intermediate step before we actually change the wire protocol (and in a step after that also take advantage of the new possibilites on the ingester side). Ref #4849. * docs: explain adapter	2022-06-15 14:32:24 +00:00
Carol (Nichols \|\| Goulding)	e9cdaffe74	fix: Create querier sharder from catalog sequencer info Panic if there are no sharders in the catalog.	2022-06-15 10:18:54 -04:00
Carol (Nichols \|\| Goulding)	874ef89daa	feat: Make specifying the write buffer, and thus getting a sharder, optional in querier	2022-06-15 10:01:45 -04:00
Marco Neumann	3bd24b67ba	feat: extend flight client to accept multiple (changing) schemas (#4853 ) * feat: extend flight client to accept multiple (changing) schemas See #4849. Originally I intended not to use Flight at all for the new ingester<>querier protocol. However since flight also deals with dictionary batches and multiple batches and the gRPC protocol that I would write would look very similar, I will use Flight with a bit more flexible message types. The rough idea for the protocol is the following stream: - for each partition: 1. "none" message with partition metadata 2. for each chunk (can have different schemas under certain circumstances): 1. "schema" message (resets dictionary state) 2. (optional) dictionary batch messages 3. one or more "record batch" message The nice thing about it is that the same arrow client works also for the existing client<>querier protocol since there we just send: 1. "schema" message (no app metadata) 2. (optional) dictionary batch messages 3. zero, one or more "record batch" message (no app metadata) * refactor: separate high- and low-level flight client It is very unlikely that a user will use the high-level batch-producing functionality and the low-level stuff within the same session. So let's split this into to clients (high-level uses the low-level one internally) to avoid confusion. Also add documentation on our protocol handling. * refactor: enumerate all variants in match statement to better catch errors in the future	2022-06-15 11:38:08 +00:00
Carol (Nichols \|\| Goulding)	e875a92cf8	feat: Log time spent requesting ingester partitions (#4806 ) * feat: Log time spent requesting ingester partitions Fixes #4558. * feat: Record a metric for the duration queriers wait on ingesters * fix: Use DurationHistogram instead of U64 Histogram * test: Add a test for the ingester ms metric * feat: Add back the logging to provide both logging and metrics for ingester duration * refactor: Use sample_count method on metrics * feat: Record ingester duration separately for success or failure * fix: Create a separate test for the ingester metrics Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-14 17:58:19 +00:00
Andrew Lamb	e91d00b10c	chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `16.0.0 (#4851 ) * chore: TEMP Update DataFusion to pre-release * chore: update arrow et al to 16.0.0 * chore: Run cargo hakari tasks * fix: update reader read_dictionary API * chore: Update to real Datafusion release * fix: Update parquet API * fix: update test Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-06-14 16:31:40 +00:00
Dom Dwyer	b41ea1d718	refactor: PartitionKey type This commit changes the code base to use a new reference-counted PartitionKey type wrapper, instead of passing a bare String around. This allows the compiler to type check & verify usage of the partition key, instead of passing a bare string around. By reference counting the underlying string, we reduce memory usage for some use cases.	2022-06-14 14:47:56 +01:00
Marco Neumann	2b84e5c087	feat: measure "probably reloaded" cache loads (#4813 ) To roughly gauge how much data we re-load into cached (i.e. data that was already loaded but was later evicted due to LRU pressure or TTL eviction) this change introduces a new metric that estimates if a cache entry that is requested from the loader was already seen before (using a probabilistic filter).	2022-06-13 13:51:45 +00:00
Marco Neumann	66623fe0cd	feat: expose query semaphore metrics (#4836 ) The groundwork for that was already done, just needed a bit of wiring. This might help us to judge timeouts.	2022-06-13 09:36:50 +00:00
Andrew Lamb	ddf61c5e98	refactor: Consolidate `Selection` creation, add tests (#4832 ) * refactor: Consolidate Selection --> DataFusion projection * fix: remove now unused function	2022-06-10 18:30:43 +00:00
kodiakhq[bot]	dd8d44e24f	Merge branch 'main' into cn/duration	2022-06-10 14:23:09 +00:00
Nga Tran	13c57d524a	feat: Change data type of catalog partition's sort_key from a string to an array of string (#4801 ) * feat: Change data type of catalog Postgres partition's sort_key from a string to an array of string * test: add column with comma * fix: use new protonuf field to avoid incompactible * fix: ensure sort_key is an empty array rather than NULL * refactor: address review comments * refactor: address more comments * chore: clearer comments * chore: Update iox_catalog/migrations/20220607102200_change_sort_key_type_to_array.sql * chore: Update iox_catalog/migrations/20220607102200_change_sort_key_type_to_array.sql * fix: Rename migration so it will be applied after Co-authored-by: Marko Mikulicic <mkm@influxdata.com>	2022-06-10 13:31:31 +00:00
Carol (Nichols \|\| Goulding)	1c7cbaf5ae	refactor: Use DurationHistogram in more places	2022-06-09 14:20:51 -04:00
Marco Neumann	4e5842dec7	feat: expose hit-miss metrics for querier caches (#4811 ) * feat: `MetricsCache` * feat: expose hit-miss metrics for querier caches * refactor: `MetricsCache` -> `CacheWithMetrics` Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-09 13:07:40 +00:00
Andrew Lamb	2ec7764fdd	refactor: rename builder like predicate methods to be `with_` (#4808 ) * refactor: rename builder like predicate methods to be `with_` * fix: merge conflict Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-09 11:26:03 +00:00
Andrew Lamb	5e4fcfaa4d	refactor: reduce mut usage in Predicate (#4807 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-09 10:46:01 +00:00
Andrew Lamb	afc1c12062	refactor: consolidate `PredicateBuilder` into `Predicate` (#4799 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-08 12:21:24 +00:00
Marco Neumann	317e9486df	refactor: make `Cache` a trait (#4802 ) * refactor: make `Cache` a trait To insert more high-level metrics (e.g. cache misses/hits) it would be helpful if we could easily instrument the layer right above the cache driver (that combines the backend and the loader). To do that without polluting the types too much, let's introduce a trait that describes the driver interface and that we could later wrap with intrumentation. This also pulls out the test into a generic setup, similar to how this is done for the cache storage backends. This does NOT include any functionality changes. * fix: typo Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-06-08 11:04:53 +00:00
Marco Neumann	4509e3db57	feat: wire up RB metrics for querier chunks	2022-06-07 15:31:49 +02:00
Andrew Lamb	8e96a2721d	chore: Update datafusion (again) (#4788 ) * chore: Update datafusion * chore: Update imports * refactor: update API usage * refactor: clean up some uses of binary_expr * fix: remove unused export * fix: update explain output * chore: update more explain tests Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-07 08:17:56 +00:00
dependabot[bot]	04c685b3b7	chore(deps): Bump tokio-util from 0.7.2 to 0.7.3 (#4784 ) Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.2 to 0.7.3. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.2...tokio-util-0.7.3) --- updated-dependencies: - dependency-name: tokio-util dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-06 14:46:27 +00:00
dependabot[bot]	e03bf94420	chore(deps): Bump tokio from 1.18.2 to 1.19.1 (#4783 ) Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.18.2 to 1.19.1. - [Release notes](https://github.com/tokio-rs/tokio/releases) - [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.18.2...tokio-1.19.1) --- updated-dependencies: - dependency-name: tokio dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-06 14:15:12 +00:00
Carol (Nichols \|\| Goulding)	5af0cc6acf	fix: Handle read buffer column not existing for column_names in QueryChunk impl	2022-06-03 12:45:16 -04:00
Carol (Nichols \|\| Goulding)	aa510ae4e6	fix: Remove test uses of parquet chunks and document as unused The querier is now using read buffer chunks only, but we're leaving the parquet chunk code around for the moment.	2022-06-03 09:16:04 -04:00
Carol (Nichols \|\| Goulding)	a4f51d99f6	feat: Use the read buffer chunk cache in the querier	2022-06-03 09:16:04 -04:00
Andrew Lamb	3592aa52d8	chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `15.0.0` (#4743 ) * chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `15.0.0` * chore: Update APIs * chore: Run cargo hakari tasks * feat: normalize parquet file metadata * chore: update size tests * chore: add docs on metadata stripping * chore: TEMP UPDATE TO DF BRANCH * chore: Update for new API * fix: Update to latest DF * fix: cargo hakari Co-authored-by: CircleCI[bot] <circleci@influxdata.com> Co-authored-by: Raphael Taylor-Davies <r.taylordavies@googlemail.com>	2022-06-03 10:32:26 +00:00
dependabot[bot]	9a21292db8	chore(deps): Bump async-trait from 0.1.53 to 0.1.56 (#4774 ) Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.53 to 0.1.56. - [Release notes](https://github.com/dtolnay/async-trait/releases) - [Commits](https://github.com/dtolnay/async-trait/compare/0.1.53...0.1.56) --- updated-dependencies: - dependency-name: async-trait dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-03 09:10:40 +00:00
Carol (Nichols \|\| Goulding)	9d9c5d3692	fix: Take backoff config as an argument to be consistent with the other caches	2022-06-02 09:50:48 -04:00
Carol (Nichols \|\| Goulding)	76b40ac6a1	refactor: Make the type alias into a struct	2022-06-02 09:26:11 -04:00
Carol (Nichols \|\| Goulding)	715c65dfef	docs: Clarify a comment about what is an Arc	2022-06-02 09:22:44 -04:00
Carol (Nichols \|\| Goulding)	879dd7cec4	test: LRU behavior of the read buffer chunk cache	2022-06-02 09:22:44 -04:00
Carol (Nichols \|\| Goulding)	9328ba8c45	feat: Use new extra loading info to load read buffer chunks into cache	2022-06-02 09:22:44 -04:00
Carol (Nichols \|\| Goulding)	054c25de50	refactor: Add more methods to DecodedParquetFile I'm tired of trying to remember which info is on which metadata.	2022-06-02 09:22:44 -04:00
Marco Neumann	9e30a3eb29	refactor: rework querier concurrency limiting (#4760 ) * refactor: rework querier concurrency limiting With #4752 we introduced a concurrency limit into the querier. It works by drawing permits from a central semaphore whenever we create a `QuerierNamespace`. This however only limits concurrency during query planning and not query execution, because the objects contained within the plan (chunks and some metadata) neither reference the permit nor the `QuerierNamespace`. Now one approach to fix that would be to wire up the permit all the down into all the query-related data structures. This however is very fiddly and potentially will get lost at some point, because as soon as we transform these data structures -- e.g. into streams -- the permit might get lost again. This will be potentially query-dependent and very hard to debug. So instead we reverse the approach and track the permits at the upper layer of the stack: the gRPC service entry points. There we also need to be careful -- e.g. when we return streams to tonic -- but it's way easier to review that then the deeply nested object hierarchy that is involved with queries. Also the separation of concerns is a bit clearer, because why would a "chunk" care about the "query concurrency" as a whole. * refactor: improve gRPC permit keeping and prepare tests	2022-06-02 09:49:58 +00:00
Carol (Nichols \|\| Goulding)	37347f2389	feat: Add an Extra type to Cacher Loader to specify extra information for loading entries	2022-06-01 08:58:19 -04:00
Andrew Lamb	2886149afc	chore: naming / comment cleanups from namespace semaphore (#4753 )	2022-06-01 12:46:38 +00:00
Marco Neumann	ebeccf037c	feat: limit querier concurrency by limiting number of active namespaces (#4752 ) This is a rather quick fix for prod. On the mid-term we probably wanna rethink our deployment strategy, e.g. by using "one query per pod" and by deploying queryd w/ IOx into the same pod.	2022-06-01 11:59:35 +00:00

1 2 3 4 5

222 Commits (59accfe862e12d96e8834f3a9dfad60189c6e4ab)