influxdb

Commit Graph

Author	SHA1	Message	Date
kodiakhq[bot]	aff5e6d69a	Merge branch 'main' into dom/ingester-uses-partition-key	2022-06-22 09:47:49 +00:00
Andrew Lamb	087dbd3eca	fix: fix heappy + update docs (#4917 ) * docs: Update heap profiling documentation * fix: fix heappy builds * fix: do not run cli tests with heappy Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-21 19:53:28 +00:00
Marco Neumann	59accfe862	refactor: assorted fixes and prep work for #4124 (#4912 ) * refactor: `TestPartition::update_sort_key` should return an `Arc` The whole test framework is built around `Arc`s, so let's fix this consistency issue. * fix: actually calculate correct column set in test framework * feat: check expected parquet file schema While working on the querier I made some mistakes regarding schemas and such a check would have greatly improved the debugging experience. * feat: namespace cache expiration * fix: improve parquet schema check * fix: remove clone	2022-06-21 16:08:28 +00:00
Dom Dwyer	75a3fd5e1e	refactor: use propagated partition key in ingester Changes the ingester to use the partition key derived in the router, and transmitted over through the kafka API boundary. This should have no observable behavioural change, but be more resilient as we're no longer assuming the partitioning algorithm produces the same value in both the router (where data is partitioned) and the ingester (where data is persisted, segregated by partition key). This is a pre-requisite to allowing the user to specify partitioning schemes.	2022-06-21 15:57:30 +01:00
Marco Neumann	70337087a8	refactor: do not require parquet metadata for RB cache (#4911 ) * test: add `TestParquetFile::schema` * refactor: do not require parquet metadata for RB cache Ref #4124.	2022-06-21 12:59:23 +00:00
Marco Neumann	db24838221	refactor: remove table name from read buffer (#4910 ) The low-level chunk storage shouldn't care about the table name (this is also true for parquet chunks btw). In fact, the table name is already only a partial information since it misses the namespace. If we need a table name, then the high-level chunk/data management is responsible for that. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-21 11:57:28 +00:00
Marco Neumann	0f63be26c3	refactor: pass path instead of metadata around to load parquet files (#4909 )	2022-06-21 10:57:10 +00:00
Marco Neumann	c3912e34e9	refactor: store per-file column set in catalog (#4908 ) * refactor: store per-file column set in catalog Together with the table-wide schema and the partition-wide sort key, this should be everything we need to read a parquet file directly into memory without peeking any file-level metadata. The querier will use this to directly load parquet files into the read buffer. WARNING: This requires a catalog wipe! Ref #4124. * refactor: use proper `ColumnSet` type	2022-06-21 10:26:12 +00:00
Dom	4df710a205	Merge pull request #4907 from influxdata/dom/propagate-partition-key feat: propagate partition key through kafka	2022-06-20 14:07:50 +01:00
Dom Dwyer	c1f7154031	feat: propagate partition key through kafka Changes the kafka message wire format to include the partition key for serialised DML writes on the wire. After this commit, the kafka messages will contain the partition key for each op, but this information will go unused in the ingester - this enables us to roll out the producer side, before making the value's presence necessary on the consumer side. A follow-up PR will change the ingester to utilise this embedded partition key. This has the unfortunate side effect of making the partition key part of the public gRPC write API: https://github.com/influxdata/influxdb_iox/issues/4866	2022-06-20 13:42:51 +01:00
Marco Neumann	1962fcc229	chore: reduce dependencies and run `cargo update` (#4906 ) * chore: reduce proptest features * chore: remove `grpc-router` This crate is currently unused and we don't have immediate plans to use it. And there's GIT, so it can always be restored. * chore: `cargo update`	2022-06-20 12:18:28 +00:00
Marco Neumann	730f85a619	refactor(querier): split ingester partitions into chunks (#4893 ) * refactor(querier): split ingester partitions into chunks With the new wire protocol the ingester can now transmit multiple snapshots per partition with different schemas. This changes the querier to reflect this and and splits uses the individual snapshots as chunks for the query engine instead of a single partition. The schema handling was changed so that instead of a table-wide schema enforcement, we now use the snapshot-specific projections. This means we do not need to create all-NULL columns any longer because the batches within the chunks now always have the correct schema. * refactor: "disassembler" -> "decoder"	2022-06-20 08:58:58 +00:00
dependabot[bot]	cc324613cc	chore(deps): Bump syn from 1.0.96 to 1.0.98 (#4905 ) Bumps [syn](https://github.com/dtolnay/syn) from 1.0.96 to 1.0.98. - [Release notes](https://github.com/dtolnay/syn/releases) - [Commits](https://github.com/dtolnay/syn/compare/1.0.96...1.0.98) --- updated-dependencies: - dependency-name: syn dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-20 08:49:02 +00:00
Andrew Lamb	f151b1e89f	fix: categorize `NamespaceNotFound` as ingester not found errors as well (#4899 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-20 08:40:31 +00:00
dependabot[bot]	2ddc78bc41	chore(deps): Bump uuid from 1.0.0 to 1.1.2 (#4904 ) Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.0.0 to 1.1.2. - [Release notes](https://github.com/uuid-rs/uuid/releases) - [Commits](https://github.com/uuid-rs/uuid/compare/1.0.0...1.1.2) --- updated-dependencies: - dependency-name: uuid dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-20 08:34:33 +00:00
dependabot[bot]	8b28721627	chore(deps): Bump tower from 0.4.12 to 0.4.13 (#4903 ) Bumps [tower](https://github.com/tower-rs/tower) from 0.4.12 to 0.4.13. - [Release notes](https://github.com/tower-rs/tower/releases) - [Commits](https://github.com/tower-rs/tower/compare/tower-0.4.12...tower-0.4.13) --- updated-dependencies: - dependency-name: tower dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-20 08:25:29 +00:00
kodiakhq[bot]	bf6fec2447	Merge pull request #4898 from influxdata/dom/timestamps test: assert backfill partitioning in router	2022-06-17 13:51:14 +00:00
kodiakhq[bot]	5786245698	Merge branch 'main' into dom/timestamps	2022-06-17 13:45:30 +00:00
Dom	a8c9638e89	Merge branch 'main' into dom/timestamps	2022-06-17 14:45:27 +01:00
Nga Tran	72c8cfa6ed	fix: make ChunkOrder i64 data type to accept min sequence number 0 and match with data type of sequence number (#4888 ) * fix: make ChunkOrder u64 data type to accept min sequence number 0 * fix: make ChunkOrder i64 to match with sequence number type Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-17 13:45:17 +00:00
Dom Dwyer	2cc2ad6887	test: assert backfill partitioning in router Assert writes in the router that cover multiple partitions are correctly split up.	2022-06-17 14:40:53 +01:00
Marco Neumann	0fbff981ec	chore(deps): Bump sqlx to 0.6.0 and uuid to 1 (#4894 ) Closes #4889. Closes #4890. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-17 10:28:28 +00:00
dependabot[bot]	d9ab157797	chore(deps): Bump indexmap from 1.8.2 to 1.9.0 (#4891 ) * chore(deps): Bump indexmap from 1.8.2 to 1.9.0 Bumps [indexmap](https://github.com/bluss/indexmap) from 1.8.2 to 1.9.0. - [Release notes](https://github.com/bluss/indexmap/releases) - [Changelog](https://github.com/bluss/indexmap/blob/master/RELEASES.md) - [Commits](https://github.com/bluss/indexmap/compare/1.8.2...1.9.0) --- updated-dependencies: - dependency-name: indexmap dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * chore: Run cargo hakari tasks Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-06-17 07:42:36 +00:00
Nga Tran	3ca74744bf	chore: debug info about sequence number while it gets converted into ChunkOrder (#4884 )	2022-06-16 18:40:55 +00:00
Nga Tran	d57b0eb1fa	chore: more info for i64-to-u128 panic message (#4881 ) * chore: more info for i64-to-u128 panic message * chore: Apply suggestions from code review Co-authored-by: Dom <dom@itsallbroken.com> * chore: fix fmt Co-authored-by: Dom <dom@itsallbroken.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 15:49:43 +00:00
Marco Neumann	c6bffac5d3	refactor: make querier->ingester request metrics per-ingester (#4879 ) The metrics and logs introduced in #4806 will be emitted once for all ingesters instead of per request. The accumulated view makes it pretty hard to judge the actual request-response timings and the number of requests. Instead we now measure the data per request. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 15:09:47 +00:00
Dom	9122850fe5	Merge pull request #4878 from influxdata/dom/fix-dml-aggregation fix: respect partition key when batching dml ops	2022-06-16 16:00:09 +01:00
Dom Dwyer	43b3f22411	fix: respect partition key when batching dml ops This commit changes the kafka write aggregator to only merge DML ops destined for the same partition. Prior to this commit the aggregator was merging DML ops that had different partition keys, causing data to be persisted in incorrect partitions: https://github.com/influxdata/influxdb_iox/issues/4787	2022-06-16 14:05:32 +01:00
Marco Neumann	743c1692ea	refactor: stream query results from ingester to querier (#4875 ) * refactor: stream partitions from ingester Ref #4849. * refactor: do not collect record batched on the ingester side Ref #4849. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 12:58:50 +00:00
Andrew Lamb	d67336fd69	fix(ingester): ensure all ingester metrics are prefixed with `ingester_` (#4871 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 12:52:35 +00:00
Andrew Lamb	74f4006580	fix(ingester): make ingester metrics start with `ingester` (#4870 ) * fix(ingester): make ingester metrics start with `ingester` * fix: Update ingester/src/stream_handler/handler.rs Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 12:46:37 +00:00
Andrew Lamb	8c56909218	fix(ingester): Distinguish between "not found" and other flight errors (#4874 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 12:39:37 +00:00
Marco Neumann	827d869658	feat: instrument semaphore "cancelled while pending" requests (#4876 ) This is useful to see how many requests timed out while waiting for a semaphore. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 12:33:39 +00:00
Marco Neumann	4b945493be	test: test gRPC and stream flattening (#4873 ) Ref #4849. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 11:44:59 +00:00
dependabot[bot]	73e1b3a47e	chore(deps): Bump clap from 3.2.4 to 3.2.5 (#4872 ) Bumps [clap](https://github.com/clap-rs/clap) from 3.2.4 to 3.2.5. - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.4...v3.2.5) --- updated-dependencies: - dependency-name: clap dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-16 09:27:04 +00:00
Marco Neumann	66c7d95312	refactor: use new ingester<>querier wire protocol (#4867 ) * refactor: use new ingester<>querier wire protocol Use and document the new and more flexible ingester<>querier wire protocol. Note that the ingester does NOT stream the response data yet, but the internal data structures would allow that. A follow-up change will adjust the ingester code to stream the data. Ref #4849. * fix: typos Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * refactor: clarify naming and public interface * test: add schema assertion to `ingester_response_to_record_batches` Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-06-16 08:02:28 +00:00
Andrew Lamb	6b771375bf	feat: log when partitions are written due to going over size (#4868 )	2022-06-15 20:12:43 +00:00
kodiakhq[bot]	bf39649d64	Merge pull request #4835 from influxdata/cn/talk-to-ingesters-less feat: Make a sharder available in the querier	2022-06-15 17:48:40 +00:00
kodiakhq[bot]	fa9a094068	Merge branch 'main' into cn/talk-to-ingesters-less	2022-06-15 17:42:40 +00:00
Dom	9a774abc70	Merge pull request #4864 from influxdata/dom/partition-key-dml refactor: add partition key to DmlWrite	2022-06-15 17:44:56 +01:00
Carol (Nichols \|\| Goulding)	8331cb1afe	fix: Add retry to querying of catalog for sequencers in querier startup	2022-06-15 12:09:42 -04:00
Carol (Nichols \|\| Goulding)	03f6f59a9b	fix: Change the sharder to return error instead of panicking for no shards	2022-06-15 11:23:31 -04:00
Dom Dwyer	4df2964566	refactor: store PartitionKey in DmlWrite Carry the PartitionKey in the DmlWrite, allowing the batch to be associated with a specific partition key.	2022-06-15 15:48:54 +01:00
Dom Dwyer	0da8ec87d5	refactor: always generate a partition key Changes the partitioner to always generate a partition key, even if the column being used to partition doesn't exist. This doesn't functionally change the batch partitioning output, but ensures we always have a non-empty string for the partition key.	2022-06-15 15:38:02 +01:00
Dom Dwyer	61182f506b	refactor: emit PartitionKey from partitioner Changes the partitioning code to emit a PartitionKey, instead of a bare String.	2022-06-15 15:38:02 +01:00
Marco Neumann	7c60edd38c	refactor: prepare new ingester<>querier protocol on the querier side (#4863 ) * refactor: prepare new ingester<>querier protocol on the querier side This changes the querier internals to work with the new protocol. The wire protocol stays the same (for now). There's a (somewhat hackish) adapter in place on the querier side that converts the old to the new protocol on-the-fly. This is an intermediate step before we actually change the wire protocol (and in a step after that also take advantage of the new possibilites on the ingester side). Ref #4849. * docs: explain adapter	2022-06-15 14:32:24 +00:00
Carol (Nichols \|\| Goulding)	e9cdaffe74	fix: Create querier sharder from catalog sequencer info Panic if there are no sharders in the catalog.	2022-06-15 10:18:54 -04:00
Carol (Nichols \|\| Goulding)	553590fb23	fix: Move deps to dev-deps	2022-06-15 10:01:45 -04:00
Carol (Nichols \|\| Goulding)	874ef89daa	feat: Make specifying the write buffer, and thus getting a sharder, optional in querier	2022-06-15 10:01:45 -04:00
Carol (Nichols \|\| Goulding)	127467b5c4	feat: Create a sharder in the querier	2022-06-15 10:01:45 -04:00

1 2 3 4 5 ...

8274 Commits (aff5e6d69a839e99eb553bad04152acb57998e2c) All Branches Search

8274 Commits (aff5e6d69a839e99eb553bad04152acb57998e2c)

All Branches