influxdb

Commit Graph

Author	SHA1	Message	Date
Andrew Lamb	776c34e03d	chore: Update datafusion (#4927 ) * chore: Update datafusion * fix: Update for API changes Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-23 09:30:43 +00:00
Andrew Lamb	47def89670	docs: Update tracing.md for NG (#4916 ) tracing instructions referred to OG -- update for NG Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-23 09:24:38 +00:00
Marco Neumann	463d430d43	refactor: do not fetch parquet MD from catalog in querier (#4926 ) Ref #4124	2022-06-23 09:03:19 +00:00
Marco Neumann	4b7d02fad1	feat: do not rely on encoded parquet metadata for RB chunks (#4924 ) * fix: use proper sort key in tests * feat: do not rely on encoded parquet metadata for RB chunks Ref #4124. * refactor: allocate less strings * refactor: use upstream PK calculation * fix: cache expiration w/o a good reason * refactor: make namespace cache safer to use * refactor: make partition cache safer to use	2022-06-23 08:55:52 +00:00
Marco Neumann	c899c3a0f4	fix: column handling when reading parquet files (#4921 ) * fix: column handling when reading parquet files This improves/fixes/tests a few aspects when reading parquet files: - fix usage of `Selection::Some(...)`. This was broken since #4912 but apparently no test caught that. - ensure that the order of `Selection::Some(...)` is preserved - ensure that schema metadata is attached to output batches - ignore parquet columns that we don't care about (i.e. do not select) - allow parquet file to have a different column order than our internal bookkeeping, this makes it way simpler to read parquet files w/o scanning the metadata first - extend the test coverage Ref #4124. * test: even more tests for parquet reader	2022-06-22 13:51:30 +00:00
Marco Neumann	0534b80886	fix: `ParquetFile::size` must include column set (#4925 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-22 13:06:02 +00:00
Marco Neumann	9591bed696	refactor: make querier internals private (#4922 ) Queries internals are not meant to be used by other crates. Only a handful selected interfaces should be used by IOxD and the query tests. The compactor only used a very small subset just to read parquet files back into memory. It shall rather use the official `parquet_file` interface instead. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-22 13:00:08 +00:00
Marco Neumann	751bdce88a	fix: pass write buffer tests w/o Kafka (#4923 ) Fixes interaction of `maybe_skip_kafka_integration!` and `should_panic` by ensuring that `maybe_skip_kafka_integration!` panics to skip `should_panic` tests. Without that it is not possible to just run `cargo test -p write_buffer`. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-22 10:41:40 +00:00
dependabot[bot]	f7d83ea581	chore(deps): Bump clap from 3.2.5 to 3.2.6 (#4920 ) Bumps [clap](https://github.com/clap-rs/clap) from 3.2.5 to 3.2.6. - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.5...v3.2.6) --- updated-dependencies: - dependency-name: clap dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-22 10:28:44 +00:00
dependabot[bot]	50ffca791a	chore(deps): Bump indexmap from 1.9.0 to 1.9.1 (#4919 ) Bumps [indexmap](https://github.com/bluss/indexmap) from 1.9.0 to 1.9.1. - [Release notes](https://github.com/bluss/indexmap/releases) - [Changelog](https://github.com/bluss/indexmap/blob/master/RELEASES.md) - [Commits](https://github.com/bluss/indexmap/compare/1.9.0...1.9.1) --- updated-dependencies: - dependency-name: indexmap dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-22 10:18:06 +00:00
kodiakhq[bot]	95bd8b41e4	Merge pull request #4914 from influxdata/dom/ingester-uses-partition-key refactor: use propagated partition key in ingester	2022-06-22 09:53:44 +00:00
kodiakhq[bot]	aff5e6d69a	Merge branch 'main' into dom/ingester-uses-partition-key	2022-06-22 09:47:49 +00:00
Andrew Lamb	087dbd3eca	fix: fix heappy + update docs (#4917 ) * docs: Update heap profiling documentation * fix: fix heappy builds * fix: do not run cli tests with heappy Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-21 19:53:28 +00:00
Marco Neumann	59accfe862	refactor: assorted fixes and prep work for #4124 (#4912 ) * refactor: `TestPartition::update_sort_key` should return an `Arc` The whole test framework is built around `Arc`s, so let's fix this consistency issue. * fix: actually calculate correct column set in test framework * feat: check expected parquet file schema While working on the querier I made some mistakes regarding schemas and such a check would have greatly improved the debugging experience. * feat: namespace cache expiration * fix: improve parquet schema check * fix: remove clone	2022-06-21 16:08:28 +00:00
Dom Dwyer	75a3fd5e1e	refactor: use propagated partition key in ingester Changes the ingester to use the partition key derived in the router, and transmitted over through the kafka API boundary. This should have no observable behavioural change, but be more resilient as we're no longer assuming the partitioning algorithm produces the same value in both the router (where data is partitioned) and the ingester (where data is persisted, segregated by partition key). This is a pre-requisite to allowing the user to specify partitioning schemes.	2022-06-21 15:57:30 +01:00
Marco Neumann	70337087a8	refactor: do not require parquet metadata for RB cache (#4911 ) * test: add `TestParquetFile::schema` * refactor: do not require parquet metadata for RB cache Ref #4124.	2022-06-21 12:59:23 +00:00
Marco Neumann	db24838221	refactor: remove table name from read buffer (#4910 ) The low-level chunk storage shouldn't care about the table name (this is also true for parquet chunks btw). In fact, the table name is already only a partial information since it misses the namespace. If we need a table name, then the high-level chunk/data management is responsible for that. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-21 11:57:28 +00:00
Marco Neumann	0f63be26c3	refactor: pass path instead of metadata around to load parquet files (#4909 )	2022-06-21 10:57:10 +00:00
Marco Neumann	c3912e34e9	refactor: store per-file column set in catalog (#4908 ) * refactor: store per-file column set in catalog Together with the table-wide schema and the partition-wide sort key, this should be everything we need to read a parquet file directly into memory without peeking any file-level metadata. The querier will use this to directly load parquet files into the read buffer. WARNING: This requires a catalog wipe! Ref #4124. * refactor: use proper `ColumnSet` type	2022-06-21 10:26:12 +00:00
Dom	4df710a205	Merge pull request #4907 from influxdata/dom/propagate-partition-key feat: propagate partition key through kafka	2022-06-20 14:07:50 +01:00
Dom Dwyer	c1f7154031	feat: propagate partition key through kafka Changes the kafka message wire format to include the partition key for serialised DML writes on the wire. After this commit, the kafka messages will contain the partition key for each op, but this information will go unused in the ingester - this enables us to roll out the producer side, before making the value's presence necessary on the consumer side. A follow-up PR will change the ingester to utilise this embedded partition key. This has the unfortunate side effect of making the partition key part of the public gRPC write API: https://github.com/influxdata/influxdb_iox/issues/4866	2022-06-20 13:42:51 +01:00
Marco Neumann	1962fcc229	chore: reduce dependencies and run `cargo update` (#4906 ) * chore: reduce proptest features * chore: remove `grpc-router` This crate is currently unused and we don't have immediate plans to use it. And there's GIT, so it can always be restored. * chore: `cargo update`	2022-06-20 12:18:28 +00:00
Marco Neumann	730f85a619	refactor(querier): split ingester partitions into chunks (#4893 ) * refactor(querier): split ingester partitions into chunks With the new wire protocol the ingester can now transmit multiple snapshots per partition with different schemas. This changes the querier to reflect this and and splits uses the individual snapshots as chunks for the query engine instead of a single partition. The schema handling was changed so that instead of a table-wide schema enforcement, we now use the snapshot-specific projections. This means we do not need to create all-NULL columns any longer because the batches within the chunks now always have the correct schema. * refactor: "disassembler" -> "decoder"	2022-06-20 08:58:58 +00:00
dependabot[bot]	cc324613cc	chore(deps): Bump syn from 1.0.96 to 1.0.98 (#4905 ) Bumps [syn](https://github.com/dtolnay/syn) from 1.0.96 to 1.0.98. - [Release notes](https://github.com/dtolnay/syn/releases) - [Commits](https://github.com/dtolnay/syn/compare/1.0.96...1.0.98) --- updated-dependencies: - dependency-name: syn dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-20 08:49:02 +00:00
Andrew Lamb	f151b1e89f	fix: categorize `NamespaceNotFound` as ingester not found errors as well (#4899 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-20 08:40:31 +00:00
dependabot[bot]	2ddc78bc41	chore(deps): Bump uuid from 1.0.0 to 1.1.2 (#4904 ) Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.0.0 to 1.1.2. - [Release notes](https://github.com/uuid-rs/uuid/releases) - [Commits](https://github.com/uuid-rs/uuid/compare/1.0.0...1.1.2) --- updated-dependencies: - dependency-name: uuid dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-20 08:34:33 +00:00
dependabot[bot]	8b28721627	chore(deps): Bump tower from 0.4.12 to 0.4.13 (#4903 ) Bumps [tower](https://github.com/tower-rs/tower) from 0.4.12 to 0.4.13. - [Release notes](https://github.com/tower-rs/tower/releases) - [Commits](https://github.com/tower-rs/tower/compare/tower-0.4.12...tower-0.4.13) --- updated-dependencies: - dependency-name: tower dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-20 08:25:29 +00:00
kodiakhq[bot]	bf6fec2447	Merge pull request #4898 from influxdata/dom/timestamps test: assert backfill partitioning in router	2022-06-17 13:51:14 +00:00
kodiakhq[bot]	5786245698	Merge branch 'main' into dom/timestamps	2022-06-17 13:45:30 +00:00
Dom	a8c9638e89	Merge branch 'main' into dom/timestamps	2022-06-17 14:45:27 +01:00
Nga Tran	72c8cfa6ed	fix: make ChunkOrder i64 data type to accept min sequence number 0 and match with data type of sequence number (#4888 ) * fix: make ChunkOrder u64 data type to accept min sequence number 0 * fix: make ChunkOrder i64 to match with sequence number type Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-17 13:45:17 +00:00
Dom Dwyer	2cc2ad6887	test: assert backfill partitioning in router Assert writes in the router that cover multiple partitions are correctly split up.	2022-06-17 14:40:53 +01:00
Marco Neumann	0fbff981ec	chore(deps): Bump sqlx to 0.6.0 and uuid to 1 (#4894 ) Closes #4889. Closes #4890. Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-17 10:28:28 +00:00
dependabot[bot]	d9ab157797	chore(deps): Bump indexmap from 1.8.2 to 1.9.0 (#4891 ) * chore(deps): Bump indexmap from 1.8.2 to 1.9.0 Bumps [indexmap](https://github.com/bluss/indexmap) from 1.8.2 to 1.9.0. - [Release notes](https://github.com/bluss/indexmap/releases) - [Changelog](https://github.com/bluss/indexmap/blob/master/RELEASES.md) - [Commits](https://github.com/bluss/indexmap/compare/1.8.2...1.9.0) --- updated-dependencies: - dependency-name: indexmap dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * chore: Run cargo hakari tasks Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: CircleCI[bot] <circleci@influxdata.com>	2022-06-17 07:42:36 +00:00
Nga Tran	3ca74744bf	chore: debug info about sequence number while it gets converted into ChunkOrder (#4884 )	2022-06-16 18:40:55 +00:00
Nga Tran	d57b0eb1fa	chore: more info for i64-to-u128 panic message (#4881 ) * chore: more info for i64-to-u128 panic message * chore: Apply suggestions from code review Co-authored-by: Dom <dom@itsallbroken.com> * chore: fix fmt Co-authored-by: Dom <dom@itsallbroken.com> Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 15:49:43 +00:00
Marco Neumann	c6bffac5d3	refactor: make querier->ingester request metrics per-ingester (#4879 ) The metrics and logs introduced in #4806 will be emitted once for all ingesters instead of per request. The accumulated view makes it pretty hard to judge the actual request-response timings and the number of requests. Instead we now measure the data per request. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 15:09:47 +00:00
Dom	9122850fe5	Merge pull request #4878 from influxdata/dom/fix-dml-aggregation fix: respect partition key when batching dml ops	2022-06-16 16:00:09 +01:00
Dom Dwyer	43b3f22411	fix: respect partition key when batching dml ops This commit changes the kafka write aggregator to only merge DML ops destined for the same partition. Prior to this commit the aggregator was merging DML ops that had different partition keys, causing data to be persisted in incorrect partitions: https://github.com/influxdata/influxdb_iox/issues/4787	2022-06-16 14:05:32 +01:00
Marco Neumann	743c1692ea	refactor: stream query results from ingester to querier (#4875 ) * refactor: stream partitions from ingester Ref #4849. * refactor: do not collect record batched on the ingester side Ref #4849. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 12:58:50 +00:00
Andrew Lamb	d67336fd69	fix(ingester): ensure all ingester metrics are prefixed with `ingester_` (#4871 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 12:52:35 +00:00
Andrew Lamb	74f4006580	fix(ingester): make ingester metrics start with `ingester` (#4870 ) * fix(ingester): make ingester metrics start with `ingester` * fix: Update ingester/src/stream_handler/handler.rs Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 12:46:37 +00:00
Andrew Lamb	8c56909218	fix(ingester): Distinguish between "not found" and other flight errors (#4874 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 12:39:37 +00:00
Marco Neumann	827d869658	feat: instrument semaphore "cancelled while pending" requests (#4876 ) This is useful to see how many requests timed out while waiting for a semaphore. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 12:33:39 +00:00
Marco Neumann	4b945493be	test: test gRPC and stream flattening (#4873 ) Ref #4849. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-16 11:44:59 +00:00
dependabot[bot]	73e1b3a47e	chore(deps): Bump clap from 3.2.4 to 3.2.5 (#4872 ) Bumps [clap](https://github.com/clap-rs/clap) from 3.2.4 to 3.2.5. - [Release notes](https://github.com/clap-rs/clap/releases) - [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md) - [Commits](https://github.com/clap-rs/clap/compare/v3.2.4...v3.2.5) --- updated-dependencies: - dependency-name: clap dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2022-06-16 09:27:04 +00:00
Marco Neumann	66c7d95312	refactor: use new ingester<>querier wire protocol (#4867 ) * refactor: use new ingester<>querier wire protocol Use and document the new and more flexible ingester<>querier wire protocol. Note that the ingester does NOT stream the response data yet, but the internal data structures would allow that. A follow-up change will adjust the ingester code to stream the data. Ref #4849. * fix: typos Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * refactor: clarify naming and public interface * test: add schema assertion to `ingester_response_to_record_batches` Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2022-06-16 08:02:28 +00:00
Andrew Lamb	6b771375bf	feat: log when partitions are written due to going over size (#4868 )	2022-06-15 20:12:43 +00:00
kodiakhq[bot]	bf39649d64	Merge pull request #4835 from influxdata/cn/talk-to-ingesters-less feat: Make a sharder available in the querier	2022-06-15 17:48:40 +00:00
kodiakhq[bot]	fa9a094068	Merge branch 'main' into cn/talk-to-ingesters-less	2022-06-15 17:42:40 +00:00

1 2 3 4 5 ...

8285 Commits (776c34e03d36ebcc0d39a9493116675904e3a46e) All Branches Search

8285 Commits (776c34e03d36ebcc0d39a9493116675904e3a46e)

All Branches