Commit Graph

8510 Commits (d7838e357f7c1a5135ddf0e9d9d83dd037b4a466)

Author SHA1 Message Date
dependabot[bot] 2ddc78bc41
chore(deps): Bump uuid from 1.0.0 to 1.1.2 (#4904)
Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.0.0 to 1.1.2.
- [Release notes](https://github.com/uuid-rs/uuid/releases)
- [Commits](https://github.com/uuid-rs/uuid/compare/1.0.0...1.1.2)

---
updated-dependencies:
- dependency-name: uuid
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-20 08:34:33 +00:00
dependabot[bot] 8b28721627
chore(deps): Bump tower from 0.4.12 to 0.4.13 (#4903)
Bumps [tower](https://github.com/tower-rs/tower) from 0.4.12 to 0.4.13.
- [Release notes](https://github.com/tower-rs/tower/releases)
- [Commits](https://github.com/tower-rs/tower/compare/tower-0.4.12...tower-0.4.13)

---
updated-dependencies:
- dependency-name: tower
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-20 08:25:29 +00:00
kodiakhq[bot] bf6fec2447
Merge pull request #4898 from influxdata/dom/timestamps
test: assert backfill partitioning in router
2022-06-17 13:51:14 +00:00
kodiakhq[bot] 5786245698
Merge branch 'main' into dom/timestamps 2022-06-17 13:45:30 +00:00
Dom a8c9638e89
Merge branch 'main' into dom/timestamps 2022-06-17 14:45:27 +01:00
Nga Tran 72c8cfa6ed
fix: make ChunkOrder i64 data type to accept min sequence number 0 and match with data type of sequence number (#4888)
* fix: make ChunkOrder u64 data type to accept min sequence number 0

* fix: make ChunkOrder i64 to match with sequence number type

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-17 13:45:17 +00:00
Dom Dwyer 2cc2ad6887 test: assert backfill partitioning in router
Assert writes in the router that cover multiple partitions are correctly
split up.
2022-06-17 14:40:53 +01:00
Marco Neumann 0fbff981ec
chore(deps): Bump sqlx to 0.6.0 and uuid to 1 (#4894)
Closes #4889.
Closes #4890.

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-17 10:28:28 +00:00
dependabot[bot] d9ab157797
chore(deps): Bump indexmap from 1.8.2 to 1.9.0 (#4891)
* chore(deps): Bump indexmap from 1.8.2 to 1.9.0

Bumps [indexmap](https://github.com/bluss/indexmap) from 1.8.2 to 1.9.0.
- [Release notes](https://github.com/bluss/indexmap/releases)
- [Changelog](https://github.com/bluss/indexmap/blob/master/RELEASES.md)
- [Commits](https://github.com/bluss/indexmap/compare/1.8.2...1.9.0)

---
updated-dependencies:
- dependency-name: indexmap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-06-17 07:42:36 +00:00
Nga Tran 3ca74744bf
chore: debug info about sequence number while it gets converted into ChunkOrder (#4884) 2022-06-16 18:40:55 +00:00
Nga Tran d57b0eb1fa
chore: more info for i64-to-u128 panic message (#4881)
* chore: more info for i64-to-u128 panic message

* chore: Apply suggestions from code review

Co-authored-by: Dom <dom@itsallbroken.com>

* chore: fix fmt

Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-16 15:49:43 +00:00
Marco Neumann c6bffac5d3
refactor: make querier->ingester request metrics per-ingester (#4879)
The metrics and logs introduced in #4806 will be emitted once for all
ingesters instead of per request. The accumulated view makes it pretty
hard to judge the actual request-response timings and the number of
requests.

Instead we now measure the data per request.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-16 15:09:47 +00:00
Dom 9122850fe5
Merge pull request #4878 from influxdata/dom/fix-dml-aggregation
fix: respect partition key when batching dml ops
2022-06-16 16:00:09 +01:00
Dom Dwyer 43b3f22411 fix: respect partition key when batching dml ops
This commit changes the kafka write aggregator to only merge DML ops
destined for the same partition.

Prior to this commit the aggregator was merging DML ops that had
different partition keys, causing data to be persisted in incorrect
partitions:

    https://github.com/influxdata/influxdb_iox/issues/4787
2022-06-16 14:05:32 +01:00
Marco Neumann 743c1692ea
refactor: stream query results from ingester to querier (#4875)
* refactor: stream partitions from ingester

Ref #4849.

* refactor: do not collect record batched on the ingester side

Ref #4849.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-16 12:58:50 +00:00
Andrew Lamb d67336fd69
fix(ingester): ensure all ingester metrics are prefixed with `ingester_` (#4871)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-16 12:52:35 +00:00
Andrew Lamb 74f4006580
fix(ingester): make ingester metrics start with `ingester` (#4870)
* fix(ingester): make ingester metrics start with `ingester`

* fix: Update ingester/src/stream_handler/handler.rs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-16 12:46:37 +00:00
Andrew Lamb 8c56909218
fix(ingester): Distinguish between "not found" and other flight errors (#4874)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-16 12:39:37 +00:00
Marco Neumann 827d869658
feat: instrument semaphore "cancelled while pending" requests (#4876)
This is useful to see how many requests timed out while waiting for a
semaphore.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-16 12:33:39 +00:00
Marco Neumann 4b945493be
test: test gRPC and stream flattening (#4873)
Ref #4849.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-16 11:44:59 +00:00
dependabot[bot] 73e1b3a47e
chore(deps): Bump clap from 3.2.4 to 3.2.5 (#4872)
Bumps [clap](https://github.com/clap-rs/clap) from 3.2.4 to 3.2.5.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.2.4...v3.2.5)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-16 09:27:04 +00:00
Marco Neumann 66c7d95312
refactor: use new ingester<>querier wire protocol (#4867)
* refactor: use new ingester<>querier wire protocol

Use and document the new and more flexible ingester<>querier wire
protocol.

Note that the ingester does NOT stream the response data yet, but the
internal data structures would allow that. A follow-up change will
adjust the ingester code to stream the data.

Ref #4849.

* fix: typos

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* refactor: clarify naming and public interface

* test: add schema assertion to `ingester_response_to_record_batches`

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-06-16 08:02:28 +00:00
Andrew Lamb 6b771375bf
feat: log when partitions are written due to going over size (#4868) 2022-06-15 20:12:43 +00:00
kodiakhq[bot] bf39649d64
Merge pull request #4835 from influxdata/cn/talk-to-ingesters-less
feat: Make a sharder available in the querier
2022-06-15 17:48:40 +00:00
kodiakhq[bot] fa9a094068
Merge branch 'main' into cn/talk-to-ingesters-less 2022-06-15 17:42:40 +00:00
Dom 9a774abc70
Merge pull request #4864 from influxdata/dom/partition-key-dml
refactor: add partition key to DmlWrite
2022-06-15 17:44:56 +01:00
Carol (Nichols || Goulding) 8331cb1afe
fix: Add retry to querying of catalog for sequencers in querier startup 2022-06-15 12:09:42 -04:00
Carol (Nichols || Goulding) 03f6f59a9b
fix: Change the sharder to return error instead of panicking for no shards 2022-06-15 11:23:31 -04:00
Dom Dwyer 4df2964566 refactor: store PartitionKey in DmlWrite
Carry the PartitionKey in the DmlWrite, allowing the batch to be
associated with a specific partition key.
2022-06-15 15:48:54 +01:00
Dom Dwyer 0da8ec87d5 refactor: always generate a partition key
Changes the partitioner to always generate a partition key, even if the
column being used to partition doesn't exist. This doesn't functionally
change the batch partitioning output, but ensures we always have a
non-empty string for the partition key.
2022-06-15 15:38:02 +01:00
Dom Dwyer 61182f506b refactor: emit PartitionKey from partitioner
Changes the partitioning code to emit a PartitionKey, instead of a bare
String.
2022-06-15 15:38:02 +01:00
Marco Neumann 7c60edd38c
refactor: prepare new ingester<>querier protocol on the querier side (#4863)
* refactor: prepare new ingester<>querier protocol on the querier side

This changes the querier internals to work with the new protocol. The
wire protocol stays the same (for now). There's a (somewhat hackish)
adapter in place on the querier side that converts the old to the new
protocol on-the-fly. This is an intermediate step before we actually
change the wire protocol (and in a step after that also take advantage
of the new possibilites on the ingester side).

Ref #4849.

* docs: explain adapter
2022-06-15 14:32:24 +00:00
Carol (Nichols || Goulding) e9cdaffe74
fix: Create querier sharder from catalog sequencer info
Panic if there are no sharders in the catalog.
2022-06-15 10:18:54 -04:00
Carol (Nichols || Goulding) 553590fb23
fix: Move deps to dev-deps 2022-06-15 10:01:45 -04:00
Carol (Nichols || Goulding) 874ef89daa
feat: Make specifying the write buffer, and thus getting a sharder, optional in querier 2022-06-15 10:01:45 -04:00
Carol (Nichols || Goulding) 127467b5c4
feat: Create a sharder in the querier 2022-06-15 10:01:45 -04:00
Carol (Nichols || Goulding) 148bc57e7b
refactor: Make the querier server constructor more like other server constructors 2022-06-15 10:01:45 -04:00
Carol (Nichols || Goulding) 6417e7dc2a
feat: Extract sharder to its own crate 2022-06-15 10:01:45 -04:00
Marco Neumann 3bd24b67ba
feat: extend flight client to accept multiple (changing) schemas (#4853)
* feat: extend flight client to accept multiple (changing) schemas

See #4849.

Originally I intended not to use Flight at all for the new
ingester<>querier protocol. However since flight also deals with
dictionary batches and multiple batches and the gRPC protocol that I
would write would look very similar, I will use Flight with a bit more
flexible message types.

The rough idea for the protocol is the following stream:

- for each partition:
  1. "none" message with partition metadata
  2. for each chunk (can have different schemas under certain
     circumstances):
     1. "schema" message (resets dictionary state)
     2. (optional) dictionary batch messages
     3. one or more "record batch" message

The nice thing about it is that the same arrow client works also for the
existing client<>querier protocol since there we just send:

1. "schema" message (no app metadata)
2. (optional) dictionary batch messages
3. zero, one or more "record batch" message (no app metadata)

* refactor: separate high- and low-level flight client

It is very unlikely that a user will use the high-level batch-producing
functionality and the low-level stuff within the same session. So let's
split this into to clients (high-level uses the low-level one
internally) to avoid confusion.

Also add documentation on our protocol handling.

* refactor: enumerate all variants in match statement to better catch errors in the future
2022-06-15 11:38:08 +00:00
Andrew Lamb 005610b172
refactor: remove some `&` use in iox_catalog (#4862)
* refactor: remove some `&` use in iox_catalog

* fix: Update data_types/src/lib.rs
2022-06-15 11:31:49 +00:00
Andrew Lamb 394c84f3e8
chore: Update CI checks to verify data generator build (#4857)
* chore: Update CI checks to verify data generator build

* fix: bench verify test

* docs: Update .circleci/config.yml

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-15 10:31:14 +00:00
Andrew Lamb 164e75f328
refactor: Remove unused `Option` (#4839)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-15 10:24:51 +00:00
dependabot[bot] 232dc897df
chore(deps): Bump clap from 3.2.1 to 3.2.4 (#4860)
Bumps [clap](https://github.com/clap-rs/clap) from 3.2.1 to 3.2.4.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/clap_complete-v3.2.1...v3.2.4)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-15 07:28:53 +00:00
Nga Tran b682dbbc2e
chore: Add debug info of sort_key for ingester (#4859)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-14 20:39:17 +00:00
Andrew Lamb 7eed3ba0b7
fix: fix feature flags for iox_data_generator build (#4858) 2022-06-14 19:43:22 +00:00
Andrew Lamb c8f70b8933
feat: log query from querier to ingester at `info` level (#4856)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-14 18:35:50 +00:00
Andrew Lamb eca3b6b9a1
fix: reduce memory usage in ingester with less buffering prior to query engine (#4830)
* refactor: remove another buffer copy in ingester

* docs: Update arrow_util/src/util.rs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-14 18:22:55 +00:00
Carol (Nichols || Goulding) e875a92cf8
feat: Log time spent requesting ingester partitions (#4806)
* feat: Log time spent requesting ingester partitions

Fixes #4558.

* feat: Record a metric for the duration queriers wait on ingesters

* fix: Use DurationHistogram instead of U64 Histogram

* test: Add a test for the ingester ms metric

* feat: Add back the logging to provide both logging and metrics for ingester duration

* refactor: Use sample_count method on metrics

* feat: Record ingester duration separately for success or failure

* fix: Create a separate test for the ingester metrics

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-14 17:58:19 +00:00
Andrew Lamb 7d2a5c299f
refactor: remove one buffer copy in the ingester (#4855)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-14 17:15:36 +00:00
Andrew Lamb e91d00b10c
chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `16.0.0 (#4851)
* chore: TEMP Update DataFusion to pre-release

* chore: update arrow et al to 16.0.0

* chore: Run cargo hakari tasks

* fix: update reader read_dictionary API

* chore: Update to real Datafusion release

* fix: Update parquet API

* fix: update test

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-06-14 16:31:40 +00:00