Commit Graph

543 Commits (71625043e2b393eecc803e7c30fb3554c7a7881c)

Author SHA1 Message Date
Carol (Nichols || Goulding) 038f8e9ce0
fix: Move shard concepts into only the catalog
This still inserts the shard id into the database, always set to the
TRANSITION_SHARD_ID, but never reads it back out again.
2023-04-26 11:42:32 -04:00
dependabot[bot] bdf7f316d7
chore(deps): Bump tokio from 1.27.0 to 1.28.0 (#7667)
* chore(deps): Bump tokio from 1.27.0 to 1.28.0

Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.27.0 to 1.28.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.27.0...tokio-1.28.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Dom <dom@itsallbroken.com>
2023-04-26 12:53:26 +00:00
dependabot[bot] 0b9240cbbe
chore(deps): Bump tokio-util from 0.7.7 to 0.7.8 (#7665)
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.7 to 0.7.8.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.7...tokio-util-0.7.8)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-26 09:24:39 +00:00
Carol (Nichols || Goulding) 8d4c2bfabb
fix: Only ever create the transition shard in the in-memory catalog
Tests that use the in-memory catalog are creating different shards that
then creates old-style Parquet file paths, but in production, everything
uses the transition shard now. To make the tests more like production,
only ever create and use the transition shard, and stop checking for
different shard IDs.
2023-04-24 10:08:00 -04:00
Marco Neumann d7dc305972
feat: allow overwriting DataFusion's default config (#7586)
This is helpful to test changes in our defaults but also for testing.

Required for https://github.com/influxdata/idpe/issues/17474 .

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-18 11:28:45 +00:00
Dom Dwyer c5bb88e173
chore: remove unused dependencies
Some crates import dependencies they never use.
2023-04-18 12:07:13 +02:00
kodiakhq[bot] da96239605
Merge branch 'main' into cn/delete-tombstones 2023-04-17 13:59:49 +00:00
Carol (Nichols || Goulding) 5f2d82fbc6
fix: Remove tombstones from querier; they're unused 2023-04-14 13:20:39 -04:00
Andrew Lamb f46d06d56f
chore: Update DataFusion + arrow ecosystem to 37 (#7544)
* chore: Update datafusion and arrow/parquet to 37, tonic to 0.9.1

* refactor: Update for FieldRef and other API changes

* fix: Update field size calculation

* fix: Use `NullBuffer` directly

* fix: remove outdated comment

* chore: Update test for tonic

* chore: Run cargo hakari tasks

* chore: cargo update

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-14 12:43:01 +00:00
Carol (Nichols || Goulding) f0f74bae02
fix: Treat empty ingester info differently than not having ingester info
When pre-warming the catalog cache before the ingester responses have
returned, we don't have any ingester parquet file counts. This is
different than asking the ingesters for the parquet file counts and not
getting any. So keep the Option to be able to treat "not present"
differently from "present but empty".
2023-04-12 14:50:18 -04:00
Carol (Nichols || Goulding) acf857816e
fix: Remove old querier 2023-04-12 13:18:23 -04:00
Carol (Nichols || Goulding) 6387a9576a
fix: Remove the write_summary crate and write info service 2023-04-12 11:31:23 -04:00
Marco Neumann b29bdf73ab
feat: improve querier->ingester tracing (#7501)
* feat: improve querier->ingester tracing

- add more hierarchy items on the querier side
- ensure that streaming is correctly traced by the querier

* refactor: improve span name

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* docs: `QueryDataTracer`

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-04-11 13:43:42 +00:00
Andrew Lamb 1a80b8073c
fix: Improve span names for query access (#7476)
* fix: Improve span names for query access

* fix: update test

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-11 10:34:09 +00:00
Marco Neumann 5f43f2a719
refactor: remove old query planning code (#7449)
Closes #7406.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-06 16:05:08 +00:00
Marco Neumann c03a5c7c14
fix: tracing span hierarchy in querier (#7469)
The span for the individual chunk creations should be under
"create individual chunks".
2023-04-06 10:01:39 +00:00
dependabot[bot] 66982f988b
chore(deps): Bump object_store from 0.5.5 to 0.5.6 (#7433)
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.5 to 0.5.6.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/commits)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-04 08:43:34 +00:00
Marco Neumann f04962d630
feat: new query planning (#7250)
Closes #6098.
2023-04-03 10:31:03 +00:00
dependabot[bot] 4eedb7ea77
chore(deps): Bump async-trait from 0.1.66 to 0.1.68 (#7374)
* chore(deps): Bump async-trait from 0.1.66 to 0.1.68

Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.66 to 0.1.68.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.66...0.1.68)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2023-03-30 10:14:36 +00:00
dependabot[bot] 9cbcdc7672
chore(deps): Bump tokio from 1.26.0 to 1.27.0 (#7373)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.26.0 to 1.27.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.26.0...tokio-1.27.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-30 09:36:04 +00:00
Marco Neumann 75dba43ced
test: extend retention policy query test (#7352)
Add an ingester chunk to the parquet chunks.

Helpful for #6098.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-29 05:32:22 +00:00
dependabot[bot] 4b888c7255
chore(deps): Bump insta from 1.28.0 to 1.29.0 (#7322)
Bumps [insta](https://github.com/mitsuhiko/insta) from 1.28.0 to 1.29.0.
- [Release notes](https://github.com/mitsuhiko/insta/releases)
- [Changelog](https://github.com/mitsuhiko/insta/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mitsuhiko/insta/compare/1.28.0...1.29.0)

---
updated-dependencies:
- dependency-name: insta
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-24 18:25:01 +00:00
Andrew Lamb 5dd71998a1
chore: Update datafusion (#7318)
* chore: Update datafusion

* chore: Update for API change

* chore: Run cargo hakari tasks

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-24 15:07:23 +00:00
Marco Neumann 07b7107f9a
feat: sub-traces for `create_chunks` (#7148)
In one prod case the majority of this was NOT spend on creating the
child chunks. I suspect that the summary creation and the string cloning
involved in there are quite slow. So let's have slightly more detailed
tracing and see.
2023-03-07 15:06:37 +00:00
dependabot[bot] 8f3a9396d0
chore(deps): Bump async-trait from 0.1.64 to 0.1.66 (#7129)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.64 to 0.1.66.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.64...0.1.66)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-06 10:13:29 +00:00
dependabot[bot] 3256fcc72e
chore(deps): Bump object_store from 0.5.4 to 0.5.5
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.4 to 0.5.5.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.4...object_store_0.5.5)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-03-03 02:00:51 +00:00
dependabot[bot] c538cac4ef
chore(deps): Bump tokio from 1.25.0 to 1.26.0 (#7107)
* chore(deps): Bump tokio from 1.25.0 to 1.26.0

Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.25.0 to 1.26.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.25.0...tokio-1.26.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-02 09:50:39 +00:00
Carol (Nichols || Goulding) 3bf0f2779e
refactor: Move query plan normalizer to arrow_util 2023-03-01 15:44:22 -05:00
Carol (Nichols || Goulding) bbfff8699c
fix: Use the same normalization code for explain tests as e2e tests do
The regex for replacing UUIDs needed to be changed like the normalizer's
regex did, so keep them in sync by using the same code.

This might point to the normalizer needing to be moved somewhere else,
or changing these tests to be e2e?
2023-03-01 13:00:04 -05:00
kodiakhq[bot] b7170e41fb
Merge branch 'main' into cn/more-querier-tests-to-kafkaless 2023-03-01 16:05:41 +00:00
Andrew Lamb e19ce98407
chore: Update datafusion + arrow/arrow-flight/parquet to 34.0.0 (#7084)
* chore: Update datafusion + arrow/arrow-flight/parquet to 34.0.0

* chore: Run cargo hakari tasks

* chore: Update plans

* chore: Update querier expected output

* chore: Update querier tests to use insta

* fix: sort output too

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-01 11:25:01 +00:00
Carol (Nichols || Goulding) 312a9bb56b
test: Change more querier tests to only use Kafkaless 2023-02-27 14:20:46 -05:00
kodiakhq[bot] 731a131a85
Merge branch 'main' into cn/test-rpc-write-in-querier 2023-02-27 15:17:51 +00:00
Carol (Nichols || Goulding) faae5eb438 chore: Rerun cargo hakari manage-deps 2023-02-27 11:56:15 +01:00
Carol (Nichols || Goulding) 5e9e08a86d
test: Switch most querier tests to use the RPC write path 2023-02-23 14:48:51 -05:00
Andrew Lamb f93baf7693
chore: Update DataFusion and `arrow` / `arrow-flight` / `parquet` to `33.0.0` (#7045)
* chore: Update DataFusion and arrow/arrow-flight/parquet to 33.0.0

* fix: Update test output

* fix: update more test output

* fix: Update querier test output

* chore: Run cargo hakari tasks

* test: fix formatting

Fix formatting of batch pretty printing.

* test: fix formatting

Fix formatting of batch pretty printing.

* test: fix formatting for selector tests

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Dom Dwyer <dom@itsallbroken.com>
Co-authored-by: Christopher Wolff <chris.wolff@influxdata.com>
2023-02-22 21:24:20 +00:00
Andrew Lamb 27890b313f
chore: Update datafusion (#6997)
* chore: Update datafusion

* chore: update the plans

* fix: update some plans

* chore: Update plans and port some explain plans to use insta snapshots

* fix: another plan

* chore: Run cargo hakari tasks

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-16 17:03:25 +00:00
Dom Dwyer 2d46a364dc
feat: namespace soft-delete support
This commit adds initial support for "soft" namespace deletion, where
the actual records & data remain, but are no longer queryable /
writeable.

Soft deletion is eventually consistent - users can expect to continue
writing to and reading from a bucket after issuing a soft delete call,
until the various components either restart, or have their caches
flushed.

The components treat soft-deleted namespaces differently:

    * router: ignore soft deleted namespaces
    * ingester: accept soft deleted namespaces
    * compactor: accept soft deleted namespaces
    * querier: ignore soft deleted namespaces
    * various gRPC services: ignore soft deleted namespaces

This ensures that the ingester & compactor do not see rows "vanishing"
from the database, and continue to make forward progress.

Writes for the deleted namespace that are buffered in the ingester will
be persisted as normal, allowing us to support "un-delete" operations
where the system is restored to a the state at which the delete was
issued (rather than loosing the buffered data).

Follow-on work is required to ensure GC drops the orphaned parquet files
after the configured GC time, and optimisations such as not compacting
parquet from soft-deleted namespaces seems like a trivial win.
2023-02-13 12:01:35 +01:00
dependabot[bot] 0cbd9f6a82
chore(deps): Bump tokio-util from 0.7.5 to 0.7.7 (#6964)
---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-13 10:10:53 +00:00
Andrew Lamb 779fb93ce7
refactor: move test builders out of compactor2 code (#6953)
* refactor: move test builders out of compactor2 code

* fix: docs
2023-02-10 18:28:09 +00:00
dependabot[bot] c0c9b51b9e
chore(deps): Bump tokio-util from 0.7.4 to 0.7.5 (#6941)
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.4 to 0.7.5.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.4...tokio-util-0.7.5)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-10 09:42:00 +00:00
dependabot[bot] 0ecde75af5
chore(deps): Bump object_store from 0.5.3 to 0.5.4 (#6900)
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.3 to 0.5.4.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.3...object_store_0.5.4)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-08 09:40:11 +00:00
Raphael Taylor-Davies d3601a59f8
chore: update DataFusion, upgrade `arrow` `arrow-flight` and `parquet` to `32.0.0` (#6756)
* chore: update DataFusion

* fix: test

* chore: format

* chore: clippy

* chore: update arrow

* chore: arrow upgrade fallout

* chore: Run cargo hakari tasks

* chore: remove failing warm compaction test

* fix: flight error propagation

* chore: update parquet size

* fix: Update error message

* chore: Update parquet metadata test

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-06 11:35:39 +00:00
Carol (Nichols || Goulding) b3aa52700b
fix: Box an error containing an error to reduce size
As suggested by
<https://rust-lang.github.io/rust-clippy/master/index.html#result_large_err>.
2023-02-03 13:06:20 -05:00
Carol (Nichols || Goulding) 30fea67701
fix: Move variables within format strings. Thanks clippy!
Changes made automatically using `cargo clippy --fix`.
2023-02-03 13:06:17 -05:00
dependabot[bot] d0e6b16450
chore(deps): Bump bytes from 1.3.0 to 1.4.0
Bumps [bytes](https://github.com/tokio-rs/bytes) from 1.3.0 to 1.4.0.
- [Release notes](https://github.com/tokio-rs/bytes/releases)
- [Changelog](https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tokio-rs/bytes/compare/v1.3.0...v1.4.0)

---
updated-dependencies:
- dependency-name: bytes
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-02-01 00:30:56 +00:00
Marco Neumann 7cadd38a3c
fix: do not panic when partition was removed from catalog (#6773)
Fixes https://github.com/influxdata/idpe/issues/17040 .
2023-01-31 11:54:34 +00:00
dependabot[bot] 6f032b1d57
chore(deps): Bump async-trait from 0.1.63 to 0.1.64 (#6769)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.63 to 0.1.64.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.63...0.1.64)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-31 10:18:27 +00:00
Marco Neumann d707709cce
fix: invalidate querier->ingester conn on error (#6747)
It seems that tonic is caching DNS results for too long and clings to an
old ingester that no longer exists.

See https://github.com/influxdata/idpe/issues/17022 (not sure though if
this fix is sufficient, let's see).

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-30 11:06:12 +00:00
dependabot[bot] ed7d02a225
chore(deps): Bump tokio from 1.24.2 to 1.25.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.2 to 1.25.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/commits/tokio-1.25.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-01-30 01:57:27 +00:00
Nga Tran b8a80869d4
feat: introduce a new way of max_sequence_number for ingester, compactor and querier (#6692)
* feat: introduce a new way of max_sequence_number for ingester, compactor and querier

* chore: cleanup

* feat: new column max_l0_created_at to order files for deduplication

* chore: cleanup

* chore: debug info for chnaging cpu.parquet

* fix: update test parquet file

Co-authored-by: Marco Neumann <marco@crepererum.net>
2023-01-26 10:52:47 +00:00
dependabot[bot] 0114e7ee50
chore(deps): Bump async-trait from 0.1.61 to 0.1.63 (#6660)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.61 to 0.1.63.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.61...0.1.63)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-23 08:41:27 +00:00
Andrew Lamb 65c020c9f2
refactor: remove iox_arrow_flight use in `influxdb_iox_client ` and `querier` (#6624)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-19 18:48:23 +00:00
Andrew Lamb 57f08dbccd
chore: Update datafusion to Jan 9, 2023 (1 / 2) (#6603)
* refactor: Update DataFusion pin to early Jan 2023

* fix: Update tests now that planning is async

* fix: Updates for API changes

* chore: Run cargo hakari tasks

* fix: Update comment

* refactor: nicer config setup

* fix: gapfill async

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2023-01-18 12:19:32 +00:00
Marco Neumann 7d06a61b5f
fix: use `create_at` to order querier chunks under kafkaless (#6554)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 17:15:08 +00:00
Marco Neumann 042b7c4521
refactor: invalidate querier cache if ingester is gone (#6550)
* refactor: invalidate querier cache if ingester is gone

For #6549 but I think even w/o the plan illustrated there, this is the
right thing to do.

Also changes the cache system to use flats sorted vectors instead of costly hash
maps.

* refactor: simplify code

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 13:46:18 +00:00
Marco Neumann 2bb6db3f37
fix: ensure ingester state tracked in querier cache is always in-sync (#6512)
Fixes #6510.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 12:00:05 +00:00
dependabot[bot] c68049c37a
chore(deps): Bump regex from 1.7.0 to 1.7.1 (#6546)
Bumps [regex](https://github.com/rust-lang/regex) from 1.7.0 to 1.7.1.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.7.0...1.7.1)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 09:55:41 +00:00
dependabot[bot] b49cc2e35e
chore(deps): Bump tokio from 1.24.0 to 1.24.1 (#6545)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.0 to 1.24.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.24.0...tokio-1.24.1)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 09:48:44 +00:00
dependabot[bot] e31c84a794
chore(deps): Bump async-trait from 0.1.60 to 0.1.61 (#6533)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.60 to 0.1.61.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.60...0.1.61)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-09 07:44:35 +00:00
Raphael Taylor-Davies e1036a0c63
refactor: cleanup schema boxing (#6511)
* refactor: cleanup Schema boxing

* chore: clippy
2023-01-06 10:57:39 +00:00
Marco Neumann 6f4b128285
refactor: improve "Parquet files after filtering" dbg log (#6502)
- Place IDs last because they may hit the "max line length" limit and be
  truncated. The other information should NOT be truncated with it.
- Unpack IDs to integer to remove useless `ParquetFileID(...)` wrappers
  in output.
- Print number of files in addition to the actual list to simplify
  debugging.
2023-01-05 11:13:33 +00:00
Carol (Nichols || Goulding) f121d395cc
refactor: Extract a constructor for PolicyBackend using a HashMap 2022-12-21 14:32:35 -05:00
Carol (Nichols || Goulding) 7c6ccdb6d7
fix: Use keys and values functions. Thanks clippy! 2022-12-21 14:32:35 -05:00
Carol (Nichols || Goulding) 56ba3b17de
fix: Allow partitions from ingesters to overlap in RPC write mode
This was added in c82d0d8ca6dc02dcdd40a4c656a1ee51f3f9bfee with the
comment:

> Right now this would clearly indicate a bug and before I am trying to
> understand some prod issues, I wanna rule that one out.

In the RPC write path, this isn't a bug, it's quite expected.
2022-12-21 11:32:58 -05:00
Carol (Nichols || Goulding) 257c155d1e
chore: Line wrapping at 100 cols 2022-12-21 11:18:47 -05:00
Dom Dwyer adc6fcfb04
feat(catalog): linearise sort key updates
Updating the sort key is not commutative and MUST be serialised. The
correctness of the current catalog interface relies on the caller
serialising updates globally, something it cannot reasonably assert in a
distributed system.

This change of the catalog interface pushes this responsibility to the
catalog itself where it can be effectively enforced, and allows a caller
to detect parallel updates to the sort key.
2022-12-20 12:31:00 +01:00
Carol (Nichols || Goulding) 200f4fe9bd
fix: Disable parquet file filtering in the querier based on max seq num in RPC write mode (#6443)
Connects to #6421.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-19 18:01:21 +00:00
Andrew Lamb 9b22ede3f0
refactor: Make arrow flight client return `futures::Streams` (#6438)
* refactor: Make arrow flight client use futures::Streams

* refactor: concision
2022-12-19 17:09:26 +00:00
Andrew Lamb 94c2f94ea1
refactor: Extract common ArrowFlight client into iox_arrow_flight (#6427)
* refactor: Extract common ArrowFlight client into iox_arrow_flight

* chore: Run cargo hakari tasks

* fix: clarify intent of iox_arrow_flight crate

* refactor: Apply suggestions from code review

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>

* fix: loop --> while let

* fix: REmove make_tonic_error in favor of From impl

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-19 11:35:20 +00:00
dependabot[bot] c72734473c
chore(deps): Bump async-trait from 0.1.59 to 0.1.60 (#6433)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.59 to 0.1.60.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.59...0.1.60)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-19 10:09:23 +00:00
Marco Neumann ffe8b98f47
refactor: clean up querier code base (#6404)
* refactor: `s/QuerierChunk/QuerierParquetChunk/g`

* refactor: isolate parquet chunk creation code

* refactor: fuse `chunk` and `chunk_parts`

* refactor: pass catalog cache instead of chunk adapter to state reconciler

* refactor: move parquet chunks creation into its own method

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-15 07:01:11 +00:00
kodiakhq[bot] d6afc9eee1
Merge branch 'main' into cn/ingester-persisted-file-count 2022-12-14 15:48:59 +00:00
Marco Neumann 4e36c590af
refactor: speed up partition sort key syncing (#6400)
* refactor: speed up partition sort key syncing

Prior to syncing, all chunks have a "locally correct" partiton sort key,
i.e. one that at least covers all chunk columns (this is ensured during
chunk creation, both for parquet chunks as well as ingester chunks).
However due to the timing, some chunks may have a newer (= longer)
partition sort key. All we need to do to fix this is to pick the longest
partition sort key, there is no need to go through the whole cache
system again.

For #6358.

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2022-12-14 15:48:08 +00:00
kodiakhq[bot] 66c610f7b1
Merge branch 'main' into cn/ingester-persisted-file-count 2022-12-14 14:58:31 +00:00
Marco Neumann c51548f28b
refactor: improve concurrency during parquet chunk creation (#6376)
* refactor: de-correletate parquet file processing

* refactor: increase concurrent chunk creation jobs to 100 (from 10)

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* refactor: use deterministic RNG

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-13 16:16:09 +00:00
Carol (Nichols || Goulding) 44c3486db0
feat: Expire the querier's cache using info from ingester2
Fixes #6335.

For each table, keep track of the ingester UUIDs and associated
persisted Parquet file counts that we've seen from previous requests to
ingesters. When doing a query, determine if we should expire the Parquet
file catalog cache by looking at the new information from the ingesters.

If we see a new ingester UUID or if the number of persisted files for a
known ingester UUID is different than what we've stored, then we should
expire this table's Parquet file cache.

Either way, incorporate the new information into the saved values for
comparing with the next request.
2022-12-12 15:53:39 -05:00
Carol (Nichols || Goulding) b4b50d7dc1
feat: Collect the ingester UUIDs and persistence counts in the table
And pass them to the parquet file cache, which doesn't use them yet.
2022-12-12 15:52:56 -05:00
Carol (Nichols || Goulding) b0ba171742
feat: Keep track of ingester UUIDs and counts in IngesterPartition 2022-12-12 15:52:08 -05:00
Carol (Nichols || Goulding) 9c8b55c5be
docs: Fix some wrapping/typos in comments 2022-12-12 14:30:52 -05:00
Carol (Nichols || Goulding) 1c7f322a4e
feat: Keep track of and report number of Parquet files persisted
Per partition and starting over each time the ingester restarts.

Fixes #6334.
2022-12-12 11:45:00 -05:00
Carol (Nichols || Goulding) 33886970ef
refactor: Extract a helper fn for test messages
Reduces duplication, makes it easier to see what's different between the
tests, will make it easier to add another field in the next commit
2022-12-12 11:45:00 -05:00
kodiakhq[bot] 727efcbdee
Merge branch 'main' into cn/ingester2-uuid 2022-12-12 16:21:15 +00:00
Marco Neumann e49ffc02f8
refactor: faster sort key calculation (#6375)
Avoid nasty string lookups to dermine which columns make a parquet's
sort key.

For #6358.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-12 15:32:04 +00:00
Marco Neumann 6b1c43f01e
refactor: use column IDs for partition cache invalidation (#6374)
This shall avoid a bunch of string hashing during query planning.

For #6358.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-12 14:22:28 +00:00
Marco Neumann db933c44b6
refactor: store reverse column ID map for cached tables (#6360) 2022-12-09 11:58:24 +00:00
Marco Neumann 450b452148
refactor: avoid string-hashing of parquet file column names (#6359) 2022-12-09 11:51:18 +00:00
Carol (Nichols || Goulding) 2fd2d05ef6
feat: Identify each run of an ingester with a Uuid
And send that UUID in the Flight response for queries to that ingester
run.

Fixes #6333.
2022-12-08 17:22:52 -05:00
kodiakhq[bot] 6f7cb5ccf0
Merge branch 'main' into cn/ingester2-querier 2022-12-08 14:00:49 +00:00
Marco Neumann d4e321a2bd
refactor: add additional span around chunk spans (#6353)
* refactor: add additional span around chunk spans

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2022-12-08 13:57:32 +00:00
Andrew Lamb 9175f4a0b5
chore: Upgrade datafusion to get correct support for multi-part identifiers (#6349)
* test: add tests for periods in measurement names

* chore: Update Datafusion

* chore: Update for changed APIs

* chore: Update expected plan output

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-08 11:27:13 +00:00
Carol (Nichols || Goulding) e13e668d26
refactor: Share more code in the querier in the RPC write path mode 2022-12-07 13:54:08 -05:00
Carol (Nichols || Goulding) b1c5ec4dee
fix: Correct compiler errors in places I missed while running crate tests 2022-12-07 10:25:36 -05:00
Carol (Nichols || Goulding) 9166ace796
feat: Make a mode for the querier to use ingester2 instead, behind the rpc_write feature flag 2022-12-07 09:56:50 -05:00
dependabot[bot] 1d38d400f0
chore(deps): Bump object_store from 0.5.1 to 0.5.2 (#6339)
* chore(deps): Bump object_store from 0.5.1 to 0.5.2

Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.1 to 0.5.2.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.1...object_store_0.5.2)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-06 07:53:54 +00:00
Marco Neumann cd6a8a1a82
refactor: DF-driven on-demand mem limit instead of ahead-of-time heuristics (#6313)
* refactor: DF-driven on-demand mem limit instead of ahead-of-time heuristics

Closes #6310.

* refactor: rename and tune default exec mem limits

* fix: ingester2 bits after rebase
2022-12-05 12:38:28 +00:00
Marco Neumann ec2e72d223
test: simplify test executors (#6312)
Have a single global test executor w/ reasonable defaults. Also don't
require tests to join/await executor shutdowns (most tests forget this
anyways and will get a runtime warning).

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-02 11:38:18 +00:00
Marco Neumann befc6d668b
fix: avoid user error for unsupported querier<>ingester preds (#6238)
Fixes #6195.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-28 16:51:41 +00:00
Nga Tran 45d25b0af2
refactor: remove duplicate tests (#6243) 2022-11-28 16:39:57 +00:00
Andrew Lamb 1a1ea74cb7
chore: Upgrade datafusion again (#6160)
* Revert "Revert "chore: Update datafusion again (#6108)""

This reverts commit 766b3bbeb440618cfe332f6ee7d4f8a8217acc48.

* fix: Respect the partition sort key

* chore: update plans

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-22 19:28:26 +00:00