Commit Graph

513 Commits (bf4fce3c77b3f996cb151cb9aed0fd511b011ace)

Author SHA1 Message Date
Andrew Lamb 1ff76b7bf2 chore: use workspace dependencies for `object_store` 2023-05-26 07:03:42 -04:00
Andrew Lamb c1a448e930
feat: Add decoded payload type and size to querier <--> ingester tracing (#7870)
* feat: Add decoded payload type and size to querier <--> ingester tracing

* feat: add aggregate sizes

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-26 10:05:14 +00:00
Andrew Lamb d68a399a7b
fix: fix span name (#7868)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-25 17:40:43 +00:00
Dom Dwyer 928a4d163e
build: remove unused dependencies from crates
This commit fixes loads of crates (47!) had unused dependencies, or
mis-configured dependencies (test deps as normal deps).

I added the "unused_crate_dependencies" to all crates to help prevent
this mess from growing again!

    https://doc.rust-lang.org/beta/nightly-rustc/rustc_lint_defs/builtin/static.UNUSED_CRATE_DEPENDENCIES.html

This has the minor downside of false-positives when specifying
dev-dependencies for test/bench binaries - these are files in /test or
/benches (not normal tests). This commit includes a workaround,
importing them in lib.rs (gated by a feature flag). I think the
trade-off of better dependency management is worth it!
2023-05-23 14:55:43 +02:00
Marco Neumann 31b8813760
feat: hide `system.queries` table from prod by default (#7810)
Introduce a new header called `iox-debug` which when set enables certain
debug features. The first one will be the `system.queries` table which
is a process-local, namespace-scoped query log. In most prod setups this
is only useful for debugging and will confuse the user a lot because
when multiple queries are deployed then the K8s routing decides which
pod/process the users hits. This leads to an inconsistent view. However
the log is still useful for debugging.

This also wires the "debug header set" flag through the Flight ticket,
because JDBC proved (integration tests FTW!) that headers are only
passed to `GetFlightInfo` but not to `DoGet` and the ticket must encode
all the relevant information.

Closes #7119.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-22 12:29:24 +00:00
Andrew Lamb 6344fe8c3f
chore: Add rationale for `clippy::future_not_send` (#7822)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-18 16:58:56 +00:00
Dom 6aa634c1b9
Merge branch 'main' into cn/move-peas 2023-05-15 13:29:42 +01:00
dependabot[bot] fba9836f2a
chore(deps): Bump pin-project from 1.0.12 to 1.1.0
Bumps [pin-project](https://github.com/taiki-e/pin-project) from 1.0.12 to 1.1.0.
- [Release notes](https://github.com/taiki-e/pin-project/releases)
- [Changelog](https://github.com/taiki-e/pin-project/blob/main/CHANGELOG.md)
- [Commits](https://github.com/taiki-e/pin-project/compare/v1.0.12...v1.1.0)

---
updated-dependencies:
- dependency-name: pin-project
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-05-15 02:02:32 +00:00
Carol (Nichols || Goulding) 1770d0f4d8
fix: Move ingester-querier gRPC communication to its own crate 2023-05-12 13:28:30 -04:00
Carol (Nichols || Goulding) 92e5036943
fix: Size of ColumnSet shouldn't be using ChunkId (#7786) 2023-05-12 14:58:03 +00:00
Carol (Nichols || Goulding) cc41216382
fix: Undo the addition of a TableInfo type; store partition_template on TableSchema 2023-05-09 14:54:59 +02:00
Carol (Nichols || Goulding) 596673d515
refactor: Create a new ColumnsByName type to abstract over TableSchema columns
And allow usage of just the columns when that's all that's needed
without leaking the BTreeMap implementation detail everywhere
2023-05-09 14:54:58 +02:00
Carol (Nichols || Goulding) 1f1dcc947d
fix: Don't change how the compactor gets the table schema 2023-05-09 14:54:58 +02:00
Carol (Nichols || Goulding) 58d9c40ffd
feat: If namespace or table partition templates are specified, use those 2023-05-09 14:54:57 +02:00
Carol (Nichols || Goulding) 56916cf942
fix: Rename ingester2 to ingester 2023-05-08 12:03:05 -04:00
Andrew Lamb 2860d87fe1
chore: Update DataFusion (#7756)
* chore: Update DataFusion pin

* chore: Update explain plans

* chore: Run cargo hakari tasks

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2023-05-05 18:58:18 +00:00
Carol (Nichols || Goulding) 621caab2e9
fix: Remove unused parquet_max_sequence_number metadata 2023-05-03 10:57:27 -04:00
Carol (Nichols || Goulding) dfa184e296
fix: Make ingester UUID an expected, required field of IngesterPartition 2023-05-03 10:45:02 -04:00
Marco Neumann 0556fdae53
refactor: remove `QueryChunk::partition_sort_key` (#7680)
As of #7250 / #7449 the partition sort key is no longer required for
query planning. Instead we use a combination of
`QueryChunk::partition_id` and `QueryChunk::sort_key` which is more
robust and easier to reason about.

Removing it simplifies the querier code a lot since we no longer need to
have a sort key for the ingester chunks and also don't need to "sync"
the sort key between chunks for consistency.
2023-04-27 10:54:41 +00:00
Marco Neumann 2bf867ea0a
refactor: do not block on querier cache warm-up (#7679)
Warming up a cache should not block the planning, it is a mere signal to
the cache system to start to fetch data. See code comment for more
details.

This lowers the query latency in a few cases. I've seen at least one
trace were this would have been useful. This will never make things
worse (because the cache system drives the request to completion anyways).

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-27 08:57:55 +00:00
Carol (Nichols || Goulding) 038f8e9ce0
fix: Move shard concepts into only the catalog
This still inserts the shard id into the database, always set to the
TRANSITION_SHARD_ID, but never reads it back out again.
2023-04-26 11:42:32 -04:00
dependabot[bot] bdf7f316d7
chore(deps): Bump tokio from 1.27.0 to 1.28.0 (#7667)
* chore(deps): Bump tokio from 1.27.0 to 1.28.0

Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.27.0 to 1.28.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.27.0...tokio-1.28.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Dom <dom@itsallbroken.com>
2023-04-26 12:53:26 +00:00
dependabot[bot] 0b9240cbbe
chore(deps): Bump tokio-util from 0.7.7 to 0.7.8 (#7665)
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.7 to 0.7.8.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.7...tokio-util-0.7.8)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-26 09:24:39 +00:00
Carol (Nichols || Goulding) 8d4c2bfabb
fix: Only ever create the transition shard in the in-memory catalog
Tests that use the in-memory catalog are creating different shards that
then creates old-style Parquet file paths, but in production, everything
uses the transition shard now. To make the tests more like production,
only ever create and use the transition shard, and stop checking for
different shard IDs.
2023-04-24 10:08:00 -04:00
Marco Neumann d7dc305972
feat: allow overwriting DataFusion's default config (#7586)
This is helpful to test changes in our defaults but also for testing.

Required for https://github.com/influxdata/idpe/issues/17474 .

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-18 11:28:45 +00:00
Dom Dwyer c5bb88e173
chore: remove unused dependencies
Some crates import dependencies they never use.
2023-04-18 12:07:13 +02:00
kodiakhq[bot] da96239605
Merge branch 'main' into cn/delete-tombstones 2023-04-17 13:59:49 +00:00
Carol (Nichols || Goulding) 5f2d82fbc6
fix: Remove tombstones from querier; they're unused 2023-04-14 13:20:39 -04:00
Andrew Lamb f46d06d56f
chore: Update DataFusion + arrow ecosystem to 37 (#7544)
* chore: Update datafusion and arrow/parquet to 37, tonic to 0.9.1

* refactor: Update for FieldRef and other API changes

* fix: Update field size calculation

* fix: Use `NullBuffer` directly

* fix: remove outdated comment

* chore: Update test for tonic

* chore: Run cargo hakari tasks

* chore: cargo update

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-14 12:43:01 +00:00
Carol (Nichols || Goulding) f0f74bae02
fix: Treat empty ingester info differently than not having ingester info
When pre-warming the catalog cache before the ingester responses have
returned, we don't have any ingester parquet file counts. This is
different than asking the ingesters for the parquet file counts and not
getting any. So keep the Option to be able to treat "not present"
differently from "present but empty".
2023-04-12 14:50:18 -04:00
Carol (Nichols || Goulding) acf857816e
fix: Remove old querier 2023-04-12 13:18:23 -04:00
Carol (Nichols || Goulding) 6387a9576a
fix: Remove the write_summary crate and write info service 2023-04-12 11:31:23 -04:00
Marco Neumann b29bdf73ab
feat: improve querier->ingester tracing (#7501)
* feat: improve querier->ingester tracing

- add more hierarchy items on the querier side
- ensure that streaming is correctly traced by the querier

* refactor: improve span name

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* docs: `QueryDataTracer`

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-04-11 13:43:42 +00:00
Andrew Lamb 1a80b8073c
fix: Improve span names for query access (#7476)
* fix: Improve span names for query access

* fix: update test

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-11 10:34:09 +00:00
Marco Neumann 5f43f2a719
refactor: remove old query planning code (#7449)
Closes #7406.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-06 16:05:08 +00:00
Marco Neumann c03a5c7c14
fix: tracing span hierarchy in querier (#7469)
The span for the individual chunk creations should be under
"create individual chunks".
2023-04-06 10:01:39 +00:00
dependabot[bot] 66982f988b
chore(deps): Bump object_store from 0.5.5 to 0.5.6 (#7433)
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.5 to 0.5.6.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/commits)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-04 08:43:34 +00:00
Marco Neumann f04962d630
feat: new query planning (#7250)
Closes #6098.
2023-04-03 10:31:03 +00:00
dependabot[bot] 4eedb7ea77
chore(deps): Bump async-trait from 0.1.66 to 0.1.68 (#7374)
* chore(deps): Bump async-trait from 0.1.66 to 0.1.68

Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.66 to 0.1.68.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.66...0.1.68)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2023-03-30 10:14:36 +00:00
dependabot[bot] 9cbcdc7672
chore(deps): Bump tokio from 1.26.0 to 1.27.0 (#7373)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.26.0 to 1.27.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.26.0...tokio-1.27.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-30 09:36:04 +00:00
Marco Neumann 75dba43ced
test: extend retention policy query test (#7352)
Add an ingester chunk to the parquet chunks.

Helpful for #6098.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-29 05:32:22 +00:00
dependabot[bot] 4b888c7255
chore(deps): Bump insta from 1.28.0 to 1.29.0 (#7322)
Bumps [insta](https://github.com/mitsuhiko/insta) from 1.28.0 to 1.29.0.
- [Release notes](https://github.com/mitsuhiko/insta/releases)
- [Changelog](https://github.com/mitsuhiko/insta/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mitsuhiko/insta/compare/1.28.0...1.29.0)

---
updated-dependencies:
- dependency-name: insta
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-24 18:25:01 +00:00
Andrew Lamb 5dd71998a1
chore: Update datafusion (#7318)
* chore: Update datafusion

* chore: Update for API change

* chore: Run cargo hakari tasks

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-24 15:07:23 +00:00
Marco Neumann 07b7107f9a
feat: sub-traces for `create_chunks` (#7148)
In one prod case the majority of this was NOT spend on creating the
child chunks. I suspect that the summary creation and the string cloning
involved in there are quite slow. So let's have slightly more detailed
tracing and see.
2023-03-07 15:06:37 +00:00
dependabot[bot] 8f3a9396d0
chore(deps): Bump async-trait from 0.1.64 to 0.1.66 (#7129)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.64 to 0.1.66.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.64...0.1.66)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-03-06 10:13:29 +00:00
dependabot[bot] 3256fcc72e
chore(deps): Bump object_store from 0.5.4 to 0.5.5
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.4 to 0.5.5.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.4...object_store_0.5.5)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-03-03 02:00:51 +00:00
dependabot[bot] c538cac4ef
chore(deps): Bump tokio from 1.25.0 to 1.26.0 (#7107)
* chore(deps): Bump tokio from 1.25.0 to 1.26.0

Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.25.0 to 1.26.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.25.0...tokio-1.26.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-03-02 09:50:39 +00:00
Carol (Nichols || Goulding) 3bf0f2779e
refactor: Move query plan normalizer to arrow_util 2023-03-01 15:44:22 -05:00
Carol (Nichols || Goulding) bbfff8699c
fix: Use the same normalization code for explain tests as e2e tests do
The regex for replacing UUIDs needed to be changed like the normalizer's
regex did, so keep them in sync by using the same code.

This might point to the normalizer needing to be moved somewhere else,
or changing these tests to be e2e?
2023-03-01 13:00:04 -05:00
kodiakhq[bot] b7170e41fb
Merge branch 'main' into cn/more-querier-tests-to-kafkaless 2023-03-01 16:05:41 +00:00