Commit Graph

439 Commits (0d111c4672102ceb373c19e77e2f9cc98103e12d)

Author SHA1 Message Date
Marco Neumann 7d06a61b5f
fix: use `create_at` to order querier chunks under kafkaless (#6554)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 17:15:08 +00:00
Marco Neumann 042b7c4521
refactor: invalidate querier cache if ingester is gone (#6550)
* refactor: invalidate querier cache if ingester is gone

For #6549 but I think even w/o the plan illustrated there, this is the
right thing to do.

Also changes the cache system to use flats sorted vectors instead of costly hash
maps.

* refactor: simplify code

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 13:46:18 +00:00
Marco Neumann 2bb6db3f37
fix: ensure ingester state tracked in querier cache is always in-sync (#6512)
Fixes #6510.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 12:00:05 +00:00
dependabot[bot] c68049c37a
chore(deps): Bump regex from 1.7.0 to 1.7.1 (#6546)
Bumps [regex](https://github.com/rust-lang/regex) from 1.7.0 to 1.7.1.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.7.0...1.7.1)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 09:55:41 +00:00
dependabot[bot] b49cc2e35e
chore(deps): Bump tokio from 1.24.0 to 1.24.1 (#6545)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.0 to 1.24.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.24.0...tokio-1.24.1)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 09:48:44 +00:00
dependabot[bot] e31c84a794
chore(deps): Bump async-trait from 0.1.60 to 0.1.61 (#6533)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.60 to 0.1.61.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.60...0.1.61)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-09 07:44:35 +00:00
Raphael Taylor-Davies e1036a0c63
refactor: cleanup schema boxing (#6511)
* refactor: cleanup Schema boxing

* chore: clippy
2023-01-06 10:57:39 +00:00
Marco Neumann 6f4b128285
refactor: improve "Parquet files after filtering" dbg log (#6502)
- Place IDs last because they may hit the "max line length" limit and be
  truncated. The other information should NOT be truncated with it.
- Unpack IDs to integer to remove useless `ParquetFileID(...)` wrappers
  in output.
- Print number of files in addition to the actual list to simplify
  debugging.
2023-01-05 11:13:33 +00:00
Carol (Nichols || Goulding) f121d395cc
refactor: Extract a constructor for PolicyBackend using a HashMap 2022-12-21 14:32:35 -05:00
Carol (Nichols || Goulding) 7c6ccdb6d7
fix: Use keys and values functions. Thanks clippy! 2022-12-21 14:32:35 -05:00
Carol (Nichols || Goulding) 56ba3b17de
fix: Allow partitions from ingesters to overlap in RPC write mode
This was added in c82d0d8ca6dc02dcdd40a4c656a1ee51f3f9bfee with the
comment:

> Right now this would clearly indicate a bug and before I am trying to
> understand some prod issues, I wanna rule that one out.

In the RPC write path, this isn't a bug, it's quite expected.
2022-12-21 11:32:58 -05:00
Carol (Nichols || Goulding) 257c155d1e
chore: Line wrapping at 100 cols 2022-12-21 11:18:47 -05:00
Dom Dwyer adc6fcfb04
feat(catalog): linearise sort key updates
Updating the sort key is not commutative and MUST be serialised. The
correctness of the current catalog interface relies on the caller
serialising updates globally, something it cannot reasonably assert in a
distributed system.

This change of the catalog interface pushes this responsibility to the
catalog itself where it can be effectively enforced, and allows a caller
to detect parallel updates to the sort key.
2022-12-20 12:31:00 +01:00
Carol (Nichols || Goulding) 200f4fe9bd
fix: Disable parquet file filtering in the querier based on max seq num in RPC write mode (#6443)
Connects to #6421.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-19 18:01:21 +00:00
Andrew Lamb 9b22ede3f0
refactor: Make arrow flight client return `futures::Streams` (#6438)
* refactor: Make arrow flight client use futures::Streams

* refactor: concision
2022-12-19 17:09:26 +00:00
Andrew Lamb 94c2f94ea1
refactor: Extract common ArrowFlight client into iox_arrow_flight (#6427)
* refactor: Extract common ArrowFlight client into iox_arrow_flight

* chore: Run cargo hakari tasks

* fix: clarify intent of iox_arrow_flight crate

* refactor: Apply suggestions from code review

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>

* fix: loop --> while let

* fix: REmove make_tonic_error in favor of From impl

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-19 11:35:20 +00:00
dependabot[bot] c72734473c
chore(deps): Bump async-trait from 0.1.59 to 0.1.60 (#6433)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.59 to 0.1.60.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.59...0.1.60)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-19 10:09:23 +00:00
Marco Neumann ffe8b98f47
refactor: clean up querier code base (#6404)
* refactor: `s/QuerierChunk/QuerierParquetChunk/g`

* refactor: isolate parquet chunk creation code

* refactor: fuse `chunk` and `chunk_parts`

* refactor: pass catalog cache instead of chunk adapter to state reconciler

* refactor: move parquet chunks creation into its own method

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-15 07:01:11 +00:00
kodiakhq[bot] d6afc9eee1
Merge branch 'main' into cn/ingester-persisted-file-count 2022-12-14 15:48:59 +00:00
Marco Neumann 4e36c590af
refactor: speed up partition sort key syncing (#6400)
* refactor: speed up partition sort key syncing

Prior to syncing, all chunks have a "locally correct" partiton sort key,
i.e. one that at least covers all chunk columns (this is ensured during
chunk creation, both for parquet chunks as well as ingester chunks).
However due to the timing, some chunks may have a newer (= longer)
partition sort key. All we need to do to fix this is to pick the longest
partition sort key, there is no need to go through the whole cache
system again.

For #6358.

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2022-12-14 15:48:08 +00:00
kodiakhq[bot] 66c610f7b1
Merge branch 'main' into cn/ingester-persisted-file-count 2022-12-14 14:58:31 +00:00
Marco Neumann c51548f28b
refactor: improve concurrency during parquet chunk creation (#6376)
* refactor: de-correletate parquet file processing

* refactor: increase concurrent chunk creation jobs to 100 (from 10)

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* refactor: use deterministic RNG

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-13 16:16:09 +00:00
Carol (Nichols || Goulding) 44c3486db0
feat: Expire the querier's cache using info from ingester2
Fixes #6335.

For each table, keep track of the ingester UUIDs and associated
persisted Parquet file counts that we've seen from previous requests to
ingesters. When doing a query, determine if we should expire the Parquet
file catalog cache by looking at the new information from the ingesters.

If we see a new ingester UUID or if the number of persisted files for a
known ingester UUID is different than what we've stored, then we should
expire this table's Parquet file cache.

Either way, incorporate the new information into the saved values for
comparing with the next request.
2022-12-12 15:53:39 -05:00
Carol (Nichols || Goulding) b4b50d7dc1
feat: Collect the ingester UUIDs and persistence counts in the table
And pass them to the parquet file cache, which doesn't use them yet.
2022-12-12 15:52:56 -05:00
Carol (Nichols || Goulding) b0ba171742
feat: Keep track of ingester UUIDs and counts in IngesterPartition 2022-12-12 15:52:08 -05:00
Carol (Nichols || Goulding) 9c8b55c5be
docs: Fix some wrapping/typos in comments 2022-12-12 14:30:52 -05:00
Carol (Nichols || Goulding) 1c7f322a4e
feat: Keep track of and report number of Parquet files persisted
Per partition and starting over each time the ingester restarts.

Fixes #6334.
2022-12-12 11:45:00 -05:00
Carol (Nichols || Goulding) 33886970ef
refactor: Extract a helper fn for test messages
Reduces duplication, makes it easier to see what's different between the
tests, will make it easier to add another field in the next commit
2022-12-12 11:45:00 -05:00
kodiakhq[bot] 727efcbdee
Merge branch 'main' into cn/ingester2-uuid 2022-12-12 16:21:15 +00:00
Marco Neumann e49ffc02f8
refactor: faster sort key calculation (#6375)
Avoid nasty string lookups to dermine which columns make a parquet's
sort key.

For #6358.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-12 15:32:04 +00:00
Marco Neumann 6b1c43f01e
refactor: use column IDs for partition cache invalidation (#6374)
This shall avoid a bunch of string hashing during query planning.

For #6358.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-12 14:22:28 +00:00
Marco Neumann db933c44b6
refactor: store reverse column ID map for cached tables (#6360) 2022-12-09 11:58:24 +00:00
Marco Neumann 450b452148
refactor: avoid string-hashing of parquet file column names (#6359) 2022-12-09 11:51:18 +00:00
Carol (Nichols || Goulding) 2fd2d05ef6
feat: Identify each run of an ingester with a Uuid
And send that UUID in the Flight response for queries to that ingester
run.

Fixes #6333.
2022-12-08 17:22:52 -05:00
kodiakhq[bot] 6f7cb5ccf0
Merge branch 'main' into cn/ingester2-querier 2022-12-08 14:00:49 +00:00
Marco Neumann d4e321a2bd
refactor: add additional span around chunk spans (#6353)
* refactor: add additional span around chunk spans

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2022-12-08 13:57:32 +00:00
Andrew Lamb 9175f4a0b5
chore: Upgrade datafusion to get correct support for multi-part identifiers (#6349)
* test: add tests for periods in measurement names

* chore: Update Datafusion

* chore: Update for changed APIs

* chore: Update expected plan output

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-08 11:27:13 +00:00
Carol (Nichols || Goulding) e13e668d26
refactor: Share more code in the querier in the RPC write path mode 2022-12-07 13:54:08 -05:00
Carol (Nichols || Goulding) b1c5ec4dee
fix: Correct compiler errors in places I missed while running crate tests 2022-12-07 10:25:36 -05:00
Carol (Nichols || Goulding) 9166ace796
feat: Make a mode for the querier to use ingester2 instead, behind the rpc_write feature flag 2022-12-07 09:56:50 -05:00
dependabot[bot] 1d38d400f0
chore(deps): Bump object_store from 0.5.1 to 0.5.2 (#6339)
* chore(deps): Bump object_store from 0.5.1 to 0.5.2

Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.1 to 0.5.2.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.1...object_store_0.5.2)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-06 07:53:54 +00:00
Marco Neumann cd6a8a1a82
refactor: DF-driven on-demand mem limit instead of ahead-of-time heuristics (#6313)
* refactor: DF-driven on-demand mem limit instead of ahead-of-time heuristics

Closes #6310.

* refactor: rename and tune default exec mem limits

* fix: ingester2 bits after rebase
2022-12-05 12:38:28 +00:00
Marco Neumann ec2e72d223
test: simplify test executors (#6312)
Have a single global test executor w/ reasonable defaults. Also don't
require tests to join/await executor shutdowns (most tests forget this
anyways and will get a runtime warning).

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-02 11:38:18 +00:00
Marco Neumann befc6d668b
fix: avoid user error for unsupported querier<>ingester preds (#6238)
Fixes #6195.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-28 16:51:41 +00:00
Nga Tran 45d25b0af2
refactor: remove duplicate tests (#6243) 2022-11-28 16:39:57 +00:00
Andrew Lamb 1a1ea74cb7
chore: Upgrade datafusion again (#6160)
* Revert "Revert "chore: Update datafusion again (#6108)""

This reverts commit 766b3bbeb440618cfe332f6ee7d4f8a8217acc48.

* fix: Respect the partition sort key

* chore: update plans

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-22 19:28:26 +00:00
Andrew Lamb f89d542715
refactor: Minor cleanup of retention predicate code (#6211)
* refactor: Minor cleanup of retention predicate code

* fix: use cow
2022-11-22 18:28:54 +00:00
Nga Tran dd1755b23a
feat: querier filters data outsude retnetion period (#6209) 2022-11-22 15:41:00 +00:00
Marco Neumann 0c6afd7dbe
refactor: tune circuit breaker config (#6202)
At the moment it takes way to long to half-open and close circuits ones
they were opened.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-22 09:03:04 +00:00
dependabot[bot] 04c00bbb62
chore(deps): Bump bytes from 1.2.1 to 1.3.0 (#6199)
Bumps [bytes](https://github.com/tokio-rs/bytes) from 1.2.1 to 1.3.0.
- [Release notes](https://github.com/tokio-rs/bytes/releases)
- [Changelog](https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tokio-rs/bytes/commits)

---
updated-dependencies:
- dependency-name: bytes
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-22 08:23:24 +00:00