Commit Graph

49187 Commits (3c9e6ed836fcd607af90589e438cb35ca83efeb6)

Author SHA1 Message Date
kodiakhq[bot] b5c0ecd140
Merge pull request #8767 from influxdata/savage/respect-ingest-system-state-during-wal-replay
feat(ingester): Respect `IngestState` during WAL replay
2023-09-21 10:19:37 +00:00
kodiakhq[bot] 12b02359aa
Merge branch 'main' into savage/respect-ingest-system-state-during-wal-replay 2023-09-21 10:13:27 +00:00
dependabot[bot] 82382b9b3a
chore(deps): Bump insta from 1.31.0 to 1.32.0
Bumps [insta](https://github.com/mitsuhiko/insta) from 1.31.0 to 1.32.0.
- [Changelog](https://github.com/mitsuhiko/insta/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mitsuhiko/insta/compare/1.31.0...1.32.0)

---
updated-dependencies:
- dependency-name: insta
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-09-21 10:10:26 +00:00
Dom bd1b668dbb
Merge branch 'main' into dependabot/cargo/tokio-util-0.7.9 2023-09-21 11:04:42 +01:00
Dom 29462d0fe5
Merge pull request #8789 from influxdata/dependabot/cargo/smallvec-1.11.1
chore(deps): Bump smallvec from 1.11.0 to 1.11.1
2023-09-21 11:04:28 +01:00
dependabot[bot] 37d37f3626
chore(deps): Bump smallvec from 1.11.0 to 1.11.1
Bumps [smallvec](https://github.com/servo/rust-smallvec) from 1.11.0 to 1.11.1.
- [Release notes](https://github.com/servo/rust-smallvec/releases)
- [Commits](https://github.com/servo/rust-smallvec/compare/v1.11.0...v1.11.1)

---
updated-dependencies:
- dependency-name: smallvec
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-09-21 02:02:29 +00:00
dependabot[bot] 661acc77f0
chore(deps): Bump tokio-util from 0.7.8 to 0.7.9
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.8 to 0.7.9.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.8...tokio-util-0.7.9)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-09-21 02:01:19 +00:00
Brandon Pfeifer b3b982d746
chore: update MacOS executor to M1 (#24372) 2023-09-20 14:30:21 -04:00
kodiakhq[bot] 3e5196bbda
Merge pull request #8782 from influxdata/cn/independent-refactorings
refactor: Improvements made during UpsertSchema work that are actually independent
2023-09-20 16:56:41 +00:00
Carol (Nichols || Goulding) d1f355bb58
fix: Remove an Arc wrapping that's no longer needed 2023-09-20 11:14:20 -04:00
Carol (Nichols || Goulding) 7c31771c64
refactor: Implement NamespaceSchema proto conversion as From 2023-09-20 11:13:53 -04:00
Carol (Nichols || Goulding) 11f916eee1
refactor: Extract test helper functions to improve readability 2023-09-20 10:42:26 -04:00
Fraser Savage cb7a26cb65
refactor(ingester): Revert `IngestState` methods to crate public
These methods should not be used at all outside the ingester, and only
the type itself needs to be accessible in the WAL replay benchmark.
2023-09-20 15:03:39 +01:00
Carol (Nichols || Goulding) eda4ccdf1a
test: Clean up existing test for consistency with current test style
- Extract some shared values
- Remove an unneeded Arc::clone
- Change expects that don't provide much clarity to unwraps
- Give the test a more distinctive and less redundant name
2023-09-20 10:01:06 -04:00
Carol (Nichols || Goulding) 257a6d2552
fix: Generating proto doesn't need ownership of an Arc 2023-09-20 10:01:06 -04:00
Carol (Nichols || Goulding) 265941f1a8
refactor: Only return NamespaceSchema proto so it can be reused in different responses 2023-09-20 09:57:06 -04:00
Dom 28c3637c01
Merge pull request #8780 from influxdata/dom/enable-merkle-tracking
feat(router): init anti-entropy merkle search tree
2023-09-20 14:41:30 +01:00
Dom d6f87cc569
Merge branch 'main' into dom/enable-merkle-tracking 2023-09-20 14:25:45 +01:00
Marco Neumann 5269285250
refactor: isolate V1 ingester->querier client (#8778)
Isolate the actual client from the query planning parts
(`Ingester{Chunk,Partition}`) so we can hook up the V2 client in #8350.

The PR looks large, but it just moves code around and decouples the
error handling.
2023-09-20 12:55:30 +00:00
Dom Dwyer 39768fa989
feat(router): init anti-entropy merkle search tree
Adds initialisation code to the routers to instantiate an
AntiEntropyActor, pre-populate the Merkle Search Tree during schema
warmup, and maintain it at runtime.
2023-09-20 13:47:16 +02:00
Dom 3ca05369dc
Merge pull request #8777 from influxdata/dom/rust-bump
build: bump rust to 1.72.1
2023-09-20 12:08:00 +01:00
Dom 6d0ef588f4
Merge branch 'main' into dom/rust-bump 2023-09-20 11:10:10 +01:00
Dom 485dbe9a82
Merge pull request #8776 from influxdata/dom/merkle-snap
feat: merkle search tree snapshot generation
2023-09-20 11:10:03 +01:00
Dom Dwyer 3c9f282837
build: bump rust to 1.72.1
https://github.com/rust-lang/rust/releases/tag/1.72.1
2023-09-20 12:00:04 +02:00
Dom Dwyer ec96545c34
feat: merkle search tree snapshot generation
Allow an owned, compact content summary snapshot of the merkle search
tree state to be read from the MST actor.

This snapshot describes the structure of the MST in a compact/efficient
representation suitable for exchanging over the network between peers.
2023-09-20 11:43:47 +02:00
Marco Neumann 7b4dbb570d
refactor: clean up query log impl (#8775)
- take span ctx directly instead of the execution context (see point
  below)
- use the original trace ID (i.e. the one that we get via HTTP header),
  NOT some internal span/trace because the latter is only available for
  sampled requests, while the former one is generally more available (we
  also do that for the stdout logs btw.)
- minor code clean ups

This is prep work for #8774.
2023-09-20 09:20:19 +00:00
Marco Neumann fd50d7cfcf
Merge pull request #8771 from influxdata/crepererum/issue8705
feat: prune partitions before creating parquet chunks
2023-09-20 10:49:48 +02:00
Marco Neumann 4219d7d318 feat: prune partitions before creating parquet chunks
This should lower query latency, because creating many chunks just to
throw them away afterwards isn't exactly cheap.

Closes #8705.
2023-09-20 10:43:39 +02:00
Marco Neumann 60e795e15e
Merge pull request #8768 from influxdata/crepererum/issue8350b
refactor: allow streaming record batches into query
2023-09-20 10:36:23 +02:00
kodiakhq[bot] 809e0f4a42
Merge branch 'main' into crepererum/issue8350b 2023-09-20 08:21:04 +00:00
Andrew Lamb 65d0ea2055
chore: Update DataFusion (#8765)
* chore: Update DataFusion pin again

* chore: update for different type

* fix: statistics

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-09-19 22:26:53 +00:00
Joe-Blount c05739ff20
chore(compactor): move CompactRange up to RoundInfo (#8736)
* chore(compactor): move CompactRange up to RoundInfo

* chore: insta updates from compactor CompactRange refactor

* chore: lint cleanup

* chore: addressing some of the comments

* chore: remove duplicated done check

* chore: variable renaming
2023-09-19 16:53:36 +00:00
Nga Tran 0a7ae5603f
feat: make sory_key_ids non-optional (#8750)
* feat: make sory_key_ids non-optional

* refactor: address review comments

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-09-19 14:22:29 +00:00
Dom a74aab709c
Merge pull request #8770 from influxdata/dom/merkle-bump-version
build:  latest merkle-search-tree version
2023-09-19 15:15:41 +01:00
Dom 8742de9819
Merge branch 'main' into dom/merkle-bump-version 2023-09-19 14:44:28 +01:00
Martin Hilton 39e35eb0a7
feat(querier): convert timezone sent from ingester (#8769)
* feat(querier): convert timezone sent from ingester

In order to facilitate the change of default timezone from None to
UTC make the querier able to convert the timezone sent from the
ingester into its preferred type. This can convert from None to UTC
or UTC to None and should allow the interaction between ingesters
and queriers with differing settings for the default timezone.

To allow testing of both conversions, the type checking has been
made more liberal when converting an arrow schema to an IOx one.

* fix: fmt

* fix: lint

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-09-19 13:31:48 +00:00
Dom Dwyer 33a441fbec
build: pick up latest merkle-search-tree version
Pick up the improvements allowing construction of PageRangeSnapshots
from owned keys / no cloning.
2023-09-19 14:09:07 +02:00
Marco Neumann 74b1a5e368 refactor: allow streaming record batches into query
For #8350, we won't have all the record batches from the ingester during
planning but we'll stream them during the execution. Technically the
DF plan is already based on streams, it's just `QueryChunkData` that
required a materialized `Vec<RecordBatch>`. This change moves the stream
creation up so a chunk can decide to either use `QueryChunkData::in_mem`
(which conveniently creates the stream) or it can provide its own
stream.
2023-09-19 13:53:37 +02:00
Fraser Savage f6d6dd9b5b
test(ingester): Add timeout panic to blocked `IngestState` WAL replay test 2023-09-19 11:58:52 +01:00
Marco Neumann ca791386eb
refactor: clean up chunk pruning metrics/observers (#8766)
There where like 3 layers (metrics, observer, pruner) that all only had
a single implementation. IIRC this is a leftover from older code where
`iox_query` was more involved in query pruning. With #8705 however the
chunk pruning is pushed even closer to the source (i.e. the querier
code) and it is just more practical to perform the metric management
directly in the querier code (this was the case already, it was just
somewhat hidden by the interfaces). This also allows us to add metrics
for #8705 more easily.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-09-19 10:53:14 +00:00
Fraser Savage 9e80e03069
feat(ingester): Respect non-disk full `IngestStateError` during WAL replay
This change causes WAL replay to mimic the RPC write handler, mostly
respecting the `IngestState` before apply an op while replaying a WAL
file. The caveat is that `DiskFull` is ignored as WAL replay specifically
helps with this state.
2023-09-19 11:34:24 +01:00
dependabot[bot] b135cb8d23
chore(deps): Bump pbjson from 0.5.1 to 0.6.0 (#8755)
Bumps [pbjson](https://github.com/influxdata/pbjson) from 0.5.1 to 0.6.0.
- [Commits](https://github.com/influxdata/pbjson/commits)

---
updated-dependencies:
- dependency-name: pbjson
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-09-19 10:24:39 +00:00
Fraser Savage 8548ea3b31
refactor(ingester): Pass `IngestState` to WAL replay
This requires the `IngestState` and associated types to be public so
that WAL replay can be called by the benchmarker. The module containing
the `IngestState` is private and is only conditionally re-exported under
the benchmark feature as part of the `internal_implementation_details`
module.
2023-09-19 11:21:42 +01:00
dependabot[bot] 9123c6126d
chore(deps): Bump predicates from 3.0.3 to 3.0.4 (#8761)
Bumps [predicates](https://github.com/assert-rs/predicates-rs) from 3.0.3 to 3.0.4.
- [Changelog](https://github.com/assert-rs/predicates-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/assert-rs/predicates-rs/compare/v3.0.3...v3.0.4)

---
updated-dependencies:
- dependency-name: predicates
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-09-19 10:18:03 +00:00
kodiakhq[bot] 4a653a3ff7
Merge pull request #8666 from influxdata/savage/automatic-recovery-from-incomplete-wal-write
feat(ingester): Automatically recover from an incomplete WAL write
2023-09-19 09:41:57 +00:00
kodiakhq[bot] 87a25cf3cb
Merge branch 'main' into savage/automatic-recovery-from-incomplete-wal-write 2023-09-19 09:35:33 +00:00
Dom 5fbe7b80b9
Merge pull request #8762 from influxdata/dependabot/cargo/clap-4.4.4
chore(deps): Bump clap from 4.4.3 to 4.4.4
2023-09-19 10:35:18 +01:00
kodiakhq[bot] d034a0ed5f
Merge branch 'main' into savage/automatic-recovery-from-incomplete-wal-write 2023-09-19 09:34:45 +00:00
Dom 500112bd47
Merge branch 'main' into dependabot/cargo/clap-4.4.4 2023-09-19 10:28:35 +01:00
Marco Neumann 949635b324
feat: use time-based column ranges in querier (#8732)
Use output of #8725 within the column ranges of the querier. Currently
this won't have any effect since the column ranges are only used to
prune parquet files and parquet files come with their own, more precise
time range (and that information has priority). However for #8705 we
want to use it to prune partitions before needing to deal with the
parquet files.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-09-19 09:13:50 +00:00