Commit Graph

775 Commits (9b52bfdeaaa437daa0558f567b2dfd7ef215f47f)

Author SHA1 Message Date
Dom a37b85804d
Merge branch 'main' into dom/cached-fsm-schema 2023-07-27 10:31:02 +01:00
Dom Dwyer ef158a664b
docs: ref-clone indicators for Schema
Cloning a Schema looks expensive, but it's not!
2023-07-27 11:12:11 +02:00
Fraser Savage 3133f9c2eb
fix(ingester): Skip empty writes with no data during WAL replay
In very rare cases a panic mid-write can result in a partially completed
write to the WAL which contains no table data. This is now not replayed
(as there is nothing to replay) and does not panic when encountered,
but tracks the occurence into the WAL replayed ops metric and logs a
warning.
2023-07-26 10:43:10 +01:00
Dom Dwyer 41c9c0f396
perf(ingester): reusable FSM / RecordBatch schemas
Cache the merged Schema of all the RecordBatch within a buffer at
snapshot generation time.

To be useful, this cached schema is made available to the PartitionData
for re-use, allowing the schema of "hot" data within a partition's
mutable buffer to be read without generating a RecordBatch first.
2023-07-25 17:10:06 +02:00
Dom Dwyer b79b120788
refactor: per-partition summary statistics
Provide row count & timestamp min/max statistics on a per-partition
basis.

This commit builds on the FSM summary statistics, merging all FSM
statistics across all data within the PartitionData (in various states)
and making them available to the caller.
2023-07-25 14:44:38 +02:00
Dom Dwyer b4b7822f2b
perf: cache summary statistics in partition FSM
Cache the row count & timestamp min/max values within the partition FSM
/ buffer, and make them available through the Queryable trait.

This allows the PartitionData to read the row count of a buffer (either
"hot" for writes, a "snapshot" of immutable RecordBatch, or "persisting"
for in-flight persisting data).

These values will enable early partition pruning.
2023-07-25 14:44:37 +02:00
Dom Dwyer 5c3e19742a
refactor: remove unused projection code
This code was superseded in:

    https://github.com/influxdata/influxdb_iox/pull/8154

This code is now unused.
2023-07-25 12:54:20 +02:00
Dom Dwyer 32414acb00
test(bench): ingester query partition pruning
Adds benchmarks that exercise partition pruning during query execution
within the ingester, for varying partition counts within a table, and
varying row counts within each partition.
2023-07-24 17:26:48 +02:00
dependabot[bot] faa8d44492
chore(deps): Bump thiserror from 1.0.43 to 1.0.44 (#8315)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.43 to 1.0.44.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.43...1.0.44)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-24 10:18:44 +00:00
dependabot[bot] cd31492e5b
chore(deps): Bump async-trait from 0.1.71 to 0.1.72 (#8317)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.71 to 0.1.72.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.71...0.1.72)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-24 10:07:18 +00:00
kodiakhq[bot] 0ed4e8509b
Merge branch 'main' into savage/await-enqueue-rotation-returned-receiver-during-shutdown 2023-07-24 09:17:55 +00:00
kodiakhq[bot] 76ecfcc815
Merge branch 'main' into cn/cleanups 2023-07-21 13:22:50 +00:00
Carol (Nichols || Goulding) 3ac0e30ac9
fix: Remove namespace ID from a partition identifier type (#8288)
I'm going to make a change in the future that removes the access to the
namespace ID from this code, and it's not needed anyway as partitions
are uniquely identifiable by only table ID and partition key.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-21 13:22:33 +00:00
Fraser Savage 6347d833e7
fix(ingester): Graceful shutdown must wait for file rotation receiver
This adds a second level of wait to the WAL draing & rotate before
allowing shutdown to proceed. Only once the returned receiver has
notified the called that the notification has been handled may it await
the empty waker set.
2023-07-21 12:31:04 +01:00
dependabot[bot] 0d0f07b34e
chore(deps): Bump tempfile from 3.6.0 to 3.7.0 (#8297)
* chore(deps): Bump tempfile from 3.6.0 to 3.7.0

Bumps [tempfile](https://github.com/Stebalien/tempfile) from 3.6.0 to 3.7.0.
- [Changelog](https://github.com/Stebalien/tempfile/blob/master/CHANGELOG.md)
- [Commits](https://github.com/Stebalien/tempfile/compare/v3.6.0...v3.7.0)

---
updated-dependencies:
- dependency-name: tempfile
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-21 09:06:34 +00:00
Carol (Nichols || Goulding) 3dd29384ed
fix: Remove unneeded allow unused_imports and remove unused imports 2023-07-20 14:20:55 -04:00
Carol (Nichols || Goulding) fce4f3f346
test: Remove outdated test comments
This is not at all what this test is doing; probably copypasta
2023-07-20 14:13:15 -04:00
dependabot[bot] 4c4c9f731c
chore(deps): Bump uuid from 1.4.0 to 1.4.1 (#8256)
Bumps [uuid](https://github.com/uuid-rs/uuid) from 1.4.0 to 1.4.1.
- [Release notes](https://github.com/uuid-rs/uuid/releases)
- [Commits](https://github.com/uuid-rs/uuid/compare/1.4.0...1.4.1)

---
updated-dependencies:
- dependency-name: uuid
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-18 12:30:42 +00:00
dependabot[bot] e33a078128
chore(deps): Bump paste from 1.0.13 to 1.0.14 (#8244)
Bumps [paste](https://github.com/dtolnay/paste) from 1.0.13 to 1.0.14.
- [Release notes](https://github.com/dtolnay/paste/releases)
- [Commits](https://github.com/dtolnay/paste/compare/1.0.13...1.0.14)

---
updated-dependencies:
- dependency-name: paste
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-17 16:10:02 +00:00
Carol (Nichols || Goulding) cf046d0b3e
refactor: Extract a from implementation for creating TransitionPartitionId 2023-07-17 10:34:01 -04:00
Carol (Nichols || Goulding) a9b788b58f
feat: Collate chunks based on their partition hash id if they have it 2023-07-17 10:34:01 -04:00
Carol (Nichols || Goulding) c2606ff3ac
test: Add and use methods creating arbitrary TransitionPartitionId and PartitionHashIds 2023-07-17 09:56:55 -04:00
Carol (Nichols || Goulding) 745895c643
test: Ensure ingester only sends partition hash id if the partition provider gives it one
Also set a partition hash ID on all partitions built in tests with the
`PartitionDataBuilder` by default.
2023-07-17 09:56:55 -04:00
Carol (Nichols || Goulding) 54adecca58
test: Send partition_hash_id to the querier iff PartitionResponse gets a partition hash ID
Test to make sure I don't break this behavior before I intend to
2023-07-17 09:54:44 -04:00
Fraser Savage a2ca5ca17c
Merge branch 'main' into savage/hook-up-wal-reference-counter-actor 2023-07-17 10:49:45 +01:00
Carol (Nichols || Goulding) 10a0f8e3bf
fix: Remove ::default() when constructing unit structs
As recommended by https://rust-lang.github.io/rust-clippy/master/index.html#default_constructed_unit_structs
2023-07-14 10:50:55 -04:00
Carol (Nichols || Goulding) d40bc54b71
fix: Remove unneeded double derefs found with new lint suspicious_double_ref_op 2023-07-14 10:25:21 -04:00
kodiakhq[bot] 699fb70616
Merge branch 'main' into savage/propagate-tracing-spans-from-router-to-ingester 2023-07-14 12:28:56 +00:00
Fraser Savage 181cc0096c
fix(ingester): Prevent WAL segment deletion wait deadlock during shutdown
The WAL reference tracker's inactive segment empty notification only fires
when dropping to 0, not if it is already 0 (it may do this multiple times
over the lifetime of the ingester). This makes sure that the graceful
shutdown notifier listener is set up before the WAL is rotated and the
file is enqueued in the tracker for deletion.
2023-07-13 16:26:16 +01:00
Fraser Savage 4572057a11
Merge branch 'main' into savage/hook-up-wal-reference-counter-actor 2023-07-13 14:37:42 +01:00
Dom Dwyer 787f9a57dc
test: projection with and without "time"
Add test cases driving projection logic for BufferTree queries with, and
without the "time" column.
2023-07-13 14:42:52 +02:00
Dom Dwyer 7f7d1f2ee7
fix(ingester): projection without time column
The ingester can project arbitrary columns at query time, and has no
special requirement that the "time" column be part of that projection.

Because the timestamp summary generation explicitly requires the time
column to exist, it panics when there's no "time" column in the
projection - this is a bit of a modelling mismatch more than anything.
2023-07-13 14:22:48 +02:00
Fraser Savage 4c54d10098
refactor(ingester): Simplify test code and reference actor handle passing
As pointed out, use of the turbofish for the `MockPersistQueue` default
constructor can be avoided by a specialised `Default` implementation
on the type. The WAL reference actor handle is internally refcounted,
so this commit also stops wrapping it in an `Arc`.

Co-authored-by: Dom <dom@itsallbroken.com>
2023-07-13 12:20:58 +01:00
Fraser Savage 516880eeb8
docs(ingester): Make it clear that getting op `SequenceNumberSet` is not free
The type is not copy and does not use a cached value - it collects a new
owned set.

Co-authored-by: Dom <dom@itsallbroken.com>
2023-07-13 11:42:30 +01:00
Dom Dwyer 7bd6e90830
perf: only send metadata for relevant partitions
When partition pruning is possible, it skips sending the data for
partitions that have no affect on the query outcome.

This commit does the same for the partition metadata - these frames can
form a significant portion of the query response when the row count is
low, and for pruned partitions have no bearing on the query result.
2023-07-12 18:38:43 +02:00
kodiakhq[bot] e73116a122
Merge branch 'main' into cn/query-catalog-with-either-partition-identifier 2023-07-12 14:51:02 +00:00
Fraser Savage 729851be58
test(ingester): Integration test for RPC write trace context inheritrance 2023-07-12 15:48:41 +01:00
Fraser Savage 458b1bf1a6
feat(ingester): Extract SpanContext from RPC write request
Ensure that if a `SpanContext` type is present in the request that the
trace ID is used for spans in the RPC write path.
2023-07-12 14:22:58 +01:00
Dom Dwyer af56985d70
refactor(ingester): emit span for query handler
Emit a span that covers the entire flight query handler.
2023-07-12 14:42:43 +02:00
Carol (Nichols || Goulding) 22c17fb970
feat: Abstract over which partition ID type we're using to list Parquet files 2023-07-10 13:40:01 -04:00
Carol (Nichols || Goulding) c1e42651ec
feat: Abstract over which partition ID type we're using to compare and swap sort keys 2023-07-10 13:39:19 -04:00
Carol (Nichols || Goulding) eec31b7f00
feat: Abstract over which partition ID type we're using to get a partition from the catalog 2023-07-10 10:43:20 -04:00
kodiakhq[bot] 5fa861abab
Merge branch 'main' into savage/individually-sequence-partitions-within-writes 2023-07-10 12:48:37 +00:00
Dom 341dcf2124
Merge branch 'main' into dom/partition-query-concurrency 2023-07-10 10:24:09 +01:00
dependabot[bot] 12317fee23
chore(deps): Bump async-channel from 1.8.0 to 1.9.0
Bumps [async-channel](https://github.com/smol-rs/async-channel) from 1.8.0 to 1.9.0.
- [Release notes](https://github.com/smol-rs/async-channel/releases)
- [Changelog](https://github.com/smol-rs/async-channel/blob/master/CHANGELOG.md)
- [Commits](https://github.com/smol-rs/async-channel/compare/v1.8.0...v1.9.0)

---
updated-dependencies:
- dependency-name: async-channel
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-10 01:42:26 +00:00
Dom Dwyer ea38e93511
test(bench): concurrent partition queries
Benchmark the performance of concurrent queries against a single
partition, varying the number of concurrent queries and size of buffered
data in the partition.
2023-07-07 16:27:44 +02:00
kodiakhq[bot] e06b6987f0
Merge branch 'main' into savage/remove-op-level-sequence-number-for-writes 2023-07-07 10:12:04 +00:00
dependabot[bot] 057ee40cb9
chore(deps): Bump thiserror from 1.0.41 to 1.0.43 (#8181)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.41 to 1.0.43.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.41...1.0.43)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-07 09:25:12 +00:00
Fraser Savage 197078d6c7
feat(ingester): Use WAL reference actor for file deletion during graceful shutdown
This change integrates the WAL reference actor with the graceful
shutdown buffer drain & persist behaviour, relying on its knowledge of
partial persistence for deletion and shutdown timing.
2023-07-06 17:18:12 +01:00
Dom a005f344d8
Merge branch 'main' into 7899/wal-disk-metrics 2023-07-06 14:44:11 +01:00