Commit Graph

13085 Commits (a49a0aaa213e49cfc4440ac1cc6c04193b949733)

Author SHA1 Message Date
wiedld d43300635e
Revert "feat(idpe-17789): scheduler job_status() (#8202)" (#8213)
This reverts commit 3dabccd84b.
2023-07-11 10:33:56 -07:00
Joe-Blount c5a4912399 chore: add compactor tracing test case 2023-07-11 10:43:09 -05:00
Joe-Blount 23aff4afc4 chore: add more useful info to compactor tracing 2023-07-11 10:42:32 -05:00
wiedld 3dabccd84b
feat(idpe-17789): scheduler job_status() (#8202)
* feat(idpe-17789): scheduler job_status() (#8121)

This block of work moves into the scheduler some of the specific downstream actions affiliated with compaction outcomes. Which responsibilities stay in the compactor, versus moved to the scheduler, roughly followed the heuristic of whether the action (a) had an impact on global catalog state (a.k.a. commits and partition skipping), (b) whether it's logging affiliated with compactor health (e.g. ParitionDoneSink logging outcomes) versus system health (e.g. logging commits), and (c) reporting to the scheduler on any errors encountered during compaction. This boundary is subject to change as we move forward.

Also, a noted caveat (TODO) on this commit. We have a CompactionJob which is used to track work handed off to each compactor. Currently it still uses the partition_id for tracking, but the followup PR will start moving the compactor to have more CompactionJob uuid awareness.
2023-07-11 08:41:12 -07:00
Andrew Lamb b24f9c81ba
chore: Update DataFusion pin, updates for API changed (#8199) 2023-07-11 13:36:38 +00:00
Dom aaaa669bfb
Merge branch 'main' into cn/query-catalog-with-either-partition-identifier 2023-07-11 10:47:56 +01:00
dependabot[bot] 2d5decf108
chore(deps): Bump regex-syntax from 0.7.3 to 0.7.4 (#8206)
Bumps [regex-syntax](https://github.com/rust-lang/regex) from 0.7.3 to 0.7.4.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/regex-syntax-0.7.3...regex-syntax-0.7.4)

---
updated-dependencies:
- dependency-name: regex-syntax
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-11 09:27:00 +00:00
dependabot[bot] f3b6c8bf15
chore(deps): Bump rustls from 0.21.3 to 0.21.5 (#8207)
Bumps [rustls](https://github.com/rustls/rustls) from 0.21.3 to 0.21.5.
- [Release notes](https://github.com/rustls/rustls/releases)
- [Commits](https://github.com/rustls/rustls/compare/v/0.21.3...v/0.21.5)

---
updated-dependencies:
- dependency-name: rustls
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
2023-07-11 09:21:38 +00:00
Martin Hilton 9111cd517f
feat(influxql): PERCENTILE function (#8187)
* feat(influxql): support TOP and BOTTOM  functions

Add support for the TOP and BOTTOM functions which return the first
n rows in some ordered data set.

* fix: clippy

* refactor(influxql): use window aggregates for selectors

Change the implentation of ProjectionType::Selector to use a window
aggregate, rather than an aggregate with a custom selector function.
This is in preparation for implementing PERCENTILE.

* feat(influxql): PERCENTILE selector

Add a selector for the row containing the nth percentile of a
partition. This is the behaviour used when a single selector function
is used in an influxql query.

* feat(influxql): PERCENTILE aggregator

Add the PERCENTILE aggregation function for when the PERCENTILE
function is used in an aggregating projection. This implementation
buffers all non-null field values in memory in order to perform the
operation and therefore could be an expensive operation. This is
necessary for compatibility with earlier influxdb versions.

* refactor(influxql): move PERCENTILE implementation out of plan

The plan module is getting rather full of user-defined function
implementations. This breaks the new functions used to implement
percentile into some new top-level modules for aggregate and window
UDFs.

* fix: doc-lint

* chore: refactor `find_enumerated`

* chore: use `s` in format string

* chore: include the unexpected selector function in the error

* chore(influxql): review suggestions

Added some addition comments to help understanding.

Changed the handling os slector functions such that FIRST, LAST,
MAX & MIN behave the same as they did before PERCENTILE was added.

* chore(influxql): make percent_row_number a window UDF

Now that user-defined window functions are available make the
percent_row_number function be one of those. this allows the values
to be calculated for the entire window partition in one go.

For some reason the user-defined window function cannot return NULL
values. This function uses 0 where it would otherwise use NULL, as
row numbering starts at 1.

---------

Co-authored-by: Stuart Carnie <stuart.carnie@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-11 05:33:16 +00:00
Joe-Blount 16939c849d chore: add tracing to compactor 2023-07-10 16:36:24 -05:00
Carol (Nichols || Goulding) f20e9e6368
fix: Add index on parquet_file.partition_hash_id for lookup perf 2023-07-10 13:40:03 -04:00
Carol (Nichols || Goulding) 22c17fb970
feat: Abstract over which partition ID type we're using to list Parquet files 2023-07-10 13:40:01 -04:00
Carol (Nichols || Goulding) c1e42651ec
feat: Abstract over which partition ID type we're using to compare and swap sort keys 2023-07-10 13:39:19 -04:00
Carol (Nichols || Goulding) eec31b7f00
feat: Abstract over which partition ID type we're using to get a partition from the catalog 2023-07-10 10:43:20 -04:00
kodiakhq[bot] 5521310005
Merge pull request #8094 from influxdata/savage/individually-sequence-partitions-within-writes
feat(ingester): Assign individual sequence numbers for writes per partition
2023-07-10 14:39:39 +00:00
Fraser Savage dec0244bff
refactor(e2e): Wait 100ms between queries in debug::build_catalog test 2023-07-10 15:27:30 +01:00
Fraser Savage 7e17b54f2a
Merge branch 'main' into savage/individually-sequence-partitions-within-writes 2023-07-10 15:19:45 +01:00
Fraser Savage 0978aa0551
fix(e2e): Add small busy-loop to debug::build_catalog test to assert only on non-empty results 2023-07-10 15:13:37 +01:00
Joe-Blount 83febf3eef
Merge pull request #8192 from influxdata/jrb_62_index_partition_tbl_on_createdat
chore: create index on created_at in partition table
2023-07-10 08:44:57 -05:00
Joe-Blount fdecf96103
Merge branch 'main' into jrb_62_index_partition_tbl_on_createdat 2023-07-10 08:38:12 -05:00
kodiakhq[bot] 5fa861abab
Merge branch 'main' into savage/individually-sequence-partitions-within-writes 2023-07-10 12:48:37 +00:00
Dom Dwyer c2273e6488
docs: remove outdated comment 2023-07-10 14:27:08 +02:00
Dom Dwyer 701da1363c
refactor: remove panic on impossible error
Remove the logical complexity of error handling for an error that cannot
occur.

This was an artifact of pre-PR refactoring - the error being returned
SHOULD never be reached, as the only error returned is the "your message
is too big" error, and that's not possible because the message size is
validated in the GossipHandle::broadcast() method before it reaches the
reactor.
2023-07-10 14:10:03 +02:00
Dom Dwyer a686580ffa
test: multiple messages in single test
This ensures various reused scratch buffers are wiped between uses.
2023-07-10 14:03:57 +02:00
Dom Dwyer 71625043e2
test: remove dbg!() 2023-07-10 14:02:57 +02:00
Dom Dwyer 060f1b2ed6
docs: unwrap correctness docs
Describe the possible reasons a socket recvfrom() would cause a panic.
2023-07-10 14:01:11 +02:00
Dom Dwyer 991692d2fb
refactor: short/long panic message 2023-07-10 13:51:40 +02:00
Dom Dwyer bee1b45c13
build: reuse path var
DRY the path var.
2023-07-10 13:48:01 +02:00
Dom Dwyer 118aefe2d2
chore: use workspace crate config
Inherit version/authors/edition from the workspace.
2023-07-10 13:39:52 +02:00
Dom Dwyer 7880f9287f
chore: add license 2023-07-10 12:11:16 +02:00
Dom Dwyer 58c4874880
chore: workspace_hack support
Add workspace_hack and whitelist the import.
2023-07-10 12:11:15 +02:00
Dom Dwyer 48466bfa89
feat(metrics): bytes/frames sent/received & peers
Emit metrics tracking the number of bytes sent / received, and number of
frames sent / received by the local node.

Track the number of discovered peers to record peer discovery rate and
current number of known peers per node.
2023-07-10 12:11:15 +02:00
Dom Dwyer 93789d7abb
feat: refuse oversized gossip payloads
Calculate the available byte size for a user payload sent via gossip,
and pro-actively check this limit earlier, when the caller is attempting
to send the frame, rather than later in the reactor where there's no
feedback to the caller.

DRY frame serialisation to simplify enforcement, and validate/refuse
oversized frames in the reactor so that frames are unlikely to be
truncated by receivers.
2023-07-10 12:11:14 +02:00
Dom Dwyer bc9ebc9c66
feat: gossip primitive
Adds a simple "gossip" implementation (more accurately described as a
pub/sub primitive currently) that supports broadcasting
application-level messages to the set of active peers.

This implementation uses UDP as a transport for best-effort delivery,
and enables zero-copy use of the payload using the Bytes crate.

Only peers explicitly provided as "seeds" when initialising will be
known to a gossip node - there's currently no peer exchange mechanism.
This implementation tolerates seeds changing their DNS entries when
restarting to point at new socket addresses (such as within Kubernetes
when pods move around).
2023-07-10 12:11:14 +02:00
Dom Dwyer 48aa4a5e33
feat(gossip): frame proto definitions
Adds a proto definition and configures prost to build the rust types
from it.

The gossip framing is intended to be flexible and decoupled - the gossip
library will batch together one or more opaque application messages
and/or control frames, and uniquely identify each peer with a
per-instance UUID to detect crashes/restarts and track peers.
2023-07-10 12:11:13 +02:00
Dom Dwyer 5c191ce6cf
ci: enable standard lint set
Adds the somewhat "standard" lint set we use to the gossip lib.
2023-07-10 12:11:12 +02:00
Dom Dwyer 69ab70ce99
feat: init gossip package
Adds a new empty "gossip" package to the workspace.
2023-07-10 12:11:12 +02:00
Dom Dwyer 5027c9a88c
chore: sort workspace members
Sort the package names in the workspace member declaration.
2023-07-10 12:11:03 +02:00
Andrew Lamb 3ce11d8d66
chore: Update DataFusion (#8190)
* chore: Update DataFusion

* chore: Run cargo hakari tasks

* fix: Update for API changes

* fix: use display format

* chore: Update explain plan output

* fix: update plans

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-10 09:54:50 +00:00
kodiakhq[bot] 1f1b17f712
Merge pull request #8188 from influxdata/dom/partition-query-concurrency
test(bench): concurrent ingester partition queries
2023-07-10 09:41:03 +00:00
kodiakhq[bot] 19b59d9de5
Merge branch 'main' into dom/partition-query-concurrency 2023-07-10 09:35:22 +00:00
dependabot[bot] 1f82f6b059
chore(deps): Bump snafu from 0.7.4 to 0.7.5 (#8193)
Bumps [snafu](https://github.com/shepmaster/snafu) from 0.7.4 to 0.7.5.
- [Changelog](https://github.com/shepmaster/snafu/blob/main/CHANGELOG.md)
- [Commits](https://github.com/shepmaster/snafu/compare/0.7.4...0.7.5)

---
updated-dependencies:
- dependency-name: snafu
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-10 09:35:10 +00:00
Dom d395a4907a
Merge branch 'main' into dom/partition-query-concurrency 2023-07-10 10:34:43 +01:00
dependabot[bot] f0789b74ce
chore(deps): Bump regex from 1.9.0 to 1.9.1 (#8195)
Bumps [regex](https://github.com/rust-lang/regex) from 1.9.0 to 1.9.1.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.9.0...1.9.1)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-10 09:29:12 +00:00
Dom 341dcf2124
Merge branch 'main' into dom/partition-query-concurrency 2023-07-10 10:24:09 +01:00
dependabot[bot] ab16180f15
chore(deps): Bump serde from 1.0.167 to 1.0.168 (#8194)
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.167 to 1.0.168.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.167...v1.0.168)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-10 09:24:01 +00:00
Dom d87f69a76c
Merge pull request #8197 from influxdata/dependabot/cargo/async-channel-1.9.0
chore(deps): Bump async-channel from 1.8.0 to 1.9.0
2023-07-10 10:17:46 +01:00
dependabot[bot] 12317fee23
chore(deps): Bump async-channel from 1.8.0 to 1.9.0
Bumps [async-channel](https://github.com/smol-rs/async-channel) from 1.8.0 to 1.9.0.
- [Release notes](https://github.com/smol-rs/async-channel/releases)
- [Changelog](https://github.com/smol-rs/async-channel/blob/master/CHANGELOG.md)
- [Commits](https://github.com/smol-rs/async-channel/compare/v1.8.0...v1.9.0)

---
updated-dependencies:
- dependency-name: async-channel
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-10 01:42:26 +00:00
Joe-Blount c2442c31f3 chore: create partition table index for created_at 2023-07-07 16:27:05 -05:00
Marko Mikulicic b5faa37152
fix: Plumb tracing header name env/flag to client (#8189)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-07 21:07:29 +00:00