Commit Graph

13165 Commits (5453ad8ba4aa2f729b7ac0ceb829d5dbd343dcbc)

Author SHA1 Message Date
Fraser Savage 5453ad8ba4
feat(router): Include table/column diff for namespace schema cache update
This adds some computational overhead during the merging of new
namespace schema with what's in the router's local cache, but will allow
gossiping of changes.
2023-07-27 13:37:47 +01:00
Marco Neumann 73339cfc57
fix: remove sqlx "used" metrics (#8336)
PR #8327 introduced a bunch of metrics for the sqlx connection pool. One
of the metrics was the "used" metrics that was supposed to count
"currently in use" connection. In prod however this metric underflows to
a very large integer. It seems that "acquire" callback is only used by sqlx for
re-used connections (i.e. for the transition from "idle" to "used").
Now we could try to work around it but since there is no "close
connection" callback, I doubt it it possible to do the accurately.

Luckily though we don't really need that counter. sqlx already offers
"active" (defined as idle + used) and "idle", so getting "used" is just
the difference. I removed the "used" metric nevertheless because
"active" and "idle" are read independently from each other (based on atomic
integers) and are NOT guaranteed to be in-sync. Calculating the
difference within IOx however would give the illusion that they are. So
I leave this to the dashboard / alert / whatever, because there it is
usually understood that metrics are samples and may be out of sync for a
very short time.

A nice side effect of this change is that it simplifies the code quite a
bit.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-27 10:04:56 +00:00
Dom b372e5532f
Merge pull request #8333 from influxdata/dom/cached-fsm-schema
perf(ingester): reusable FSM / RecordBatch schemas
2023-07-27 10:36:22 +01:00
Dom a37b85804d
Merge branch 'main' into dom/cached-fsm-schema 2023-07-27 10:31:02 +01:00
dependabot[bot] 854c4c25e9
chore(deps): Bump sysinfo from 0.29.6 to 0.29.7 (#8341)
Bumps [sysinfo](https://github.com/GuillaumeGomez/sysinfo) from 0.29.6 to 0.29.7.
- [Changelog](https://github.com/GuillaumeGomez/sysinfo/blob/master/CHANGELOG.md)
- [Commits](https://github.com/GuillaumeGomez/sysinfo/commits)

---
updated-dependencies:
- dependency-name: sysinfo
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-27 09:22:52 +00:00
dependabot[bot] ac810aab8a
chore(deps): Bump serde from 1.0.175 to 1.0.176 (#8343)
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.175 to 1.0.176.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.175...v1.0.176)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-27 09:17:39 +00:00
dependabot[bot] 700830d7d3
chore(deps): Bump serde_json from 1.0.103 to 1.0.104 (#8342)
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.103 to 1.0.104.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.103...v1.0.104)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-27 09:12:35 +00:00
Dom Dwyer ef158a664b
docs: ref-clone indicators for Schema
Cloning a Schema looks expensive, but it's not!
2023-07-27 11:12:11 +02:00
Dom cb3bc1f0fa
Merge pull request #8344 from influxdata/dependabot/cargo/pprof-0.12.1
chore(deps): Bump pprof from 0.12.0 to 0.12.1
2023-07-27 10:06:36 +01:00
dependabot[bot] cdaa3bc720
chore(deps): Bump pprof from 0.12.0 to 0.12.1
Bumps [pprof](https://github.com/tikv/pprof-rs) from 0.12.0 to 0.12.1.
- [Changelog](https://github.com/tikv/pprof-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tikv/pprof-rs/commits)

---
updated-dependencies:
- dependency-name: pprof
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-27 01:54:25 +00:00
wiedld 8a16703ff8
Merge pull request #8325 from influxdata/idpe-17789/compaction-job-in-compactor
feat(idpe-17789): provide CompactionJob throughout compactor
2023-07-26 10:52:44 -07:00
wiedld fe3897dd62
Merge branch 'main' into idpe-17789/compaction-job-in-compactor 2023-07-26 10:45:09 -07:00
kodiakhq[bot] 1a6aa1a8dc
Merge pull request #8337 from influxdata/savage/tolerate-empty-wal-write-during-replay
fix(ingester): Skip empty writes with no data during WAL replay
2023-07-26 10:11:33 +00:00
kodiakhq[bot] 352bb79fc1
Merge branch 'main' into savage/tolerate-empty-wal-write-during-replay 2023-07-26 10:06:25 +00:00
kodiakhq[bot] 11aba01385
Merge pull request #8330 from influxdata/savage/expose-router-health-check-configuration-options
feat(router): Expose circuit breaker healthcheck `ERROR_WINDOW` for config
2023-07-26 10:06:09 +00:00
Fraser Savage c818f90aef
docs(router): Remove code doc ref from router CLI flag text 2023-07-26 11:01:13 +01:00
Fraser Savage 3133f9c2eb
fix(ingester): Skip empty writes with no data during WAL replay
In very rare cases a panic mid-write can result in a partially completed
write to the WAL which contains no table data. This is now not replayed
(as there is nothing to replay) and does not panic when encountered,
but tracks the occurence into the WAL replayed ops metric and logs a
warning.
2023-07-26 10:43:10 +01:00
Fraser Savage 61e79374e0
feat(router): Expose circuit breaker healthcheck config
Exposes the `ERROR_WINDOW` parameter that controls the router's
downstream error-gate health check behaviour as an environment
variable/command line flag. This allows tuning, per-environment, the
period over which the error rate of 80% must be exceeded to cause an
ingester to appear unhealthy.
2023-07-26 09:48:55 +01:00
Dom Dwyer 41c9c0f396
perf(ingester): reusable FSM / RecordBatch schemas
Cache the merged Schema of all the RecordBatch within a buffer at
snapshot generation time.

To be useful, this cached schema is made available to the PartitionData
for re-use, allowing the schema of "hot" data within a partition's
mutable buffer to be read without generating a RecordBatch first.
2023-07-25 17:10:06 +02:00
Dom 7df6028bf1
Merge pull request #8331 from influxdata/dom/cached-summary-statistics
perf(ingester): cache summary statistics in partition FSM
2023-07-25 15:56:21 +01:00
Dom Dwyer b79b120788
refactor: per-partition summary statistics
Provide row count & timestamp min/max statistics on a per-partition
basis.

This commit builds on the FSM summary statistics, merging all FSM
statistics across all data within the PartitionData (in various states)
and making them available to the caller.
2023-07-25 14:44:38 +02:00
Dom Dwyer b4b7822f2b
perf: cache summary statistics in partition FSM
Cache the row count & timestamp min/max values within the partition FSM
/ buffer, and make them available through the Queryable trait.

This allows the PartitionData to read the row count of a buffer (either
"hot" for writes, a "snapshot" of immutable RecordBatch, or "persisting"
for in-flight persisting data).

These values will enable early partition pruning.
2023-07-25 14:44:37 +02:00
Dom 16a3ff8dfe
Merge pull request #8329 from influxdata/dom/remove-unused-projection
refactor: remove unused projection code
2023-07-25 12:10:18 +01:00
Dom Dwyer 5c3e19742a
refactor: remove unused projection code
This code was superseded in:

    https://github.com/influxdata/influxdb_iox/pull/8154

This code is now unused.
2023-07-25 12:54:20 +02:00
Dom d20cf4b094
Merge pull request #8324 from influxdata/dom/query-pruning-bench
test(bench): ingester query partition pruning
2023-07-25 11:29:06 +01:00
Dom 869d760b80
Merge branch 'main' into dom/query-pruning-bench 2023-07-25 11:07:33 +01:00
Marco Neumann b62e98cef1
feat: metrics for sqlx conn pools (#8327)
To better gauge how many connections we use and especially if we hit the
max connection limit, it would be helpful to actually have some metrics
available for the pool usage. This change adds a few basic metrics.
2023-07-25 10:07:25 +00:00
Dom 0c940222d2
Merge branch 'main' into dom/query-pruning-bench 2023-07-25 11:06:39 +01:00
Marco Neumann b883c7c554
chore: manual cargo update (#8328)
* chore: manual cargo update

Dependabot seemed to have fallen behind a bit.

```console
❯ cargo update
    Updating crates.io index
    Updating git repository `https://github.com/apache/arrow-datafusion.git`
    Updating git repository `https://github.com/mkmik/heappy`
    Updating allocator-api2 v0.2.15 -> v0.2.16
    Updating anyhow v1.0.71 -> v1.0.72
    Updating async-compression v0.4.0 -> v0.4.1
    Updating axum v0.6.18 -> v0.6.19
    Updating blake3 v1.4.0 -> v1.4.1
    Updating bstr v1.5.0 -> v1.6.0
    Updating constant_time_eq v0.2.6 -> v0.3.0
    Updating cpufeatures v0.2.8 -> v0.2.9
    Updating dashmap v5.4.0 -> v5.5.0
    Updating equivalent v1.0.0 -> v1.0.1
    Updating http-range-header v0.3.0 -> v0.3.1
    Updating hyper-rustls v0.24.0 -> v0.24.1
    Updating itoa v1.0.7 -> v1.0.9
    Updating num v0.4.0 -> v0.4.1
    Updating pest v2.7.0 -> v2.7.1
    Updating pest_derive v2.7.0 -> v2.7.1
    Updating pest_generator v2.7.0 -> v2.7.1
    Updating pest_meta v2.7.0 -> v2.7.1
    Updating proc-macro2 v1.0.63 -> v1.0.66
    Updating quote v1.0.29 -> v1.0.32
    Updating rustversion v1.0.12 -> v1.0.14
    Updating ryu v1.0.13 -> v1.0.15
    Updating semver v1.0.17 -> v1.0.18
    Updating seq-macro v0.3.3 -> v0.3.5
    Updating stringprep v0.1.2 -> v0.1.3
    Updating strum_macros v0.25.0 -> v0.25.1
    Updating symbolic-common v12.2.0 -> v12.3.0
    Updating symbolic-demangle v12.2.0 -> v12.3.0
    Updating syn v2.0.26 -> v2.0.27
    Updating toml_edit v0.19.12 -> v0.19.14
    Updating ucd-trie v0.1.5 -> v0.1.6
    Updating unicode-ident v1.0.9 -> v1.0.11
    Updating winnow v0.4.7 -> v0.5.1
```

* chore: Run cargo hakari tasks

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2023-07-25 09:56:55 +00:00
wiedld bab6f239ea feat(idpe-17789): move Compactor abstractions PartitionsSource and PartitionStream to use CompactionJob 2023-07-24 18:40:04 -07:00
wiedld 817bd595ca chore(idpe-17789): make PartitionsSource be crate private for scheduler 2023-07-24 14:44:40 -07:00
wiedld 82d55fd6b6 refactor(idpe-17789): move PartitionsSource to be separate traits in scheduler vs compactor 2023-07-24 14:43:52 -07:00
wiedld 02088995b2
feat(idpe 17789): compactor to scheduler communication. `update_job_status()` and `end_job()` (#8216)
* feat(idpe-17789): scheduler job_status() (#8121)

This block of work moves into the scheduler some of the specific downstream actions affiliated with compaction outcomes. Which responsibilities stay in the compactor, versus moved to the scheduler, roughly followed the heuristic of whether the action (a) had an impact on global catalog state (a.k.a. commits and partition skipping), (b) whether it's logging affiliated with compactor health (e.g. ParitionDoneSink logging outcomes) versus system health (e.g. logging commits), and (c) reporting to the scheduler on any errors encountered during compaction. This boundary is subject to change as we move forward.

Also, a noted caveat (TODO) on this commit. We have a CompactionJob which is used to track work handed off to each compactor. Currently it still uses the partition_id for tracking, but the followup PR will start moving the compactor to have more CompactionJob uuid awareness.

* fix(idpe-17789): need to remove partition from uniqueness tracking, so it becomes available again

* refactor(idpe-17789): split up the single-use end_job() from the multi-use update_job_status()

* feat(idpe-17789): Commit is now a scheduler trait, only used externally in the compactor_test_utils

* feat(idpe-17789): Propagate errors pertaining to commit, in both the scheduler and the compactor.

* feat(idpe-17789): PartitionDoneSink should have different crate-private traits for scheduler versus comactor.

* feat(idpe-17789): PartitionDoneSink should propagate errors

* test(idpe-17789): integration tests suite

* test(idpe-17789): test documenting what skip request does (as outcome)

* refactor(idpe-17789): make the validate of the upgrade commit, versus replacement commit, more explicit.

* feat(idpe-17789): switch to using parking_lot Mutex within the scheduler
2023-07-24 12:01:28 -07:00
wiedld bf1e28ba96
refactor: implementations of PartitionsSource in compactor_scheduler should use the parking lot mutex (#8309) 2023-07-24 11:29:20 -07:00
Dom Dwyer 32414acb00
test(bench): ingester query partition pruning
Adds benchmarks that exercise partition pruning during query execution
within the ingester, for varying partition counts within a table, and
varying row counts within each partition.
2023-07-24 17:26:48 +02:00
Joe-Blount acf9da2336
fix: detect empty list in compactor before assert (#8323) 2023-07-24 15:02:47 +00:00
Joe-Blount 3985c28bdb
Merge pull request #8306 from influxdata/jrb_69_smooth_rate_limiter
chore: improve rate limiter accuracy
2023-07-24 09:04:47 -05:00
Joe-Blount 968a0fc574
Merge branch 'main' into jrb_69_smooth_rate_limiter 2023-07-24 08:52:55 -05:00
Marco Neumann e822374270
chore: build annotated OCI images (#8301)
* refactor: isolate docker build to script

* chore: add labels to docker image

* chore: export image as OCI

* chore: print image digest

* fix: convert to OCI BEFORE calculating digest

* fix: use digest of uploaded image, not of the local archive

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-24 13:28:04 +00:00
kodiakhq[bot] 6aaa7edcbe
Merge pull request #8321 from influxdata/savage/reject-time-as-tag-for-custom-partition-schemes
fix: Reject `time` as a tag value for custom partition templates
2023-07-24 12:18:57 +00:00
kodiakhq[bot] 234c72e4af
Merge branch 'main' into savage/reject-time-as-tag-for-custom-partition-schemes 2023-07-24 12:13:53 +00:00
Marco Neumann edf77c73d8
fix: avoid panic when clock goes backwards (#8322)
I've seen at least one case in prod where the UTC clock goes backwards.
The `TimeProvider` and `Time` interface even warns about that. However
there was a `Sub` impl that would panic if that happens and even though
this was documented, I think we can do better and just not offer a
panicky interface at all.

So this removes the `Sub` impl. and replaces all uses with
`checked_duration_since`.
2023-07-24 12:10:41 +00:00
Fraser Savage c834ec171f
test(router): Custom partition template API create using `time` tag value is rejected
This removes the double negative from the error message and adds
coverage at the router's gRPC API level for the rejection of the bad
TagValue value.
2023-07-24 13:07:04 +01:00
Fraser Savage aac4166bf0
fix: Reject `time` as a tag value for custom partition templates
Time has a special meaning and can be partitioned on by the strftime
formatter. It should not be used as a tag value part in a custom
partitioning template.
2023-07-24 12:49:13 +01:00
dependabot[bot] fca624a039
chore(deps): Bump sqlparser from 0.36.0 to 0.36.1 (#8312)
Bumps [sqlparser](https://github.com/sqlparser-rs/sqlparser-rs) from 0.36.0 to 0.36.1.
- [Changelog](https://github.com/sqlparser-rs/sqlparser-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sqlparser-rs/sqlparser-rs/compare/v0.36.0...v0.36.1)

---
updated-dependencies:
- dependency-name: sqlparser
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-24 10:28:23 +00:00
dependabot[bot] b7ab20be0d
chore(deps): Bump either from 1.8.1 to 1.9.0 (#8314)
Bumps [either](https://github.com/bluss/either) from 1.8.1 to 1.9.0.
- [Commits](https://github.com/bluss/either/compare/1.8.1...1.9.0)

---
updated-dependencies:
- dependency-name: either
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-24 10:23:38 +00:00
dependabot[bot] faa8d44492
chore(deps): Bump thiserror from 1.0.43 to 1.0.44 (#8315)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.43 to 1.0.44.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.43...1.0.44)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-24 10:18:44 +00:00
dependabot[bot] c6df58b30c
chore(deps): Bump nu-ansi-term from 0.48.0 to 0.49.0 (#8316)
Bumps [nu-ansi-term](https://github.com/nushell/nu-ansi-term) from 0.48.0 to 0.49.0.
- [Release notes](https://github.com/nushell/nu-ansi-term/releases)
- [Changelog](https://github.com/nushell/nu-ansi-term/blob/main/CHANGELOG.md)
- [Commits](https://github.com/nushell/nu-ansi-term/compare/v0.48.0...v0.49.0)

---
updated-dependencies:
- dependency-name: nu-ansi-term
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-24 10:12:56 +00:00
dependabot[bot] cd31492e5b
chore(deps): Bump async-trait from 0.1.71 to 0.1.72 (#8317)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.71 to 0.1.72.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.71...0.1.72)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-24 10:07:18 +00:00
dependabot[bot] 1d1cc86912
chore(deps): Bump serde from 1.0.173 to 1.0.175 (#8318)
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.173 to 1.0.175.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.173...v1.0.175)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
2023-07-24 10:02:07 +00:00