Commit Graph

13254 Commits (dac0db21960c871c298924269d198a8b01849724)

Author SHA1 Message Date
wiedld 9a7ff9ecfc chore(idpe-17789): update code comments to reflect both jobs and partitions 2023-07-27 15:39:18 -07:00
wiedld d7fee9fdb8 refactor(idpe-17789): rename local variables and private struct properties to jobs (versus partitions). 2023-07-27 15:38:21 -07:00
wiedld 7ac6c6d80f
Merge pull request #8326 from influxdata/idpe-17789/compaction-job-renaming
refactor(idpe-17789): renaming abstractions related to partitions source, to compaction jobs source
2023-07-27 15:30:45 -07:00
wiedld 78ef536954
Merge branch 'main' into idpe-17789/compaction-job-renaming 2023-07-27 15:05:49 -07:00
Nga Tran e1626c3ba4
Merge pull request #8307 from influxdata/ntran/table_cli
feat: create table CLI
2023-07-27 11:03:07 -04:00
NGA-TRAN 091d387e2a chore: merge main to branch ntran/table_cli 2023-07-27 10:28:45 -04:00
Joe-Blount 4af45e0cee
Merge pull request #8340 from influxdata/jrb_73_stuck
fix: no percent split during ManySmallFiles compaction
2023-07-27 09:03:56 -05:00
Joe-Blount f5a41592da Merge remote-tracking branch 'origin/main' into jrb_73_stuck
# Conflicts:
#	compactor/tests/layouts/stuck.rs
2023-07-27 08:54:50 -05:00
Joe-Blount 525f8ec0cb
fix: compactor loop splitting then undoing it (#8338) 2023-07-27 13:17:30 +00:00
Fraser Savage e00a5cab13
perf(router): Pre-compute `ChangeStats` new column total during schema merge
During the schema merge the new tables are iterated over already (to find
which tables and columns are new), so the number needed for the metrics
can be pre-computed to spare two extra loops over the new tables and new
columns returned in `ChangeStats`.
2023-07-27 14:01:50 +01:00
Fraser Savage 5453ad8ba4
feat(router): Include table/column diff for namespace schema cache update
This adds some computational overhead during the merging of new
namespace schema with what's in the router's local cache, but will allow
gossiping of changes.
2023-07-27 13:37:47 +01:00
Marco Neumann 73339cfc57
fix: remove sqlx "used" metrics (#8336)
PR #8327 introduced a bunch of metrics for the sqlx connection pool. One
of the metrics was the "used" metrics that was supposed to count
"currently in use" connection. In prod however this metric underflows to
a very large integer. It seems that "acquire" callback is only used by sqlx for
re-used connections (i.e. for the transition from "idle" to "used").
Now we could try to work around it but since there is no "close
connection" callback, I doubt it it possible to do the accurately.

Luckily though we don't really need that counter. sqlx already offers
"active" (defined as idle + used) and "idle", so getting "used" is just
the difference. I removed the "used" metric nevertheless because
"active" and "idle" are read independently from each other (based on atomic
integers) and are NOT guaranteed to be in-sync. Calculating the
difference within IOx however would give the illusion that they are. So
I leave this to the dashboard / alert / whatever, because there it is
usually understood that metrics are samples and may be out of sync for a
very short time.

A nice side effect of this change is that it simplifies the code quite a
bit.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-27 10:04:56 +00:00
Dom b372e5532f
Merge pull request #8333 from influxdata/dom/cached-fsm-schema
perf(ingester): reusable FSM / RecordBatch schemas
2023-07-27 10:36:22 +01:00
Dom a37b85804d
Merge branch 'main' into dom/cached-fsm-schema 2023-07-27 10:31:02 +01:00
dependabot[bot] 854c4c25e9
chore(deps): Bump sysinfo from 0.29.6 to 0.29.7 (#8341)
Bumps [sysinfo](https://github.com/GuillaumeGomez/sysinfo) from 0.29.6 to 0.29.7.
- [Changelog](https://github.com/GuillaumeGomez/sysinfo/blob/master/CHANGELOG.md)
- [Commits](https://github.com/GuillaumeGomez/sysinfo/commits)

---
updated-dependencies:
- dependency-name: sysinfo
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-27 09:22:52 +00:00
dependabot[bot] ac810aab8a
chore(deps): Bump serde from 1.0.175 to 1.0.176 (#8343)
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.175 to 1.0.176.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.175...v1.0.176)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-27 09:17:39 +00:00
dependabot[bot] 700830d7d3
chore(deps): Bump serde_json from 1.0.103 to 1.0.104 (#8342)
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.103 to 1.0.104.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.103...v1.0.104)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-27 09:12:35 +00:00
Dom Dwyer ef158a664b
docs: ref-clone indicators for Schema
Cloning a Schema looks expensive, but it's not!
2023-07-27 11:12:11 +02:00
Dom cb3bc1f0fa
Merge pull request #8344 from influxdata/dependabot/cargo/pprof-0.12.1
chore(deps): Bump pprof from 0.12.0 to 0.12.1
2023-07-27 10:06:36 +01:00
dependabot[bot] cdaa3bc720
chore(deps): Bump pprof from 0.12.0 to 0.12.1
Bumps [pprof](https://github.com/tikv/pprof-rs) from 0.12.0 to 0.12.1.
- [Changelog](https://github.com/tikv/pprof-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tikv/pprof-rs/commits)

---
updated-dependencies:
- dependency-name: pprof
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-27 01:54:25 +00:00
Joe-Blount 6246275c4a chore: insta churn updates 2023-07-26 15:59:49 -05:00
Joe-Blount f1e088aa0e fix: no percent split during ManySmallFiles compaction 2023-07-26 15:59:20 -05:00
NGA-TRAN afd5b12324 chore: address review comments and fix tests due to previous commit 2023-07-26 16:40:54 -04:00
wiedld 8a16703ff8
Merge pull request #8325 from influxdata/idpe-17789/compaction-job-in-compactor
feat(idpe-17789): provide CompactionJob throughout compactor
2023-07-26 10:52:44 -07:00
wiedld fe3897dd62
Merge branch 'main' into idpe-17789/compaction-job-in-compactor 2023-07-26 10:45:09 -07:00
Carol (Nichols || Goulding) 14815435b8
fix: Use proto::PartitionTemplate's serde_json support directly
Rather than defining new types and implementing serialization on them
2023-07-26 13:04:55 -04:00
NGA-TRAN 1ddc64d68d test: modify and add tests check valid and invalid strftime 2023-07-26 11:44:13 -04:00
kodiakhq[bot] 1a6aa1a8dc
Merge pull request #8337 from influxdata/savage/tolerate-empty-wal-write-during-replay
fix(ingester): Skip empty writes with no data during WAL replay
2023-07-26 10:11:33 +00:00
kodiakhq[bot] 352bb79fc1
Merge branch 'main' into savage/tolerate-empty-wal-write-during-replay 2023-07-26 10:06:25 +00:00
kodiakhq[bot] 11aba01385
Merge pull request #8330 from influxdata/savage/expose-router-health-check-configuration-options
feat(router): Expose circuit breaker healthcheck `ERROR_WINDOW` for config
2023-07-26 10:06:09 +00:00
Fraser Savage c818f90aef
docs(router): Remove code doc ref from router CLI flag text 2023-07-26 11:01:13 +01:00
Fraser Savage 3133f9c2eb
fix(ingester): Skip empty writes with no data during WAL replay
In very rare cases a panic mid-write can result in a partially completed
write to the WAL which contains no table data. This is now not replayed
(as there is nothing to replay) and does not panic when encountered,
but tracks the occurence into the WAL replayed ops metric and logs a
warning.
2023-07-26 10:43:10 +01:00
Fraser Savage 61e79374e0
feat(router): Expose circuit breaker healthcheck config
Exposes the `ERROR_WINDOW` parameter that controls the router's
downstream error-gate health check behaviour as an environment
variable/command line flag. This allows tuning, per-environment, the
period over which the error rate of 80% must be exceeded to cause an
ingester to appear unhealthy.
2023-07-26 09:48:55 +01:00
NGA-TRAN 62c9424cca chore: Merge branch 'main' into ntran/table_cli 2023-07-25 17:42:01 -04:00
NGA-TRAN 57eed252c7 chore: fix a comment 2023-07-25 17:11:48 -04:00
NGA-TRAN e6cf9c9d61 fix: rename namespaces of differnet tests to avoid test failures 2023-07-25 17:07:28 -04:00
NGA-TRAN 44e1c1abdb feat: implement partition templpate as json and more tests as well as verify the partition key after inserting data 2023-07-25 16:51:57 -04:00
Dom Dwyer 41c9c0f396
perf(ingester): reusable FSM / RecordBatch schemas
Cache the merged Schema of all the RecordBatch within a buffer at
snapshot generation time.

To be useful, this cached schema is made available to the PartitionData
for re-use, allowing the schema of "hot" data within a partition's
mutable buffer to be read without generating a RecordBatch first.
2023-07-25 17:10:06 +02:00
Dom 7df6028bf1
Merge pull request #8331 from influxdata/dom/cached-summary-statistics
perf(ingester): cache summary statistics in partition FSM
2023-07-25 15:56:21 +01:00
Dom Dwyer fc866ebe92
feat(gossip): peer exchange
This commit implements peer exchange (abbreviated PEX) between peers
of the gossip cluster.

This allows using a set of fixed seeds and dynamic node membership -
nodes can come and go without having to be manually configured across
all peers in order to communicate.

"Dead" peers are periodically cleaned from the local list of active
peers, ensuring the list of peers doesn't grow forever as node churn
occurs. This is a best-effort, conservative process, biasing towards
reliability/deliverability rather than accuracy and fast removal - it's
not a health check!
2023-07-25 15:14:19 +02:00
Dom Dwyer b79b120788
refactor: per-partition summary statistics
Provide row count & timestamp min/max statistics on a per-partition
basis.

This commit builds on the FSM summary statistics, merging all FSM
statistics across all data within the PartitionData (in various states)
and making them available to the caller.
2023-07-25 14:44:38 +02:00
Dom Dwyer b4b7822f2b
perf: cache summary statistics in partition FSM
Cache the row count & timestamp min/max values within the partition FSM
/ buffer, and make them available through the Queryable trait.

This allows the PartitionData to read the row count of a buffer (either
"hot" for writes, a "snapshot" of immutable RecordBatch, or "persisting"
for in-flight persisting data).

These values will enable early partition pruning.
2023-07-25 14:44:37 +02:00
Dom 16a3ff8dfe
Merge pull request #8329 from influxdata/dom/remove-unused-projection
refactor: remove unused projection code
2023-07-25 12:10:18 +01:00
Dom Dwyer 5c3e19742a
refactor: remove unused projection code
This code was superseded in:

    https://github.com/influxdata/influxdb_iox/pull/8154

This code is now unused.
2023-07-25 12:54:20 +02:00
Dom d20cf4b094
Merge pull request #8324 from influxdata/dom/query-pruning-bench
test(bench): ingester query partition pruning
2023-07-25 11:29:06 +01:00
Dom 869d760b80
Merge branch 'main' into dom/query-pruning-bench 2023-07-25 11:07:33 +01:00
Marco Neumann b62e98cef1
feat: metrics for sqlx conn pools (#8327)
To better gauge how many connections we use and especially if we hit the
max connection limit, it would be helpful to actually have some metrics
available for the pool usage. This change adds a few basic metrics.
2023-07-25 10:07:25 +00:00
Dom 0c940222d2
Merge branch 'main' into dom/query-pruning-bench 2023-07-25 11:06:39 +01:00
Marco Neumann b883c7c554
chore: manual cargo update (#8328)
* chore: manual cargo update

Dependabot seemed to have fallen behind a bit.

```console
❯ cargo update
    Updating crates.io index
    Updating git repository `https://github.com/apache/arrow-datafusion.git`
    Updating git repository `https://github.com/mkmik/heappy`
    Updating allocator-api2 v0.2.15 -> v0.2.16
    Updating anyhow v1.0.71 -> v1.0.72
    Updating async-compression v0.4.0 -> v0.4.1
    Updating axum v0.6.18 -> v0.6.19
    Updating blake3 v1.4.0 -> v1.4.1
    Updating bstr v1.5.0 -> v1.6.0
    Updating constant_time_eq v0.2.6 -> v0.3.0
    Updating cpufeatures v0.2.8 -> v0.2.9
    Updating dashmap v5.4.0 -> v5.5.0
    Updating equivalent v1.0.0 -> v1.0.1
    Updating http-range-header v0.3.0 -> v0.3.1
    Updating hyper-rustls v0.24.0 -> v0.24.1
    Updating itoa v1.0.7 -> v1.0.9
    Updating num v0.4.0 -> v0.4.1
    Updating pest v2.7.0 -> v2.7.1
    Updating pest_derive v2.7.0 -> v2.7.1
    Updating pest_generator v2.7.0 -> v2.7.1
    Updating pest_meta v2.7.0 -> v2.7.1
    Updating proc-macro2 v1.0.63 -> v1.0.66
    Updating quote v1.0.29 -> v1.0.32
    Updating rustversion v1.0.12 -> v1.0.14
    Updating ryu v1.0.13 -> v1.0.15
    Updating semver v1.0.17 -> v1.0.18
    Updating seq-macro v0.3.3 -> v0.3.5
    Updating stringprep v0.1.2 -> v0.1.3
    Updating strum_macros v0.25.0 -> v0.25.1
    Updating symbolic-common v12.2.0 -> v12.3.0
    Updating symbolic-demangle v12.2.0 -> v12.3.0
    Updating syn v2.0.26 -> v2.0.27
    Updating toml_edit v0.19.12 -> v0.19.14
    Updating ucd-trie v0.1.5 -> v0.1.6
    Updating unicode-ident v1.0.9 -> v1.0.11
    Updating winnow v0.4.7 -> v0.5.1
```

* chore: Run cargo hakari tasks

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2023-07-25 09:56:55 +00:00
wiedld 4dd73a036b refactor(idpe-17789): rename remaining references (in methods and report output) to be compaction_job_stream 2023-07-24 18:40:15 -07:00