Commit Graph

13247 Commits (c8242c74696bd849e8b296f7b255d909babd7bd5)

Author SHA1 Message Date
Chunchun Ye c8242c7469
chore(cli): add `--partition-template` to `namespace create` (#8365)
* chore(cli): add `--partition-template` to namespace create

* chore: fix typo in doc for `PartitionTemplateConfig`

chore: add max limit 8 for partition template in doc

* chore: add e2e tests

* chore: fmt

* chore: add more e2e tests for namespace create with partition template

* chore: show doc comments in cli help interface

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-01 14:37:00 +00:00
Nga Tran 73f38077b6
feat: add sort_key_ids as array of bigints into catalog partition (#8375)
* feat: add sort_key_ids as array of bigints into catalog partition

* chore: add comments

* chore: remove comments to avoid changing them in the future due to checksum requirement

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-01 14:28:30 +00:00
kodiakhq[bot] 4994157910
Merge pull request #8380 from influxdata/savage/configure-router-health-probe-numbers
feat(router): Expose `num_probes` request count used to health-check ingesters as config option
2023-08-01 13:57:41 +00:00
Fraser Savage df2c1850fb
refactor(router): Try to fix rustfmt having a nap 2023-08-01 14:51:20 +01:00
Fraser Savage a05fecd8dd
docs(router): Clearer documentation of probe request behaviour
Co-authored-by: Dom <dom@itsallbroken.com>
2023-08-01 14:48:18 +01:00
Fraser Savage e643014900
docs(router): Fix typo in circuit breaker document comment 2023-08-01 14:46:17 +01:00
Fraser Savage e4a5d2efaa
feat(router): Expose `num_probes` request count used to health-check ingesters as config option
This allows routers to be configured to mark downstreams as healthy/
unhealthy with a requirement for the number of probe requests
which can/must be collected to transition the health checkers circuit
state to healthy/unhealthy.
2023-08-01 14:21:56 +01:00
dependabot[bot] 72feefc3cc
chore(deps): Bump serde from 1.0.179 to 1.0.180 (#8376)
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.179 to 1.0.180.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.179...v1.0.180)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-01 08:24:30 +00:00
Marco Neumann 743a59aa64
feat: use single per-migration txn when possible (#8373)
* test: improve `test_step_sql_statement_no_transaction`

* feat: also print number of steps in "applying migration step"

* feat: use single per-migration txn when possible

If all steps can (and want) to run in a transaction block, then wrap the
migration bookkeeping and the migration script into a single
transaction. This way we avoid the dirty state altogether because its
now an "all or nothing" migration.

Note that we still guarantee that there is only a single migration
running at the same time due to the locking mechanism. Otherwise we
would potentially run into nasty transaction failures during schema
modifications.

This is related to #7897 but only fixes / self-heals the "dirty" state
for transaction that can run in transactions. For concurrent index
migrations (which we need in prod) we need to be a bit smarter and this
will be done in a follow-up. However I feel that not leaving half-done
migrations for the cases where it's technically possible (e.g. adding
columns) is already a huge step forward.

* test: make `test_migrator_uses_single_transaction_when_possible` harder

* test: explain test

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-01 08:18:39 +00:00
Martin Hilton 25c3ce805d
refactor(influxql): make MOVING_AVERAGE a user-defined window function (#8377)
Update the implementation of the MOVING_AVERAGE function to be a
user-defined window function allowing the values to be calculated
for the entire window in one go.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-01 06:30:06 +00:00
Andrew Lamb de79619e71
chore: Update datafusion (#8355)
* chore: Update datafusion pin

* fix: Update for change in API

* chore: Update plan

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-31 15:41:00 +00:00
wiedld b58c26368f
Merge pull request #8367 from influxdata/idpe-17789/provide-job-on-commit
feat(idpe-17789): provide job from compactor --> scheduler, on commit
2023-07-31 08:35:49 -07:00
wiedld cc70a2c38b
Merge branch 'main' into idpe-17789/provide-job-on-commit 2023-07-31 08:20:45 -07:00
Dom 878f217631
Merge pull request #8372 from influxdata/dom/enable-gossip
feat: optional gossip clustering for router/ingester
2023-07-31 15:49:32 +01:00
Dom e98188b181
Merge branch 'main' into dom/enable-gossip 2023-07-31 15:27:44 +01:00
Dom 336a50017e
Merge pull request #8362 from influxdata/dom/persist-list-stat-cache
perf(ingester): persisting list & cached statistics
2023-07-31 14:34:57 +01:00
Dom Dwyer 5d4ce7eacc
docs: fix up comments
Fixes some outdated comments.
2023-07-31 15:19:31 +02:00
Dom Dwyer e3550f78a3
perf(ingester): persisting list & cached statistics
This commit breaks the ordered list of persisting buffers into its own
type (PersistingList) for clarity, and implements a cache within it of
the merged set of schemas across all persisting buffer FSMs, and
row/timestamp summaries.

This cleans up the code, and prevents N persisting schemas from being
merged at query time (for every query!), instead schemas and statistics
are incrementally maintained, pushing the computation to persist time
rather than query time.
2023-07-31 15:19:30 +02:00
Joe-Blount 44e266d000
fix: compaction looping fixes (#8363)
* fix: selectively merge L1 to L2 when L0s still exist

* fix: avoid grouping files that undo previous splits

* chore: add test case for new fixes

* chore: insta test churn

* chore: lint cleanup
2023-07-31 13:15:49 +00:00
Marco Neumann aa7a38be55
fix: re-design LRU cache to be deadlock-free (#8345)
* fix: re-design LRU cache to be deadlock-free

Fixes #8334.

* test: explain test

* test: add regression test

* docs: extend "overdelete" section
2023-07-31 13:04:34 +00:00
kodiakhq[bot] 8d0caae186
Merge pull request #8374 from influxdata/savage/notify-watchers-of-disk-usage-changes
refactor(tracker): Return disk usage watcher from `DiskUsageMetrics`
2023-07-31 12:49:00 +00:00
kodiakhq[bot] 8197dd10a7
Merge branch 'main' into savage/notify-watchers-of-disk-usage-changes 2023-07-31 12:44:05 +00:00
Carol (Nichols || Goulding) 4a9e76b8b7
feat: Make parquet_file.partition_id optional in the catalog (#8339)
* feat: Make parquet_file.partition_id optional in the catalog

This will acquire a short lock on the table in postgres, per:
<https://stackoverflow.com/questions/52760971/will-making-column-nullable-lock-the-table-for-reads>

This allows us to persist data for new partitions and associate the
Parquet file catalog records with the partition records using only the
partition hash ID, rather than both that are used now.

* fix: Support transition partition ID in the catalog service

* fix: Use transition partition ID in import/export

This commit also removes support for the `--partition-id` flag of the
`influxdb_iox remote store get-table` command, which Andrew approved.

The `--partition-id` filter was getting the results of the catalog gRPC
service's query for Parquet files of a table and then keeping only the
files whose partition IDs matched. The gRPC query is no longer returning
the partition ID from the Parquet file table, and really, this command
should instead be using `GetParquetFilesByPartitionId` to only request
what's needed rather than filtering.

* feat: Support looking up Parquet files by either kind of Partition id

Regardless of which is actually stored on the Parquet file record.

That is, say there's a Partition in the catalog with:

Partition {
    id: 3,
    hash_id: abcdefg,
}

and a Parquet file that has:

ParquetFile {
    partition_hash_id: abcdefg,
}

calling `list_by_partition_not_to_delete(PartitionId(3))` should still
return this Parquet file because it is associated with the partition
that has ID 3.

This is important for the compactor, which is currently only dealing in
PartitionIds, and I'd like to keep it that way for now to avoid having
to change Even More in this PR.

* fix: Use and set new partition ID fields everywhere they want to be

---------

Co-authored-by: Dom <dom@itsallbroken.com>
2023-07-31 12:40:56 +00:00
Fraser Savage 8e0cee8e73
refactor(tracker): Return disk usage watcher from `DiskUsageMetrics`
This allows the creator to pass around a handle to the latest observed
disk usage statistics, allowing other threads to act upon changes.
2023-07-31 12:14:13 +01:00
Dom Dwyer 8da08fa574
feat(router): optionally enable gossip subsystem
Allows the router to optionally enable and start the gossip subsystem
(disabled by default).

No code uses the gossip system, so no application-level messages are
exchanged, but this allows the gossip subsystem to run and exchange
control frames / perform discovery / etc.
2023-07-31 11:01:30 +02:00
Dom Dwyer d8badbe9ca
feat(ingester): optionally enable gossip subsystem
Allows the ingester to optionally enable and start the gossip subsystem
(disabled by default).

No code uses the gossip system, so no application-level messages are
exchanged, but this allows the gossip subsystem to run and exchange
control frames / perform discovery / etc.
2023-07-31 11:01:30 +02:00
Dom Dwyer 901839d66b
refactor: debug log for no-op payload dispatcher
Emit a debug log when the NopDispatcher receives a payload over gossip.
2023-07-31 11:01:29 +02:00
Dom Dwyer 1ec1b9155a
feat(ingester): optional gossip configuration
Exposes configuration parameters (on the ingester only) for
configuration of the gossip sub-system.
2023-07-31 11:01:25 +02:00
Dom 9b52bfdeaa
Merge pull request #8371 from influxdata/dependabot/cargo/serde-1.0.179
chore(deps): Bump serde from 1.0.177 to 1.0.179
2023-07-31 09:51:07 +01:00
dependabot[bot] 4fa6ead27d
chore(deps): Bump serde from 1.0.177 to 1.0.179
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.177 to 1.0.179.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.177...v1.0.179)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-31 08:38:48 +00:00
dependabot[bot] ec8033a65b
chore(deps): Bump regex-automata from 0.3.3 to 0.3.4 (#8370)
Bumps [regex-automata](https://github.com/rust-lang/regex) from 0.3.3 to 0.3.4.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/regex-automata-0.3.3...regex-automata-0.3.4)

---
updated-dependencies:
- dependency-name: regex-automata
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-31 08:34:24 +00:00
wiedld 1ce8e50f1a feat(idpe-17789): provide job from compactor --> scheduler, on commit 2023-07-28 15:58:50 -07:00
wiedld 61b65a9cbb
Merge pull request #8357 from influxdata/idpe-17789/rename-internal-variables
refactor(idpe-17789): rename internal variables
2023-07-28 14:42:37 -07:00
wiedld cfcef35680
Merge branch 'main' into idpe-17789/rename-internal-variables 2023-07-28 14:25:16 -07:00
kodiakhq[bot] 4f9c901dcf
Merge pull request #8354 from influxdata/savage/router-return-schema-change-diff-values-for-gossip
feat(router): Include table/column diff for namespace schema cache update
2023-07-28 14:23:52 +00:00
kodiakhq[bot] b3f35b4e7c
Merge branch 'main' into savage/router-return-schema-change-diff-values-for-gossip 2023-07-28 14:16:12 +00:00
Fraser Savage a930be45f7
refactor(router): Use map & sum over values instead of fold over iter
Also add a nice comment explaining what the string keys are for
[`ChangeStats`].

Co-authored-by: Dom <dom@itsallbroken.com>
2023-07-28 15:11:13 +01:00
dependabot[bot] 940f441259
chore(deps): Bump tikv-jemalloc-ctl from 0.5.0 to 0.5.4 (#8359)
Bumps [tikv-jemalloc-ctl](https://github.com/tikv/jemallocator) from 0.5.0 to 0.5.4.
- [Release notes](https://github.com/tikv/jemallocator/releases)
- [Changelog](https://github.com/tikv/jemallocator/blob/main/CHANGELOG.md)
- [Commits](https://github.com/tikv/jemallocator/compare/0.5.0...0.5.4)

---
updated-dependencies:
- dependency-name: tikv-jemalloc-ctl
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
2023-07-28 12:45:43 +00:00
Dom aad6608a9a
Merge pull request #8360 from influxdata/dependabot/cargo/serde-1.0.177
chore(deps): Bump serde from 1.0.176 to 1.0.177
2023-07-28 13:40:42 +01:00
dependabot[bot] 29d66444e5
chore(deps): Bump serde from 1.0.176 to 1.0.177
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.176 to 1.0.177.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.176...v1.0.177)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-07-28 10:40:15 +00:00
dependabot[bot] 91baa9bab7
chore(deps): Bump tikv-jemalloc-sys (#8361)
Bumps [tikv-jemalloc-sys](https://github.com/tikv/jemallocator) from 0.5.3+5.3.0-patched to 0.5.4+5.3.0-patched.
- [Release notes](https://github.com/tikv/jemallocator/releases)
- [Changelog](https://github.com/tikv/jemallocator/blob/main/CHANGELOG.md)
- [Commits](https://github.com/tikv/jemallocator/compare/tikv-jemalloc-sys-0.5.3...0.5.4)

---
updated-dependencies:
- dependency-name: tikv-jemalloc-sys
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
2023-07-28 10:38:16 +00:00
Dom 783454adb6
Merge pull request #8332 from influxdata/dom/gossip
feat(gossip): peer exchange
2023-07-28 11:32:37 +01:00
Dom 61348373ff
Merge branch 'main' into dom/gossip 2023-07-28 11:13:06 +01:00
wiedld 9a7ff9ecfc chore(idpe-17789): update code comments to reflect both jobs and partitions 2023-07-27 15:39:18 -07:00
wiedld d7fee9fdb8 refactor(idpe-17789): rename local variables and private struct properties to jobs (versus partitions). 2023-07-27 15:38:21 -07:00
wiedld 7ac6c6d80f
Merge pull request #8326 from influxdata/idpe-17789/compaction-job-renaming
refactor(idpe-17789): renaming abstractions related to partitions source, to compaction jobs source
2023-07-27 15:30:45 -07:00
wiedld 78ef536954
Merge branch 'main' into idpe-17789/compaction-job-renaming 2023-07-27 15:05:49 -07:00
Nga Tran e1626c3ba4
Merge pull request #8307 from influxdata/ntran/table_cli
feat: create table CLI
2023-07-27 11:03:07 -04:00
NGA-TRAN 091d387e2a chore: merge main to branch ntran/table_cli 2023-07-27 10:28:45 -04:00
Joe-Blount 4af45e0cee
Merge pull request #8340 from influxdata/jrb_73_stuck
fix: no percent split during ManySmallFiles compaction
2023-07-27 09:03:56 -05:00