Commit Graph

451 Commits (ae5883146757fa9dcfced8b4151722a9b69e9fc4)

Author SHA1 Message Date
Dom Dwyer 2d46a364dc
feat: namespace soft-delete support
This commit adds initial support for "soft" namespace deletion, where
the actual records & data remain, but are no longer queryable /
writeable.

Soft deletion is eventually consistent - users can expect to continue
writing to and reading from a bucket after issuing a soft delete call,
until the various components either restart, or have their caches
flushed.

The components treat soft-deleted namespaces differently:

    * router: ignore soft deleted namespaces
    * ingester: accept soft deleted namespaces
    * compactor: accept soft deleted namespaces
    * querier: ignore soft deleted namespaces
    * various gRPC services: ignore soft deleted namespaces

This ensures that the ingester & compactor do not see rows "vanishing"
from the database, and continue to make forward progress.

Writes for the deleted namespace that are buffered in the ingester will
be persisted as normal, allowing us to support "un-delete" operations
where the system is restored to a the state at which the delete was
issued (rather than loosing the buffered data).

Follow-on work is required to ensure GC drops the orphaned parquet files
after the configured GC time, and optimisations such as not compacting
parquet from soft-deleted namespaces seems like a trivial win.
2023-02-13 12:01:35 +01:00
dependabot[bot] 0cbd9f6a82
chore(deps): Bump tokio-util from 0.7.5 to 0.7.7 (#6964)
---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-13 10:10:53 +00:00
Andrew Lamb 779fb93ce7
refactor: move test builders out of compactor2 code (#6953)
* refactor: move test builders out of compactor2 code

* fix: docs
2023-02-10 18:28:09 +00:00
dependabot[bot] c0c9b51b9e
chore(deps): Bump tokio-util from 0.7.4 to 0.7.5 (#6941)
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.4 to 0.7.5.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.4...tokio-util-0.7.5)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-10 09:42:00 +00:00
dependabot[bot] 0ecde75af5
chore(deps): Bump object_store from 0.5.3 to 0.5.4 (#6900)
Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.3 to 0.5.4.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.3...object_store_0.5.4)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-08 09:40:11 +00:00
Raphael Taylor-Davies d3601a59f8
chore: update DataFusion, upgrade `arrow` `arrow-flight` and `parquet` to `32.0.0` (#6756)
* chore: update DataFusion

* fix: test

* chore: format

* chore: clippy

* chore: update arrow

* chore: arrow upgrade fallout

* chore: Run cargo hakari tasks

* chore: remove failing warm compaction test

* fix: flight error propagation

* chore: update parquet size

* fix: Update error message

* chore: Update parquet metadata test

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-02-06 11:35:39 +00:00
Carol (Nichols || Goulding) 30fea67701
fix: Move variables within format strings. Thanks clippy!
Changes made automatically using `cargo clippy --fix`.
2023-02-03 13:06:17 -05:00
dependabot[bot] d0e6b16450
chore(deps): Bump bytes from 1.3.0 to 1.4.0
Bumps [bytes](https://github.com/tokio-rs/bytes) from 1.3.0 to 1.4.0.
- [Release notes](https://github.com/tokio-rs/bytes/releases)
- [Changelog](https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tokio-rs/bytes/compare/v1.3.0...v1.4.0)

---
updated-dependencies:
- dependency-name: bytes
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-02-01 00:30:56 +00:00
dependabot[bot] 6f032b1d57
chore(deps): Bump async-trait from 0.1.63 to 0.1.64 (#6769)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.63 to 0.1.64.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.63...0.1.64)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-31 10:18:27 +00:00
dependabot[bot] ed7d02a225
chore(deps): Bump tokio from 1.24.2 to 1.25.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.2 to 1.25.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/commits/tokio-1.25.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-01-30 01:57:27 +00:00
Andrew Lamb ead6812210
fix: reduce logging verbosity (#6704)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-27 13:53:42 +00:00
Nga Tran b8a80869d4
feat: introduce a new way of max_sequence_number for ingester, compactor and querier (#6692)
* feat: introduce a new way of max_sequence_number for ingester, compactor and querier

* chore: cleanup

* feat: new column max_l0_created_at to order files for deduplication

* chore: cleanup

* chore: debug info for chnaging cpu.parquet

* fix: update test parquet file

Co-authored-by: Marco Neumann <marco@crepererum.net>
2023-01-26 10:52:47 +00:00
dependabot[bot] 0114e7ee50
chore(deps): Bump async-trait from 0.1.61 to 0.1.63 (#6660)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.61 to 0.1.63.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.61...0.1.63)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-23 08:41:27 +00:00
Nga Tran e596f5f074
chore: metrics for compaction candidate counts (#6593)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-13 19:00:43 +00:00
Nga Tran 550cea8bc5
perf: optimize not to update partitions with newly created level 2 files (#6590)
* perf: optimize not to update partitions with newly created level 2 files

* chore: cleanup

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-13 14:46:58 +00:00
Nga Tran fa0893819c
fix: have warm compaction work with compactor2 (#6571)
* refactor: same function to select partition candidates

* fix: have warm compaction work with compactor2

* fix: format

* chore: cleanup
2023-01-12 02:32:39 +00:00
Nga Tran 1f508b76fc
refactor: same function to select partition candidates (#6569)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-12 01:14:49 +00:00
Nga Tran d3b2203560
fix: bug in count processed partittions (#6572) 2023-01-11 22:53:52 +00:00
Nga Tran 2de0e45b0a
fix: using created_at to order chunks for deduplication (#6556)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 18:18:33 +00:00
NGA-TRAN 2ae018e2f9 chore: merge main to branch 2023-01-10 11:55:07 -05:00
NGA-TRAN 1a93f70a8b fix: use created_at to order L0 during comapction 2023-01-10 11:48:05 -05:00
Nga Tran 62c0f3dbdd
feat: have cold compaction work with Compactor2 (#6542)
* feat: cold

* chore: debug info

* feat: only compact qualified cold partition candidates

* fix: catalog test

* chore: cleanup

* chore: add new config flag for cold partition candidates

* chore: implement display for CompactionType and add tests for max num partitions

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 16:42:57 +00:00
dependabot[bot] b49cc2e35e
chore(deps): Bump tokio from 1.24.0 to 1.24.1 (#6545)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.0 to 1.24.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.24.0...tokio-1.24.1)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 09:48:44 +00:00
dependabot[bot] e31c84a794
chore(deps): Bump async-trait from 0.1.60 to 0.1.61 (#6533)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.60 to 0.1.61.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.60...0.1.61)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-09 07:44:35 +00:00
Nga Tran 4031ea1c10
feat: integrate new way to get partition candidates for hot compaction (#6525)
* feat: integrate new way to get partition candidates for hot compaction

* chore: rename
2023-01-06 18:51:52 +00:00
Raphael Taylor-Davies e1036a0c63
refactor: cleanup schema boxing (#6511)
* refactor: cleanup Schema boxing

* chore: clippy
2023-01-06 10:57:39 +00:00
Nga Tran cd1a604df0
fix: make estimate memory the a query needs higher due to recent observation (#6476) 2022-12-30 21:15:39 +00:00
Nga Tran 0c944346e5
chore: debug info of estimate file size (#6475) 2022-12-30 20:13:21 +00:00
Nga Tran d27e137c39
chore: add debug info for the investigation (#6472) 2022-12-29 23:49:29 +00:00
Carol (Nichols || Goulding) 39acfc4f0d
fix: Remove needless casts. Thanks clippy! 2022-12-21 14:32:34 -05:00
Dom Dwyer adc6fcfb04
feat(catalog): linearise sort key updates
Updating the sort key is not commutative and MUST be serialised. The
correctness of the current catalog interface relies on the caller
serialising updates globally, something it cannot reasonably assert in a
distributed system.

This change of the catalog interface pushes this responsibility to the
catalog itself where it can be effectively enforced, and allows a caller
to detect parallel updates to the sort key.
2022-12-20 12:31:00 +01:00
kodiakhq[bot] c0f2ba09ee
Merge branch 'main' into cn/compactor2 2022-12-19 14:22:56 +00:00
dependabot[bot] c72734473c
chore(deps): Bump async-trait from 0.1.59 to 0.1.60 (#6433)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.59 to 0.1.60.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.59...0.1.60)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-19 10:09:23 +00:00
Carol (Nichols || Goulding) dfd979477c
fix: Update warm compaction code to optionally take shard ID 2022-12-16 17:41:57 -05:00
Carol (Nichols || Goulding) d7e75d43ea
fix: Make shard ID optional for compactor queries in RPC write mode 2022-12-16 17:28:53 -05:00
Luke Bond f419e2c378
feat: warm compaction (#6192)
* feat: warm compaction

chore: add missing warm compaction config

chore: tests for warm compaction

chore: modify count usage in warm compaction sql

chore: catalog test for warm compaction; sql fixes

feat: settable target level for compact w/ budget

chore: tests for warm compaction

chore: clarifying comments in warm compaction test

chore: fixed erroneous comment in catalog test

chore: improve warm compactor test by checking file exists

chore: tests for warm compaction

chore: warm compactor test tidy-ups

* chore: improve test for warm compaction

* chore: fix erroneous comment in warm compaction code
2022-12-16 15:59:45 +00:00
dependabot[bot] 1d38d400f0
chore(deps): Bump object_store from 0.5.1 to 0.5.2 (#6339)
* chore(deps): Bump object_store from 0.5.1 to 0.5.2

Bumps [object_store](https://github.com/apache/arrow-rs) from 0.5.1 to 0.5.2.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG-old.md)
- [Commits](https://github.com/apache/arrow-rs/compare/object_store_0.5.1...object_store_0.5.2)

---
updated-dependencies:
- dependency-name: object_store
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-06 07:53:54 +00:00
Nga Tran 77cbc880f6
feat: Add cap limit on number of partitions to be compacted in parallel (#6305)
* feat: Add cap limit on number of partitions to be comapcted in parallel

* chore: cleanup

* chore: clearer comments
2022-12-01 21:23:44 +00:00
Andrew Lamb 255a168d07
refactor: Refactor ParquetFileCombining into a builder and plan, and add sort exec test (#6196)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-22 14:47:47 +00:00
dependabot[bot] 04c00bbb62
chore(deps): Bump bytes from 1.2.1 to 1.3.0 (#6199)
Bumps [bytes](https://github.com/tokio-rs/bytes) from 1.2.1 to 1.3.0.
- [Release notes](https://github.com/tokio-rs/bytes/releases)
- [Changelog](https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tokio-rs/bytes/commits)

---
updated-dependencies:
- dependency-name: bytes
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-22 08:23:24 +00:00
dependabot[bot] a9db7581cd
chore(deps): Bump tokio from 1.21.2 to 1.22.0 (#6183)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.21.2 to 1.22.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.21.2...tokio-1.22.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-21 10:21:24 +00:00
Luke Bond 7c813c170a
feat: reintroduce compactor first file in partition exception (#6176)
* feat: compactor ignores max file count for first file

chore: typo in comment in compactor

* feat: restore special first file in partition compaction logic; add limit

* fix: calculation in compaction max file count

chore: clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-18 15:58:59 +00:00
Nga Tran 49a9565240
feat: gRPC that creates namespace (#6103)
* feat: create namespace API call in router

Co-authored-by: Nga Tran <nga-tran@live.com>

* chore: treat retention as ns except in CLI

* fix: overflow in nanosecond calc

* fix: retention test after changing it from hours to ns

* chore: comment clarification in cli; better response type for error in ns API

* fix: correct some rebase mistakes

* chore: merge namespace create & create_with_retention; renamed ns create test helper fn & const

* fix: ns autocreation test was wrong after rebase

* fix: mem catalog has default 1hr retention, accidently removed in rebase

* chore: remove mem catalogs default 1hr retention; make it settable in sets & router

Co-authored-by: Luke Bond <luke.n.bond@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-18 13:02:12 +00:00
Nga Tran 80e91a644b
chore: Revert "feat: compactor ignores max file count for first file (#6144)" (#6158)
This reverts commit bf1681f4fe.
2022-11-16 19:58:46 +00:00
Luke Bond bf1681f4fe
feat: compactor ignores max file count for first file (#6144)
* feat: compactor ignores max file count for first file

* chore: typo in comment in compactor
2022-11-16 11:21:28 +00:00
Andrew Lamb 448911794c
test: test coverage for sorting and merging in compactor (#6136)
* test: test coverage for sorting and merging in compactor

* fix: Apply suggestions from code review (comments)

Co-authored-by: Marco Neumann <marco@crepererum.net>

* feat: use itertools to cover all permutations

Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-15 20:39:45 +00:00
Nga Tran 9c4266c503
refactor: first step to remove unused retention_duration (#6113)
* refactor: first step to remove unused retention_duration

* refactor: remove retenion_duration from update catalog

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-11 15:21:06 +00:00
Luke Bond f9316decee
chore: expose compactor's hot compaction hours thresholds as cfg (#6060)
* chore: expose compactor's hot compaction hours thresholds as cfg

* fix: add missing compactor arg envar; fix some comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-07 15:29:17 +00:00
Marco Neumann f511db380c
refactor: remove table name from chunks (#6063)
It should be always clear from the context to which table a chunk
belongs.

I think having a table name bound to a chunk goes back to a time where
chunks had multiple tables.

Helps with #6049.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-07 10:42:57 +00:00
Nga Tran 654ed98d1f
feat: config param to set when partition is cold (#6044)
* feat: config param to set when partition is cold

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* fix: make default 8 hours and avoid using 8 * 60 becasue it is a string, not expression which makes a test fail

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-03 15:03:56 +00:00