Commit Graph

315 Commits (3997de6a524ab77a765f62ec07d50b26d1d0795c)

Author SHA1 Message Date
Dom Dwyer 2d46a364dc
feat: namespace soft-delete support
This commit adds initial support for "soft" namespace deletion, where
the actual records & data remain, but are no longer queryable /
writeable.

Soft deletion is eventually consistent - users can expect to continue
writing to and reading from a bucket after issuing a soft delete call,
until the various components either restart, or have their caches
flushed.

The components treat soft-deleted namespaces differently:

    * router: ignore soft deleted namespaces
    * ingester: accept soft deleted namespaces
    * compactor: accept soft deleted namespaces
    * querier: ignore soft deleted namespaces
    * various gRPC services: ignore soft deleted namespaces

This ensures that the ingester & compactor do not see rows "vanishing"
from the database, and continue to make forward progress.

Writes for the deleted namespace that are buffered in the ingester will
be persisted as normal, allowing us to support "un-delete" operations
where the system is restored to a the state at which the delete was
issued (rather than loosing the buffered data).

Follow-on work is required to ensure GC drops the orphaned parquet files
after the configured GC time, and optimisations such as not compacting
parquet from soft-deleted namespaces seems like a trivial win.
2023-02-13 12:01:35 +01:00
Dom Dwyer a85dcd745b
refactor(catalog): expose deleted_at on Namespace
Add the new catalog column to the Namespace representation/model.
2023-02-10 14:15:01 +01:00
Dom Dwyer aa74bb6292
test(catalog): independent test state
All our catalog tests run as one test, over one database connection.
Prior to this commit, there was no state reset during test execution, so
earlier tests would pollute the state of later tests, making it an
increasingly complex and intermingled set of tests trying to assert
their entities while ignoring other, previously created entities (or
changing the order of test execution in search of the golden ordering
that makes everything great again.)

This is a bit of a hack, and is not how I'd have structured catalog
testing w/ clean state if I was writing it fresh. It is what it is.

This has been driving me mad for SO LONG it's SO BAD <shakes fist>.
2023-02-09 17:27:58 +01:00
Dom Dwyer 61409f062c
refactor(catalog): soft delete namespace column
Adds a "deleted_at" column that will indicate the timestamp at which is
was marked as logically deleted.
2023-02-09 11:35:27 +01:00
Marco Neumann dcba47ab58
feat: allow the compactor to process all known partitions (#6887)
* feat: `PartitionRepo::list_ids`

* refactor: `CatalogPartitionsSource` => `CatalogToCompactPartitionsSource`

* feat: allow the compactor to process all known partitions

Closes #6648.

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

---------

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2023-02-08 09:32:21 +00:00
Dom 9e8785f9c1
Merge branch 'main' into dom/cache-table-limit 2023-02-07 10:04:50 +00:00
Stuart Carnie eb245d6774
feat: Initial SQLite catalog schema (#6851)
* feat: Initial SQLite catalog schema

* chore: Run cargo hakari tasks

* feat: impls, many TODOs

* feat: completed `todo!()`'s

* chore: add remaining tests from postgres module

* feat: add SQLite to get_catalog API

* chore: Add docs

* chore: Placate clippy

* chore: Placate clippy

* chore: PR feedback from @domodwyer

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2023-02-06 22:55:14 +00:00
Dom Dwyer a633964f2b
feat(catalog): return max table limit in schema
The maximum number of tables is part of the Namespace, which is already
loaded in its entirety. This commit copies the value into the
NamespaceSchema, making it available for the router to utilise.
2023-02-06 17:33:55 +01:00
Carol (Nichols || Goulding) 30fea67701
fix: Move variables within format strings. Thanks clippy!
Changes made automatically using `cargo clippy --fix`.
2023-02-03 13:06:17 -05:00
Dom Dwyer 2af563ad94
refactor(iox): lower default table limit
Drops the default table limit from 10,000 to 500. This will apply to all
new namespaces / org&bucket created after this change has been deployed.
2023-02-03 16:28:38 +01:00
Dom Dwyer c343ea60b7
feat(catalog): use explicit UTC time zone
We use UTC, but that doesn't mean everyone does. Queries that utilise
NOW() will return incorrect results when the server is using a non-UTC
tz, but application-provided UTC timestamps / epochs.
2023-02-02 11:51:25 +01:00
Marko Mikulicic 167cde1838
fix(iox): Add transition shard to catalog setup (#6799) 2023-02-01 17:04:03 +00:00
dependabot[bot] 6f032b1d57
chore(deps): Bump async-trait from 0.1.63 to 0.1.64 (#6769)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.63 to 0.1.64.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.63...0.1.64)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-31 10:18:27 +00:00
Marco Neumann 9279570299
fix: PG `flag_for_delete_by_retention` (#6762)
* test: failing test

* fix: PG `flag_for_delete_by_retention`
2023-01-30 21:27:12 +00:00
dependabot[bot] ed7d02a225
chore(deps): Bump tokio from 1.24.2 to 1.25.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.2 to 1.25.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/commits/tokio-1.25.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-01-30 01:57:27 +00:00
Nga Tran b8a80869d4
feat: introduce a new way of max_sequence_number for ingester, compactor and querier (#6692)
* feat: introduce a new way of max_sequence_number for ingester, compactor and querier

* chore: cleanup

* feat: new column max_l0_created_at to order files for deduplication

* chore: cleanup

* chore: debug info for chnaging cpu.parquet

* fix: update test parquet file

Co-authored-by: Marco Neumann <marco@crepererum.net>
2023-01-26 10:52:47 +00:00
Nga Tran 06d4a5fe4e
refactor: ignore partitions in table skipped compactions (#6666)
* refactor: ignore partitions in table skipped compactions

* refactor: continue ignoring partitions in skipped compaction

* test: skip partition
2023-01-23 19:53:05 +00:00
dependabot[bot] 0114e7ee50
chore(deps): Bump async-trait from 0.1.61 to 0.1.63 (#6660)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.61 to 0.1.63.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.61...0.1.63)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-01-23 08:41:27 +00:00
Marco Neumann 5c462937ca
refactor: converters for `ParquetFile`<>`ParquetFileParams` (#6637)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-19 16:47:54 +00:00
Nga Tran 7a5fdd1d95
feat: function to read partition IDs of all partitions with new writes (#6613)
* feat: function to read partition IDs of all partitions with new writes

* chore: run fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-18 16:30:15 +00:00
Nga Tran 550cea8bc5
perf: optimize not to update partitions with newly created level 2 files (#6590)
* perf: optimize not to update partitions with newly created level 2 files

* chore: cleanup

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-13 14:46:58 +00:00
Nga Tran 62c0f3dbdd
feat: have cold compaction work with Compactor2 (#6542)
* feat: cold

* chore: debug info

* feat: only compact qualified cold partition candidates

* fix: catalog test

* chore: cleanup

* chore: add new config flag for cold partition candidates

* chore: implement display for CompactionType and add tests for max num partitions

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 16:42:57 +00:00
dependabot[bot] b49cc2e35e
chore(deps): Bump tokio from 1.24.0 to 1.24.1 (#6545)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.24.0 to 1.24.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.24.0...tokio-1.24.1)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-10 09:48:44 +00:00
dependabot[bot] e31c84a794
chore(deps): Bump async-trait from 0.1.60 to 0.1.61 (#6533)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.60 to 0.1.61.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.60...0.1.61)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-09 07:44:35 +00:00
Nga Tran b20226797a
fix: make trigger midification in different file (#6526) 2023-01-06 20:34:48 +00:00
Nga Tran b856edf826
feat: function to get parttion candidates from partition table (#6519)
* feat: function to get parttion candidates from partition table

* chore: cleanup

* fix: make new_file_at the same value as created_at

* chore: cleanup

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-06 16:20:45 +00:00
Nga Tran 23807df7a9
feat: trigger that updates partition table when a parquet file is created (#6514)
* feat: trigger that update partition table when a parquet file is created

* chore: simplify epoch of now
2023-01-05 19:57:23 +00:00
NGA-TRAN c733f27eea chore: cleanup 2023-01-04 16:00:42 -05:00
NGA-TRAN 72977bf250 feat: catalog query to select partitions with recently created files 2023-01-04 15:54:46 -05:00
Nga Tran 1088baea3d
chore: index for selecting partitions with parquet files created after a given time (#6496)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-01-04 18:07:07 +00:00
Luke Bond fb8ead8515
Merge branch 'main' into feat/delete-namespace-api 2023-01-04 11:55:01 +00:00
Carol (Nichols || Goulding) 72aab99951
fix: Remove needless borrow. Thanks clippy! 2022-12-21 14:32:34 -05:00
Dom Dwyer 2f88fc71ce
docs: fix incomplete comment 2022-12-20 12:31:01 +01:00
Dom Dwyer adc6fcfb04
feat(catalog): linearise sort key updates
Updating the sort key is not commutative and MUST be serialised. The
correctness of the current catalog interface relies on the caller
serialising updates globally, something it cannot reasonably assert in a
distributed system.

This change of the catalog interface pushes this responsibility to the
catalog itself where it can be effectively enforced, and allows a caller
to detect parallel updates to the sort key.
2022-12-20 12:31:00 +01:00
kodiakhq[bot] c0f2ba09ee
Merge branch 'main' into cn/compactor2 2022-12-19 14:22:56 +00:00
dependabot[bot] 299f0e99f9
chore(deps): Bump thiserror from 1.0.37 to 1.0.38
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.37 to 1.0.38.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.37...1.0.38)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-12-19 10:33:32 +00:00
dependabot[bot] 8478d41bcb
chore(deps): Bump paste from 1.0.10 to 1.0.11 (#6430)
Bumps [paste](https://github.com/dtolnay/paste) from 1.0.10 to 1.0.11.
- [Release notes](https://github.com/dtolnay/paste/releases)
- [Commits](https://github.com/dtolnay/paste/compare/1.0.10...1.0.11)

---
updated-dependencies:
- dependency-name: paste
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-19 10:31:05 +00:00
dependabot[bot] c72734473c
chore(deps): Bump async-trait from 0.1.59 to 0.1.60 (#6433)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.59 to 0.1.60.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.59...0.1.60)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-19 10:09:23 +00:00
Carol (Nichols || Goulding) dfd979477c
fix: Update warm compaction code to optionally take shard ID 2022-12-16 17:41:57 -05:00
Carol (Nichols || Goulding) b1b9b3122a
test: Run cold compaction catalog functions first and in a transaction
To avoid other tests' state bleeding into this one and this one's state
bleeding into other tests, now that it's testing some queries without
scoping by shard.
2022-12-16 17:28:55 -05:00
Carol (Nichols || Goulding) d7e75d43ea
fix: Make shard ID optional for compactor queries in RPC write mode 2022-12-16 17:28:53 -05:00
Luke Bond f419e2c378
feat: warm compaction (#6192)
* feat: warm compaction

chore: add missing warm compaction config

chore: tests for warm compaction

chore: modify count usage in warm compaction sql

chore: catalog test for warm compaction; sql fixes

feat: settable target level for compact w/ budget

chore: tests for warm compaction

chore: clarifying comments in warm compaction test

chore: fixed erroneous comment in catalog test

chore: improve warm compactor test by checking file exists

chore: tests for warm compaction

chore: warm compactor test tidy-ups

* chore: improve test for warm compaction

* chore: fix erroneous comment in warm compaction code
2022-12-16 15:59:45 +00:00
Luke Bond 1bc2003cf4 chore: simplify delete namespace sql query 2022-12-16 10:23:50 +00:00
Luke Bond a6036631ad chore: comment typo in catalog 2022-12-16 10:23:50 +00:00
Luke Bond 6263ca234a chore: delete ns postgres impl, test improvements, fix to mem impl 2022-12-16 10:23:50 +00:00
Luke Bond 3659be59c7 feat: delete namespace api mem impl
chore: tests for delete namespace; use unique ptn names in tests
2022-12-16 10:23:50 +00:00
dependabot[bot] e108a8b6c9
chore(deps): Bump paste from 1.0.9 to 1.0.10 (#6384)
Bumps [paste](https://github.com/dtolnay/paste) from 1.0.9 to 1.0.10.
- [Release notes](https://github.com/dtolnay/paste/releases)
- [Commits](https://github.com/dtolnay/paste/compare/1.0.9...1.0.10)

---
updated-dependencies:
- dependency-name: paste
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-12-13 06:03:05 +00:00
dependabot[bot] a9db7581cd
chore(deps): Bump tokio from 1.21.2 to 1.22.0 (#6183)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.21.2 to 1.22.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.21.2...tokio-1.22.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-21 10:21:24 +00:00
Luke Bond 7c813c170a
feat: reintroduce compactor first file in partition exception (#6176)
* feat: compactor ignores max file count for first file

chore: typo in comment in compactor

* feat: restore special first file in partition compaction logic; add limit

* fix: calculation in compaction max file count

chore: clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-18 15:58:59 +00:00
Nga Tran 49a9565240
feat: gRPC that creates namespace (#6103)
* feat: create namespace API call in router

Co-authored-by: Nga Tran <nga-tran@live.com>

* chore: treat retention as ns except in CLI

* fix: overflow in nanosecond calc

* fix: retention test after changing it from hours to ns

* chore: comment clarification in cli; better response type for error in ns API

* fix: correct some rebase mistakes

* chore: merge namespace create & create_with_retention; renamed ns create test helper fn & const

* fix: ns autocreation test was wrong after rebase

* fix: mem catalog has default 1hr retention, accidently removed in rebase

* chore: remove mem catalogs default 1hr retention; make it settable in sets & router

Co-authored-by: Luke Bond <luke.n.bond@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-18 13:02:12 +00:00