Commit Graph

332 Commits (c506d883818a6059760bab0fc6cdb6f29b125201)

Author SHA1 Message Date
Dom Dwyer adc6fcfb04
feat(catalog): linearise sort key updates
Updating the sort key is not commutative and MUST be serialised. The
correctness of the current catalog interface relies on the caller
serialising updates globally, something it cannot reasonably assert in a
distributed system.

This change of the catalog interface pushes this responsibility to the
catalog itself where it can be effectively enforced, and allows a caller
to detect parallel updates to the sort key.
2022-12-20 12:31:00 +01:00
kodiakhq[bot] c0f2ba09ee
Merge branch 'main' into cn/compactor2 2022-12-19 14:22:56 +00:00
dependabot[bot] 299f0e99f9
chore(deps): Bump thiserror from 1.0.37 to 1.0.38
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.37 to 1.0.38.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.37...1.0.38)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-12-19 10:33:32 +00:00
dependabot[bot] 8478d41bcb
chore(deps): Bump paste from 1.0.10 to 1.0.11 (#6430)
Bumps [paste](https://github.com/dtolnay/paste) from 1.0.10 to 1.0.11.
- [Release notes](https://github.com/dtolnay/paste/releases)
- [Commits](https://github.com/dtolnay/paste/compare/1.0.10...1.0.11)

---
updated-dependencies:
- dependency-name: paste
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-19 10:31:05 +00:00
dependabot[bot] c72734473c
chore(deps): Bump async-trait from 0.1.59 to 0.1.60 (#6433)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.59 to 0.1.60.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.59...0.1.60)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-12-19 10:09:23 +00:00
Carol (Nichols || Goulding) dfd979477c
fix: Update warm compaction code to optionally take shard ID 2022-12-16 17:41:57 -05:00
Carol (Nichols || Goulding) b1b9b3122a
test: Run cold compaction catalog functions first and in a transaction
To avoid other tests' state bleeding into this one and this one's state
bleeding into other tests, now that it's testing some queries without
scoping by shard.
2022-12-16 17:28:55 -05:00
Carol (Nichols || Goulding) d7e75d43ea
fix: Make shard ID optional for compactor queries in RPC write mode 2022-12-16 17:28:53 -05:00
Luke Bond f419e2c378
feat: warm compaction (#6192)
* feat: warm compaction

chore: add missing warm compaction config

chore: tests for warm compaction

chore: modify count usage in warm compaction sql

chore: catalog test for warm compaction; sql fixes

feat: settable target level for compact w/ budget

chore: tests for warm compaction

chore: clarifying comments in warm compaction test

chore: fixed erroneous comment in catalog test

chore: improve warm compactor test by checking file exists

chore: tests for warm compaction

chore: warm compactor test tidy-ups

* chore: improve test for warm compaction

* chore: fix erroneous comment in warm compaction code
2022-12-16 15:59:45 +00:00
Luke Bond 1bc2003cf4 chore: simplify delete namespace sql query 2022-12-16 10:23:50 +00:00
Luke Bond a6036631ad chore: comment typo in catalog 2022-12-16 10:23:50 +00:00
Luke Bond 6263ca234a chore: delete ns postgres impl, test improvements, fix to mem impl 2022-12-16 10:23:50 +00:00
Luke Bond 3659be59c7 feat: delete namespace api mem impl
chore: tests for delete namespace; use unique ptn names in tests
2022-12-16 10:23:50 +00:00
dependabot[bot] e108a8b6c9
chore(deps): Bump paste from 1.0.9 to 1.0.10 (#6384)
Bumps [paste](https://github.com/dtolnay/paste) from 1.0.9 to 1.0.10.
- [Release notes](https://github.com/dtolnay/paste/releases)
- [Commits](https://github.com/dtolnay/paste/compare/1.0.9...1.0.10)

---
updated-dependencies:
- dependency-name: paste
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-12-13 06:03:05 +00:00
dependabot[bot] a9db7581cd
chore(deps): Bump tokio from 1.21.2 to 1.22.0 (#6183)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.21.2 to 1.22.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.21.2...tokio-1.22.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-21 10:21:24 +00:00
Luke Bond 7c813c170a
feat: reintroduce compactor first file in partition exception (#6176)
* feat: compactor ignores max file count for first file

chore: typo in comment in compactor

* feat: restore special first file in partition compaction logic; add limit

* fix: calculation in compaction max file count

chore: clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-18 15:58:59 +00:00
Nga Tran 49a9565240
feat: gRPC that creates namespace (#6103)
* feat: create namespace API call in router

Co-authored-by: Nga Tran <nga-tran@live.com>

* chore: treat retention as ns except in CLI

* fix: overflow in nanosecond calc

* fix: retention test after changing it from hours to ns

* chore: comment clarification in cli; better response type for error in ns API

* fix: correct some rebase mistakes

* chore: merge namespace create & create_with_retention; renamed ns create test helper fn & const

* fix: ns autocreation test was wrong after rebase

* fix: mem catalog has default 1hr retention, accidently removed in rebase

* chore: remove mem catalogs default 1hr retention; make it settable in sets & router

Co-authored-by: Luke Bond <luke.n.bond@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-18 13:02:12 +00:00
Nga Tran 6f7b1e2e26
feat: reject writes that are outside the retention period (#6148)
* feat: reject writes that are outside the retention period

* feat: add retention validator into handler stack

* chore: Apply suggestions from code review

Co-authored-by: Dom <dom@itsallbroken.com>

* refactor: address review comments

* test: unit tests fot retention validation

* chore: address review comments

* test: more unit tests and integration tests

* refactor: make time inside retention period for emphemeral_mode test

* fix: 2 hours

Co-authored-by: Dom <dom@itsallbroken.com>
2022-11-17 20:55:58 +00:00
Carol (Nichols || Goulding) bdff4e8848
fix: Consistently use 'namespace' instead of 'database' in comments and other internal text 2022-11-11 15:46:04 -05:00
Nga Tran a3f2fe489c
refactor: remove retention_duration field from namespace catalog table (#6124) 2022-11-11 20:30:42 +00:00
Nga Tran 9c4266c503
refactor: first step to remove unused retention_duration (#6113)
* refactor: first step to remove unused retention_duration

* refactor: remove retenion_duration from update catalog

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-11 15:21:06 +00:00
Nga Tran 93e11d4c91
chore: Revert "feat: flag partitions for delete (#6075)" (#6111)
This reverts commit 77a2541172.
2022-11-10 17:01:39 +00:00
Nga Tran e81ff1f4d5
chore: Revert "feat: catalog delete old partitions (#6099)" (#6109)
This reverts commit 664b0578e9.
2022-11-10 15:31:16 +00:00
Luke Bond 664b0578e9
feat: catalog delete old partitions (#6099)
* feat: catalog delete old partitions

* chore: remove debug println

* chore: remove debug println

* chore: clippy

* chore: sql statement refactor for deleting partitions

* chore: improve delete partition test

* chore: clippy
2022-11-10 10:51:22 +00:00
Carol (Nichols || Goulding) 43687a86d2
fix: Remove lots of needless borrows that Clippy can now identify
Except for in generated code that we don't control.
2022-11-09 10:54:18 -05:00
Carol (Nichols || Goulding) 07505c8f72
fix: Remove needless borrows, thanks Clippy! 2022-11-09 10:54:18 -05:00
Nga Tran 77a2541172
feat: flag partitions for delete (#6075)
* feat: flag partition for delete

* fix: compare the right date and time

* chore: Run cargo hakari tasks

* chore: cleanup

* fix: typos

* chore: rust style tidy ups in catalog

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Luke Bond <luke.n.bond@gmail.com>
2022-11-09 12:06:23 +00:00
Luke Bond dfb820615c
feat: deletion flagging in GC based on retention policy (#6073)
* feat: deletion flagging in GC based on retention policy

* chore: typo in comment

* fix: only soft delete parquet files that aren't yet soft deleted

* fix: guard against flakiness in catalog test

* chore: some better tests for parquet file delete flagging

Co-authored-by: Nga Tran <nga-tran@live.com>
2022-11-08 20:22:35 +00:00
kodiakhq[bot] 369937d68f
Merge branch 'main' into cn/order-by-insert 2022-11-07 19:18:13 +00:00
Carol (Nichols || Goulding) 74a40cc9bd
fix: Assert that there aren't two columns with the same name in the same batch
This shouldn't be possible; let's make sure we know if it happens!
2022-11-07 14:10:12 -05:00
Luke Bond 5e05fa52cf
feat: soft delete parquet files based on retention period (#6070) 2022-11-07 17:31:29 +00:00
Nga Tran 9356f2a1b9
feat: grpc for updating namespace retention period (#6041)
* refactor: make namespace folder for all namesapce's commands

* feat: WIP for add command to set retention period

* feat: more on updating retention period

* feat: grpc for update namespace retention period

* test: end to end test fpr namespace retention

* fix: lint proto

* chore: cleanup

* chore: kick CI run again

* fix: command hierachy

* chore: fix comments
2022-11-04 20:58:11 +00:00
Carol (Nichols || Goulding) 9b0af96927
docs: Add a link to the idpe issue
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
2022-11-04 13:39:25 -04:00
Carol (Nichols || Goulding) d454c66b4b
fix: Use a HashMap for column lookup instead of Vec ordering
The checks for whether a column already exists with a different type
were relying on ordering of the input matching the ordering of the
columns returned from inserting the columns in Postgres.

Rather than trying to match the new ordering that is required to avoid
Postgres deadlocks, switch from a Vec to a HashMap and look up the
column type from the name.

This also reduces some allocations that weren't really needed.
2022-11-04 11:52:37 -04:00
Carol (Nichols || Goulding) a6634ada19
fix: Add an ORDER BY to the insert to prevent Postgres deadlocks
Fixes influxdata/idpe#16298.

Without this ORDER BY, concurrent writes that add many column records to
this table can deadlock because they grab locks to rows/index entries in
an arbitrary order to check the unique index.

By switching to a consistent order across all requests, inserts won't
get in a deadlock loop waiting for each other.

More info:

- <https://rcoh.svbtle.com/postgres-unique-constraints-can-cause-deadlock>
- <https://dba.stackexchange.com/a/195220/27897>
2022-11-04 11:52:37 -04:00
NGA-TRAN 498851eaf5 feat: add catalog columns needed for retention policy 2022-11-01 15:35:15 -04:00
Carol (Nichols || Goulding) 2e83e04eab
feat: Use workspace package metadata to reduce differences and repetition 2022-10-24 13:04:09 -04:00
Jake Goulding df2ba85661 feat: add the ability to delete a skipped compaction 2022-10-21 15:12:20 -04:00
dependabot[bot] b5574c07b7
chore(deps): Bump async-trait from 0.1.57 to 0.1.58 (#5904)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.57 to 0.1.58.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.57...0.1.58)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-10-19 09:40:26 +00:00
dependabot[bot] f3c27c5c71
chore(deps): Bump dotenvy from 0.15.5 to 0.15.6 (#5881)
Bumps [dotenvy](https://github.com/allan2/dotenvy) from 0.15.5 to 0.15.6.
- [Release notes](https://github.com/allan2/dotenvy/releases)
- [Changelog](https://github.com/allan2/dotenvy/blob/master/CHANGELOG.md)
- [Commits](https://github.com/allan2/dotenvy/compare/v0.15.5...v0.15.6)

---
updated-dependencies:
- dependency-name: dotenvy
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-10-18 07:06:40 +00:00
Dom Dwyer afdc008855 fix: correct default limits 2022-10-14 16:05:56 +02:00
Dom Dwyer e4179605df test: assert default service limit values
Adds tests that assert the default service limit values.
2022-10-14 14:46:34 +02:00
Dom Dwyer 46bbee5423 refactor: reduce default column limit
Reduces the default number of columns allowed per-table, from 1,000 to
200.
2022-10-14 14:45:48 +02:00
Carol (Nichols || Goulding) efb964c390
feat: Enforce table column limits from the schema cache (#5819)
* fix: Avoid some allocations by collecting instead of inserting into a vec

* refactor: Encode that adding columns is for one table at a time

* test: Add another test of column limits

* test: Add below/above limit tests for create_or_get_many

* fix: Explicitly DO NOT check column limits when inserting many columns

* feat: Cache the max_columns_per_table on the NamespaceSchema

* feat: Add a function to validate column limits in-memory

* fix: Provide more useful information when over column limits

* fix: Swap types to remove intermediate allocation

* docs: Explain the interactions of the cache and the column limits

* test: Actually set up test that showcases column limit race condition

* fix: Allow writing to existing columns even if table is over column limit

Co-authored-by: Dom <dom@itsallbroken.com>
2022-10-14 11:34:17 +00:00
Dom Dwyer 3e70dc44a0 refactor(catalog): remove partition_info_by_id()
This method used to return a subset of partition metadata, and was used
exclusively for persistence in the ingester. It is now no longer
necessary.
2022-10-13 15:26:36 +02:00
Dom Dwyer 3e1e4c1f0b refactor: remove Table::get_table_persist_info()
Remove the now-redundant get_table_persist_info() implementations.
2022-10-13 13:44:50 +02:00
Dom Dwyer c4f542bbe2 refactor(ingester): remove tombstone support
This commit removes tombstone support from the ingester, and deletes
associated code/helpers/tests. This commit does NOT remove tombstone
support from any other service, but MAY include removing overlapping
test coverage.

This also removes the tombstone support from the Ingester -> Querier RPC
response message.

This has the nice side effect of removing a whole lot of thread spawning
in the ingester tests for the Executor, speeding everything up!
2022-10-11 13:10:04 +02:00
Dom Dwyer afcb96ae47 perf(ingester): deferred sort key lookup queries
This commit carries the SortKey in the PartitionData, and configures the
ingester to use deferred sort key lookups, smearing the lookups across a
fixed period of time after initialising the PartitionData, instead of
querying for the sort key at persist time.

This allows large numbers of PartitionData to be initialised without
causing a equally large spike in catalog load to resolve the sort key -
instead this load is spread out randomly to reduce peak query rps.
2022-10-06 16:39:54 +02:00
Nga Tran d171697fd7
feat: always pick cold partitions in next cycle even if it has been pa… (#5772)
* fix: always pick cold partitions in next cycle even if it has been partially compacted recently

* fix: comment

* fix: test output

* refactor: using var instead of literal

* fix: consider deleted L0s for recent writes

* chore: cleanup

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-30 15:54:00 +00:00
Dom Dwyer cd4087e00d style: add no todo!() or dbg!() lints
Some crates had theme, some not - lets be consistent and have the
compiler spot dbg!() and todo!() macro calls - they should never be in
prod code!
2022-09-29 13:10:07 +02:00
dependabot[bot] 227dde1dfc
chore(deps): Bump thiserror from 1.0.36 to 1.0.37 (#5753)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.36 to 1.0.37.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.36...1.0.37)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-29 10:37:14 +00:00
Dom Dwyer e19b88cae9 feat(catalog): most recent N partitions
Adds a Partition::most_recent_n() method to the catalog interface,
returning the N most recent partitions for a given set of shards.

The most recently created partitions are likely to be currently "hot"
for writes, and are cheap to list.
2022-09-27 16:22:00 +02:00
Nga Tran 75ff805ee2
feat: instead of adding num_files and memory budget into the reason text column, let us create differnt columns for them. We will be able to filter them easily (#5742)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-26 20:14:04 +00:00
dependabot[bot] b1740f45d6
chore(deps): Bump thiserror from 1.0.35 to 1.0.36 (#5737)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.35 to 1.0.36.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.35...1.0.36)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-26 14:44:36 +00:00
Carol (Nichols || Goulding) c8108f01e7
chore: Upgrade to Rust 1.64 (#5727)
* chore: Upgrade to Rust 1.64

* fix: Use iter find instead of a for loop, thanks clippy

* fix: Remove some needless borrows, thanks clippy

* fix: Use then_some rather than then with a closure, thanks clippy

* fix: Use iter retain rather than filter collect, thanks clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-22 18:04:00 +00:00
Marco Neumann 84e8a4ac41
fix: GC `parquet_file` table in batches (#5691)
* fix: GC `parquet_file` table in batches

Otherwise this transaction will never finish in prod.

* fix: GC shutdown

* refactor: use constant
2022-09-20 11:14:39 +00:00
dependabot[bot] b6fb481b0f
chore(deps): Bump dotenvy from 0.15.3 to 0.15.5 (#5689)
Bumps [dotenvy](https://github.com/allan2/dotenvy) from 0.15.3 to 0.15.5.
- [Release notes](https://github.com/allan2/dotenvy/releases)
- [Changelog](https://github.com/allan2/dotenvy/blob/master/CHANGELOG.md)
- [Commits](https://github.com/allan2/dotenvy/compare/v0.15.3...v0.15.5)

---
updated-dependencies:
- dependency-name: dotenvy
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-20 05:28:47 +00:00
Marko Mikulicic 758649296b fix: Close old connection pool after swap 2022-09-20 01:56:42 +02:00
Marko Mikulicic 14a6b437d8
chore: Lower sqlx logging verbosity (#5681)
The default statement logging verbosity of the `sqlx` crate is INFO, which
is frankly surprising.

The reason we didn't bother with lowering this before is that the `sqlx` crate
emits logs using the `log` crate, and we're using the `tracing` crate for logging too.

We did bridge the two logging ecosystems with https://docs.rs/tracing-log/latest/tracing_log/
but until https://github.com/influxdata/influxdb_iox/pull/5680 the bridge wasn't really working
so we didn't notice the *very* verbose logs of sqlx sstatement logging (which log our whole SQL multiline statements as INFO logs...)
2022-09-19 22:56:05 +00:00
Carol (Nichols || Goulding) 414b0f02ca
fix: Use time helper methods in more places 2022-09-19 13:24:08 -04:00
Carol (Nichols || Goulding) c0c0349bc5
fix: Use typed Time values rather than ns 2022-09-19 12:59:20 -04:00
Carol (Nichols || Goulding) 0e23360da1
refactor: Add helper methods for computing times to TimeProvider 2022-09-19 11:34:43 -04:00
Dom Dwyer 66bf0ff272 refactor(db): NULLable persisted_sequence_number
Makes the partition.persisted_sequence_number column in the catalog DB
NULLable. 0 is a valid persisted sequence number.
2022-09-15 18:19:39 +02:00
Dom Dwyer 234d460fcb chore: rename update_persisted_sequence_number fn 2022-09-15 16:10:35 +02:00
Dom Dwyer d199a83355 feat(catalog): per-partition persist mark API
Adds the "persisted_sequence_number" field to the Partition model, and
updates the catalog API to read & update it.
2022-09-15 16:10:35 +02:00
Dom Dwyer c5ac17399a refactor(db): persist marker for partition table
Adds a migration to add a column "persisted_sequence_number" that
defines the inclusive upper-bound on sequencer writes materialised and
uploaded to object store for the partition.
2022-09-15 16:10:35 +02:00
Luke Bond b52865e018
feat: garbage collector now cleans up old parquet files (#5588)
* feat: garbage collector now cleans up old parquet files

* chore: clarifying comment in GC

* chore: typos in GC

* chore: typos in GC

* fix: cmdline arg in GC test needs updating after refactor

* fix: use select! on shutdown rx in GC

* fix: recalc cutoff in GD each loop

* chore: add delete_old that returns IDs only, for GC

* chore: use duration in GC args instead of usize days

* chore: GC lister runs forever w/ sleep; tests updated accordingly

* docs: fix link in GC comments to automatic link

* chore: test for delete_old_ids_only; refactor mem impl thereof

* chore: make GC test less flakey

* chore: make GC test less flakey

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-14 14:09:28 +00:00
dependabot[bot] b4a25fdb0e
chore(deps): Bump thiserror from 1.0.34 to 1.0.35 (#5629)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.34 to 1.0.35.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.34...1.0.35)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-14 12:54:12 +00:00
kodiakhq[bot] 85641efa6f
Merge branch 'main' into cn/infallible-estimated-bytes 2022-09-14 01:00:10 +00:00
Luke Bond ee3f172d45 chore: renamed DB migration for billing trigger 2022-09-13 16:29:14 +01:00
Luke Bond c8b545134e chore: add index to speed up billing_summary upsert 2022-09-13 16:22:44 +01:00
Luke Bond feae712881 fix: parquet_file billing trigger respects to_delete 2022-09-13 16:22:44 +01:00
Luke Bond 80661a5d1c chore: clippy 2022-09-13 16:22:44 +01:00
Luke Bond 10acaf4567 chore: added test for parquet file delete trigger 2022-09-13 16:22:44 +01:00
Luke Bond cc93b2c275 chore: add catalog trigger for billing 2022-09-13 16:22:44 +01:00
Carol (Nichols || Goulding) 224e3cec10
fix: Use ColumnType in errors rather than strings 2022-09-12 17:35:27 -04:00
Carol (Nichols || Goulding) 20e6d26aa9
refactor: Have sqlx decode ColumnTypes in the catalog 2022-09-12 16:50:25 -04:00
Carol (Nichols || Goulding) aba9759268
test: Correct expectations with skipped compactions 2022-09-12 13:13:29 -04:00
Carol (Nichols || Goulding) ef35f2e236
fix: Always parameterize the compaction level to postgres
So that we don't have hardcoded values in SQL that could get out of sync
2022-09-12 13:13:26 -04:00
Carol (Nichols || Goulding) da201ba87f
fix: Select by num of both l0 and l1 files for cold compaction
Now that we're going to compact level 1 files in to level 2 files as
well.
2022-09-12 13:13:26 -04:00
Carol (Nichols || Goulding) 6bba3fafaa
fix: If full compaction group has only 1 file, upgrade level
As opposed to running full compaction.

Makes the catalog function general and take the level as a parameter
rather than only upgrade to level 1.
2022-09-12 13:13:26 -04:00
Carol (Nichols || Goulding) 327446f0cd
fix: Change default cold hours threshold from 24 hours to 8
As requested in https://github.com/influxdata/influxdb_iox/issues/5330#issuecomment-1212468682
2022-09-12 13:13:26 -04:00
Carol (Nichols || Goulding) 20e7c4f4e5
test: Check all returned partitions to make sure the skipped one isn't there 2022-09-09 17:24:09 -04:00
Carol (Nichols || Goulding) c92aebd595
feat: Exclude skipped partitions from compaction candidacy
Connects to #5458.
2022-09-09 15:31:07 -04:00
Carol (Nichols || Goulding) fbe3e360d2
feat: Record skipped compactions in memory
Connects to #5458.
2022-09-09 15:31:07 -04:00
YIXIAO SHI 52ae60bf2e
chore: fix comment typo (#5551)
Co-authored-by: Dom <dom@itsallbroken.com>
2022-09-07 08:49:29 +00:00
Marco Neumann adeacf416c
ci: fix (#5569)
* ci: use same feature set in `build_dev` and `build_release`

* ci: also enable unstable tokio for `build_dev`

* chore: update tokio to 1.21 (to fix console-subscriber 0.1.8

* fix: "must use"
2022-09-06 14:13:28 +00:00
dependabot[bot] 9f0b0328f7
chore(deps): Bump thiserror from 1.0.33 to 1.0.34 (#5556)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.33 to 1.0.34.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.33...1.0.34)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-06 09:18:41 +00:00
Nga Tran dde65fa7ef
fix: remove timestamp functions from SQLs to be able to use index for improving performance (#5547) 2022-09-02 19:43:52 +00:00
Nga Tran cbfd37540a
feat: add index on parquet_file(shard_id, compaction_level, to_delete, created_at) (#5544) 2022-09-02 14:27:29 +00:00
Nga Tran c8cbc5299b
feat: make compactors to select candidates based on the last n minutes (#5535)
* feat: make compactors to select candidates based on the last n minutes to reduce workload for postgres catalog query

* refactor: remove 1-minute case per review comment
2022-09-01 20:07:26 +00:00
Carol (Nichols || Goulding) 8a0fa616cf
fix: Rename columns, tables, indexes and constraints in postgres catalog 2022-09-01 10:00:54 -04:00
dependabot[bot] 7c61bdcf35
chore(deps): Bump paste from 1.0.8 to 1.0.9 (#5526)
Bumps [paste](https://github.com/dtolnay/paste) from 1.0.8 to 1.0.9.
- [Release notes](https://github.com/dtolnay/paste/releases)
- [Commits](https://github.com/dtolnay/paste/compare/1.0.8...1.0.9)

---
updated-dependencies:
- dependency-name: paste
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-01 12:07:53 +00:00
dependabot[bot] 9af93ca9ba
chore(deps): Bump pretty_assertions from 1.2.1 to 1.3.0 (#5517)
Bumps [pretty_assertions](https://github.com/rust-pretty-assertions/rust-pretty-assertions) from 1.2.1 to 1.3.0.
- [Release notes](https://github.com/rust-pretty-assertions/rust-pretty-assertions/releases)
- [Changelog](https://github.com/rust-pretty-assertions/rust-pretty-assertions/blob/main/CHANGELOG.md)
- [Commits](https://github.com/rust-pretty-assertions/rust-pretty-assertions/compare/v1.2.1...v1.3.0)

---
updated-dependencies:
- dependency-name: pretty_assertions
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-01 10:20:26 +00:00
dependabot[bot] 00ed79ff1b
chore(deps): Bump thiserror from 1.0.32 to 1.0.33 (#5524)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.32 to 1.0.33.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.32...1.0.33)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-01 09:11:31 +00:00
Nga Tran cb10a7c6d8
feat: More accurate memory estimate for compaction (#5471)
* feat: initial implementation of memory estimation for a compaction

* feat: estimate size of files and have the right actions for the needed budget

* feat: run candidates in parallel

* fix: have the right name for the column field of the output struct

* feat: add metrics for estimated budgets

* chore: cleanup

* chore: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: fix syntax after applying review's suggestions

* refactor: Convert a Vec to VecDeque to go well with pop and push

* chore: remove max_concurrent_size_bytes and input_size_threshold_bytes

* chore: remove input_file_count_threshold

* test: tests for estimate_arrow_bytes_for_file

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-30 13:44:44 +00:00
Carol (Nichols || Goulding) dbd27f648f
refactor: Rename more mentions of Kafka to their other name where appropriate 2022-08-29 14:27:02 -04:00
Carol (Nichols || Goulding) 1b49ad25f7
refactor: Rename KafkaTopicId to TopicId 2022-08-29 14:27:02 -04:00
Carol (Nichols || Goulding) 58f0b63cdc
refactor: Rename KafkaTopic to Topic or TopicMetadata or topic name as appropriate 2022-08-29 14:27:02 -04:00
Carol (Nichols || Goulding) 74c9529062
fix: Rename KafkaPartition to ShardIndex 2022-08-29 14:07:18 -04:00