* perf: optimize not to update partitions with newly created level 2 files
* chore: cleanup
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: cold
* chore: debug info
* feat: only compact qualified cold partition candidates
* fix: catalog test
* chore: cleanup
* chore: add new config flag for cold partition candidates
* chore: implement display for CompactionType and add tests for max num partitions
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: function to get parttion candidates from partition table
* chore: cleanup
* fix: make new_file_at the same value as created_at
* chore: cleanup
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Updating the sort key is not commutative and MUST be serialised. The
correctness of the current catalog interface relies on the caller
serialising updates globally, something it cannot reasonably assert in a
distributed system.
This change of the catalog interface pushes this responsibility to the
catalog itself where it can be effectively enforced, and allows a caller
to detect parallel updates to the sort key.
To avoid other tests' state bleeding into this one and this one's state
bleeding into other tests, now that it's testing some queries without
scoping by shard.
* feat: compactor ignores max file count for first file
chore: typo in comment in compactor
* feat: restore special first file in partition compaction logic; add limit
* fix: calculation in compaction max file count
chore: clippy
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: create namespace API call in router
Co-authored-by: Nga Tran <nga-tran@live.com>
* chore: treat retention as ns except in CLI
* fix: overflow in nanosecond calc
* fix: retention test after changing it from hours to ns
* chore: comment clarification in cli; better response type for error in ns API
* fix: correct some rebase mistakes
* chore: merge namespace create & create_with_retention; renamed ns create test helper fn & const
* fix: ns autocreation test was wrong after rebase
* fix: mem catalog has default 1hr retention, accidently removed in rebase
* chore: remove mem catalogs default 1hr retention; make it settable in sets & router
Co-authored-by: Luke Bond <luke.n.bond@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: reject writes that are outside the retention period
* feat: add retention validator into handler stack
* chore: Apply suggestions from code review
Co-authored-by: Dom <dom@itsallbroken.com>
* refactor: address review comments
* test: unit tests fot retention validation
* chore: address review comments
* test: more unit tests and integration tests
* refactor: make time inside retention period for emphemeral_mode test
* fix: 2 hours
Co-authored-by: Dom <dom@itsallbroken.com>
* feat: flag partition for delete
* fix: compare the right date and time
* chore: Run cargo hakari tasks
* chore: cleanup
* fix: typos
* chore: rust style tidy ups in catalog
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Luke Bond <luke.n.bond@gmail.com>
* feat: deletion flagging in GC based on retention policy
* chore: typo in comment
* fix: only soft delete parquet files that aren't yet soft deleted
* fix: guard against flakiness in catalog test
* chore: some better tests for parquet file delete flagging
Co-authored-by: Nga Tran <nga-tran@live.com>
* refactor: make namespace folder for all namesapce's commands
* feat: WIP for add command to set retention period
* feat: more on updating retention period
* feat: grpc for update namespace retention period
* test: end to end test fpr namespace retention
* fix: lint proto
* chore: cleanup
* chore: kick CI run again
* fix: command hierachy
* chore: fix comments
The checks for whether a column already exists with a different type
were relying on ordering of the input matching the ordering of the
columns returned from inserting the columns in Postgres.
Rather than trying to match the new ordering that is required to avoid
Postgres deadlocks, switch from a Vec to a HashMap and look up the
column type from the name.
This also reduces some allocations that weren't really needed.