* feat: teach compactor to use sort_key_ids instead of sort_key
* test: update the test output after chatting with Joe and know the reason of the chnanges
* feat: fill catalog sort_key_ids for partition with coming data
* test: sort_key_ids has empty array for newly create partition
* test: name of non-existing column
* chore: add comments to ask Andrew about the code
* chore: make comments clearer
* chore: fix a comment to avoid failure in doc
* chore: add comment for the panic if column name of sort key not found
* fix: during import files the partition has to be created with empty sort key first. Then after its files are created, the partition will be uodated with sort key
* chore: remove no longer needed comments after the bug in build_catalog test is fixed
* chore: address review comments
* refactor: Use ColumnSet type
* chore: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* chore: fix a clippy
---------
Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@gmail.com>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* feat: Make parquet_file.partition_id optional in the catalog
This will acquire a short lock on the table in postgres, per:
<https://stackoverflow.com/questions/52760971/will-making-column-nullable-lock-the-table-for-reads>
This allows us to persist data for new partitions and associate the
Parquet file catalog records with the partition records using only the
partition hash ID, rather than both that are used now.
* fix: Support transition partition ID in the catalog service
* fix: Use transition partition ID in import/export
This commit also removes support for the `--partition-id` flag of the
`influxdb_iox remote store get-table` command, which Andrew approved.
The `--partition-id` filter was getting the results of the catalog gRPC
service's query for Parquet files of a table and then keeping only the
files whose partition IDs matched. The gRPC query is no longer returning
the partition ID from the Parquet file table, and really, this command
should instead be using `GetParquetFilesByPartitionId` to only request
what's needed rather than filtering.
* feat: Support looking up Parquet files by either kind of Partition id
Regardless of which is actually stored on the Parquet file record.
That is, say there's a Partition in the catalog with:
Partition {
id: 3,
hash_id: abcdefg,
}
and a Parquet file that has:
ParquetFile {
partition_hash_id: abcdefg,
}
calling `list_by_partition_not_to_delete(PartitionId(3))` should still
return this Parquet file because it is associated with the partition
that has ID 3.
This is important for the compactor, which is currently only dealing in
PartitionIds, and I'd like to keep it that way for now to avoid having
to change Even More in this PR.
* fix: Use and set new partition ID fields everywhere they want to be
---------
Co-authored-by: Dom <dom@itsallbroken.com>
* feat: metrics for main tokio runtime
* feat: instrument executor tokio runtime
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This will hold the deterministic ID for partitions.
Until all existing partitions have this value, this is optional/nullable.
The row ID still exists and is used as the main foreign key in the
parquet_file and skipped_compaction tables.
The hash_id has a unique index so that we can look up records based on
it (if it's available).
If the parquet file record has a partition_hash_id value, use that to
generate the object storage path instead of the partition_id.
* feat: Allow passing service protection limits in create db gRPC call
* fix: Move the impl into the catalog namespace trait
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Nothing gets the partition ID out of the metadata. The parts of the code
interacting with object storage that need the ID to create the object
store path were using the partition ID from the metadata out of
convenience, but I changed those places to pass in the partition ID in a
separate argument instead.
This will make the transition to deterministic partition IDs a bit
smoother.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* refactor: remove `ParquetFileRepo::flag_for_delete`
* refactor: batch update parquet files in catalog
* refactor: avoid data roundtrips through postgres
* refactor: do not return ID from PG when we do not need it
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This commit fixes loads of crates (47!) had unused dependencies, or
mis-configured dependencies (test deps as normal deps).
I added the "unused_crate_dependencies" to all crates to help prevent
this mess from growing again!
https://doc.rust-lang.org/beta/nightly-rustc/rustc_lint_defs/builtin/static.UNUSED_CRATE_DEPENDENCIES.html
This has the minor downside of false-positives when specifying
dev-dependencies for test/bench binaries - these are files in /test or
/benches (not normal tests). This commit includes a workaround,
importing them in lib.rs (gated by a feature flag). I think the
trade-off of better dependency management is worth it!
Still insert them into the database and associate them with namespaces,
but don't ever query them back out.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* test: set max_l0_created_at to reasonable values for the tests and also verify it using both test layout and catalog function
* fix: typo
---------
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* refactor: n_threads and n_target_partitions are non-zero
Zero values will just panic. Prevent that earlier.
* fix: typo
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
---------
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* feat: initial implementation of the split
* feat: split many L0 files in groups and compact them into new and fewer L0 files
* test: remove iappropriate AllAtOnce test
* refactor: move file classification for initial target to its own function
* fix: pop the branch from start to end
* chore: address review comments
* feat: support splitting to many L1 files
* feat: only add extra round to compact level-n files to same level-n files if their files plus overlapped level-n-plus-1 over limit
* chore: Apply suggestions from code review
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* chore: final cleanup and address comments
* chore: run fmt
---------
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This commit adds initial support for "soft" namespace deletion, where
the actual records & data remain, but are no longer queryable /
writeable.
Soft deletion is eventually consistent - users can expect to continue
writing to and reading from a bucket after issuing a soft delete call,
until the various components either restart, or have their caches
flushed.
The components treat soft-deleted namespaces differently:
* router: ignore soft deleted namespaces
* ingester: accept soft deleted namespaces
* compactor: accept soft deleted namespaces
* querier: ignore soft deleted namespaces
* various gRPC services: ignore soft deleted namespaces
This ensures that the ingester & compactor do not see rows "vanishing"
from the database, and continue to make forward progress.
Writes for the deleted namespace that are buffered in the ingester will
be persisted as normal, allowing us to support "un-delete" operations
where the system is restored to a the state at which the delete was
issued (rather than loosing the buffered data).
Follow-on work is required to ensure GC drops the orphaned parquet files
after the configured GC time, and optimisations such as not compacting
parquet from soft-deleted namespaces seems like a trivial win.
Use the namespace schema cache in the router to enforce the
per-namespace table limit (service protection limit), adding O(1)
overhead to the existing column limit evaluation logic.
Prior to this commit, each request that would breach the table limit
would be (potentially partially) applied to the catalog and return an
error. Every subsequent request creating a new table continued to cause
a catalog query, unnecessarily adding load proportional to request
counts.
After this commit, catalog requests are sent when the router instance
can determine (to the best of it's ability, see below) that the request
will not cause the namespace to exceed the table limit.
Because this uses cached schemas, the actual state set of tables may
have changed - this will cause inconsistent enforcement and spurious
errors in the same way it currently does for the column limit. For more
details (and to track a resolution) see:
https://github.com/influxdata/influxdb_iox/issues/5957