Commit Graph

8 Commits (dac0db21960c871c298924269d198a8b01849724)

Author SHA1 Message Date
Nga Tran dac0db2196
feat: add sort_key_ids into sqlite catalog (#8384) 2023-08-01 20:15:27 +00:00
Carol (Nichols || Goulding) 4a9e76b8b7
feat: Make parquet_file.partition_id optional in the catalog (#8339)
* feat: Make parquet_file.partition_id optional in the catalog

This will acquire a short lock on the table in postgres, per:
<https://stackoverflow.com/questions/52760971/will-making-column-nullable-lock-the-table-for-reads>

This allows us to persist data for new partitions and associate the
Parquet file catalog records with the partition records using only the
partition hash ID, rather than both that are used now.

* fix: Support transition partition ID in the catalog service

* fix: Use transition partition ID in import/export

This commit also removes support for the `--partition-id` flag of the
`influxdb_iox remote store get-table` command, which Andrew approved.

The `--partition-id` filter was getting the results of the catalog gRPC
service's query for Parquet files of a table and then keeping only the
files whose partition IDs matched. The gRPC query is no longer returning
the partition ID from the Parquet file table, and really, this command
should instead be using `GetParquetFilesByPartitionId` to only request
what's needed rather than filtering.

* feat: Support looking up Parquet files by either kind of Partition id

Regardless of which is actually stored on the Parquet file record.

That is, say there's a Partition in the catalog with:

Partition {
    id: 3,
    hash_id: abcdefg,
}

and a Parquet file that has:

ParquetFile {
    partition_hash_id: abcdefg,
}

calling `list_by_partition_not_to_delete(PartitionId(3))` should still
return this Parquet file because it is associated with the partition
that has ID 3.

This is important for the compactor, which is currently only dealing in
PartitionIds, and I'd like to keep it that way for now to avoid having
to change Even More in this PR.

* fix: Use and set new partition ID fields everywhere they want to be

---------

Co-authored-by: Dom <dom@itsallbroken.com>
2023-07-31 12:40:56 +00:00
Joe-Blount 629f9d20db fix: update new_file_at following all compactions 2023-07-20 13:27:54 -05:00
Carol (Nichols || Goulding) f20e9e6368
fix: Add index on parquet_file.partition_hash_id for lookup perf 2023-07-10 13:40:03 -04:00
Carol (Nichols || Goulding) 62ba18171a
feat: Add a new hash column on the partition and parquet file tables
This will hold the deterministic ID for partitions.

Until all existing partitions have this value, this is optional/nullable.

The row ID still exists and is used as the main foreign key in the
parquet_file and skipped_compaction tables.

The hash_id has a unique index so that we can look up records based on
it (if it's available).

If the parquet file record has a partition_hash_id value, use that to
generate the object storage path instead of the partition_id.
2023-06-22 09:01:22 -04:00
Carol (Nichols || Goulding) 47157015d9
feat: Add columns to store the partition templates 2023-05-24 10:10:34 -04:00
Dom Dwyer 61409f062c
refactor(catalog): soft delete namespace column
Adds a "deleted_at" column that will indicate the timestamp at which is
was marked as logically deleted.
2023-02-09 11:35:27 +01:00
Stuart Carnie eb245d6774
feat: Initial SQLite catalog schema (#6851)
* feat: Initial SQLite catalog schema

* chore: Run cargo hakari tasks

* feat: impls, many TODOs

* feat: completed `todo!()`'s

* chore: add remaining tests from postgres module

* feat: add SQLite to get_catalog API

* chore: Add docs

* chore: Placate clippy

* chore: Placate clippy

* chore: PR feedback from @domodwyer

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2023-02-06 22:55:14 +00:00