influxdb

Commit Graph

Author	SHA1	Message	Date
Nga Tran	73f38077b6	feat: add sort_key_ids as array of bigints into catalog partition (#8375 ) * feat: add sort_key_ids as array of bigints into catalog partition * chore: add comments * chore: remove comments to avoid changing them in the future due to checksum requirement --------- Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-08-01 14:28:30 +00:00
Carol (Nichols \|\| Goulding)	4a9e76b8b7	feat: Make parquet_file.partition_id optional in the catalog (#8339 ) * feat: Make parquet_file.partition_id optional in the catalog This will acquire a short lock on the table in postgres, per: <https://stackoverflow.com/questions/52760971/will-making-column-nullable-lock-the-table-for-reads> This allows us to persist data for new partitions and associate the Parquet file catalog records with the partition records using only the partition hash ID, rather than both that are used now. * fix: Support transition partition ID in the catalog service * fix: Use transition partition ID in import/export This commit also removes support for the `--partition-id` flag of the `influxdb_iox remote store get-table` command, which Andrew approved. The `--partition-id` filter was getting the results of the catalog gRPC service's query for Parquet files of a table and then keeping only the files whose partition IDs matched. The gRPC query is no longer returning the partition ID from the Parquet file table, and really, this command should instead be using `GetParquetFilesByPartitionId` to only request what's needed rather than filtering. * feat: Support looking up Parquet files by either kind of Partition id Regardless of which is actually stored on the Parquet file record. That is, say there's a Partition in the catalog with: Partition { id: 3, hash_id: abcdefg, } and a Parquet file that has: ParquetFile { partition_hash_id: abcdefg, } calling `list_by_partition_not_to_delete(PartitionId(3))` should still return this Parquet file because it is associated with the partition that has ID 3. This is important for the compactor, which is currently only dealing in PartitionIds, and I'd like to keep it that way for now to avoid having to change Even More in this PR. * fix: Use and set new partition ID fields everywhere they want to be --------- Co-authored-by: Dom <dom@itsallbroken.com>	2023-07-31 12:40:56 +00:00
Joe-Blount	629f9d20db	fix: update new_file_at following all compactions	2023-07-20 13:27:54 -05:00
Fraser Savage	e894ea73f7	refactor(catalog): Allow kafka columns to be nullable	2023-07-20 11:18:02 +01:00
Carol (Nichols \|\| Goulding)	f20e9e6368	fix: Add index on parquet_file.partition_hash_id for lookup perf	2023-07-10 13:40:03 -04:00
Joe-Blount	c2442c31f3	chore: create partition table index for created_at	2023-07-07 16:27:05 -05:00
Carol (Nichols \|\| Goulding)	62ba18171a	feat: Add a new hash column on the partition and parquet file tables This will hold the deterministic ID for partitions. Until all existing partitions have this value, this is optional/nullable. The row ID still exists and is used as the main foreign key in the parquet_file and skipped_compaction tables. The hash_id has a unique index so that we can look up records based on it (if it's available). If the parquet file record has a partition_hash_id value, use that to generate the object storage path instead of the partition_id.	2023-06-22 09:01:22 -04:00
Marco Neumann	551e838db3	refactor: remove unused PG indices (#7905 ) Similar to #7859. To test index usage, execute the following query on the writer replica: ```sql SELECT n.nspname AS namespace_name, t.relname AS table_name, pg_size_pretty(pg_relation_size(t.oid)) AS table_size, t.reltuples::bigint AS num_rows, psai.indexrelname AS index_name, pg_size_pretty(pg_relation_size(i.indexrelid)) AS index_size, CASE WHEN i.indisunique THEN 'Y' ELSE 'N' END AS "unique", psai.idx_scan AS number_of_scans, psai.idx_tup_read AS tuples_read, psai.idx_tup_fetch AS tuples_fetched FROM pg_index i INNER JOIN pg_class t ON t.oid = i.indrelid INNER JOIN pg_namespace n ON n.oid = t.relnamespace INNER JOIN pg_stat_all_indexes psai ON i.indexrelid = psai.indexrelid WHERE n.nspname = 'iox_catalog' AND t.relname = 'parquet_file' ORDER BY 1, 2, 5; ```` Data for eu-west-1 at `2023-05-31T16:30:00Z`: ```text namespace_name \| table_name \| table_size \| num_rows \| index_name \| index_size \| unique \| number_of_scans \| tuples_read \| tuples_fetched ----------------+--------------+------------+-----------+-----------------------------------+------------+--------+-----------------+----------------+---------------- iox_catalog \| parquet_file \| 38 GB \| 146489216 \| parquet_file_deleted_at_idx \| 6442 MB \| N \| 1693534991 \| 21602734184385 \| 21694365037 iox_catalog \| parquet_file \| 38 GB \| 146489216 \| parquet_file_partition_delete_idx \| 20 MB \| N \| 17854904 \| 3087700816 \| 384603858 iox_catalog \| parquet_file \| 38 GB \| 146489216 \| parquet_file_partition_idx \| 2325 MB \| N \| 1627977474 \| 12604272924323 \| 11088781876397 iox_catalog \| parquet_file \| 38 GB \| 146489216 \| parquet_file_pkey \| 8290 MB \| Y \| 480767174 \| 481021514 \| 480733966 iox_catalog \| parquet_file \| 38 GB \| 146489216 \| parquet_file_table_delete_idx \| 174 MB \| N \| 1006563 \| 24687617719 \| 385132581 iox_catalog \| parquet_file \| 38 GB \| 146489216 \| parquet_file_table_idx \| 1905 MB \| N \| 9288042 \| 351240529272 \| 27551 iox_catalog \| parquet_file \| 38 GB \| 146489216 \| parquet_location_unique \| 6076 MB \| Y \| 385294957 \| 109448 \| 109445 ```` and at `2023-06-01T13:00:00Z`: ```text namespace_name \| table_name \| table_size \| num_rows \| index_name \| index_size \| unique \| number_of_scans \| tuples_read \| tuples_fetched ----------------+--------------+------------+-----------+-----------------------------------+------------+--------+-----------------+----------------+---------------- iox_catalog \| parquet_file \| 43 GB \| 152684560 \| parquet_file_deleted_at_idx \| 6976 MB \| N \| 1693535032 \| 21602834620294 \| 21736731439 iox_catalog \| parquet_file \| 43 GB \| 152684560 \| parquet_file_partition_delete_idx \| 21 MB \| N \| 31468423 \| 7397141567 \| 677909956 iox_catalog \| parquet_file \| 43 GB \| 152684560 \| parquet_file_partition_idx \| 2464 MB \| N \| 1627977474 \| 12604272924323 \| 11088781876397 iox_catalog \| parquet_file \| 43 GB \| 152684560 \| parquet_file_pkey \| 8785 MB \| Y \| 492762975 \| 493017342 \| 492729691 iox_catalog \| parquet_file \| 43 GB \| 152684560 \| parquet_file_table_delete_idx \| 241 MB \| N \| 1136317 \| 24735561304 \| 429892231 iox_catalog \| parquet_file \| 43 GB \| 152684560 \| parquet_file_table_idx \| 2058 MB \| N \| 9288042 \| 351240529272 \| 27551 iox_catalog \| parquet_file \| 43 GB \| 152684560 \| parquet_location_unique \| 6776 MB \| Y \| 399142416 \| 124810 \| 124807 ```` Due to #7842 and #7894, the following indices are no longer used: - `parquet_file_partition_idx` - `parquet_file_table_idx`	2023-06-01 13:45:05 +00:00
Marco Neumann	e14305ac33	feat: add index for compactor (#7894 ) * fix: migration name * feat: add index for compactor	2023-05-31 12:29:00 +00:00
Marco Neumann	e1c1908a0b	refactor: add `parquet_file` PG index for querier (#7842 ) * refactor: add `parquet_file` PG index for querier Currently the `list_by_table_not_to_delete` catalog query is somewhat expensive: ```text iox_catalog_prod=> select table_id, sum((to_delete is NULL)::int) as n from parquet_file group by table_id order by n desc limit 5; table_id \| n ----------+------ 1489038 \| 7221 1489037 \| 7019 1491534 \| 5793 1491951 \| 5522 1513377 \| 5339 (5 rows) iox_catalog_prod=> EXPLAIN ANALYZE SELECT id, namespace_id, table_id, partition_id, object_store_id, min_time, max_time, to_delete, file_size_bytes, row_count, compaction_level, created_at, column_set, max_l0_created_at FROM parquet_file WHERE table_id = 1489038 AND to_delete IS NULL; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on parquet_file (cost=46050.91..47179.26 rows=283 width=200) (actual time=464.368..472.514 rows=7221 loops=1) Recheck Cond: ((table_id = 1489038) AND (to_delete IS NULL)) Heap Blocks: exact=7152 -> BitmapAnd (cost=46050.91..46050.91 rows=283 width=0) (actual time=463.341..463.343 rows=0 loops=1) -> Bitmap Index Scan on parquet_file_table_idx (cost=0.00..321.65 rows=22545 width=0) (actual time=1.674..1.674 rows=7221 loops=1) Index Cond: (table_id = 1489038) -> Bitmap Index Scan on parquet_file_deleted_at_idx (cost=0.00..45728.86 rows=1525373 width=0) (actual time=460.717..460.717 rows=4772117 loops=1) Index Cond: (to_delete IS NULL) Planning Time: 0.092 ms Execution Time: 472.907 ms (10 rows) ``` I think this may also be because PostgreSQL kinda chooses the wrong strategy, because it could just look at the existing index and filter from there: ```text iox_catalog_prod=> EXPLAIN ANALYZE SELECT id, namespace_id, table_id, partition_id, object_store_id, min_time, max_time, to_delete, file_size_bytes, row_count, compaction_level, created_at, column_set, max_l0_created_at FROM parquet_file WHERE table_id = 1489038; QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------- Index Scan using parquet_file_table_idx on parquet_file (cost=0.57..86237.78 rows=22545 width=200) (actual time=0.057..6.994 rows=7221 loops=1) Index Cond: (table_id = 1489038) Planning Time: 0.094 ms Execution Time: 7.297 ms (4 rows) ``` However PostgreSQL doesn't know the cardinalities well enough. So let's add a dedicated index to make the querier faster. * feat: new migration system * docs: explain dirty migrations	2023-05-31 10:56:32 +00:00
Carol (Nichols \|\| Goulding)	47157015d9	feat: Add columns to store the partition templates	2023-05-24 10:10:34 -04:00
Marco Neumann	b71564f455	refactor: remove ununused `parquet_file` indices Remove unused Postgres indices. This lower database load but also gives us room to install actually useful indices (see #7842). To detect which indices are used, I've used the following query (on the actual write/master replicate in eu-central-1): ```sql SELECT n.nspname AS namespace_name, t.relname AS table_name, pg_size_pretty(pg_relation_size(t.oid)) AS table_size, t.reltuples::bigint AS num_rows, psai.indexrelname AS index_name, pg_size_pretty(pg_relation_size(i.indexrelid)) AS index_size, CASE WHEN i.indisunique THEN 'Y' ELSE 'N' END AS "unique", psai.idx_scan AS number_of_scans, psai.idx_tup_read AS tuples_read, psai.idx_tup_fetch AS tuples_fetched FROM pg_index i INNER JOIN pg_class t ON t.oid = i.indrelid INNER JOIN pg_namespace n ON n.oid = t.relnamespace INNER JOIN pg_stat_all_indexes psai ON i.indexrelid = psai.indexrelid WHERE n.nspname = 'iox_catalog' AND t.relname = 'parquet_file' ORDER BY 1, 2, 5; ``` At `2023-05-23T16:00:00Z`: ```text namespace_name \| table_name \| table_size \| num_rows \| index_name \| index_size \| unique \| number_of_scans \| tuples_read \| tuples_fetched ----------------+--------------+------------+-----------+--------------------------------------------------+------------+--------+-----------------+----------------+---------------- iox_catalog \| parquet_file \| 31 GB \| 120985000 \| parquet_file_deleted_at_idx \| 5398 MB \| N \| 1693383413 \| 21036174283392 \| 21336337964 iox_catalog \| parquet_file \| 31 GB \| 120985000 \| parquet_file_partition_created_idx \| 11 GB \| N \| 34190874 \| 4749070532 \| 61934212 iox_catalog \| parquet_file \| 31 GB \| 120985000 \| parquet_file_partition_idx \| 2032 MB \| N \| 1612961601 \| 9935669905489 \| 8611676799872 iox_catalog \| parquet_file \| 31 GB \| 120985000 \| parquet_file_pkey \| 7135 MB \| Y \| 453927041 \| 454181262 \| 453894565 iox_catalog \| parquet_file \| 31 GB \| 120985000 \| parquet_file_shard_compaction_delete_created_idx \| 14 GB \| N \| 0 \| 0 \| 0 iox_catalog \| parquet_file \| 31 GB \| 120985000 \| parquet_file_shard_compaction_delete_idx \| 8767 MB \| N \| 2 \| 30717 \| 4860 iox_catalog \| parquet_file \| 31 GB \| 120985000 \| parquet_file_table_idx \| 1602 MB \| N \| 9136844 \| 341839537275 \| 27551 iox_catalog \| parquet_file \| 31 GB \| 120985000 \| parquet_location_unique \| 4989 MB \| Y \| 332341872 \| 3123 \| 3123 ``` At `2023-05-24T09:50:00Z` (i.e. nearly 18h later): ```text namespace_name \| table_name \| table_size \| num_rows \| index_name \| index_size \| unique \| number_of_scans \| tuples_read \| tuples_fetched ----------------+--------------+------------+-----------+--------------------------------------------------+------------+--------+-----------------+----------------+---------------- iox_catalog \| parquet_file \| 31 GB \| 123869328 \| parquet_file_deleted_at_idx \| 5448 MB \| N \| 1693485804 \| 21409285169862 \| 21364369704 iox_catalog \| parquet_file \| 31 GB \| 123869328 \| parquet_file_partition_created_idx \| 11 GB \| N \| 34190874 \| 4749070532 \| 61934212 iox_catalog \| parquet_file \| 31 GB \| 123869328 \| parquet_file_partition_idx \| 2044 MB \| N \| 1615214409 \| 10159380553599 \| 8811036969123 iox_catalog \| parquet_file \| 31 GB \| 123869328 \| parquet_file_pkey \| 7189 MB \| Y \| 455128165 \| 455382386 \| 455095624 iox_catalog \| parquet_file \| 31 GB \| 123869328 \| parquet_file_shard_compaction_delete_created_idx \| 14 GB \| N \| 0 \| 0 \| 0 iox_catalog \| parquet_file \| 31 GB \| 123869328 \| parquet_file_shard_compaction_delete_idx \| 8849 MB \| N \| 2 \| 30717 \| 4860 iox_catalog \| parquet_file \| 31 GB \| 123869328 \| parquet_file_table_idx \| 1618 MB \| N \| 9239071 \| 348304417343 \| 27551 iox_catalog \| parquet_file \| 31 GB \| 123869328 \| parquet_location_unique \| 5043 MB \| Y \| 343484617 \| 3123 \| 3123 ``` The cluster currently is under load and all components are running. Conclusion: - `parquet_file_deleted_at_idx`: Used, likely by the GC. We could probably shrink this index by binning `deleted_at` (within the index, not within the actual database table), but let's do this in a later PR. - `parquet_file_partition_created_idx`: Unused and huge (`created_at` is NOT binned). So let's remove it. - `parquet_file_partition_idx`: Used, likely by the compactor and querier because we currently don't have a better index (see #7842 as well). This includes deleted files as well which is somewhat pointless. May become obsolete after #7842, not touching for now. - `parquet_file_pkey`: Primary key. We should probably use the object store UUID as a primary key BTW, which would also make the GC faster. Not touching for now. - `parquet_file_shard_compaction_delete_created_idx`: Huge unused index. Shards don't exist anymore. Delete it. - `parquet_file_shard_compaction_delete_idx`: Same as `parquet_file_shard_compaction_delete_created_idx`. - `parquet_file_table_idx`: Used but is somewhat too large because it contains deleted files. Might become obsolete after #7842, don't touch for now. - `parquet_location_unique`: See note `parquet_file_pkey`, it's pointless to have two IDs here. Not touching for now but this is a potential future improvement. So we remove: - `parquet_file_partition_created_idx` - `parquet_file_shard_compaction_delete_created_idx` - `parquet_file_shard_compaction_delete_idx`	2023-05-24 12:10:22 +02:00
Marco Neumann	d34d23c354	refactor: remove `processed_tombstone` table (#7840 ) - the table is unused - there are no foreign keys or triggers based on this table - the design is generally not scalable (N*M entries) and tombstones should rather have a timestamp so we can check if a parquet file includes that information or not (or some other form of serialization mechanism) - it's currently empty in prod (an never was filled w/ data in any cluster)	2023-05-22 15:56:23 +00:00
Dom Dwyer	61409f062c	refactor(catalog): soft delete namespace column Adds a "deleted_at" column that will indicate the timestamp at which is was marked as logically deleted.	2023-02-09 11:35:27 +01:00
Nga Tran	b8a80869d4	feat: introduce a new way of max_sequence_number for ingester, compactor and querier (#6692 ) * feat: introduce a new way of max_sequence_number for ingester, compactor and querier * chore: cleanup * feat: new column max_l0_created_at to order files for deduplication * chore: cleanup * chore: debug info for chnaging cpu.parquet * fix: update test parquet file Co-authored-by: Marco Neumann <marco@crepererum.net>	2023-01-26 10:52:47 +00:00
Nga Tran	550cea8bc5	perf: optimize not to update partitions with newly created level 2 files (#6590 ) * perf: optimize not to update partitions with newly created level 2 files * chore: cleanup Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-13 14:46:58 +00:00
Nga Tran	b20226797a	fix: make trigger midification in different file (#6526 )	2023-01-06 20:34:48 +00:00
Nga Tran	b856edf826	feat: function to get parttion candidates from partition table (#6519 ) * feat: function to get parttion candidates from partition table * chore: cleanup * fix: make new_file_at the same value as created_at * chore: cleanup Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-06 16:20:45 +00:00
Nga Tran	23807df7a9	feat: trigger that updates partition table when a parquet file is created (#6514 ) * feat: trigger that update partition table when a parquet file is created * chore: simplify epoch of now	2023-01-05 19:57:23 +00:00
Nga Tran	1088baea3d	chore: index for selecting partitions with parquet files created after a given time (#6496 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2023-01-04 18:07:07 +00:00
Luke Bond	6263ca234a	chore: delete ns postgres impl, test improvements, fix to mem impl	2022-12-16 10:23:50 +00:00
Luke Bond	7c813c170a	feat: reintroduce compactor first file in partition exception (#6176 ) * feat: compactor ignores max file count for first file chore: typo in comment in compactor * feat: restore special first file in partition compaction logic; add limit * fix: calculation in compaction max file count chore: clippy Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-11-18 15:58:59 +00:00
Nga Tran	a3f2fe489c	refactor: remove retention_duration field from namespace catalog table (#6124 )	2022-11-11 20:30:42 +00:00
NGA-TRAN	498851eaf5	feat: add catalog columns needed for retention policy	2022-11-01 15:35:15 -04:00
Dom Dwyer	46bbee5423	refactor: reduce default column limit Reduces the default number of columns allowed per-table, from 1,000 to 200.	2022-10-14 14:45:48 +02:00
Nga Tran	75ff805ee2	feat: instead of adding num_files and memory budget into the reason text column, let us create differnt columns for them. We will be able to filter them easily (#5742 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-09-26 20:14:04 +00:00
Dom Dwyer	66bf0ff272	refactor(db): NULLable persisted_sequence_number Makes the partition.persisted_sequence_number column in the catalog DB NULLable. 0 is a valid persisted sequence number.	2022-09-15 18:19:39 +02:00
Dom Dwyer	c5ac17399a	refactor(db): persist marker for partition table Adds a migration to add a column "persisted_sequence_number" that defines the inclusive upper-bound on sequencer writes materialised and uploaded to object store for the partition.	2022-09-15 16:10:35 +02:00
Luke Bond	ee3f172d45	chore: renamed DB migration for billing trigger	2022-09-13 16:29:14 +01:00
Luke Bond	c8b545134e	chore: add index to speed up billing_summary upsert	2022-09-13 16:22:44 +01:00
Luke Bond	feae712881	fix: parquet_file billing trigger respects to_delete	2022-09-13 16:22:44 +01:00
Luke Bond	cc93b2c275	chore: add catalog trigger for billing	2022-09-13 16:22:44 +01:00
Carol (Nichols \|\| Goulding)	fbe3e360d2	feat: Record skipped compactions in memory Connects to #5458.	2022-09-09 15:31:07 -04:00
Nga Tran	cbfd37540a	feat: add index on parquet_file(shard_id, compaction_level, to_delete, created_at) (#5544 )	2022-09-02 14:27:29 +00:00
Carol (Nichols \|\| Goulding)	8a0fa616cf	fix: Rename columns, tables, indexes and constraints in postgres catalog	2022-09-01 10:00:54 -04:00
Nga Tran	a2c82a6f1c	chore: remove min sequence number from the catalog table as we no longer use it (#5178 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-07-21 20:47:55 +00:00
Marco Neumann	be53716e4d	refactor: use IDs for `parquet_file.column_set` (#4965 ) * feat: `ColumnRepo::list_by_table_id` * refactor: use IDs for `parquet_file.column_set` Closes #4959. * refactor: introduce `TableSchema::column_id_map`	2022-06-30 15:08:41 +00:00
Marco Neumann	215f297162	refactor: parquet file metadata from catalog (#4949 ) * refactor: remove `ParquetFileWithMetadata` * refactor: remove `ParquetFileRepo::parquet_metadata` * refactor: parquet file metadata from catalog Closes #4124.	2022-06-27 15:38:39 +00:00
Nga Tran	92eeb5b232	chore: remove unused sort_key_old from catalog partition (#4944 ) * chore: remove unused sort_key_old from catalog partition * chore: add new line at the end of the SQL file	2022-06-24 15:02:38 +00:00
Marco Neumann	994bc5fefd	refactor: ensure that SQL parquet file column sets are not NULL (#4937 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-24 14:26:18 +00:00
Marco Neumann	c3912e34e9	refactor: store per-file column set in catalog (#4908 ) * refactor: store per-file column set in catalog Together with the table-wide schema and the partition-wide sort key, this should be everything we need to read a parquet file directly into memory without peeking any file-level metadata. The querier will use this to directly load parquet files into the read buffer. WARNING: This requires a catalog wipe! Ref #4124. * refactor: use proper `ColumnSet` type	2022-06-21 10:26:12 +00:00
Nga Tran	13c57d524a	feat: Change data type of catalog partition's sort_key from a string to an array of string (#4801 ) * feat: Change data type of catalog Postgres partition's sort_key from a string to an array of string * test: add column with comma * fix: use new protonuf field to avoid incompactible * fix: ensure sort_key is an empty array rather than NULL * refactor: address review comments * refactor: address more comments * chore: clearer comments * chore: Update iox_catalog/migrations/20220607102200_change_sort_key_type_to_array.sql * chore: Update iox_catalog/migrations/20220607102200_change_sort_key_type_to_array.sql * fix: Rename migration so it will be applied after Co-authored-by: Marko Mikulicic <mkm@influxdata.com>	2022-06-10 13:31:31 +00:00
Marko Mikulicic	c09f6f6bc9	chore: Incrementally migrate sort_key to array type (#4826 ) This PR is the first step where we add a new column sort_key_arr whose content we'll manually migrate from sort_key. When we're done with this, we'll merge https://github.com/influxdata/influxdb_iox/pull/4801/ (whose migration script must be adapted slightly to rename the `sort_key_arr` column back to `sort_key`). All this must be done while we shut down the ingesters and the compactors. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-06-10 11:35:43 +00:00
Marco Neumann	86e8f05ed1	fix: make all catalog IDs 64bit (#4418 ) Closes #4365. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-25 16:49:34 +00:00
kodiakhq[bot]	e2439c0a4f	Merge branch 'main' into cn/sort-key-catalog	2022-04-04 16:54:48 +00:00
Dom Dwyer	61bc9c83ad	refactor: add table_id index on column_name After checking the postgres workload for the catalog in prod, this missing index was noted as the cause of unexpectedly expensive plans for simple queries.	2022-04-04 13:04:25 +01:00
Carol (Nichols \|\| Goulding)	c9bc70f03a	feat: Add optional sort_key column to partition table Connects to #4195.	2022-04-01 15:45:51 -04:00
Paul Dix	6479e1fc8e	fix: add indexes to parquet_file (#4198 ) Add indexes so compactor can find candidate partitions and specific partition files quickly. Limit number of level 0 files returned for determining candidates. This should ensure that if comapction is very backed up, it will be able to work through the backlog without evaluating the entire world. Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-04-01 09:59:39 +00:00
Marko Mikulicic	2c47d77a5b	fix: Backfill namespace_id in schema migration (#4177 ) Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>	2022-03-30 16:31:26 +00:00
Carol (Nichols \|\| Goulding)	5c8a80dca6	fix: Add an index to parquet_file to_delete	2022-03-29 08:15:26 -04:00

1 2

66 Commits (dac0db21960c871c298924269d198a8b01849724)