Commit Graph

443 Commits (dac0db21960c871c298924269d198a8b01849724)

Author SHA1 Message Date
Nga Tran dac0db2196
feat: add sort_key_ids into sqlite catalog (#8384) 2023-08-01 20:15:27 +00:00
Nga Tran 73f38077b6
feat: add sort_key_ids as array of bigints into catalog partition (#8375)
* feat: add sort_key_ids as array of bigints into catalog partition

* chore: add comments

* chore: remove comments to avoid changing them in the future due to checksum requirement

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-01 14:28:30 +00:00
Marco Neumann 743a59aa64
feat: use single per-migration txn when possible (#8373)
* test: improve `test_step_sql_statement_no_transaction`

* feat: also print number of steps in "applying migration step"

* feat: use single per-migration txn when possible

If all steps can (and want) to run in a transaction block, then wrap the
migration bookkeeping and the migration script into a single
transaction. This way we avoid the dirty state altogether because its
now an "all or nothing" migration.

Note that we still guarantee that there is only a single migration
running at the same time due to the locking mechanism. Otherwise we
would potentially run into nasty transaction failures during schema
modifications.

This is related to #7897 but only fixes / self-heals the "dirty" state
for transaction that can run in transactions. For concurrent index
migrations (which we need in prod) we need to be a bit smarter and this
will be done in a follow-up. However I feel that not leaving half-done
migrations for the cases where it's technically possible (e.g. adding
columns) is already a huge step forward.

* test: make `test_migrator_uses_single_transaction_when_possible` harder

* test: explain test

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-01 08:18:39 +00:00
Carol (Nichols || Goulding) 4a9e76b8b7
feat: Make parquet_file.partition_id optional in the catalog (#8339)
* feat: Make parquet_file.partition_id optional in the catalog

This will acquire a short lock on the table in postgres, per:
<https://stackoverflow.com/questions/52760971/will-making-column-nullable-lock-the-table-for-reads>

This allows us to persist data for new partitions and associate the
Parquet file catalog records with the partition records using only the
partition hash ID, rather than both that are used now.

* fix: Support transition partition ID in the catalog service

* fix: Use transition partition ID in import/export

This commit also removes support for the `--partition-id` flag of the
`influxdb_iox remote store get-table` command, which Andrew approved.

The `--partition-id` filter was getting the results of the catalog gRPC
service's query for Parquet files of a table and then keeping only the
files whose partition IDs matched. The gRPC query is no longer returning
the partition ID from the Parquet file table, and really, this command
should instead be using `GetParquetFilesByPartitionId` to only request
what's needed rather than filtering.

* feat: Support looking up Parquet files by either kind of Partition id

Regardless of which is actually stored on the Parquet file record.

That is, say there's a Partition in the catalog with:

Partition {
    id: 3,
    hash_id: abcdefg,
}

and a Parquet file that has:

ParquetFile {
    partition_hash_id: abcdefg,
}

calling `list_by_partition_not_to_delete(PartitionId(3))` should still
return this Parquet file because it is associated with the partition
that has ID 3.

This is important for the compactor, which is currently only dealing in
PartitionIds, and I'd like to keep it that way for now to avoid having
to change Even More in this PR.

* fix: Use and set new partition ID fields everywhere they want to be

---------

Co-authored-by: Dom <dom@itsallbroken.com>
2023-07-31 12:40:56 +00:00
Marco Neumann 73339cfc57
fix: remove sqlx "used" metrics (#8336)
PR #8327 introduced a bunch of metrics for the sqlx connection pool. One
of the metrics was the "used" metrics that was supposed to count
"currently in use" connection. In prod however this metric underflows to
a very large integer. It seems that "acquire" callback is only used by sqlx for
re-used connections (i.e. for the transition from "idle" to "used").
Now we could try to work around it but since there is no "close
connection" callback, I doubt it it possible to do the accurately.

Luckily though we don't really need that counter. sqlx already offers
"active" (defined as idle + used) and "idle", so getting "used" is just
the difference. I removed the "used" metric nevertheless because
"active" and "idle" are read independently from each other (based on atomic
integers) and are NOT guaranteed to be in-sync. Calculating the
difference within IOx however would give the illusion that they are. So
I leave this to the dashboard / alert / whatever, because there it is
usually understood that metrics are samples and may be out of sync for a
very short time.

A nice side effect of this change is that it simplifies the code quite a
bit.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-27 10:04:56 +00:00
Marco Neumann b62e98cef1
feat: metrics for sqlx conn pools (#8327)
To better gauge how many connections we use and especially if we hit the
max connection limit, it would be helpful to actually have some metrics
available for the pool usage. This change adds a few basic metrics.
2023-07-25 10:07:25 +00:00
dependabot[bot] faa8d44492
chore(deps): Bump thiserror from 1.0.43 to 1.0.44 (#8315)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.43 to 1.0.44.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.43...1.0.44)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-24 10:18:44 +00:00
dependabot[bot] cd31492e5b
chore(deps): Bump async-trait from 0.1.71 to 0.1.72 (#8317)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.71 to 0.1.72.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.71...0.1.72)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-24 10:07:18 +00:00
Joe-Blount 629f9d20db fix: update new_file_at following all compactions 2023-07-20 13:27:54 -05:00
Fraser Savage e894ea73f7
refactor(catalog): Allow kafka columns to be nullable 2023-07-20 11:18:02 +01:00
Marco Neumann 4e88571142
feat: add batch partition getters (#8268) 2023-07-19 15:05:41 +00:00
Marco Neumann 004b401a05
chore: upgrade to sqlx 0.7.1 (#8266)
There are a bunch of dependencies in `Cargo.lock` that are related to
mysql. These are NOT compiled at all, and are also not part of `cargo
tree`. The reason for the inclusion is a bug in cargo:

https://github.com/rust-lang/cargo/issues/10801

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-19 12:18:57 +00:00
dependabot[bot] e33a078128
chore(deps): Bump paste from 1.0.13 to 1.0.14 (#8244)
Bumps [paste](https://github.com/dtolnay/paste) from 1.0.13 to 1.0.14.
- [Release notes](https://github.com/dtolnay/paste/releases)
- [Commits](https://github.com/dtolnay/paste/compare/1.0.13...1.0.14)

---
updated-dependencies:
- dependency-name: paste
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-17 16:10:02 +00:00
Carol (Nichols || Goulding) f20e9e6368
fix: Add index on parquet_file.partition_hash_id for lookup perf 2023-07-10 13:40:03 -04:00
Carol (Nichols || Goulding) 22c17fb970
feat: Abstract over which partition ID type we're using to list Parquet files 2023-07-10 13:40:01 -04:00
Carol (Nichols || Goulding) c1e42651ec
feat: Abstract over which partition ID type we're using to compare and swap sort keys 2023-07-10 13:39:19 -04:00
Carol (Nichols || Goulding) eec31b7f00
feat: Abstract over which partition ID type we're using to get a partition from the catalog 2023-07-10 10:43:20 -04:00
Joe-Blount c2442c31f3 chore: create partition table index for created_at 2023-07-07 16:27:05 -05:00
dependabot[bot] 8b000862e1
chore(deps): Bump pretty_assertions from 1.3.0 to 1.4.0 (#8182)
Bumps [pretty_assertions](https://github.com/rust-pretty-assertions/rust-pretty-assertions) from 1.3.0 to 1.4.0.
- [Release notes](https://github.com/rust-pretty-assertions/rust-pretty-assertions/releases)
- [Changelog](https://github.com/rust-pretty-assertions/rust-pretty-assertions/blob/main/CHANGELOG.md)
- [Commits](https://github.com/rust-pretty-assertions/rust-pretty-assertions/compare/v1.3.0...v1.4.0)

---
updated-dependencies:
- dependency-name: pretty_assertions
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-07 09:35:18 +00:00
dependabot[bot] 057ee40cb9
chore(deps): Bump thiserror from 1.0.41 to 1.0.43 (#8181)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.41 to 1.0.43.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.41...1.0.43)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-07 09:25:12 +00:00
dependabot[bot] 26a6113a37
chore(deps): Bump async-trait from 0.1.70 to 0.1.71 (#8163)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.70 to 0.1.71.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.70...0.1.71)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-06 09:58:51 +00:00
dependabot[bot] 3827257f94
chore(deps): Bump thiserror from 1.0.40 to 1.0.41 (#8149)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.40 to 1.0.41.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.40...1.0.41)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom <dom@itsallbroken.com>
2023-07-05 09:25:14 +00:00
Marco Neumann 9c65185068
refactor: normalize catalog metric names (#8152)
Use the same prefix for all metrics of the same repo type. This makes
reading dashboards way easier.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-05 09:18:39 +00:00
dependabot[bot] b5c9628f0f
chore(deps): Bump async-trait from 0.1.69 to 0.1.70 (#8148)
Bumps [async-trait](https://github.com/dtolnay/async-trait) from 0.1.69 to 0.1.70.
- [Release notes](https://github.com/dtolnay/async-trait/releases)
- [Commits](https://github.com/dtolnay/async-trait/compare/0.1.69...0.1.70)

---
updated-dependencies:
- dependency-name: async-trait
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-05 09:05:13 +00:00
dependabot[bot] 9a03d9c9fe
chore(deps): Bump paste from 1.0.12 to 1.0.13 (#8139)
Bumps [paste](https://github.com/dtolnay/paste) from 1.0.12 to 1.0.13.
- [Release notes](https://github.com/dtolnay/paste/releases)
- [Commits](https://github.com/dtolnay/paste/compare/1.0.12...1.0.13)

---
updated-dependencies:
- dependency-name: paste
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-04 07:57:41 +00:00
dependabot[bot] b15c6062a9
chore(deps): Bump tokio from 1.28.2 to 1.29.0 (#8100)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.28.2 to 1.29.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.28.2...tokio-1.29.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-06-28 13:18:08 +00:00
Carol (Nichols || Goulding) 60d0858381
feat: Add catalog method for looking up partitions by their hash ID (#8018)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-23 14:42:50 +00:00
Carol (Nichols || Goulding) 62ab8d21c2
fix: Eliminate need to have 2 separate insert statements depending on presence of hash ID
I figured out that the reason inserting `Option<PartitionHashId>` was
giving me a compiler error that `Encode` wasn't implemented was because
I only implemented `Encode` for `&PartitionHashId` and sqlx only
implements `Encode` for `Option<T: Encode>`, not `Option<T> where &T:
Encode`. Using `as_ref` makes this work and gets rid of the `match` that
created two different queries (one of which was wrong!)

Also add tests that we can insert Parquet file records for partitions
that don't have hash IDs to ensure we don't break ingest of new data for
old-style partitions.
2023-06-22 09:01:22 -04:00
Carol (Nichols || Goulding) bffb2f8f9f
fix: Specialize Partition constructors to clarify appropriate usage 2023-06-22 09:01:22 -04:00
Carol (Nichols || Goulding) 62ba18171a
feat: Add a new hash column on the partition and parquet file tables
This will hold the deterministic ID for partitions.

Until all existing partitions have this value, this is optional/nullable.

The row ID still exists and is used as the main foreign key in the
parquet_file and skipped_compaction tables.

The hash_id has a unique index so that we can look up records based on
it (if it's available).

If the parquet file record has a partition_hash_id value, use that to
generate the object storage path instead of the partition_id.
2023-06-22 09:01:22 -04:00
Phil Bracikowski 0c9cc2c1ed
chore(garbage-collector): increase batch size for clearing deleted files (#8009)
This pr increases the const for the number of parquet files to remove
from the catalog that are soft deleted and older than a configurable
cutoff.

* closes #8008

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-16 15:22:13 +00:00
Phil Bracikowski e34ec77e8d
feat(garbage-collector): batch parquet existence checks to catalog (#7964)
* feat(garbage-collector): batch parquet existence checks to catalog

The core feature of this PR is batching the existence checks of parquet
files in object store against the catalog. Before, there was 1 catalog
query per each parquet file in object store. This can be a lot of
requests.

This PR can perform one query of at most 100 parquet file uuids against
the catalog in one query. A hundred seems like a decent starting place.

The batch may not reach 100 because there is also a timeout on receiving
object store meta objects from the object store lister thread. That
timeout is set to 100 milliseconds. If more than 100 are received, they
are batched into 100 for the catalog.

Additionally, this PR includes surrounding code changes to make it more
idiomatic (but not perfect). It follows up some suggested work from
 #7652 for watching for shutdown on the threads.

* fixes #7784

* use hashset instead of vec to test for contains
* chore: add test for db failure path
* remove ParquetFileExistsByOSID and other single field structs that are
  just for sql deserialization; map to uuid explicitly
* fix the sqlite query by using a blob literal X'<hex>' for uuids
* comment clarifications
* adjust loggings to warn from debug for expected rare events

Many thanks to Carol for help implementing this!
2023-06-14 07:59:00 -07:00
Carol (Nichols || Goulding) 566ec68c58
refactor: Extract a test helper method for creating ParquetFileParams (#7959)
Co-authored-by: Dom <dom@itsallbroken.com>
2023-06-09 09:44:30 +00:00
Marko Mikulicic d26ad8e079
feat: Allow passing service protection limits in create db gRPC call (#7941)
* feat: Allow passing service protection limits in create db gRPC call

* fix: Move the impl into the catalog namespace trait

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-08 14:28:32 +00:00
Phil Bracikowski 92a83270f3
fix(garbage-collector): just test parquet file exists (#7948)
* fix(garbage-collector): just test parquet file existence

The GC, when checking files in object store against the catalog, only
cares if the parquet file for the given object store id exists in the
catalog. It doesn't need the full parquet file. Let's not transmit it
over the wire.

This PR uses a SELECT 1 and boolean to test for parquet file existing.

* helps #7784

* chore: use struct for from_row

* chore: satisfy clippy

* chore: fmt
2023-06-07 15:12:48 -07:00
Carol (Nichols || Goulding) 2becc950e1
fix: Use expect rather than returning error in a theoretically impossible case 2023-06-07 11:38:12 -04:00
Carol (Nichols || Goulding) ac26ceef91
feat: Make a place to do partition template validation
- Create data_types::partition_template::ValidationError
- Make creation of NamespacePartitionTemplateOverride and
  TablePartitionTemplateOverride fallible
- Move SerializationWrapper into a module to make its inner field
  private to force creation through one fallible constructor; this is
  where the validation logic will go to be shared among all uses of
  partition templates
2023-06-07 11:38:12 -04:00
Carol (Nichols || Goulding) 7f1d4a5bcd
fix: Create tag columns used in table partition template in a transaction with table create 2023-06-05 11:21:33 -04:00
Carol (Nichols || Goulding) bf69f17b3f
test: Add checks for tag columns being added by table creation to the catalog tests
These currently fail because the implementation still only exists in the
table grpc service.
2023-06-05 10:24:45 -04:00
Marco Neumann 86a2c249ec
refactor: faster PG `ParquetFileRepo` (#7907)
* refactor: remove `ParquetFileRepo::flag_for_delete`

* refactor: batch update parquet files in catalog

* refactor: avoid data roundtrips through postgres

* refactor: do not return ID from PG when we do not need it

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-01 16:17:28 +00:00
Marco Neumann 551e838db3
refactor: remove unused PG indices (#7905)
Similar to #7859. To test index usage, execute the following query on
the writer replica:

```sql
SELECT
    n.nspname                                      AS namespace_name,
    t.relname                                      AS table_name,
    pg_size_pretty(pg_relation_size(t.oid))        AS table_size,
    t.reltuples::bigint                            AS num_rows,
    psai.indexrelname                              AS index_name,
    pg_size_pretty(pg_relation_size(i.indexrelid)) AS index_size,
    CASE WHEN i.indisunique THEN 'Y' ELSE 'N' END  AS "unique",
    psai.idx_scan                                  AS number_of_scans,
    psai.idx_tup_read                              AS tuples_read,
    psai.idx_tup_fetch                             AS tuples_fetched
FROM
    pg_index i
    INNER JOIN pg_class t               ON t.oid = i.indrelid
    INNER JOIN pg_namespace n           ON n.oid = t.relnamespace
    INNER JOIN pg_stat_all_indexes psai ON i.indexrelid = psai.indexrelid
WHERE
    n.nspname = 'iox_catalog' AND t.relname = 'parquet_file'
ORDER BY 1, 2, 5;
````

Data for eu-west-1 at `2023-05-31T16:30:00Z`:

```text
namespace_name |  table_name  | table_size | num_rows  |            index_name             | index_size | unique | number_of_scans |  tuples_read   | tuples_fetched
----------------+--------------+------------+-----------+-----------------------------------+------------+--------+-----------------+----------------+----------------
 iox_catalog    | parquet_file | 38 GB      | 146489216 | parquet_file_deleted_at_idx       | 6442 MB    | N      |      1693534991 | 21602734184385 |    21694365037
 iox_catalog    | parquet_file | 38 GB      | 146489216 | parquet_file_partition_delete_idx | 20 MB      | N      |        17854904 |     3087700816 |      384603858
 iox_catalog    | parquet_file | 38 GB      | 146489216 | parquet_file_partition_idx        | 2325 MB    | N      |      1627977474 | 12604272924323 | 11088781876397
 iox_catalog    | parquet_file | 38 GB      | 146489216 | parquet_file_pkey                 | 8290 MB    | Y      |       480767174 |      481021514 |      480733966
 iox_catalog    | parquet_file | 38 GB      | 146489216 | parquet_file_table_delete_idx     | 174 MB     | N      |         1006563 |    24687617719 |      385132581
 iox_catalog    | parquet_file | 38 GB      | 146489216 | parquet_file_table_idx            | 1905 MB    | N      |         9288042 |   351240529272 |          27551
 iox_catalog    | parquet_file | 38 GB      | 146489216 | parquet_location_unique           | 6076 MB    | Y      |       385294957 |         109448 |         109445
````

and at `2023-06-01T13:00:00Z`:

```text
 namespace_name |  table_name  | table_size | num_rows  |            index_name             | index_size | unique | number_of_scans |  tuples_read   | tuples_fetched
----------------+--------------+------------+-----------+-----------------------------------+------------+--------+-----------------+----------------+----------------
 iox_catalog    | parquet_file | 43 GB      | 152684560 | parquet_file_deleted_at_idx       | 6976 MB    | N      |      1693535032 | 21602834620294 |    21736731439
 iox_catalog    | parquet_file | 43 GB      | 152684560 | parquet_file_partition_delete_idx | 21 MB      | N      |        31468423 |     7397141567 |      677909956
 iox_catalog    | parquet_file | 43 GB      | 152684560 | parquet_file_partition_idx        | 2464 MB    | N      |      1627977474 | 12604272924323 | 11088781876397
 iox_catalog    | parquet_file | 43 GB      | 152684560 | parquet_file_pkey                 | 8785 MB    | Y      |       492762975 |      493017342 |      492729691
 iox_catalog    | parquet_file | 43 GB      | 152684560 | parquet_file_table_delete_idx     | 241 MB     | N      |         1136317 |    24735561304 |      429892231
 iox_catalog    | parquet_file | 43 GB      | 152684560 | parquet_file_table_idx            | 2058 MB    | N      |         9288042 |   351240529272 |          27551
 iox_catalog    | parquet_file | 43 GB      | 152684560 | parquet_location_unique           | 6776 MB    | Y      |       399142416 |         124810 |         124807
````

Due to #7842 and #7894, the following indices are no longer used:

- `parquet_file_partition_idx`
- `parquet_file_table_idx`
2023-06-01 13:45:05 +00:00
Marco Neumann e14305ac33
feat: add index for compactor (#7894)
* fix: migration name

* feat: add index for compactor
2023-05-31 12:29:00 +00:00
Marco Neumann e1c1908a0b
refactor: add `parquet_file` PG index for querier (#7842)
* refactor: add `parquet_file` PG index for querier

Currently the `list_by_table_not_to_delete` catalog query is somewhat
expensive:

```text
iox_catalog_prod=> select table_id, sum((to_delete is NULL)::int) as n from parquet_file group by table_id order by n desc limit 5;
 table_id |  n
----------+------
  1489038 | 7221
  1489037 | 7019
  1491534 | 5793
  1491951 | 5522
  1513377 | 5339
(5 rows)

iox_catalog_prod=> EXPLAIN ANALYZE SELECT id, namespace_id, table_id, partition_id, object_store_id,
       min_time, max_time, to_delete, file_size_bytes,
       row_count, compaction_level, created_at, column_set, max_l0_created_at
FROM parquet_file
WHERE table_id = 1489038 AND to_delete IS NULL;
                                                                          QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on parquet_file  (cost=46050.91..47179.26 rows=283 width=200) (actual time=464.368..472.514 rows=7221 loops=1)
   Recheck Cond: ((table_id = 1489038) AND (to_delete IS NULL))
   Heap Blocks: exact=7152
   ->  BitmapAnd  (cost=46050.91..46050.91 rows=283 width=0) (actual time=463.341..463.343 rows=0 loops=1)
         ->  Bitmap Index Scan on parquet_file_table_idx  (cost=0.00..321.65 rows=22545 width=0) (actual time=1.674..1.674 rows=7221 loops=1)
               Index Cond: (table_id = 1489038)
         ->  Bitmap Index Scan on parquet_file_deleted_at_idx  (cost=0.00..45728.86 rows=1525373 width=0) (actual time=460.717..460.717 rows=4772117 loops=1)
               Index Cond: (to_delete IS NULL)
 Planning Time: 0.092 ms
 Execution Time: 472.907 ms
(10 rows)
```

I think this may also be because PostgreSQL kinda chooses the wrong
strategy, because it could just look at the existing index and filter
from there:

```text
iox_catalog_prod=> EXPLAIN ANALYZE SELECT id, namespace_id, table_id, partition_id, object_store_id,
       min_time, max_time, to_delete, file_size_bytes,
       row_count, compaction_level, created_at, column_set, max_l0_created_at
FROM parquet_file
WHERE table_id = 1489038;
                                                                    QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using parquet_file_table_idx on parquet_file  (cost=0.57..86237.78 rows=22545 width=200) (actual time=0.057..6.994 rows=7221 loops=1)
   Index Cond: (table_id = 1489038)
 Planning Time: 0.094 ms
 Execution Time: 7.297 ms
(4 rows)
```

However PostgreSQL doesn't know the cardinalities well enough. So
let's add a dedicated index to make the querier faster.

* feat: new migration system

* docs: explain dirty migrations
2023-05-31 10:56:32 +00:00
Dom Dwyer 9e0570f2bf
refactor: explicit submod for partition_template
Move the import into the submodule itself, rather than re-exporting it
at the crate level.

This will make it possible to link to the specific module/logic.
2023-05-30 15:13:20 +02:00
Dom Dwyer 2094b45c10
refactor(catalog): mark metrics() as test only
This method is used to enable tests - it's never intended to be used in
production code to access the underlying metric registry. The Catalog
trait is responsible for Catalog things, not acting as a dependency
injection for metrics.

The only current use of this is in test code, so no changes needed.
2023-05-24 17:38:10 +02:00
Carol (Nichols || Goulding) d91b75526f
fix: Clarify that the expect is on the Option, not the Result 2023-05-24 10:36:52 -04:00
Carol (Nichols || Goulding) efc817c2a8
fix: Remove From impl, leaving TablePartitionTemplateOverride::new as only creation mechanism
This makes it clearer that you do or do not have a custom table override
(in the first argument to `new`).
2023-05-24 10:36:52 -04:00
Carol (Nichols || Goulding) 46f7e3e48a
fix: Handle potential for data race in catalog table insertion by re-fetching if detected 2023-05-24 10:36:52 -04:00
Carol (Nichols || Goulding) 90cb4b6ed9
refactor: Extract a function for handling a table missing from the namespace cache 2023-05-24 10:36:52 -04:00
Carol (Nichols || Goulding) 73b09d895f
feat: Store and handle NULL partition_template database values
Treat them as the default partition template in the application, but
save space and avoid having to backfill the tables by having the
database values be NULL when no custom template has been specified.
2023-05-24 10:36:52 -04:00