Commit Graph

346 Commits (dac0db21960c871c298924269d198a8b01849724)

Author SHA1 Message Date
Marco Neumann 743a59aa64
feat: use single per-migration txn when possible (#8373)
* test: improve `test_step_sql_statement_no_transaction`

* feat: also print number of steps in "applying migration step"

* feat: use single per-migration txn when possible

If all steps can (and want) to run in a transaction block, then wrap the
migration bookkeeping and the migration script into a single
transaction. This way we avoid the dirty state altogether because its
now an "all or nothing" migration.

Note that we still guarantee that there is only a single migration
running at the same time due to the locking mechanism. Otherwise we
would potentially run into nasty transaction failures during schema
modifications.

This is related to #7897 but only fixes / self-heals the "dirty" state
for transaction that can run in transactions. For concurrent index
migrations (which we need in prod) we need to be a bit smarter and this
will be done in a follow-up. However I feel that not leaving half-done
migrations for the cases where it's technically possible (e.g. adding
columns) is already a huge step forward.

* test: make `test_migrator_uses_single_transaction_when_possible` harder

* test: explain test

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-01 08:18:39 +00:00
Carol (Nichols || Goulding) 4a9e76b8b7
feat: Make parquet_file.partition_id optional in the catalog (#8339)
* feat: Make parquet_file.partition_id optional in the catalog

This will acquire a short lock on the table in postgres, per:
<https://stackoverflow.com/questions/52760971/will-making-column-nullable-lock-the-table-for-reads>

This allows us to persist data for new partitions and associate the
Parquet file catalog records with the partition records using only the
partition hash ID, rather than both that are used now.

* fix: Support transition partition ID in the catalog service

* fix: Use transition partition ID in import/export

This commit also removes support for the `--partition-id` flag of the
`influxdb_iox remote store get-table` command, which Andrew approved.

The `--partition-id` filter was getting the results of the catalog gRPC
service's query for Parquet files of a table and then keeping only the
files whose partition IDs matched. The gRPC query is no longer returning
the partition ID from the Parquet file table, and really, this command
should instead be using `GetParquetFilesByPartitionId` to only request
what's needed rather than filtering.

* feat: Support looking up Parquet files by either kind of Partition id

Regardless of which is actually stored on the Parquet file record.

That is, say there's a Partition in the catalog with:

Partition {
    id: 3,
    hash_id: abcdefg,
}

and a Parquet file that has:

ParquetFile {
    partition_hash_id: abcdefg,
}

calling `list_by_partition_not_to_delete(PartitionId(3))` should still
return this Parquet file because it is associated with the partition
that has ID 3.

This is important for the compactor, which is currently only dealing in
PartitionIds, and I'd like to keep it that way for now to avoid having
to change Even More in this PR.

* fix: Use and set new partition ID fields everywhere they want to be

---------

Co-authored-by: Dom <dom@itsallbroken.com>
2023-07-31 12:40:56 +00:00
Marco Neumann 73339cfc57
fix: remove sqlx "used" metrics (#8336)
PR #8327 introduced a bunch of metrics for the sqlx connection pool. One
of the metrics was the "used" metrics that was supposed to count
"currently in use" connection. In prod however this metric underflows to
a very large integer. It seems that "acquire" callback is only used by sqlx for
re-used connections (i.e. for the transition from "idle" to "used").
Now we could try to work around it but since there is no "close
connection" callback, I doubt it it possible to do the accurately.

Luckily though we don't really need that counter. sqlx already offers
"active" (defined as idle + used) and "idle", so getting "used" is just
the difference. I removed the "used" metric nevertheless because
"active" and "idle" are read independently from each other (based on atomic
integers) and are NOT guaranteed to be in-sync. Calculating the
difference within IOx however would give the illusion that they are. So
I leave this to the dashboard / alert / whatever, because there it is
usually understood that metrics are samples and may be out of sync for a
very short time.

A nice side effect of this change is that it simplifies the code quite a
bit.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-27 10:04:56 +00:00
Marco Neumann b62e98cef1
feat: metrics for sqlx conn pools (#8327)
To better gauge how many connections we use and especially if we hit the
max connection limit, it would be helpful to actually have some metrics
available for the pool usage. This change adds a few basic metrics.
2023-07-25 10:07:25 +00:00
Joe-Blount 629f9d20db fix: update new_file_at following all compactions 2023-07-20 13:27:54 -05:00
Marco Neumann 4e88571142
feat: add batch partition getters (#8268) 2023-07-19 15:05:41 +00:00
Marco Neumann 004b401a05
chore: upgrade to sqlx 0.7.1 (#8266)
There are a bunch of dependencies in `Cargo.lock` that are related to
mysql. These are NOT compiled at all, and are also not part of `cargo
tree`. The reason for the inclusion is a bug in cargo:

https://github.com/rust-lang/cargo/issues/10801

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-19 12:18:57 +00:00
Carol (Nichols || Goulding) 22c17fb970
feat: Abstract over which partition ID type we're using to list Parquet files 2023-07-10 13:40:01 -04:00
Carol (Nichols || Goulding) c1e42651ec
feat: Abstract over which partition ID type we're using to compare and swap sort keys 2023-07-10 13:39:19 -04:00
Carol (Nichols || Goulding) eec31b7f00
feat: Abstract over which partition ID type we're using to get a partition from the catalog 2023-07-10 10:43:20 -04:00
Marco Neumann 9c65185068
refactor: normalize catalog metric names (#8152)
Use the same prefix for all metrics of the same repo type. This makes
reading dashboards way easier.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-05 09:18:39 +00:00
Carol (Nichols || Goulding) 60d0858381
feat: Add catalog method for looking up partitions by their hash ID (#8018)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-23 14:42:50 +00:00
Carol (Nichols || Goulding) 62ab8d21c2
fix: Eliminate need to have 2 separate insert statements depending on presence of hash ID
I figured out that the reason inserting `Option<PartitionHashId>` was
giving me a compiler error that `Encode` wasn't implemented was because
I only implemented `Encode` for `&PartitionHashId` and sqlx only
implements `Encode` for `Option<T: Encode>`, not `Option<T> where &T:
Encode`. Using `as_ref` makes this work and gets rid of the `match` that
created two different queries (one of which was wrong!)

Also add tests that we can insert Parquet file records for partitions
that don't have hash IDs to ensure we don't break ingest of new data for
old-style partitions.
2023-06-22 09:01:22 -04:00
Carol (Nichols || Goulding) bffb2f8f9f
fix: Specialize Partition constructors to clarify appropriate usage 2023-06-22 09:01:22 -04:00
Carol (Nichols || Goulding) 62ba18171a
feat: Add a new hash column on the partition and parquet file tables
This will hold the deterministic ID for partitions.

Until all existing partitions have this value, this is optional/nullable.

The row ID still exists and is used as the main foreign key in the
parquet_file and skipped_compaction tables.

The hash_id has a unique index so that we can look up records based on
it (if it's available).

If the parquet file record has a partition_hash_id value, use that to
generate the object storage path instead of the partition_id.
2023-06-22 09:01:22 -04:00
Phil Bracikowski 0c9cc2c1ed
chore(garbage-collector): increase batch size for clearing deleted files (#8009)
This pr increases the const for the number of parquet files to remove
from the catalog that are soft deleted and older than a configurable
cutoff.

* closes #8008

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-16 15:22:13 +00:00
Phil Bracikowski e34ec77e8d
feat(garbage-collector): batch parquet existence checks to catalog (#7964)
* feat(garbage-collector): batch parquet existence checks to catalog

The core feature of this PR is batching the existence checks of parquet
files in object store against the catalog. Before, there was 1 catalog
query per each parquet file in object store. This can be a lot of
requests.

This PR can perform one query of at most 100 parquet file uuids against
the catalog in one query. A hundred seems like a decent starting place.

The batch may not reach 100 because there is also a timeout on receiving
object store meta objects from the object store lister thread. That
timeout is set to 100 milliseconds. If more than 100 are received, they
are batched into 100 for the catalog.

Additionally, this PR includes surrounding code changes to make it more
idiomatic (but not perfect). It follows up some suggested work from
 #7652 for watching for shutdown on the threads.

* fixes #7784

* use hashset instead of vec to test for contains
* chore: add test for db failure path
* remove ParquetFileExistsByOSID and other single field structs that are
  just for sql deserialization; map to uuid explicitly
* fix the sqlite query by using a blob literal X'<hex>' for uuids
* comment clarifications
* adjust loggings to warn from debug for expected rare events

Many thanks to Carol for help implementing this!
2023-06-14 07:59:00 -07:00
Carol (Nichols || Goulding) 566ec68c58
refactor: Extract a test helper method for creating ParquetFileParams (#7959)
Co-authored-by: Dom <dom@itsallbroken.com>
2023-06-09 09:44:30 +00:00
Marko Mikulicic d26ad8e079
feat: Allow passing service protection limits in create db gRPC call (#7941)
* feat: Allow passing service protection limits in create db gRPC call

* fix: Move the impl into the catalog namespace trait

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-08 14:28:32 +00:00
Phil Bracikowski 92a83270f3
fix(garbage-collector): just test parquet file exists (#7948)
* fix(garbage-collector): just test parquet file existence

The GC, when checking files in object store against the catalog, only
cares if the parquet file for the given object store id exists in the
catalog. It doesn't need the full parquet file. Let's not transmit it
over the wire.

This PR uses a SELECT 1 and boolean to test for parquet file existing.

* helps #7784

* chore: use struct for from_row

* chore: satisfy clippy

* chore: fmt
2023-06-07 15:12:48 -07:00
Carol (Nichols || Goulding) 2becc950e1
fix: Use expect rather than returning error in a theoretically impossible case 2023-06-07 11:38:12 -04:00
Carol (Nichols || Goulding) ac26ceef91
feat: Make a place to do partition template validation
- Create data_types::partition_template::ValidationError
- Make creation of NamespacePartitionTemplateOverride and
  TablePartitionTemplateOverride fallible
- Move SerializationWrapper into a module to make its inner field
  private to force creation through one fallible constructor; this is
  where the validation logic will go to be shared among all uses of
  partition templates
2023-06-07 11:38:12 -04:00
Carol (Nichols || Goulding) 7f1d4a5bcd
fix: Create tag columns used in table partition template in a transaction with table create 2023-06-05 11:21:33 -04:00
Carol (Nichols || Goulding) bf69f17b3f
test: Add checks for tag columns being added by table creation to the catalog tests
These currently fail because the implementation still only exists in the
table grpc service.
2023-06-05 10:24:45 -04:00
Marco Neumann 86a2c249ec
refactor: faster PG `ParquetFileRepo` (#7907)
* refactor: remove `ParquetFileRepo::flag_for_delete`

* refactor: batch update parquet files in catalog

* refactor: avoid data roundtrips through postgres

* refactor: do not return ID from PG when we do not need it

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-01 16:17:28 +00:00
Marco Neumann e1c1908a0b
refactor: add `parquet_file` PG index for querier (#7842)
* refactor: add `parquet_file` PG index for querier

Currently the `list_by_table_not_to_delete` catalog query is somewhat
expensive:

```text
iox_catalog_prod=> select table_id, sum((to_delete is NULL)::int) as n from parquet_file group by table_id order by n desc limit 5;
 table_id |  n
----------+------
  1489038 | 7221
  1489037 | 7019
  1491534 | 5793
  1491951 | 5522
  1513377 | 5339
(5 rows)

iox_catalog_prod=> EXPLAIN ANALYZE SELECT id, namespace_id, table_id, partition_id, object_store_id,
       min_time, max_time, to_delete, file_size_bytes,
       row_count, compaction_level, created_at, column_set, max_l0_created_at
FROM parquet_file
WHERE table_id = 1489038 AND to_delete IS NULL;
                                                                          QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
 Bitmap Heap Scan on parquet_file  (cost=46050.91..47179.26 rows=283 width=200) (actual time=464.368..472.514 rows=7221 loops=1)
   Recheck Cond: ((table_id = 1489038) AND (to_delete IS NULL))
   Heap Blocks: exact=7152
   ->  BitmapAnd  (cost=46050.91..46050.91 rows=283 width=0) (actual time=463.341..463.343 rows=0 loops=1)
         ->  Bitmap Index Scan on parquet_file_table_idx  (cost=0.00..321.65 rows=22545 width=0) (actual time=1.674..1.674 rows=7221 loops=1)
               Index Cond: (table_id = 1489038)
         ->  Bitmap Index Scan on parquet_file_deleted_at_idx  (cost=0.00..45728.86 rows=1525373 width=0) (actual time=460.717..460.717 rows=4772117 loops=1)
               Index Cond: (to_delete IS NULL)
 Planning Time: 0.092 ms
 Execution Time: 472.907 ms
(10 rows)
```

I think this may also be because PostgreSQL kinda chooses the wrong
strategy, because it could just look at the existing index and filter
from there:

```text
iox_catalog_prod=> EXPLAIN ANALYZE SELECT id, namespace_id, table_id, partition_id, object_store_id,
       min_time, max_time, to_delete, file_size_bytes,
       row_count, compaction_level, created_at, column_set, max_l0_created_at
FROM parquet_file
WHERE table_id = 1489038;
                                                                    QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
 Index Scan using parquet_file_table_idx on parquet_file  (cost=0.57..86237.78 rows=22545 width=200) (actual time=0.057..6.994 rows=7221 loops=1)
   Index Cond: (table_id = 1489038)
 Planning Time: 0.094 ms
 Execution Time: 7.297 ms
(4 rows)
```

However PostgreSQL doesn't know the cardinalities well enough. So
let's add a dedicated index to make the querier faster.

* feat: new migration system

* docs: explain dirty migrations
2023-05-31 10:56:32 +00:00
Dom Dwyer 9e0570f2bf
refactor: explicit submod for partition_template
Move the import into the submodule itself, rather than re-exporting it
at the crate level.

This will make it possible to link to the specific module/logic.
2023-05-30 15:13:20 +02:00
Dom Dwyer 2094b45c10
refactor(catalog): mark metrics() as test only
This method is used to enable tests - it's never intended to be used in
production code to access the underlying metric registry. The Catalog
trait is responsible for Catalog things, not acting as a dependency
injection for metrics.

The only current use of this is in test code, so no changes needed.
2023-05-24 17:38:10 +02:00
Carol (Nichols || Goulding) d91b75526f
fix: Clarify that the expect is on the Option, not the Result 2023-05-24 10:36:52 -04:00
Carol (Nichols || Goulding) efc817c2a8
fix: Remove From impl, leaving TablePartitionTemplateOverride::new as only creation mechanism
This makes it clearer that you do or do not have a custom table override
(in the first argument to `new`).
2023-05-24 10:36:52 -04:00
Carol (Nichols || Goulding) 46f7e3e48a
fix: Handle potential for data race in catalog table insertion by re-fetching if detected 2023-05-24 10:36:52 -04:00
Carol (Nichols || Goulding) 90cb4b6ed9
refactor: Extract a function for handling a table missing from the namespace cache 2023-05-24 10:36:52 -04:00
Carol (Nichols || Goulding) 73b09d895f
feat: Store and handle NULL partition_template database values
Treat them as the default partition template in the application, but
save space and avoid having to backfill the tables by having the
database values be NULL when no custom template has been specified.
2023-05-24 10:36:52 -04:00
Carol (Nichols || Goulding) a22d809cdf
test: Create an overridden namespace, and create a table from it (no override), read it back and assert the expected partitioning scheme is derived 2023-05-24 10:34:30 -04:00
Carol (Nichols || Goulding) 2ab3ea03b8
test: Create a default (not overridden) namespace, read it back, assert the expected partitioning scheme is derived 2023-05-24 10:34:30 -04:00
Carol (Nichols || Goulding) 9c0faa66f0
feat: Set a table partition template explicitly or from the namespace
And use the table partition template when partitioning writes to that
table.
2023-05-24 10:34:30 -04:00
Carol (Nichols || Goulding) 604bab9508
fix: Make Table create_or_get be only create 2023-05-24 10:34:30 -04:00
Carol (Nichols || Goulding) afb3838437
feat: Optionally supply the namespace partition template when creating a namespace 2023-05-24 10:10:34 -04:00
Marco Neumann 103e814f22
refactor: clean up catalog `parquet_files` interface (#7853)
* feat: `ParquetFileRepo::list_all`

* refactor: remove `ParquetFileRepo::list_by_table`

* refactor: simlify `ParquetFileRepo::list_by_table`

* refactor: remove `ParquetFileRepo::count`

* refactor: remove `ParquetFileRepo::update_compaction_level`

* refactor: remove `ParquetFileRepo::exists`

* fix: test

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-24 09:15:03 +00:00
Dom Dwyer 928a4d163e
build: remove unused dependencies from crates
This commit fixes loads of crates (47!) had unused dependencies, or
mis-configured dependencies (test deps as normal deps).

I added the "unused_crate_dependencies" to all crates to help prevent
this mess from growing again!

    https://doc.rust-lang.org/beta/nightly-rustc/rustc_lint_defs/builtin/static.UNUSED_CRATE_DEPENDENCIES.html

This has the minor downside of false-positives when specifying
dev-dependencies for test/bench binaries - these are files in /test or
/benches (not normal tests). This commit includes a workaround,
importing them in lib.rs (gated by a feature flag). I think the
trade-off of better dependency management is worth it!
2023-05-23 14:55:43 +02:00
Andrew Lamb 6344fe8c3f
chore: Add rationale for `clippy::future_not_send` (#7822)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-05-18 16:58:56 +00:00
Andrew Lamb 1ff11d0856
refactor: Change catalog configuration so it is entirely dsn based / support end to end testing without postgres (#7736)
* refactor: Change catalog configuration so it is entirely dsn based / support end to end testing without postgres

Restores code from https://github.com/influxdata/influxdb_iox/pull/7708

Revert "revert: PR #7708"

This reverts commit c9cfe05f8d.

* fix: merge

* fix: Update new test
2023-05-17 13:36:25 +00:00
Jeffrey Smith II 45628ba4d3 chore: Merge remote-tracking branch 'origin/main' into smith/remove-transactions-main 2023-05-16 08:49:25 -04:00
Carol (Nichols || Goulding) 7268ea5c29
refactor: Extract a test helper function to create a basic table 2023-05-15 14:31:24 -04:00
Carol (Nichols || Goulding) 57bedb1c2d
refactor: Extract a test helper function to create a basic namespace 2023-05-15 14:20:38 -04:00
Jeffrey Smith II 09dc622078 chore: Merge remote-tracking branch 'origin/main' into smith/remove-transactions-main 2023-05-15 10:06:11 -04:00
Kaya Gökalp 5fe8affb18
refactor: accept NamespaceName with Namespace create (#7774)
Co-authored-by: Dom <dom@itsallbroken.com>
2023-05-15 10:03:55 +00:00
Jeffrey Smith II e5ee47c32e chore: code cleanup 2023-05-11 13:21:15 -05:00
Jeffrey Smith II a9ee1bae6c chore: Merge remote-tracking branch 'origin/main' into smith/remove-transactions-main 2023-05-09 13:26:02 -05:00
Carol (Nichols || Goulding) e8a480f5f6
fix: Give up ownership of Column when adding to a table
To enable reuse of existing allocations rather than borrowing, creating
new allocations, then dropping them.
2023-05-09 14:55:00 +02:00