Commit Graph

23 Commits (73a7e6f0a55262978361dd9b632dffcc07556fd1)

Author SHA1 Message Date
Marco Neumann 86e8f05ed1
fix: make all catalog IDs 64bit (#4418)
Closes #4365.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-25 16:49:34 +00:00
kodiakhq[bot] e2439c0a4f
Merge branch 'main' into cn/sort-key-catalog 2022-04-04 16:54:48 +00:00
Dom Dwyer 61bc9c83ad refactor: add table_id index on column_name
After checking the postgres workload for the catalog in prod, this
missing index was noted as the cause of unexpectedly expensive plans for
simple queries.
2022-04-04 13:04:25 +01:00
Carol (Nichols || Goulding) c9bc70f03a
feat: Add optional sort_key column to partition table
Connects to #4195.
2022-04-01 15:45:51 -04:00
Paul Dix 6479e1fc8e
fix: add indexes to parquet_file (#4198)
Add indexes so compactor can find candidate partitions and specific partition files quickly.
Limit number of level 0 files returned for determining candidates. This should ensure that if comapction is very backed up, it will be able to work through the backlog without evaluating the entire world.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-01 09:59:39 +00:00
Marko Mikulicic 2c47d77a5b
fix: Backfill namespace_id in schema migration (#4177)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-30 16:31:26 +00:00
Carol (Nichols || Goulding) 5c8a80dca6
fix: Add an index to parquet_file to_delete 2022-03-29 08:15:26 -04:00
Carol (Nichols || Goulding) f3f792fd08
feat: Add namespace_id to the parquet_files table; object store paths need it 2022-03-29 08:15:26 -04:00
Carol (Nichols || Goulding) 67e13a7c34
fix: Change to_delete column on parquet_files to be a time (#4117)
Set to_delete to the time the file was marked as deleted rather than
true.

Fixes #4059.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-23 18:47:27 +00:00
Paul Dix 27999ff72f
feat: add compaction_level and created_at to parquet_file (#3972) 2022-03-10 15:56:57 +00:00
Dom Dwyer d31576b90c perf: get_table_persist_info indexes for joins
Adds indexes to the JOINed fields to reduce execution cost, as the
TableRepo::get_table_persist_info() is currently by far the most
expensive catalog operation.
2022-03-08 12:12:47 +00:00
Carol (Nichols || Goulding) 252ced7adf
feat: Add row count to the parquet_file record in the catalog (#3847)
Fixes #3842.
2022-02-24 15:20:50 +00:00
Marco Neumann d62a052394
feat: extend catalog so we can recover `ParquetChunk`s from it (#3852)
* refactor: less parquet data copying

* feat: `PartitionRepo::get_by_id`

* feat: `TableRepo::get_by_id`

* feat: `ParquetFile::file_size_bytes`

* feat: `ParquetFile::parquet_metadata`

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-24 13:16:15 +00:00
Luke Bond e19609ab7b
feat: routing service protection (#3807)
* chore: db migration for namespace table & column limits

* feat: impl table & column limits in catalog

* chore: improved comment in catalog

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-22 17:26:37 +00:00
Dom Dwyer 4d54f8b42c refactor: remove migration create schema 2022-02-17 14:41:32 +00:00
Dom Dwyer 3b378418f7 refactor: do not specify schema in migrations
Allow the caller to set the Postgres schema a migration should be
applied to, rather than restricting the migration to a specific,
hard-coded schema.

BREAKING CHANGE: manually adds a new migration that precedes the
existing migration to ensure the iox_catalog schema exists before
applying the migration. You'll probably have to drop any existing
databases and migrate from scratch:

    sqlx database drop; sqlx database create;
2022-02-17 14:15:58 +00:00
Marco Neumann 74c251febb
feat: allow IOx catalog to setup itself (no SQLx CLI required) (#3584)
* feat: allow IOx catalog to setup itself (no SQLx CLI required)

* refactor: use SQLx macro instead of hand-rolled build script
2022-01-31 15:07:38 +00:00
Paul Dix 41038721e1 feat: Add parquet file records to iox_catalog
* Adds ParquetFile and scaffolding to IOx catalog
* Changed the file_location in parquet_file to object_store_id which is a uuid
2022-01-19 14:14:54 -05:00
Paul Dix f36d66deb7 feat: Add Tombstone to Catalog
* Adds TombstoneId and Tombstone to the iox_catalog with associated interfaces
* Adds SequenceNumber new type for use with Tombstone
* Adds Timestamp new type for use with Tombstone
* Adds constraint to the Postgres schema to enforce tombstone uniqueness by table_id, sequencer_id, and sequence_number
2022-01-18 18:17:21 -05:00
Paul Dix b796d5e2d1 fix: query pool type and sequencer create 2022-01-17 10:00:33 -05:00
Dom aa6f118487 feat: iox_catalog sequencers (#3465)
* refactor: ensure sequencers are unique

Adds a unique constraint to ensure only one sequencer record exists for
each Kafka (topic, partition).

* test: use DSN from env for integration tests

Removes the hard-coded DSN, instead sourcing it from the DATABASE_URL
environment variable.

* docs: integration testing for iox_catalog

Documents the required steps in order to run the Postgres integration
tests for the iox_catalog crate.

* feat(iox_catalog): create & list sequencers

Adds support for interacting with the "sequencer" table.

* chore: update lockfile

Running cargo in iox_catalog generates a lockfile diff.
2022-01-17 10:00:31 -05:00
Paul Dix 8d6d9e679f refactor: update iox_catalog
Changed to use the iox_catalog schema in Postgres rather than public.
Updated talbe names to be singular.
Removed the connection_string from query_pool
2022-01-17 09:56:20 -05:00
Paul Dix 4764e71c54 feat: Add initial iox_catalog skeleton 2022-01-17 09:56:20 -05:00