2d46a364dc
This commit adds initial support for "soft" namespace deletion, where the actual records & data remain, but are no longer queryable / writeable. Soft deletion is eventually consistent - users can expect to continue writing to and reading from a bucket after issuing a soft delete call, until the various components either restart, or have their caches flushed. The components treat soft-deleted namespaces differently: * router: ignore soft deleted namespaces * ingester: accept soft deleted namespaces * compactor: accept soft deleted namespaces * querier: ignore soft deleted namespaces * various gRPC services: ignore soft deleted namespaces This ensures that the ingester & compactor do not see rows "vanishing" from the database, and continue to make forward progress. Writes for the deleted namespace that are buffered in the ingester will be persisted as normal, allowing us to support "un-delete" operations where the system is restored to a the state at which the delete was issued (rather than loosing the buffered data). Follow-on work is required to ensure GC drops the orphaned parquet files after the configured GC time, and optimisations such as not compacting parquet from soft-deleted namespaces seems like a trivial win. |
||
---|---|---|
.. | ||
migrations | ||
sqlite/migrations | ||
src | ||
.gitignore | ||
Cargo.toml | ||
README.md | ||
build.rs |
README.md
IOx Catalog
This crate contains the code for the IOx Catalog. This includes the definitions of namespaces, their tables, the columns of those tables and their types, what Parquet files are in object storage and delete tombstones. There's also some configuration information that the overall distributed system uses for operation.
To run this crate's tests you'll need Postgres installed and running locally. You'll also need to
set the INFLUXDB_IOX_CATALOG_DSN
environment variable so that sqlx will be able to connect to
your local DB. For example with user and password filled in:
INFLUXDB_IOX_CATALOG_DSN=postgres://<postgres user>:<postgres password>@localhost/iox_shared
You can omit the host part if your postgres is running on the default unix domain socket (useful on
macos because, by default, the config installed by brew install postgres
doesn't listen to a TCP
port):
INFLUXDB_IOX_CATALOG_DSN=postgres:///iox_shared
You'll then need to create the database. You can do this via the sqlx command line.
cargo install sqlx-cli
DATABASE_URL=<dsn> sqlx database create
cargo run -q -- catalog setup
cargo run -- catalog topic update iox-shared
This will set up the database based on the files in ./migrations
in this crate. SQLx also creates
a table to keep track of which migrations have been run.
NOTE: do not use sqlx database setup
, because that will create the migration table in the
wrong schema (namespace). Our catalog setup
code will do that part by using the same sqlx
migration module but with the right namespace setup.
Migrations
If you need to create and run migrations to add, remove, or change the schema, you'll need the
sqlx-cli
tool. Install with cargo install sqlx-cli
if you haven't already, then run sqlx migrate --help
to see the commands relevant to migrations.
Tests
To run the Postgres integration tests, ensure the above setup is complete first.
CAUTION: existing data in the database is dropped when tests are run, so you should use a
DIFFERENT database name for your test database than your INFLUXDB_IOX_CATALOG_DSN
database.
- Set
TEST_INFLUXDB_IOX_CATALOG_DSN=<testdsn>
env as above with theINFLUXDB_IOX_CATALOG_DSN
env var. The integration tests will pick up this value if set in your.env
file. - Set
TEST_INTEGRATION=1
- Run
cargo test -p iox_catalog
Schema namespace
All iox catalog tables are created in a iox_catalog
schema. Remember to set the schema search
path when accessing the database with psql
.
There are several ways to set the default search path, depending if you want to do it for your session, for the database or for the user.
Setting a default search path for the database or user may interfere with tests (e.g. it may make some test pass when they should fail). The safest option is set the search path on a per session basis. As always, there are a few ways to do that:
- you can type
set search_path to public,iox_catalog;
inside psql. - you can add (1) to your
~/.psqlrc
- or you can just pass it as a CLI argument with:
psql 'dbname=iox_shared options=-csearch_path=public,iox_catalog'