* fix: do not allow operator token from being deleted
closes: https://github.com/influxdata/influxdb_pro/issues/819
* refactor: address PR feedback
* fix: add a word and clarifying colon
* fix: failing test
---------
Co-authored-by: Peter Barnett <peter.barnett03@gmail.com>
* feat: remove limit on LVC size
* fix: bad test case and incorrect info
* fix: more clarity and default value
* fix: light CLI polishes
* fix: bad snapshot
* feat: enable auth by default
- Removes `--bearer-token` support and starts the server with auth by
default.
- Adds `--without-auth` switch to start the server without any auth
* feat: changes for auth being turned off
when auth is turned off,
- disallow token endpoints (returns 405)
- remove hash column when querying tokens system table
* refactor: address PR feedback
This commit allows deletion of tokens by name. Below is an example,
`influxdb3 delete token --token-name _admin --token $CURRENT_ADMIN_TOKEN`
It needs user confirmation before proceeding with the delete
This commit adds TLS support to influxdb3 and allows users to pass in a
path to a key and cert file with the --tls-key and --tls-cert flags in
the serve command. It also adds the ability for every command to specify
a certificate authority for requests. This is mostly needed when the
cert is self signed, but there are other use cases for this.
The big thing is that most of our tests now use TLS by default. Included
are self signed certs for localhost and the the CA cert included in the
commit. Since these are *only* used for testing this should be fine to
include as they are not used in nor are they intended to be used in any
production system. The expiry has been set for 365 days and the file
perms are set to o600 like the original issue mentioned. The tests pass
with this restriction.
I've verified that the API works via curl with the self signed certs as
I did *not* need to pass in the -k option to bypass checking the certs
were valid. The same goes for our tests. They use the rootCA.pem file
to verify the self signed cert when connecting and reject it otherwise.
With this users can be confident that their queries are safely encrypted
during transport.
Note that TLS works for both FlightSQL and our normal APIs.
Closes#25774
* feat: generate persistable admin token
- this commit allows admin token creation using `influxdb3 create token
--admin` and also allows regeneration of admin token by `influxdb3
create token --admin --regenerate`
- `influxdb3_authz` crate hosts all low level token types and behaviour
- catalog log and snapshot types updated to use the token repo
- tests that relied on auth have been updated to use the new token
generation mechanism and new admin token generation/regeneration tests
have been added
* feat: list admin tokens
- allows listing admin tokens
- uses _internal db for token system table
- mostly test fixes due to _internal db
This commit restructures our tests to look like Enterprise in their
layout. We break cli.rs into it's own module, combine the server tests
and cli tests under one lib.rs file and handle the changes for
visibility and import paths needed to make things work. the packages
tests have been cfged out as a module so that it would not need to be
added on a per test basis. Note that those tests fail locally for me
currently, but it seems like we weren't testing these in CI at the
moment.
There is no issue for this.
This commit sets InfluxDB 3 Core to have a 72 hour limit for queries and
writes. What this means is that writes that contain historical data
older than 72 hours will be rejected and queries will filter out data
older than 72 hours. Core is intended to be a recent timeseries database
and performance over data older than 72 hours will degrade without a
garbage collector, a core feature of InfluxDB 3 Enterprise. InfluxDB 3
Enterprise does not have this write or query limit in place.
Note that this does *not* mean older data is deleted. Older data is
still accessible in object storage as Parquet files that can still be
used in other services and analyzed with dataframe libraries like pandas
and polars.
This commit does a few things:
- Uses timestamps in the year 2065 for tests as these should not break
for longer than many of us will be working in our lifetimes. This is
only needed for the integration tests as other tests use the
MockProvider for time.
- Filters the buffer and persisted files to only show data newer than
3 days ago
- Fixes the integration tests to work with the fact that writes older
than 3 days are rejected
_Follows #25737 (keeping in draft until that merges)_
Closes#25745
This PR provides both a CLI and underlying API for listing databases in the InfluxDB 3 Core server. Details are below.
There was already a method to list databases for the query executor for InfluxQL; this works by exposing that via the `HttpApi` in `influxdb3_server`.
However, one thing that we may address is that the query result for that uses `iox::database` as the column name. If we are removing references to `iox`, then we may want to just have it as `database`. I left it as is, for now, because I wanted to keep code churn down and wasn't sure why we use that prefix in the first place for the `SHOW DATABASES` and `SHOW RETENTION POLICIES` InfluxQL queries.
## Details
### CLI
This PR provides the `influxdb3 show` CLI:
```
influxdb3 show -h
List resources on the InfluxDB 3 Core server
Usage: influxdb3 show <COMMAND>
Commands:
databases List databases
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help information
```
with the ability to list databases:
```
influxdb3 show databases -h
List databases
Usage: influxdb3 show databases [OPTIONS]
Options:
-H, --host <HOST_URL> The host URL of the running InfluxDB 3 Core server [env: INFLUXDB3_HOST_URL=] [default: http://127.0.0.1:8181]
--token <AUTH_TOKEN> The token for authentication with the InfluxDB 3 Core server [env: INFLUXDB3_AUTH_TOKEN=]
--show-deleted Include databases that were marked as deleted in the output
--format <OUTPUT_FORMAT> The format in which to output the list of databases [default: pretty] [possible values: pretty, json, json_lines, csv]
-h, --help Print help information
```
Since this uses the query executor, we can pass a `--format` argument to get the output as JSON, CSV, or JSONL, but by default, it uses the `pretty` format:
```
influxdb3 show databases
+---------------+
| iox::database |
+---------------+
| bar |
+---------------+
```
The `--show-deleted` flag will have the `deleted` column displayed as well as any databases that have been marked as deleted:
```
influxdb3 show databases --show-deleted
+---------------------+---------+
| iox::database | deleted |
+---------------------+---------+
| bar | false |
| foo-20250105T202949 | true |
+---------------------+---------+
```
### API
The API to list databases can be invoked via:
```
GET /api/v3/configure/database
```
with optional parameters:
* `format`: `pretty`, `json`, `csv`, `parquet`, or `jsonl`
* `show_deleted`: `bool`, defaults to `false`
Note that `database` is singular in the API endpoint, to be consistent with the other database related create/delete API endpoints. We could change it to be plural `databases` if that is the convention we want to go with.
This commit removes the required fields restriction when using the CLI
or the API to create a new table. As users can't write via the line
protocol without a field this is fine and the schema will be updated on
write. This expands the test to check for the correct response code now
and make sure that we can both query the empty table and write new data
to it.
Closes#25735
* feat: create DB and Tables via REST and CLI
This commit does a few things:
1. It brings the database command naming scheme for types inline with
the rest of the CLI types
2. It brings the table command naming scheme for types inline with
the rest of the CLI types
3. Adds tests to check that the num of dbs is not exceeded and that you
cannot create more than one database with a given name.
4. Adds tests to check that you can create a table and put data into it
and querying it
5. Adds tests for the CLI for both the database and table commands
6. It creates an endpoint to create databases given a JSON blob
7. It creates an endpoint to create tables given a JSON blob
With this users can now create a database or table without first needing
to write to the database via the line protocol!
Closes#25640Closes#25641
This commit allows deleting (soft) a table. For an user, following
command will allow soft deleting a table (bar) in db (foo)
```
influxdb3 table delete --dbname foo --table bar --host $host
```
- Added `soft_delete_table` to `DatabaseManager` trait, which already
hosts `soft_delete_database` method. The code roughly follows the same
flow as db delete. Although like db schema, it does clone on write
because the reference is behind an Arc, `Arc::make_mut` is used in
this change.
- Moved db delete related cli parser under "manage" module that has both
db and table delete functionality
- Some minor tidyups (removing unused methods, renaming method so that
the order in name matches actual return type eg. `table_id_and_schema`,
should return (id, schema) and not (schema, id))
closes: https://github.com/influxdata/influxdb/issues/25561
* feat: drop/delete database
This commit allows soft deletion of database using `influxdb3 database
delete <db_name>` command. The write buffer and last value cache are
cleared as well.
closes: https://github.com/influxdata/influxdb/issues/25523
* feat: reuse same code path when deleting database
- In previous commit, the deletion of database immediately triggered
clearing last cache and query buffer. But on restarts same logic had
to be repeated to allow deleting database when starting up. This
commit removes immediate deletion by explicitly calling necessary
methods and moves the logic to `apply_catalog_batch` which already
applies `CatalogOp` and also clearing cache and buffer in
`buffer_ops` method which has hooks to call other places.
closes: https://github.com/influxdata/influxdb/issues/25523
* feat: use reqwest query api for query param
Co-authored-by: Trevor Hilton <thilton@influxdata.com>
* feat: include deleted flag in DatabaseSnapshot
- `DatabaseSchema` serialization/deserialization is delegated to
`DatabaseSnapshot`, so the `deleted` flag should be included in
`DatabaseSnapshot` as well.
- insta test snapshots fixed
closes: https://github.com/influxdata/influxdb/issues/25523
* feat: address PR comments + tidy ups
---------
Co-authored-by: Trevor Hilton <thilton@influxdata.com>
Closes#25461
_Note: the first three commits on this PR are from https://github.com/influxdata/influxdb/pull/25492_
This PR makes the switch from using names for columns to the use of `ColumnId`s. Where column names are used, they are represented as `Arc<str>`. This impacts most components of the system, and the result is a fairly sizeable change set. The area where the most refactoring was needed was in the last-n-value cache.
One of the themes of this PR is to rely less on the arrow `Schema` for handling the column-level information, and tracking that info in our own `ColumnDefinition` type, which captures the `ColumnId`.
I will summarize the various changes in the PR below, and also leave some comments in-line in the PR.
## Switch to `u32` for `ColumnId`
The `ColumnId` now follows the `DbId` and `TableId`, and uses a globally unique `u32` to identify all columns in the database. This was a change from using a `u16` that was only unique within the column's table. This makes it easier to follow the patterns used for creating the other identifier types when dealing with columns, and should reduce the burden of having to manage the state of a table-scoped identifier.
## Changes in the WAL/Catalog
* `WriteBatch` now contains no names for tables or columns and purely uses IDs
* This PR relies on `IndexMap` for `_Id`-keyed maps so that the order of elements in the map is consistent. This has important implications, namely, that when iterating over an ID map, the elements therein will always be produced in the same order which allows us to make assertions on column order in a lot of our tests, and allows for the re-introduction of `insta` snapshots for serialization tests. This map type provides O(1) lookups, but also provides _fast_ iteration, which should help when serializing these maps in write batches to the WAL.
* Removed the need to serialize the bi-directional maps for `DatabaseSchema`/`TableDefinition` via use of `SerdeVecMap` (see comments in-line)
* The `tables` map in `DatabaseSchema` no stores an `Arc<TableDefinition>` so that the table definition can be shared around more easily. This meant that changes to tables in the catalog need to do a clone, but we were already having to do a clone for changes to the DB schema.
* Removal of the `TableSchema` type and consolidation of its parts/functions directly onto `TableDefinition`
* Added the `ColumnDefinition` type, which represents all we need to know about a column, and is used in place of the Arrow `Schema` for column-level meta-info. We were previously relying heavily on the `Schema` for iterating over columns, accessing data types, etc., but this gives us an API that we have more control over for our needs. The `Schema` is still held at the `TableDefinition` level, as it is needed for the query path, and is maintained to be consistent with what is contained in the `ColumnDefinition`s for a table.
## Changes in the Last-N-Value Cache
* There is a bigger distinction between caches that have an explicit set of value columns, and those that accept new fields. The former should be more performant.
* The Arrow `Schema` is managed differently now: it used to be updated more than it needed to be, and now is only updated when a row with new fields is pushed to a cache that accepts new fields.
## Changes in the write-path
* When ingesting, during validation, field names are qualified to their associated column ID
Adds an API for deleting last caches.
- The API allows parameters to be passed in either the request URI query string, or in the body as JSON
- Some additional error modes were handled, specifically, for better HTTP status code responses, e.g., invalid content type is now a 415, URL query string parsing errors are now 400
- An end-to-end test was added to check behaviour of the API
Closes#25096
- Adds a new HTTP API that allows the creation of a last cache, see the issue for details
- An E2E test was added to check success/failure behaviour of the API
- Adds the mime crate, for parsing request MIME types, but this is only used in the code I added - we may adopt it in other APIs / parts of the HTTP server in future PRs