Commit Graph

65 Commits (main)

Author SHA1 Message Date
Trevor Hilton c7854363c4
chore: back-port changes to shutdown code from enterprise (#26206)
* refactor: make ShutdownManager Clone

ShutdownManager can be clone since its underlying types from tokio are
all shareable via clone.

* refactor: make ShutdownToken not Clone

Alters the API so that the ShutdownToken is not cloneable. This will help
ensure that the Drop implementation is invoked from the correct place.
2025-04-01 11:32:23 -04:00
Trevor Hilton 24887770ef
feat: shutdown on WAL overwritten (#26203)
* feat: trigger shutdown if wal has been overwritten

WAL persist uses PutMode::Create in order to invoke shutdown if another
process writes to the WAL ahead of it.

A test was added to check that it works from CLI test suite.

* chore: clippy
2025-03-31 12:58:18 -04:00
Trevor Hilton eda2fc9b21
refactor: ensure shutdown complete via Drop impl (#26202)
This ensures a ShutdownToken will invoke `complete` by calling it from
its `Drop` implementation. This means registered components are not
required to signal completion, but can if needed.

Some comments and other cleanup refactoring was done as well.
2025-03-31 10:25:55 -04:00
Trevor Hilton 9401137825
feat: handle graceful shutdown (#26197)
* feat: add influxdb3_shutdown crate

provides basic wait methods for unix/windows OS's

* feat: graceful shutdown

* docs: add rust docs and test to influxdb3_shutdown

Added rustdoc comments to types and methods in the influxdb3_shutdown
crate as well as a test that shows the ordering of a shutdown.
2025-03-31 09:58:40 -04:00
Trevor Hilton b6cb6dd51e
chore: back-port catalog debug log cleanup from enterprise (#26128)
* chore: back-port debug log cleanup for catalog

* chore: back-port debug log cleanup for wal

* chore: back-port debug log cleanup for write
2025-03-12 13:20:21 -04:00
Trevor Hilton 72dc4458fd
chore: backport changes to catalog from enterprise (#26116)
* chore: backport changes to influxdb3_catalog crate

* chore: backport changes to influxdb3_cache crate

* chore: backport changes to influxdb3_write crate

* chore: backport changes to influxdb3_proc_eng crate

* chore: backport influxdb3 crate changes for catalog

* chore: backport changes to influxdb3_id crate

* chore: backport changes to influxdb3_wal crate

* chore: backport changes to influxdb3_clap_blocks crate

* chore: backport changes to influxdb3_client crate

* chore: backport influxdb3_server crate changes

* chore: fix after full backport

* fix: ordering of catalog broadcast
2025-03-11 12:11:51 -04:00
Jackson Newhouse c930d9eef8
feat(processing_engine): error handling for triggers. (#26086) 2025-03-04 09:32:58 -08:00
Michael Gattozzi 1f72bfcc33
feat: Update to Rust 1.85 and 2024 Edition (#26046) 2025-02-20 14:58:07 -05:00
Jackson Newhouse b0a24220c0
feat(processing_engine): Allow async plugin execution. (#25994) 2025-02-13 09:08:19 -08:00
praveen-influx 3058140faf
feat: remove unnecessary item in log (#25974) 2025-02-05 21:45:46 +00:00
Jackson Newhouse d9dd8a32a2
fix(processing_engine): Use the configured request path for Request plugins. (#25945) 2025-01-31 10:36:47 -08:00
Paul Dix d49276a7fb
feat: Refactor plugins to only require creating trigger (#25914)
This refactors plugins and triggers so that plugins no longer need to be "created". Since plugins exist in either the configured local directory or on the Github repo, a user now only needs to create a trigger and reference the plugin filename.

Closes #25876
2025-01-27 11:26:46 -05:00
Michael Gattozzi 43e186d761
feat: add no_sync write_lp param for fast writes (#25902) 2025-01-24 13:34:38 -05:00
Trevor Hilton d451ef0de6
refactor: writer-id to node-id (#25905) 2025-01-23 18:09:24 -05:00
Jackson Newhouse f1ea2d8747
feat(processing_engine): Add every mode for scheduled plugins. (#25891) 2025-01-23 11:22:57 -08:00
Trevor Hilton 44ca7a4d36
refactor: reduce catalog locks when getting chunks (#25896)
* refactor: reduce catalog locks when getting chunks

The main refactor was to change the ChunkContainer trait to use the
DatabaseSchema and TableDefinition types directly in the signature, vs.
the names, which then required an additional catalog lock and lookups for
both entities. This was already handled upstream in the QueryTable, so
there was no need to do the lookups again.

This required the addition of a test helper in influxdb3_write::test_helpers
that provides convenience methods for getting record batches from the
WriteBuffer. We have been implementing such a method manually in several
places, so this is nice to have it unified. This provides a blanket impl
so that anything implementing WriteBuffer gets the method.

Some other house cleaning was included.

* refactor: clean up test helpers in influxdb3_write

* refactor: pass original df filters forward with ChunkFilter

* chore: clippy
2025-01-22 14:38:46 -05:00
Paul Dix e87f8862b3
feat: Add request plugin capability (#25864)
* feat: Add request plugin capability

Adds the request plugin type. Triggers can be bound to an API endpoint at /api/v3/engine/<path>. Requests will get yielded to the plugin with the query parameters, request parameters, and request body.

I didn't implement the test endpoint for this plugin type as it seems much more natural for users to save the file and make a new request. Once #25863 is done it'll make it very easy.

Closes #25862

* chore: fix spelling in error message
2025-01-21 20:22:27 -08:00
Jackson Newhouse 1d8d3d66fc
feat(processing_engine): Add cron plugins and triggers to the processing engine. (#25852)
* feat(processing_engine): Add cron plugins and triggers to the processing engine.

* feat(processing_engine): switch from 'cron plugin' to 'schedule plugin', use TimeProvider.

* feat(processing_engine): add test for test scheduled plugin.
2025-01-18 07:18:18 -05:00
praveen-influx 6ebbf26763
refactor: update tests for wal file removal (#25846)
* refactor: update tests for wal file removal

- update the last wal file seen first so that removal doesn't
  wait for one more cycle
- added the worked out example test
- minor tidy ups (introduce inner so that block scopes are delegated)

* refactor: address PR feedback
2025-01-16 18:47:30 +00:00
praveen-influx 4eccc38129
fix: reproducer for the empty snapshot file issue (#25835)
* fix: reproducer for the empty snapshot file issue

* fix: avoid creating empty (0 dbs) snapshot file
2025-01-15 20:01:57 +00:00
Trevor Hilton db24a62658
refactor: change host-id to writer-id (#25804)
This changes the CLI arg `host-id` to `writer-id` to more accurately
indicate meaning.

This changes also goes through the codebase and changes struct fields,
methods, and variables to use the term `writer_id` or `writer_identifier_prefix`
instead of `host_id` etc., to make the meaning clear in the code.

This also changes the catalog serialization to use the field `writer_id`
instead of `host_id`, which is breaking change.
2025-01-12 11:40:47 -05:00
Paul Dix 491a37b0d4
feat: Update create plugin to use server file (#25803)
This updates the create plugin API and CLI so that it doesn't take the pugin code, but instead takes a file name of a file that must be in the plugin-dir of the server. It returns an error if the plugin-dir is not configured or if the file isn't there.

Also updates the WAL and catalog so that it doesn't store the plugin code directly. The code is read from disk one time when the plugin runs.

Closes #25797
2025-01-11 21:02:51 -05:00
praveen-influx 50963443a4
feat: introduce num wal files to keep (#25801)
* feat: introduce num wal files to keep

This commit allows a configurable number of wal files to be left behind
in OS. This is necessary as enterprise replicas rely on these files.

closes: https://github.com/influxdata/influxdb/issues/25788

* refactor: address PR feedback

* refactor: address PR comment
2025-01-12 00:33:13 +00:00
Trevor Hilton 0bdc2fa953
chore: patch enterprise back to core (#25798) 2025-01-11 17:26:41 -05:00
Trevor Hilton 1ff4f76896
feat: only load wal files after most recent snapshot (#25787) 2025-01-11 10:27:58 -05:00
Paul Dix e8422a240a
feat: Wire up arguments to wal plugin trigger (#25783)
This allows the user to specify arguments that will be passed to each execution of a wal plugin trigger. The CLI test was updated to check this end to end.

Closes #25655
2025-01-10 16:58:18 -05:00
Paul Dix 0da0785960
feat: Finish wiring up WAL plugin with trigger (#25781)
This updates the WAL so that it can have new file notifiers added that will get updated when the wal flushes. The processing engine now implements the WALNotifier trait.

I've updated the CLI test for creating a trigger to run and end-to-end test that defines a plugin, creates a trigger, writes data into the database, triggering the plugin, which writes summary statistics back into the database in a different table. The test queries the destination table to confirm that the plugin worked.
2025-01-10 12:56:49 -05:00
Trevor Hilton c71dafc313
refactor: rename metadata cache to distinct value cache (#25775) 2025-01-10 08:48:51 -05:00
Paul Dix 7230148b58
feat: Update WAL plugin for new structure (#25777)
* feat: Update WAL plugin for new structure

This ended up being a very large change set. In order to get around circular dependencies, the processing engine had to be moved into its own crate, which I think is ultimately much cleaner.

Unfortunately, this required changing a ton of things. There's more testing and things to add on to this, but I think it's important to get this through and build on it.

Importantly, the processing engine no longer resides inside the write buffer. Instead, it is attached to the HTTP server. It is now able to take a query executor, write buffer, and WAL so that the full range of functionality of the server can be exposed to the plugin API.

There are a bunch of system-py feature flags littered everywhere, which I'm hoping we can remove soon.

* refactor: PR feedback
2025-01-10 05:52:33 -05:00
praveen-influx aa9213c4f4
feat: check mem and force snapshot (#25767)
This commit allows checking memory in the background and force
snapshotting if query buffer size is > mem threshold. This hooks into
the function (`force_flush_buffer`) to achieve it.

closes: https://github.com/influxdata/influxdb/issues/25685
2025-01-09 18:40:14 +00:00
praveen-influx 6e2e39cd4c
feat: snapshot when wal buffer is empty (#25765)
* feat: snapshot when wal buffer is empty

- This commit changes the functionality to allow snapshots to happen even when
  wal buffer is empty. For snapshots wal periods are still required but
  not the wal buffer. To allow this, we write a no-op into wal file with
  snapshot details. This enables force snapshotting functionality

closes: https://github.com/influxdata/influxdb/issues/25685

* refactor: address PR feedback
2025-01-09 12:12:37 +00:00
Michael Gattozzi f793d31f63
feat: Cleanup CLI flags for InfluxDB 3 Core (#25737)
This makes quite a few major changes to our CLI and how users interact
with it:

1. All commands are now in the form <verb> <noun> this was to make the
   commands consistent. We had last-cache as a noun, but serve as a
   verb in the top level. Given that we could only create or delete
   All noun based commands have been move under a create and delete
   command
2. --host short form is now -H not -h which is reassigned to -h/--help
   for shorter help text and is in line with what users would expect
   for a CLI
3. Only the needed items from clap_blocks have been moved into
   `influxdb3_clap_blocks` and any IOx specific references were changed
   to InfluxDB 3 specific ones
4. References to InfluxDB 3.0 OSS have been changed to InfluxDB 3 Core
   in our CLI tools
5. --dbname has been changed to --database to be consistent with --table
   in many commands. The short -d flag still remains. In the create/
   delete command for the database however the name of the database is
   a positional arg

   e.g. `influxbd3 create database foo` rather than
        `influxdb3 database create --dbname foo`
6. --table has been removed from the delete/create command for tables
   and is now a positional arg much like database
7. clap_blocks was removed as dependency to avoid having IOx specific
   env vars
8. --cache-name is now an optional positional arg for last_cache and meta_cache
9. last-cache/meta-cache commands are now last_cache and meta_cache respectively

Unfortunately we have quite a few options to run the software and I
couldn't cut down on them, but at least with this commands and options
will be more discoverable and we have full control over our CLI options
now.

Closes #25646
2025-01-06 18:51:55 -05:00
Paul Dix 74605ffa5d
feat: Add info log to wal replay (#25752) 2025-01-06 13:22:26 -05:00
Jackson Newhouse 7aa8f41268
feat(processing_engine): Add CLI support for plugins and triggers (#25731) 2025-01-03 12:11:52 -08:00
Jackson Newhouse 29dacc318a
feat(processing_engine): Add REST API endpoints for activating and deactivating triggers. (#25711) 2025-01-02 09:23:18 -08:00
Jackson Newhouse 0db71b69b9
fix(catalog): consistent ordering of catalog operations (#25690) 2024-12-20 15:17:38 -08:00
Jackson Newhouse 8bfccb74ab
feat(processing_engine): Runtime and write-back improvements (#25672)
* Move processing engine invocation to a seperate tokio task.
* Support writing back line protocol from python via insert_line_protocol().
* Update structs to work with bincode.
2024-12-17 16:38:12 -08:00
Jackson Newhouse 486d79d801
feat(processing_engine): initial implementation of Processing Engine plugins and triggers (#25639) 2024-12-13 14:11:38 -08:00
Michael Gattozzi 9292a3213d
feat: Significantly decrease startup times for WAL (#25643)
* feat: add startup time to logging output

This change adds a startup time counter to the output when starting up
a server. The main purpose of this is to verify whether the impact of
changes actually speeds up the loading of the server.

* feat: Significantly decrease startup times for WAL

This commit does a few important things to speedup startup times:
1. We avoid changing an Arc<str> to a String with the series key as the
   From<String> impl will call with_column which will then turn it into
   an Arc<str> again. Instead we can just call `with_column` directly
   and pass in the iterator without also collecting into a Vec<String>
2. We switch to using bitcode as the serialization format for the WAL.
   This significantly reduces startup time as this format is faster to
   use instead of JSON, which was eating up massive amounts of time.
   Part of this change involves not using the tag feature of serde as
   it's currently not supported by bincode
3. We also parallelize reading and deserializing the WAL files before
   we then apply them in order. This reduces time waiting on IO and we
   eagerly evaluate each spawned task in order as much as possible.

This gives us about a 189% speedup over what we were doing before.

Closes #25534
2024-12-12 11:27:51 -05:00
Trevor Hilton 9b87cd7a65
refactor: move last cache to influxdb3_cache crate (#25620)
Moved all of the last cache implementation into the `influxdb3_cache`
crate. This also splits out the implementation into three modules:
- `cache.rs`: the core cache implementation
- `provider.rs`: the cache provider used by the database to hold multiple
  caches.
- `table_function.rs`: same as before, holds the DataFusion impls

Tests were preserved and moved to `mod.rs`, however, they were updated to
not rely on the WriteBuffer implementation, and instead use the types in
the `influxdb3_cache::last_cache` module directly. This simplified the
test code, while not changing any of the test assertions at all.
2024-12-05 14:04:25 -05:00
Trevor Hilton 0daa3f2f1d
feat: track persist time in wal file content (#25614) 2024-12-03 15:37:43 -05:00
Michael Gattozzi d2fbd65a44
feat: Deny extra tags on write APIs (#25596)
This commit does three important major changes:

1. We will deny writes to the v1, v2, and v3 write apis that add new tags in
   subsequent writes after the first write
2. We make every table have a series key by default now
3. We enfore sorting order by the series key which is the order the keys came in

With these changes we have consistentcy across the various write apis and can
make optimizations and future features with the assumption we have a series key.

Closes #25585
2024-12-03 12:10:26 -05:00
Trevor Hilton 234d37329a
feat: metacache REST APIs to create and delete (#25587) 2024-11-27 08:41:46 -05:00
Trevor Hilton 8e23032ceb
feat: add metadata cache provider with APIs for write and query (#25566)
This adds the MetaDataCacheProvider for managing metadata caches in the
influxdb3 instance. This includes APIs to create caches through the WAL
as well as from a catalog on initialization, to write data into the
managed caches, and to query data out of them.

The query side is fairly involved, relying on Datafusion's TableFunctionImpl
and TableProvider traits to make querying the cache using a user-defined
table function (UDTF) possible.

The predicate code was modified to only support two kinds of predicates:
IN and NOT IN, which simplifies the code, and maps nicely with the DataFusion
LiteralGuarantee which we leverage to derive the predicates from the
incoming queries.

A custom ExecutionPlan implementation was added specifically for the
metadata cache that can report the predicates that are pushed down to
the cache during query planning/execution.

A big set of tests was added to to check that queries are working, and
that predicates are being pushed down properly.
2024-11-22 10:57:26 -05:00
praveen-influx 3cde24feb4
feat: delete table (#25572)
This commit allows deleting (soft) a table. For an user, following
command will allow soft deleting a table (bar) in db (foo)

```
influxdb3 table delete --dbname foo --table bar --host $host
```

- Added `soft_delete_table` to `DatabaseManager` trait, which already
  hosts `soft_delete_database` method. The code roughly follows the same
  flow as db delete. Although like db schema, it does clone on write
  because the reference is behind an Arc, `Arc::make_mut` is used in
  this change.
- Moved db delete related cli parser under "manage" module that has both
  db and table delete functionality
- Some minor tidyups (removing unused methods, renaming method so that
  the order in name matches actual return type eg. `table_id_and_schema`,
  should return (id, schema) and not (schema, id))

closes: https://github.com/influxdata/influxdb/issues/25561
2024-11-22 08:42:45 +00:00
Jackson Newhouse 956e223388
fix: don't rebuild snapshot if it has already been taken. (#25570) 2024-11-20 08:55:42 -08:00
praveen-influx 33c2d47ba9
feat: drop/delete database (#25549)
* feat: drop/delete database

This commit allows soft deletion of database using `influxdb3 database
delete <db_name>` command. The write buffer and last value cache are
cleared as well.

closes: https://github.com/influxdata/influxdb/issues/25523

* feat: reuse same code path when deleting database

- In previous commit, the deletion of database immediately triggered
  clearing last cache and query buffer. But on restarts same logic had
  to be repeated to allow deleting database when starting up. This
  commit removes immediate deletion by explicitly calling necessary
  methods and moves the logic to `apply_catalog_batch` which already
  applies `CatalogOp` and also clearing cache and buffer in
  `buffer_ops` method which has hooks to call other places.

closes: https://github.com/influxdata/influxdb/issues/25523

* feat: use reqwest query api for query param

Co-authored-by: Trevor Hilton <thilton@influxdata.com>

* feat: include deleted flag in DatabaseSnapshot

- `DatabaseSchema` serialization/deserialization is delegated to
 `DatabaseSnapshot`, so the `deleted` flag should be included in
 `DatabaseSnapshot` as well.
- insta test snapshots fixed

closes: https://github.com/influxdata/influxdb/issues/25523

* feat: address PR comments + tidy ups

---------

Co-authored-by: Trevor Hilton <thilton@influxdata.com>
2024-11-19 16:08:14 +00:00
praveen-influx 814eb31309
chore: update core deps (#25532)
* chore: update core deps

- arrow/parquet deps are patched (as in core)
- three specific code changes to cope with changes in core crates
  - TransitionPartitionId, use `from_parts` instead of `new`
  - arrow buffers can take &[u8] directly without `to_vec()`/`vec!`
    (used only in tests)
  - `schema` and `influxdb_line_protocol` crates need `v3` feature enabled

* chore: update deny.toml

* chore: formatting and deny toml changes

Unicode-3.0 license is added to allowed licenses list, without it
end up with 19 errors (`zerovec`, `zerovec-derive` etc.)

* chore: address PR feedback

- move enabling v3 feature to root Cargo.toml
- added the upstream PR for datafusion-common that introduced RUSTSEC-2024-0384
2024-11-12 16:07:31 +00:00
Trevor Hilton 3bb63b2d71
fix: throw error when adding fields to non-existent table in WAL (#25525)
* fix: throw error when adding fields to non-existent table

* test: add test for expected behaviour in catalog op apply

This also added in some helpers to the wal crate that were previously
added to pro.
2024-11-08 13:15:07 -05:00
Trevor Hilton 5698e79a34
feat: helper methods on WalOp (#25486) 2024-11-01 17:19:20 -04:00