influxdb

Commit Graph

Author	SHA1	Message	Date
praveen-influx	4eccc38129	fix: reproducer for the empty snapshot file issue (#25835 ) * fix: reproducer for the empty snapshot file issue * fix: avoid creating empty (0 dbs) snapshot file	2025-01-15 20:01:57 +00:00
Michael Gattozzi	aa8a8c560d	feat: Set 72 hour query/write limit for Core (#25810 ) This commit sets InfluxDB 3 Core to have a 72 hour limit for queries and writes. What this means is that writes that contain historical data older than 72 hours will be rejected and queries will filter out data older than 72 hours. Core is intended to be a recent timeseries database and performance over data older than 72 hours will degrade without a garbage collector, a core feature of InfluxDB 3 Enterprise. InfluxDB 3 Enterprise does not have this write or query limit in place. Note that this does not mean older data is deleted. Older data is still accessible in object storage as Parquet files that can still be used in other services and analyzed with dataframe libraries like pandas and polars. This commit does a few things: - Uses timestamps in the year 2065 for tests as these should not break for longer than many of us will be working in our lifetimes. This is only needed for the integration tests as other tests use the MockProvider for time. - Filters the buffer and persisted files to only show data newer than 3 days ago - Fixes the integration tests to work with the fact that writes older than 3 days are rejected	2025-01-12 13:08:01 -05:00
Trevor Hilton	db24a62658	refactor: change host-id to writer-id (#25804 ) This changes the CLI arg `host-id` to `writer-id` to more accurately indicate meaning. This changes also goes through the codebase and changes struct fields, methods, and variables to use the term `writer_id` or `writer_identifier_prefix` instead of `host_id` etc., to make the meaning clear in the code. This also changes the catalog serialization to use the field `writer_id` instead of `host_id`, which is breaking change.	2025-01-12 11:40:47 -05:00
praveen-influx	50963443a4	feat: introduce num wal files to keep (#25801 ) * feat: introduce num wal files to keep This commit allows a configurable number of wal files to be left behind in OS. This is necessary as enterprise replicas rely on these files. closes: https://github.com/influxdata/influxdb/issues/25788 * refactor: address PR feedback * refactor: address PR comment	2025-01-12 00:33:13 +00:00
Trevor Hilton	0bdc2fa953	chore: patch enterprise back to core (#25798 )	2025-01-11 17:26:41 -05:00
Trevor Hilton	1ff4f76896	feat: only load wal files after most recent snapshot (#25787 )	2025-01-11 10:27:58 -05:00
Trevor Hilton	c71dafc313	refactor: rename metadata cache to distinct value cache (#25775 )	2025-01-10 08:48:51 -05:00
Paul Dix	7230148b58	feat: Update WAL plugin for new structure (#25777 ) * feat: Update WAL plugin for new structure This ended up being a very large change set. In order to get around circular dependencies, the processing engine had to be moved into its own crate, which I think is ultimately much cleaner. Unfortunately, this required changing a ton of things. There's more testing and things to add on to this, but I think it's important to get this through and build on it. Importantly, the processing engine no longer resides inside the write buffer. Instead, it is attached to the HTTP server. It is now able to take a query executor, write buffer, and WAL so that the full range of functionality of the server can be exposed to the plugin API. There are a bunch of system-py feature flags littered everywhere, which I'm hoping we can remove soon. * refactor: PR feedback	2025-01-10 05:52:33 -05:00
Paul Dix	2d18a61949	feat: Add query API to Python plugins (#25766 ) This ended up being a couple things rolled into one. In order to add a query API to the Python plugin, I had to pull the QueryExecutor trait out of server into a place so that the python crate could use it. This implements the query API, but also fixes up the WAL plugin test CLI a bit. I've added a test in the CLI section so that it shows end-to-end operation of the WAL plugin test API and exercise of the entire Plugin API. Closes #25757	2025-01-09 20:13:20 -05:00
Trevor Hilton	63d3b867f1	chore: patch changes from enterprise (#25776 ) - reduce parquet row group size to 100k - add cli option to disable cached parquet loader	2025-01-09 16:02:12 -05:00
praveen-influx	aa9213c4f4	feat: check mem and force snapshot (#25767 ) This commit allows checking memory in the background and force snapshotting if query buffer size is > mem threshold. This hooks into the function (`force_flush_buffer`) to achieve it. closes: https://github.com/influxdata/influxdb/issues/25685	2025-01-09 18:40:14 +00:00
praveen-influx	6e2e39cd4c	feat: snapshot when wal buffer is empty (#25765 ) * feat: snapshot when wal buffer is empty - This commit changes the functionality to allow snapshots to happen even when wal buffer is empty. For snapshots wal periods are still required but not the wal buffer. To allow this, we write a no-op into wal file with snapshot details. This enables force snapshotting functionality closes: https://github.com/influxdata/influxdb/issues/25685 * refactor: address PR feedback	2025-01-09 12:12:37 +00:00
Paul Dix	1ce6a24c3f	feat: Implement WAL plugin test API (#25704 ) * feat: Implement WAL plugin test API This implements the WAL plugin test API. It also introduces a new API for the Python plugins to be called, get their data, and call back into the database server. There are some things that I'll want to address in follow on work: * CLI tests, but will wait on #25737 to land for a refactor of the CLI here * Would be better to hook the Python logging to call back into the plugin return state like here: https://pyo3.rs/v0.23.3/ecosystem/logging.html#the-python-to-rust-direction * We should only load the LineBuilder interface once in a module, rather than on every execution of a WAL plugin * More tests all around But I want to get this in so that the actual plugin and trigger system can get udated to build around this model. * refactor: PR feedback	2025-01-06 17:32:17 -05:00
Trevor Hilton	d429490f74	feat: table creation also creates db if it does not exist (#25754 )	2025-01-06 15:44:59 -05:00
Michael Gattozzi	ccda3dd3a9	feat: remove required field restriction for tables (#25738 ) This commit removes the required fields restriction when using the CLI or the API to create a new table. As users can't write via the line protocol without a field this is fine and the schema will be updated on write. This expands the test to check for the correct response code now and make sure that we can both query the empty table and write new data to it. Closes #25735	2025-01-03 18:10:56 -05:00
Jackson Newhouse	7aa8f41268	feat(processing_engine): Add CLI support for plugins and triggers (#25731 )	2025-01-03 12:11:52 -08:00
Jackson Newhouse	29dacc318a	feat(processing_engine): Add REST API endpoints for activating and deactivating triggers. (#25711 )	2025-01-02 09:23:18 -08:00
Trevor Hilton	de227b95d9	refactor: cleanup v3 write API and series key method on catalog (#25723 ) Store the series key column names on the TableDefinitin in catalog so looking up the series key by column names is more efficient Remove the /api/v3/write API and related code/tests	2024-12-30 09:32:54 -05:00
Trevor Hilton	6f4639262d	feat: track lines rejected in prometheus metrics (#25722 ) * feat: track lines rejected in prometheus metrics This adds the metric `influxdb3_write_lines_rejected` metric which tracks the total number of lines rejected from incoming writes. Note, that this only tacks the number of rejected lines when the default `accept_partial` of `true` is provided to incoming write requests.	2024-12-29 10:05:57 -05:00
Jackson Newhouse	0db71b69b9	fix(catalog): consistent ordering of catalog operations (#25690 )	2024-12-20 15:17:38 -08:00
Trevor Hilton	d10ad87f2c	feat: write metrics (#25692 ) Added prometheus metrics to track lines written and bytes written per database. The write buffer does the tracking after validation of incoming line protocol. Tests added to verify.	2024-12-20 11:02:36 -05:00
Paul Dix	0eab724bee	fix: Field not in queryable buffer (#25691 )	2024-12-19 19:22:47 -05:00
Michael Gattozzi	e51bea65b4	feat: create DB and Tables via REST and CLI (#25687 ) * feat: create DB and Tables via REST and CLI This commit does a few things: 1. It brings the database command naming scheme for types inline with the rest of the CLI types 2. It brings the table command naming scheme for types inline with the rest of the CLI types 3. Adds tests to check that the num of dbs is not exceeded and that you cannot create more than one database with a given name. 4. Adds tests to check that you can create a table and put data into it and querying it 5. Adds tests for the CLI for both the database and table commands 6. It creates an endpoint to create databases given a JSON blob 7. It creates an endpoint to create tables given a JSON blob With this users can now create a database or table without first needing to write to the database via the line protocol! Closes #25640 Closes #25641	2024-12-19 16:01:34 -05:00
Paul Dix	56576402cc	fix: Ensure tags are never null (#25680 ) * fix: Ensure tags are never null This injects empty strings into tags for any rows in the buffer where the tag value is null. This is required because the tags are what make up the series key, which must have all non-null values. There is an ongoing discussion about what the real behavior should be here, but for now this will get our users running that break without this behavior. Discussion is in #25674. Fixes #25648 * fix: clippy failures	2024-12-18 17:09:23 -05:00
Trevor Hilton	93222f756b	feat: log errors on panic in sort_dedupe_persist (#25678 ) This adds some error handling and logging around the method that sorts, deduplicates, and persists parquet data during the snapshot process The errors will need to be handled in follow-on work, but this is for helping debug fatal errors during the process.	2024-12-18 13:51:30 -05:00
Jackson Newhouse	8bfccb74ab	feat(processing_engine): Runtime and write-back improvements (#25672 ) * Move processing engine invocation to a seperate tokio task. * Support writing back line protocol from python via insert_line_protocol(). * Update structs to work with bincode.	2024-12-17 16:38:12 -08:00
Paul Dix	31b9209dd6	fix: Snapshot QueryableBuffer error (#25673 ) Fixes bug in queryable buffer where if a block of data was missing one of the columns defined in a table sort key, the creation of the logical plan to sort and dedupe the data would fail, causing a panic. Fixes #25670	2024-12-17 16:57:07 -05:00
Jackson Newhouse	486d79d801	feat(processing_engine): initial implementation of Processing Engine plugins and triggers (#25639 )	2024-12-13 14:11:38 -08:00
Michael Gattozzi	535ddd606d	feat: Parallelize loading snapshots from storage (#25657 )	2024-12-13 15:47:56 -05:00
Michael Gattozzi	9292a3213d	feat: Significantly decrease startup times for WAL (#25643 ) * feat: add startup time to logging output This change adds a startup time counter to the output when starting up a server. The main purpose of this is to verify whether the impact of changes actually speeds up the loading of the server. * feat: Significantly decrease startup times for WAL This commit does a few important things to speedup startup times: 1. We avoid changing an Arc<str> to a String with the series key as the From<String> impl will call with_column which will then turn it into an Arc<str> again. Instead we can just call `with_column` directly and pass in the iterator without also collecting into a Vec<String> 2. We switch to using bitcode as the serialization format for the WAL. This significantly reduces startup time as this format is faster to use instead of JSON, which was eating up massive amounts of time. Part of this change involves not using the tag feature of serde as it's currently not supported by bincode 3. We also parallelize reading and deserializing the WAL files before we then apply them in order. This reduces time waiting on IO and we eagerly evaluate each spawned task in order as much as possible. This gives us about a 189% speedup over what we were doing before. Closes #25534	2024-12-12 11:27:51 -05:00
Trevor Hilton	37219af9d4	feat: track parquet cache metrics (#25632 ) * feat: parquet cache metrics * feat: track parquet cache metrics Adds metrics to track the following in the in-memory parquet cache: * cache size in bytes (also included a fix in the calculation of that) * cache size in n files * cache hits * cache misses * cache misses while the oracle is fetching a file A test was added to check this functionality * refactor: clean up logic and fix cache removal tracking error Some logic and naming was cleaned up and the boolean to optionally track metrics on entry removal was removed, as it was incorrect in the first place: a fetching entry still has a size, which counts toward the size of the cache. So, this makes is such that anytime an entry is removed, whether its state is success or fetching, its size will be decremented from the cache size metrics. The sizing caclulations were made to be correct, and the cache metrics test was updated with more thurough assertions	2024-12-10 09:32:15 -05:00
Trevor Hilton	0bfef47ff9	refactor: move parquet cache to influxdb3_cache crate (#25630 )	2024-12-09 11:56:52 -05:00
Trevor Hilton	154ff7da23	feat: LastCacheExec to track predicate pushdown in last cache queries (#25621 )	2024-12-06 10:53:19 -08:00
Trevor Hilton	9b87cd7a65	refactor: move last cache to influxdb3_cache crate (#25620 ) Moved all of the last cache implementation into the `influxdb3_cache` crate. This also splits out the implementation into three modules: - `cache.rs`: the core cache implementation - `provider.rs`: the cache provider used by the database to hold multiple caches. - `table_function.rs`: same as before, holds the DataFusion impls Tests were preserved and moved to `mod.rs`, however, they were updated to not rely on the WriteBuffer implementation, and instead use the types in the `influxdb3_cache::last_cache` module directly. This simplified the test code, while not changing any of the test assertions at all.	2024-12-05 14:04:25 -05:00
Trevor Hilton	dbb1f55b5e	chore: update core for latest sync (#25617 )	2024-12-04 14:11:13 -05:00
Trevor Hilton	0daa3f2f1d	feat: track persist time in wal file content (#25614 )	2024-12-03 15:37:43 -05:00
Michael Gattozzi	d2fbd65a44	feat: Deny extra tags on write APIs (#25596 ) This commit does three important major changes: 1. We will deny writes to the v1, v2, and v3 write apis that add new tags in subsequent writes after the first write 2. We make every table have a series key by default now 3. We enfore sorting order by the series key which is the order the keys came in With these changes we have consistentcy across the various write apis and can make optimizations and future features with the assumption we have a series key. Closes #25585	2024-12-03 12:10:26 -05:00
Trevor Hilton	81d1ff1d62	chore: upgrade to rust 1.83.0 (#25605 )	2024-11-29 18:21:48 -05:00
Trevor Hilton	b7fd8e2386	feat: remove metadata caches on db and table delete (#25599 )	2024-11-28 11:35:29 -05:00
Trevor Hilton	234d37329a	feat: metacache REST APIs to create and delete (#25587 )	2024-11-27 08:41:46 -05:00
Trevor Hilton	8e23032ceb	feat: add metadata cache provider with APIs for write and query (#25566 ) This adds the MetaDataCacheProvider for managing metadata caches in the influxdb3 instance. This includes APIs to create caches through the WAL as well as from a catalog on initialization, to write data into the managed caches, and to query data out of them. The query side is fairly involved, relying on Datafusion's TableFunctionImpl and TableProvider traits to make querying the cache using a user-defined table function (UDTF) possible. The predicate code was modified to only support two kinds of predicates: IN and NOT IN, which simplifies the code, and maps nicely with the DataFusion LiteralGuarantee which we leverage to derive the predicates from the incoming queries. A custom ExecutionPlan implementation was added specifically for the metadata cache that can report the predicates that are pushed down to the cache during query planning/execution. A big set of tests was added to to check that queries are working, and that predicates are being pushed down properly.	2024-11-22 10:57:26 -05:00
praveen-influx	3cde24feb4	feat: delete table (#25572 ) This commit allows deleting (soft) a table. For an user, following command will allow soft deleting a table (bar) in db (foo) ``` influxdb3 table delete --dbname foo --table bar --host $host ``` - Added `soft_delete_table` to `DatabaseManager` trait, which already hosts `soft_delete_database` method. The code roughly follows the same flow as db delete. Although like db schema, it does clone on write because the reference is behind an Arc, `Arc::make_mut` is used in this change. - Moved db delete related cli parser under "manage" module that has both db and table delete functionality - Some minor tidyups (removing unused methods, renaming method so that the order in name matches actual return type eg. `table_id_and_schema`, should return (id, schema) and not (schema, id)) closes: https://github.com/influxdata/influxdb/issues/25561	2024-11-22 08:42:45 +00:00
Jackson Newhouse	956e223388	fix: don't rebuild snapshot if it has already been taken. (#25570 )	2024-11-20 08:55:42 -08:00
Michael Gattozzi	230bf02f93	feat: delete old Catalogs on persist (#25568 ) This commit changes the code so that we only keep the 10 most recent Catalogs. When a new one is persisted we delete any old ones that exist. If the deletion would fail we don't panic and let a future persist cleanup the catalogs rather than failing the persist itself. This commit also adds a test to make sure that only the catalogs we expect to are deleted on persist.	2024-11-19 12:42:20 -05:00
praveen-influx	33c2d47ba9	feat: drop/delete database (#25549 ) * feat: drop/delete database This commit allows soft deletion of database using `influxdb3 database delete <db_name>` command. The write buffer and last value cache are cleared as well. closes: https://github.com/influxdata/influxdb/issues/25523 * feat: reuse same code path when deleting database - In previous commit, the deletion of database immediately triggered clearing last cache and query buffer. But on restarts same logic had to be repeated to allow deleting database when starting up. This commit removes immediate deletion by explicitly calling necessary methods and moves the logic to `apply_catalog_batch` which already applies `CatalogOp` and also clearing cache and buffer in `buffer_ops` method which has hooks to call other places. closes: https://github.com/influxdata/influxdb/issues/25523 * feat: use reqwest query api for query param Co-authored-by: Trevor Hilton <thilton@influxdata.com> * feat: include deleted flag in DatabaseSnapshot - `DatabaseSchema` serialization/deserialization is delegated to `DatabaseSnapshot`, so the `deleted` flag should be included in `DatabaseSnapshot` as well. - insta test snapshots fixed closes: https://github.com/influxdata/influxdb/issues/25523 * feat: address PR comments + tidy ups --------- Co-authored-by: Trevor Hilton <thilton@influxdata.com>	2024-11-19 16:08:14 +00:00
Trevor Hilton	53f54a6845	feat: metadata cache core impl (#25552 ) * feat: core metadata cache structs with basic tests Implement the base MetaCache type that holds the hierarchical structure of the metadata cache providing methods to create and push rows from the WAL into the cache. Added a prune method as well as a method for gathering record batches from a meta cache. A test was added to check the latter for various predicates and that the former works, though, pruning shows that we need to modify how record batches are produced such that expired entries are not emitted. * refactor: filter expired entries and do some clean up in the meta cache	2024-11-18 12:28:12 -05:00
Trevor Hilton	2ac3df1bca	refactor: use `SerdeVecMap` in `PersistedSnapshot` (#25541 ) * refactor: use SerdeVecMap in PersistedSnapshot This changes from the use of a HashMap to store the DB -> Table structure in the PersistedSnapshot files to using a SerdeVecMap, which will have the identifiers serialized as integers instead of strings. * test: add a snapshot test for persisted snapshots	2024-11-12 16:31:36 -05:00
praveen-influx	814eb31309	chore: update core deps (#25532 ) * chore: update core deps - arrow/parquet deps are patched (as in core) - three specific code changes to cope with changes in core crates - TransitionPartitionId, use `from_parts` instead of `new` - arrow buffers can take &[u8] directly without `to_vec()`/`vec!` (used only in tests) - `schema` and `influxdb_line_protocol` crates need `v3` feature enabled * chore: update deny.toml * chore: formatting and deny toml changes Unicode-3.0 license is added to allowed licenses list, without it end up with 19 errors (`zerovec`, `zerovec-derive` etc.) * chore: address PR feedback - move enabling v3 feature to root Cargo.toml - added the upstream PR for datafusion-common that introduced RUSTSEC-2024-0384	2024-11-12 16:07:31 +00:00
Paul Dix	35e29d1408	feat: Update catalog to use sequence number in path (#25526 ) Updates the catalog to use its own sequence number in the path. This will enable downstream Pro systems that pick up PersistedSnapshots to get the specific catalog that a snapshot is associated with since its sequence number is included. Also updated the type to be CatalogSequenceNumber to make it more clear & readable when being used.	2024-11-08 15:50:15 -05:00
praveen-influx	c2b8a3a355	feat: add column names to last cache sys table (#25521 ) * feat: add column names to last cache sys table closes: https://github.com/influxdata/influxdb/issues/25511 * feat: move all `get_by_id` methods to take reference in schema	2024-11-08 16:08:30 +00:00

1 2 3 4

166 Commits (bugfix/run_triggers_on_start)