influxdb

Commit Graph

Author	SHA1	Message	Date
dependabot[bot]	ada6561f4a	chore(deps): Bump serde_json from 1.0.113 to 1.0.114 (#24687 )	2024-02-25 14:34:37 +00:00
dependabot[bot]	278ecbeb56	chore(deps): Bump serde from 1.0.196 to 1.0.197 (#24689 )	2024-02-25 14:26:15 +00:00
Paul Dix	3c5e5bf241	feat: Add segment persist of closed buffer segment (#24659 ) * feat: add catalog sequence tracking to OpenBufferSegment * feat: Add segment persist of closed buffer * refactor: pr review updates * refactor: PR updates	2024-02-14 10:55:09 -05:00
Paul Dix	4d9095e58d	feat: add segmenting and wal persistence to WriteBuffer (#24624 ) * refactor: move write buffer into its own dir * feat: implement write buffer segment with wal flushing This creates the WriteBufferFlusher and OpenBufferSegment. If a wal is passed into the buffer, data written into it will be persisted to the wal for the initialized segment id. * refactor: use crossbeam in flusher and pr cleanup	2024-02-12 12:36:10 -05:00
Michael Gattozzi	ff567cd33f	chore(deps): Update arrow and datafusion to 49.0.0 (#24605 ) * chore(deps): Update arrow and datafusion to 49.0.0 This commit copies in our dependency code from influxdb_iox in order for us to be able to upgrade from a forked version of 46.0.0 to 49.0.0 of both arrow and datafusion. Most of the important changes were around how we consumed the crates in influxdb3(_server/_write). Those diffs are particularly worth looking at as the rest was a straight copy and we don't touch those crates in our development currently for influxdb3 edge. * fix: regenerate workspace hack crate * fix: Protobuf issues with incompatibility labels * fix: Broken CI yaml * fix: buf version * fix: Only check IOx repo * fix: Remove protobuf lint * fix: Comment out call to protobuf-lint	2024-01-31 19:18:51 -05:00
Michael Gattozzi	001a2a6653	feat: Implement Persister for PersisterImpl (#24588 ) * feat: Implement Catalog r/w for persister This commit implements reading and writing the Catalog to the object store. This was already stubbed out functionality, but it just needed an implementation. Saving it to the object store is pretty straight forward as it just serializes it to JSON and writes it to the object store. For loading, it finds the most recently added Catalog based on the file name and returns that from the object store in it's deserialized form and returned to the caller. This commit also adds some tests to make sure that the above functionality works as intended. * feat: Implement Segment r/w for persister This commit continues the work on the persister by implementing the persist_segment and load_segment functions for the persister. Much like the Catalog implementation, it's serialized to JSON before being persisted to the object store in persist_segment. This is pretty straightforward. For the loading though we need to find the most recent n segment files and so we need to list them and then return the most recent n. This is a little more complicated to do, but there are comments in the code to make it easier to grok. We also implement more tests to make sure that this part of the persister works as expected. * feat: Implement Parquet r/w to persister This commit does a few things: - First we add methods to the persister trait for reading and writing parquet files as these were not stubbed out in prior commits - Secondly we add a method to serialize a SendableRecordBatchStream into Parquet bytes - With these in place implementing the trait methods is pretty straightforward: hand a path in and a stream and get back some metadata about the file persisted and also get the bytes back if loading from the store Of course we also add more tests to make sure this all works as expected. Do note that this does nothing to make sure that we bound how much memory is used or if this is the most efficient way to write parquet files. This is mostly to get things working with the understanding that future refinement on the approach might be needed. * fix: Update smallvec for crate advisory * fix: Implement better filename handling * feat: Handle loading > 1000 Segment Info files	2024-01-25 14:31:57 -05:00
Michael Gattozzi	e13cc476bb	feat: Add paths module to influxdb3_write (#24579 ) This commit introduces 4 new types in the paths module for the influxdb3_write crate. They are: - ParquetFilePath - CatalogFilePath - SegmentInfoFilePath - SegmentWalFilePath Each of these corresponds to an object store path and for the WAL file an on disk path that we can use to address the needed files in a consistent way and not need to have path construction be duplicated to address these files. These types also Deref/AsRef to the object_store::path::Path type (or the std::path::Path type for the Wal) so that they can be used in places that expect the type such as various object_store/std::fs and so that we can use the underlying type's methods without needing to implement them for each type as they are just a thin wrapper around those types. This commit adds some tests to make sure that the path construction works as intended and also updates the `wal.rs` file to use the new `SegmentWalFilePath` instead of just a `PathBuf`. Closes: #24578	2024-01-19 10:57:54 -05:00
Paul Dix	02b4d28637	feat: add basic wal implementation for Edge (#24570 ) * feat: add basic wal implementation for Edge This WAL implementation uses some of the code from the wal crate, but departs pretty significantly from it in many ways. For now it uses simple JSON encoding for the serialized ops, but we may want to switch that to Protobuf at some point in the future. This version of the wal doesn't have its own buffering. That will be implemented higher up in the BufferImpl, which will use the wal and SegmentWriter to make data in the buffer durable. The write flow will be that writes will come into the buffer and validate/update against an in memory Catalog. Once validated, writes will get buffered up in memory and then flushed into the WAL periodically (likely every 10-20ms). After being flushed to the wal, the entire batch of writes will be put into the in memory queryable buffer. After that responses will be sent back to the clients. This should reduce the write lock pressure on the in-memory buffer considerably. In this PR: - Update the Wal, WalSegmentWriter, and WalSegmentReader traits to line up with new design/understanding - Implement wal (mainly just a way to identify segment files in a directory) - Implement WalSegmentWriter (write header, op batch with crc, and track sequence number in segment, re-open existing file) - Implement WalSegmentReader * refactor: make Wal return impl reader/writer * refactor: clean up wal segment open * fix: WriteBuffer and Wal usage Turn wal and write buffer references into a concrete type, rather than dyn. * fix: have wal loading ignore invalid files	2024-01-12 11:52:28 -05:00
Michael Gattozzi	8ee13bca48	fix: Failing CI on main (#24562 ) * fix: build, upgrade rustc, and deps This commit upgrades Rust to 1.75.0, the latest release. We also upgraded our dependencies to stay up to date and to clear out any uneeded deps from the lockfile. In order to make sure everything works this also fixes the build by upgrading the workspace-hack crate using cargo hikari and removing the `workspace.lint` that was in influxdb3_write that didn't need to be there, probably from a merge issue. With this we can build influxdb3 as our default on main, but this alone is not enough to fix CI and will be addressed in future commits. * fix: warnings for influxdb3 build This commit fixes the warnings emitted by `cargo build` when compiling influxdb3. Mainly it adds needed lifetimes and removes uneccesary imports and functions calls. * fix: all of the clippy lints This for the most part just applies suggested fixes by clippy with a few exceptions: - Generated type crates had additional allows added since we can't control what code gets made - Things that couldn't be automatically fixed were done so manually in particular adding a Send bound for traits that created a Future that should be Send We also had to fix a build issue by adding a feature for tokio-compat due to the upgrade of deps. The workspace crate was updated accordingly. * fix: failing test due to rust panic message change Inbetween rustc 1.72 and rustc 1.75 the way that error messages were displayed when panicing changed. One of our tests depended on the output of that behavior and this commit updates the error message to the new form so that tests will pass. * fix: broken cargo doc link * fix: cargo formatting run * fix: add workspace-hack to influxdb3 crates This was the last change needed to make sure that the workspace-hack crate CI lint would pass. * fix: remove tests that can not run anymore We removed iox code from this code base and as a result some tests cannot be run anymore and so this commit removes them from the code base so that we can get a green build.	2024-01-09 15:11:35 -05:00
Paul Dix	5831cf8cee	feat: Add basic Edge server structure (#24552 ) * WIP: basic influxdb3 command and http server * WIP: write lp, buffer, query out * WIP: test write & query on influxdb3_server, fix warnings * WIP: pull write buffer and catalog into separate crate * WIP: sketch out types used for write: buffer, wal, persister * WIP: remove a bunch of old IOx stuff and fmt	2024-01-08 11:50:59 -05:00

1 2 3 4

160 Commits (8966cfb3d3e436f224b4fa2523f91eb78d427f73)