influxdb

Commit Graph

Author	SHA1	Message	Date
Michael Gattozzi	a5082ec432	feat: Add limits for InfluxDB Edge (#24703 ) This commit is the final piece for the write_lp endpoint. It adds limits to Edge such that: - There can only be 5 Databases - There can only be 500 Columns per Table - There can only be 2000 Tables across all Databases We do this by modifying the catalog code to error out whenever one of these limits would be exceeded before permanently modifying the schema. These are hard coded limits and cannot be configured by the user. Closes #24554	2024-03-04 10:24:33 -05:00
Trevor Hilton	f7892ebee5	feat: add the `api/v3/query_influxql` API (#24696 ) feat: add query_influxql api This PR adds support for the /api/v3/query_influxql API. This re-uses code from the existing query_sql API, but some refactoring was done to allow for code re-use between the two. The main change to the original code from the existing query_sql API was that the format is determined up front, in the event that the user provides some incorrect Accept header, so that the 400 BAD REQUEST is returned before performing the query. Support of several InfluxQL queries that previously required a bridge to be executed in 3.0 was added: SHOW MEASUREMENTS SHOW TAG KEYS SHOW TAG VALUES SHOW FIELD KEYS SHOW DATABASES Handling of qualified measurement names in SELECT queries (see below) This is accomplished with the newly added iox_query_influxql_rewrite crate, which provides the means to re-write an InfluxQL statement to strip out a database name and retention policy, if provided. Doing so allows the query_influxql API to have the database parameter optional, as it may be provided in the query string. Handling qualified measurement names in SELECT The implementation in this PR will inspect all measurements provided in a FROM clause and extract the database (DB) name and retention policy (RP) name (if not the default). If multiple DB/RP's are provided, an error is thrown. Testing E2E tests were added for performing basic queries against a running server on both the query_sql and query_influxql APIs. In addition, the test for query_influxql includes some of the InfluxQL-specific queries, e.g., SHOW MEASUREMENTS. Other Changes The influxdb3_client now has the api_v3_query_influxql method (and a basic test was added for this)	2024-03-01 12:27:38 -05:00
Michael Gattozzi	3c9e6ed836	fix: Add docker folder back for CI (#24720 )	2024-02-29 16:47:41 -05:00
Michael Gattozzi	59d8e23d49	fix: Readd the Dockerfile for the main branch (#24719 )	2024-02-29 16:33:36 -05:00
Michael Gattozzi	73e261c021	feat: Split out shared core crates from Edge (#24714 ) This commit is a major refactor for the code base. It mainly does four things: 1. Splits code shared between the internal IOx repository and this one into it's own repo over at https://github.com/influxdata/influxdb3_core 2. Removes any docs or anything else that did not relate to this project 3. Reorganizes the Cargo.toml files to use the top level Cargo.toml to declare dependencies and versions to keep all crates in sync and sets all others to use `<dep>.workspace = true` unless it's an optional dependency 4. Set the top level Cargo.toml to point to the core crates as git dependencies With this any changes specific to Edge will be contained here, updating deps will be a PR over in `influxdata/influxdb3_core`, and we can prove out the viability for this model to use for IOx.	2024-02-29 16:21:41 -05:00
Paul Dix	2da5803bfd	feat: implement loader for persisted state (#24705 ) * fix: persister loading with no segments Fixes a bug where the persister would throw an error if attempting to load segments when none had been persisted. Moved persister tests into tests block. * feat: implement loader for persisted state This implements a loader for the write buffer. It loads the catalog and the buffer from the WAL. Move Persister errors into their own type now that the write buffer load could return errors from the persister. This doesn't yet rotate segments or trigger persistence of newly closed segments, which will be addressed in a future PR. * fix: cargo update to fix audit * refactor: add error type to persister trait * refactor: use generics instead of dyn --------- Co-authored-by: Trevor Hilton <thilton@influxdata.com>	2024-02-29 15:58:19 -05:00
Brandon Pfeifer	3dcf2778d6	chore: remove unused CircleCI scripts (#24701 )	2024-02-28 09:48:57 -05:00
Michael Gattozzi	8fec1d636e	feat: Add write_lp partial write, name check, and precision (#24677 ) * feat: Add partial write and name check to write_lp This commit adds new behavior to the v3 write_lp http endpoint by implementing both partial writes and checking the db name for validity. It also sets the partial write behavior as the default now, whereas before we would reject the entire request if one line was incorrect. Users who do actually want that behavior can now opt in by putting 'accept_partial=false' into the url of the request. We also check that the db name used in the request contains only numbers, letters, underscores and hyphens and that it must start with either a number or letter. We also introduce a more standardized way to return errors to the user as JSON that we can expand over time to give actionable error messages to the user that they can use to fix their requests. Finally tests have been included to mock out and test the behavior for all of the above so that changes to the error messages are reflected in tests, that both partial and not partial writes work as expected, and that invalid db names are rejected without writing. * feat: Add precision to write_lp http endpoint This commit adds the ability to control the precision of the time stamp passed in to the endpoint. For example if a user chooses 'second' and the timestamp 20 that will be 20 seconds past the Unix Epoch. If they choose 'millisecond' instead it will be 20 milliseconds past the Epoch. Up to this point we assumed that all data passed in was of nanosecond precision. The data is still stored in the database as nanoseconds. Instead upon receiving the data we convert it to nanoseconds. If the precision URL parameter is not specified we default to auto and take a best effort guess at what the user wanted based on the order of magnitude of the data passed in. This change will allow users finer grained control over what precision they want to use for their data as well as trying our best to make a good user experience and having things work as expected and not creating a failure mode whereby a user wanted seconds and instead put in nanoseconds by default.	2024-02-27 11:57:10 -05:00
Trevor Hilton	298055e9fb	feat: support FlightSQL in 3.0 (#24678 ) * feat: support FlightSQL by serving gRPC requests on same port as HTTP This commit adds support for FlightSQL queries via gRPC to the influxdb3 service. It does so by ensuring the QueryExecutor implements the QueryNamespaceProvider trait, and the underlying QueryDatabase implements QueryNamespace. Satisfying those requirements allows the construction of a FlightServiceServer from the service_grpc_flight crate. The FlightServiceServer is a gRPC server that can be served via tonic at the API surface; however, enabling this required some tower::Service wrangling. The influxdb3_server/src/server.rs module was introduced to house this code. The objective is to serve both gRPC (via the newly introduced tonic server) and standard REST HTTP requests (via the existing HTTP server) on the same port. This is accomplished by the HybridService which can handle either gRPC or non-gRPC HTTP requests. The HybridService is wrapped in a HybridMakeService which allows us to serve it via hyper::Server on a single bind address. End-to-end tests were added in influxdb3/tests/flight.rs. These cover some basic FlightSQL cases. A common.rs module was added that introduces some fixtures to aid in end-to-end tests in influxdb3.	2024-02-26 15:07:48 -05:00
Michael Gattozzi	75afbbd20e	chore: Remove dependabot for our repo (#24693 )	2024-02-26 13:38:20 -05:00
dependabot[bot]	ada6561f4a	chore(deps): Bump serde_json from 1.0.113 to 1.0.114 (#24687 )	2024-02-25 14:34:37 +00:00
dependabot[bot]	fca7b702f0	chore(deps): Bump ring from 0.17.7 to 0.17.8 (#24684 )	2024-02-25 14:32:26 +00:00
dependabot[bot]	f67968c159	chore(deps): Bump insta from 1.34.0 to 1.35.1 (#24688 )	2024-02-25 14:27:40 +00:00
dependabot[bot]	278ecbeb56	chore(deps): Bump serde from 1.0.196 to 1.0.197 (#24689 )	2024-02-25 14:26:15 +00:00
dependabot[bot]	bc1e8fc15e	chore(deps): Bump unicode-normalization from 0.1.22 to 0.1.23 (#24690 )	2024-02-25 14:24:47 +00:00
dependabot[bot]	f817d63cf7	chore(deps): Bump ahash from 0.8.8 to 0.8.9 (#24692 )	2024-02-25 14:22:32 +00:00
dependabot[bot]	4b6f630387	chore(deps): Bump clap from 4.5.0 to 4.5.1 (#24691 )	2024-02-25 14:22:09 +00:00
Trevor Hilton	6ce3165aac	feat: add write and query CLI sub-commands (#24671 ) * feat: add query and write cli for influxdb3 Adds two new sub-commands to the influxdb3 CLI: - query: perform queries against the running server - write: perform writes against the running server Both share a common set of parameters for connecting to the database which are managed in influxdb3/src/commands/common.rs. Currently, query supports all underlying output formats, and can write the output to a file on disk. It only supports SQL as the query language, but will eventually also support InfluxQL. Write supports line protocol for input and expects the source of data to be from a file.	2024-02-20 16:14:19 -05:00
Michael Gattozzi	de102bc927	feat: Add All or Nothing Bearer token auth support (#24666 ) This commit adds basic authorization support to Edge. Up to this point we didn't need have authorization at all and so the server would receive and accept requests from anyone. This isn't exactly secure or ideal for a deployment and so we add a basic form of authentication. The way this works is that a user passes in a hex encoded sha256 hash of a given token to the '--bearer-token' flag of the serve command. When the server starts with this flag it will now check a header of the form 'Authorization: Bearer <token>' by making sure it is valid in the sense that it is not malformed and that when token is hashed it matches the value passed in on the command line. The request is denied with either a 400 Bad Request if the header is malformed or a 401 Unauthorized if the hash does not match or the header is missing. The user is provided a new subcommand of the form: 'influxdb3 create token <token>' where the output contains the command to run the server with and what the header should look like to make requests. I can see future work including multiple tokens and rotating between them or adding new ones to a live service, but for now this shall suffice. As part of the commit end-to-end tests are included to run the server and make requests against the HTTP API and to make sure that requests are denied for being unauthorized, accepted for having the right header, or denied for being malformed. Also as part of this commit a small fix is included for 'Accept: /' headers. We were not checking for them and if this header was included we were denying it instead of sending back the default payload return value.	2024-02-20 15:34:39 -05:00
Trevor Hilton	80505d2b42	feat: add the `influxdb3_client` crate (#24665 ) A new crate, influxdb3_client, was added, which provides the Client struct. This gives programmatic access to the influxdb3 HTTP API. Two primary methods are provided: - `api_v3_write_lp` - `api_v3_query_sql` Each API uses a builder approach to composing the request to be sent. Response handling was kept somewhat naive, in `write_lp` case not returning anything, and in `query_sql`, returning raw `Bytes`. We may improve this in future once the respective APIs have their responses more finalized. Both methods, as well as all associated types are documented with rustdocs. The general approach to these methods was to use a builder style API so that the user of the client can build their requests functionally before sending them to the server.	2024-02-16 15:02:16 -05:00
Paul Dix	3c5e5bf241	feat: Add segment persist of closed buffer segment (#24659 ) * feat: add catalog sequence tracking to OpenBufferSegment * feat: Add segment persist of closed buffer * refactor: pr review updates * refactor: PR updates	2024-02-14 10:55:09 -05:00
Paul Dix	4d9095e58d	feat: add segmenting and wal persistence to WriteBuffer (#24624 ) * refactor: move write buffer into its own dir * feat: implement write buffer segment with wal flushing This creates the WriteBufferFlusher and OpenBufferSegment. If a wal is passed into the buffer, data written into it will be persisted to the wal for the initialized segment id. * refactor: use crossbeam in flusher and pr cleanup	2024-02-12 12:36:10 -05:00
Michael Gattozzi	b555ddf18b	feat: Add different output support to queries (#24616 ) This commit adds the ability to choose the output format of a query via the v3 api so that a user can choose, whether by Accept headers or the format url param, how the data will be returned to them. Prior to this commit the default was a pretty printed text format, but that instead has been changed to json as the default. There are multiple formats one can choose: 1. json 2. csv 3. pretty printed text 4. parquet I've tested each of these out and it works well. In particular the parquet output is exciting as users will be able to perform a query and receive back parquet data that they can then load into say a Python script or something else to work on and operate it. As we extend what data can be queried, as well as persisting it, what people will be able to do with Edge will be really cool and I'm interested to see how users will end up using this functionality in the future.	2024-02-12 12:04:05 -05:00
Michael Gattozzi	8a68ae3f11	fix: Remove nightly CI build from Circle CI runs (#24637 ) Prior to this change we've had CI fail nightly because we can't push the image to CI due to permissions issues. The problem is that influxdata/influxdb_iox is the one that actually has access to push that data to quay. This commit removes the nightly build and references to it as this image is built nightly by the IOx team. If things break we have access to fix it, but I don't think it'll be an issue.	2024-02-12 10:21:15 -05:00
Trevor Hilton	397ee6e73b	fix: add rust-analyzer to toolchain file (#24636 ) * fix: add rust-analyzer to toolchain file Added the rust-analyzer component to the rust-toolchain.toml file so that the correct version of rust-analyzer is installed on Apple Silicone. This will allow the LSP to work on Apple Silicone machines. * chore: update deps for cargo deny	2024-02-06 16:04:03 -05:00
Michael Gattozzi	ff567cd33f	chore(deps): Update arrow and datafusion to 49.0.0 (#24605 ) * chore(deps): Update arrow and datafusion to 49.0.0 This commit copies in our dependency code from influxdb_iox in order for us to be able to upgrade from a forked version of 46.0.0 to 49.0.0 of both arrow and datafusion. Most of the important changes were around how we consumed the crates in influxdb3(_server/_write). Those diffs are particularly worth looking at as the rest was a straight copy and we don't touch those crates in our development currently for influxdb3 edge. * fix: regenerate workspace hack crate * fix: Protobuf issues with incompatibility labels * fix: Broken CI yaml * fix: buf version * fix: Only check IOx repo * fix: Remove protobuf lint * fix: Comment out call to protobuf-lint	2024-01-31 19:18:51 -05:00
Michael Gattozzi	001a2a6653	feat: Implement Persister for PersisterImpl (#24588 ) * feat: Implement Catalog r/w for persister This commit implements reading and writing the Catalog to the object store. This was already stubbed out functionality, but it just needed an implementation. Saving it to the object store is pretty straight forward as it just serializes it to JSON and writes it to the object store. For loading, it finds the most recently added Catalog based on the file name and returns that from the object store in it's deserialized form and returned to the caller. This commit also adds some tests to make sure that the above functionality works as intended. * feat: Implement Segment r/w for persister This commit continues the work on the persister by implementing the persist_segment and load_segment functions for the persister. Much like the Catalog implementation, it's serialized to JSON before being persisted to the object store in persist_segment. This is pretty straightforward. For the loading though we need to find the most recent n segment files and so we need to list them and then return the most recent n. This is a little more complicated to do, but there are comments in the code to make it easier to grok. We also implement more tests to make sure that this part of the persister works as expected. * feat: Implement Parquet r/w to persister This commit does a few things: - First we add methods to the persister trait for reading and writing parquet files as these were not stubbed out in prior commits - Secondly we add a method to serialize a SendableRecordBatchStream into Parquet bytes - With these in place implementing the trait methods is pretty straightforward: hand a path in and a stream and get back some metadata about the file persisted and also get the bytes back if loading from the store Of course we also add more tests to make sure this all works as expected. Do note that this does nothing to make sure that we bound how much memory is used or if this is the most efficient way to write parquet files. This is mostly to get things working with the understanding that future refinement on the approach might be needed. * fix: Update smallvec for crate advisory * fix: Implement better filename handling * feat: Handle loading > 1000 Segment Info files	2024-01-25 14:31:57 -05:00
Michael Gattozzi	e13cc476bb	feat: Add paths module to influxdb3_write (#24579 ) This commit introduces 4 new types in the paths module for the influxdb3_write crate. They are: - ParquetFilePath - CatalogFilePath - SegmentInfoFilePath - SegmentWalFilePath Each of these corresponds to an object store path and for the WAL file an on disk path that we can use to address the needed files in a consistent way and not need to have path construction be duplicated to address these files. These types also Deref/AsRef to the object_store::path::Path type (or the std::path::Path type for the Wal) so that they can be used in places that expect the type such as various object_store/std::fs and so that we can use the underlying type's methods without needing to implement them for each type as they are just a thin wrapper around those types. This commit adds some tests to make sure that the path construction works as intended and also updates the `wal.rs` file to use the new `SegmentWalFilePath` instead of just a `PathBuf`. Closes: #24578	2024-01-19 10:57:54 -05:00
François Martin	58bec1d819	docs: rename influxdb_iox to influxdata (#24577 )	2024-01-16 13:34:23 -05:00
Paul Dix	02b4d28637	feat: add basic wal implementation for Edge (#24570 ) * feat: add basic wal implementation for Edge This WAL implementation uses some of the code from the wal crate, but departs pretty significantly from it in many ways. For now it uses simple JSON encoding for the serialized ops, but we may want to switch that to Protobuf at some point in the future. This version of the wal doesn't have its own buffering. That will be implemented higher up in the BufferImpl, which will use the wal and SegmentWriter to make data in the buffer durable. The write flow will be that writes will come into the buffer and validate/update against an in memory Catalog. Once validated, writes will get buffered up in memory and then flushed into the WAL periodically (likely every 10-20ms). After being flushed to the wal, the entire batch of writes will be put into the in memory queryable buffer. After that responses will be sent back to the clients. This should reduce the write lock pressure on the in-memory buffer considerably. In this PR: - Update the Wal, WalSegmentWriter, and WalSegmentReader traits to line up with new design/understanding - Implement wal (mainly just a way to identify segment files in a directory) - Implement WalSegmentWriter (write header, op batch with crc, and track sequence number in segment, re-open existing file) - Implement WalSegmentReader * refactor: make Wal return impl reader/writer * refactor: clean up wal segment open * fix: WriteBuffer and Wal usage Turn wal and write buffer references into a concrete type, rather than dyn. * fix: have wal loading ignore invalid files	2024-01-12 11:52:28 -05:00
Michael Gattozzi	028a05fbde	fix: remove deploy step for images (#24566 ) We currently don't need or want to deploy influxdb as we're still building out the Edge product. Maybe later for a demo, but for now it just breaks CI and so this removes it.	2024-01-10 13:51:19 -05:00
Michael Gattozzi	89d28ade42	fix: change circle-ci config from iox to influxdb3 (#24564 ) This commit changes the circle-ci config to use influxdb3 rather than iox in our ci config script as the repo is influxdb not influxdb_iox. While we could probably strip out a lot more here as a first attempt to get this to build release images and push them on main this will do just fine.	2024-01-10 12:36:38 -05:00
Michael Gattozzi	9d81c73785	fix: set Dockerfile to build influxdb3 not IOx (#24563 ) Now that we're transitioning the repo code to have influxdb3 Edge not IOx be what's here, we can update the Dockerfile to build influxdb3. This is mostly just updating which version of Rust to use, changing the command that's run when docker runs the container to serve, and changing influxdb_iox to influxdb3 everywhere in the file.	2024-01-09 15:19:21 -05:00
Michael Gattozzi	8ee13bca48	fix: Failing CI on main (#24562 ) * fix: build, upgrade rustc, and deps This commit upgrades Rust to 1.75.0, the latest release. We also upgraded our dependencies to stay up to date and to clear out any uneeded deps from the lockfile. In order to make sure everything works this also fixes the build by upgrading the workspace-hack crate using cargo hikari and removing the `workspace.lint` that was in influxdb3_write that didn't need to be there, probably from a merge issue. With this we can build influxdb3 as our default on main, but this alone is not enough to fix CI and will be addressed in future commits. * fix: warnings for influxdb3 build This commit fixes the warnings emitted by `cargo build` when compiling influxdb3. Mainly it adds needed lifetimes and removes uneccesary imports and functions calls. * fix: all of the clippy lints This for the most part just applies suggested fixes by clippy with a few exceptions: - Generated type crates had additional allows added since we can't control what code gets made - Things that couldn't be automatically fixed were done so manually in particular adding a Send bound for traits that created a Future that should be Send We also had to fix a build issue by adding a feature for tokio-compat due to the upgrade of deps. The workspace crate was updated accordingly. * fix: failing test due to rust panic message change Inbetween rustc 1.72 and rustc 1.75 the way that error messages were displayed when panicing changed. One of our tests depended on the output of that behavior and this commit updates the error message to the new form so that tests will pass. * fix: broken cargo doc link * fix: cargo formatting run * fix: add workspace-hack to influxdb3 crates This was the last change needed to make sure that the workspace-hack crate CI lint would pass. * fix: remove tests that can not run anymore We removed iox code from this code base and as a result some tests cannot be run anymore and so this commit removes them from the code base so that we can get a green build.	2024-01-09 15:11:35 -05:00
Paul Dix	5831cf8cee	feat: Add basic Edge server structure (#24552 ) * WIP: basic influxdb3 command and http server * WIP: write lp, buffer, query out * WIP: test write & query on influxdb3_server, fix warnings * WIP: pull write buffer and catalog into separate crate * WIP: sketch out types used for write: buffer, wal, persister * WIP: remove a bunch of old IOx stuff and fmt	2024-01-08 11:50:59 -05:00
Joshua Powers	acfef87659	chore: Sync and release v1.0.1 of influxdb-line-protocol (#24527 ) * chore: Backport influxdb line protocol changes, release v1.0.1 * chore: Update influxdb_line_protocol to 2.0 --------- Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>	2023-12-22 15:12:41 -05:00
Joshua Powers	2fabaf98a4	chore: Run cargo fmt --all (#24528 )	2023-12-20 14:11:04 -07:00
Jamie Strandboge	bb6a5c0bf6	chore: ignore Go in .github/dependabot.yml, take 3 (#24439 ) Update to use the documented dependency-name: "*" methodology rather than an undocumented example. References: - https://docs.github.com/en/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file	2023-11-02 08:16:13 -04:00
Jeffrey Smith II	560c2ef846	chore: Update dependabot.yml (#24391 ) Add the v3 label to dependabot Rust PRs	2023-11-02 08:10:04 -04:00
Jamie Strandboge	4baa25e56f	chore: ignore Go in .github/dependabot.yml, take 2 (#24438 )	2023-11-02 07:29:35 -04:00
Jamie Strandboge	361a82a84a	chore: ignore Go in .github/dependabot.yml (#24430 ) Before switching to rust-based IOx, influxdb was a Go project which dependabot tracked. After the switch, dependabot would issue alerts for go files that no longer exist. Tell dependabot to ignore "gomod"	2023-10-26 10:53:42 -05:00
dependabot[bot]	d34fc59217	chore(deps): Bump rustix from 0.38.8 to 0.38.19 (#24421 ) Bumps [rustix](https://github.com/bytecodealliance/rustix) from 0.38.8 to 0.38.19. - [Release notes](https://github.com/bytecodealliance/rustix/releases) - [Commits](https://github.com/bytecodealliance/rustix/compare/v0.38.8...v0.38.19) --- updated-dependencies: - dependency-name: rustix dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2023-10-19 16:24:15 -05:00
Rick Spencer	2f0e2dcb9d	chore: Update README.md (#24379 ) Update README to include link to Flux repo	2023-09-26 09:59:23 -04:00
Paul Dix	7b7475983b	chore: update README with new v3 details	2023-09-21 09:31:53 -04:00
Paul Dix	aa458ed166	Merge branch 'iox-repo'	2023-09-21 09:22:15 -04:00
Paul Dix	cafe37bd1f	Merge branch 'pd/influxdb3-oss'	2023-09-21 09:15:41 -04:00
Dom	427daa82b0	Merge pull request #8788 from influxdata/dependabot/cargo/tokio-util-0.7.9 chore(deps): Bump tokio-util from 0.7.8 to 0.7.9	2023-09-21 14:04:16 +01:00
Paul Dix	de835d8c33	feat: remove everything to make way for 3.0, the last database rewrite you'll ever need.	2023-09-21 09:03:38 -04:00
Dom	25f3147dc7	Merge branch 'main' into dependabot/cargo/tokio-util-0.7.9	2023-09-21 13:37:57 +01:00
Dom	008b60cffb	Merge pull request #8790 from influxdata/dependabot/cargo/insta-1.32.0 chore(deps): Bump insta from 1.31.0 to 1.32.0	2023-09-21 13:36:11 +01:00

... 3 4 5 6 7 ...

49389 Commits (praveen/ring-buffer-optimizations) All Branches Search

49389 Commits (praveen/ring-buffer-optimizations)

All Branches