Commit Graph

540 Commits (8e796677769d1e0c2f64dbad5fc73690daa8cdf9)

Author SHA1 Message Date
Michael Gattozzi c88cb5f093
feat: build binaries and Docker images in CI (#24751)
For releases we need to have Docker images and binary images available for the
user to actually run influxdb3. These CI changes will build the binaries on a
release tag and the Docker image as well, test, sign, and publish them and make
them available for download.

Co-Authored-By: Brandon Pfeifer <bpfeifer@influxdata.com>
2024-05-03 16:39:42 -04:00
Trevor Hilton 0d5b591ec9
chore: point at latest core (#24937)
Minor core update to bring in security updates and cargo optimizations from core.
2024-04-23 12:55:30 -04:00
Trevor Hilton 1982244e65
chore: update to latest core (#24876)
* chore: update to latest core
2024-04-03 09:36:28 -04:00
Trevor Hilton 2dde602995
feat: report system stats in load generator (#24871)
* feat: report system stats in load generator

Added the mechanism to report system stats during load generation. The
following stats are saved in a CSV file:

- cpu_usage
- disk_written_bytes
- disk_read_bytes
- memory
- virtual_memory

This only works when running the load generator against a local instance
of influxdb3, i.e., one that is running on your machine.

Generating system stats is done by passing the --system-stats flag to the
load generator.
2024-04-02 17:16:17 -04:00
Trevor Hilton e0465843be
feat: `/ping` API to serve version and revision (#24864)
* feat: /ping API to serve version

The /ping API was added, which is served at GET and
POST methods. The API responds with a JSON body
containing the version and revision of the build.

A new crate was added, influxdb3_process, which
takes the process_info.rs module from the influxdb3
crate, and puts it in a separate crate so that other
crates (influxdb3_server) can depend on it. This was
needed in order to have access to the version and
revision values, which are generated at build time,
in the HTTP API code of influxdb3_server.

A E2E test was added to check that /ping works.

E2E TestServer can now have logs emitted using the
TEST_LOG environment variable.
2024-04-01 16:57:10 -04:00
Paul Dix 1827866d00
feat: initial load generator implementation (#24808)
* feat: initial load generator implementation

This adds a load generator as a new crate. Initially it only generates write load, but the scaffolding is there to add a query load generator to complement the write load tool.

This could have been added as a subcommand to the influxdb3 program, but I thought it best to have it separate for now.

It's fairly light on tests and error handling given its an internal tooling CLI. I've added only something very basic to test the line protocol generation and run the actual write command by hand.

I included pretty detailed instructions and some runnable examples.

* refactor: address PR feedback
2024-03-25 08:26:24 -04:00
Trevor Hilton caae9ca9f2
chore: `influxdb3_core` update (#24798)
chore: sync in latest core changes
2024-03-21 10:29:56 -04:00
Trevor Hilton 1fe414c14b
feat: support v1 query API (#24746)
feat: support the v1 query API

This PR adds support for the `/api/v1/query` API, which is meant to
serve the original InfluxDB v1 query API, to serve single statement
`SELECT` and `SHOW` queries. The response, which is returned as JSON,
can be chunked via the `chunked` and optional `chunk_size` parameters.
An optional `epoch` parameter can be supplied to have `time` column
timestamps converted to a UNIX epoch with the given precision.

## Buffering

The response is buffered by default, but if the `chunked` parameter
is not supplied, or is passed as `false`, then the entire query
result will be buffered into memory before being returned in the
response. This is how the original API behaves, so we are replicating
that here.

When `chunked` is passed as `true`, then the response will be a
stream of chunks, where each chunk is a self-contained response,
with the same structure as that of the non-chunked response. Chunks
are split up by the provided `chunk_size`, or by series, i.e.,
measurement, which ever comes first. The default chunk size is 10,000
rows.

Buffering is implemented with the `QueryResponseStream` and
`ChunkBuffer` types, the former implements the `Stream` trait,
which allows it to be streamed in the HTTP response directly with
`hyper`'s `Body::wrap_stream`. The `QueryResponseStream` is a wrapper
around the inner arrow `RecordBatchStream`, which buffers the
streamed `RecordBatch`es according to the requested chunking parameters.

## Testing

Two new E2E tests were added to test basic query functionality and
chunking behaviour, respectively. In addition, some manual testing
was done to verify that the InfluxDB Grafana plugin works with this
API.
2024-03-15 13:38:15 -04:00
Michael Gattozzi ce8c158956
feat: Change Bearer Auth Token to use random bits (#24733)
This changes the 'influxdb3 create token' command so that it will just
automatically generate a completely random base64 encoded token prepended with
'apiv3_' that is then fed into a Sha512 algorithm instead of Sha256. The
user can no longer pass in a token to be turned into the proper output.

This also changes the server code to handle the change to Sha512 as well.

Closes #24704
2024-03-06 12:43:00 -05:00
Trevor Hilton fb4f09d675
feat: support `SHOW RETENTION POLICIES` (#24729)
feat: support SHOW RETENTION POLICIES

Added support through the influxdb3 Query Executor to perform
SHOW RETENTION POLICIES queries, both on a specific database as well
as accross all databases.

Test cases were added to check this functionality.
2024-03-05 15:40:58 -05:00
Trevor Hilton f7892ebee5
feat: add the `api/v3/query_influxql` API (#24696)
feat: add query_influxql api

This PR adds support for the /api/v3/query_influxql API. This re-uses code from the existing query_sql API, but some refactoring was done to allow for code re-use between the two.

The main change to the original code from the existing query_sql API was that the format is determined up front, in the event that the user provides some incorrect Accept header, so that the 400 BAD REQUEST is returned before performing the query.

Support of several InfluxQL queries that previously required a bridge to be executed in 3.0 was added:

SHOW MEASUREMENTS
SHOW TAG KEYS
SHOW TAG VALUES
SHOW FIELD KEYS
SHOW DATABASES

Handling of qualified measurement names in SELECT queries (see below)

This is accomplished with the newly added iox_query_influxql_rewrite crate, which provides the means to re-write an InfluxQL statement to strip out a database name and retention policy, if provided. Doing so allows the query_influxql API to have the database parameter optional, as it may be provided in the query string.

Handling qualified measurement names in SELECT

The implementation in this PR will inspect all measurements provided in a FROM clause and extract the database (DB) name and retention policy (RP) name (if not the default). If multiple DB/RP's are provided, an error is thrown.

Testing

E2E tests were added for performing basic queries against a running server on both the query_sql and query_influxql APIs. In addition, the test for query_influxql includes some of the InfluxQL-specific queries, e.g., SHOW MEASUREMENTS.

Other Changes

The influxdb3_client now has the api_v3_query_influxql method (and a basic test was added for this)
2024-03-01 12:27:38 -05:00
Michael Gattozzi 73e261c021
feat: Split out shared core crates from Edge (#24714)
This commit is a major refactor for the code base. It mainly does four
things:

1. Splits code shared between the internal IOx repository and this one
   into it's own repo over at https://github.com/influxdata/influxdb3_core
2. Removes any docs or anything else that did not relate to this project
3. Reorganizes the Cargo.toml files to use the top level Cargo.toml to
   declare dependencies and versions to keep all crates in sync and sets
   all others to use `<dep>.workspace = true` unless it's an optional
   dependency
4. Set the top level Cargo.toml to point to the core crates as git
   dependencies

With this any changes specific to Edge will be contained here, updating
deps will be a PR over in `influxdata/influxdb3_core`, and we can prove
out the viability for this model to use for IOx.
2024-02-29 16:21:41 -05:00
Trevor Hilton 80505d2b42
feat: add the `influxdb3_client` crate (#24665)
A new crate, influxdb3_client, was added, which provides the Client
struct. This gives programmatic access to the influxdb3 HTTP API.

Two primary methods are provided:
- `api_v3_write_lp`
- `api_v3_query_sql`

Each API uses a builder approach to composing the request to be sent.
Response handling was kept somewhat naive, in `write_lp` case not returning
anything, and in `query_sql`, returning raw `Bytes`. We may improve this in 
future once the respective APIs have their responses more finalized.

Both methods, as well as all associated types are documented with rustdocs.

The general approach to these methods was to use a builder style API so that
the user of the client can build their requests functionally before sending them
to the server.
2024-02-16 15:02:16 -05:00
Michael Gattozzi ff567cd33f
chore(deps): Update arrow and datafusion to 49.0.0 (#24605)
* chore(deps): Update arrow and datafusion to 49.0.0

This commit copies in our dependency code from influxdb_iox in order for
us to be able to upgrade from a forked version of 46.0.0 to 49.0.0 of
both arrow and datafusion. Most of the important changes were around how
we consumed the crates in influxdb3(_server/_write). Those diffs are
particularly worth looking at as the rest was a straight copy and we
don't touch those crates in our development currently for influxdb3
edge.

* fix: regenerate workspace hack crate

* fix: Protobuf issues with incompatibility labels

* fix: Broken CI yaml

* fix: buf version

* fix: Only check IOx repo

* fix: Remove protobuf lint

* fix: Comment out call to protobuf-lint
2024-01-31 19:18:51 -05:00
Paul Dix 5831cf8cee
feat: Add basic Edge server structure (#24552)
* WIP: basic influxdb3 command and http server

* WIP: write lp, buffer, query out

* WIP: test write & query on influxdb3_server, fix warnings

* WIP: pull write buffer and catalog into separate crate

* WIP: sketch out types used for write: buffer, wal, persister

* WIP: remove a bunch of old IOx stuff and fmt
2024-01-08 11:50:59 -05:00
Paul Dix cafe37bd1f Merge branch 'pd/influxdb3-oss' 2023-09-21 09:15:41 -04:00
Andrew Lamb 65d0ea2055
chore: Update DataFusion (#8765)
* chore: Update DataFusion pin again

* chore: update for different type

* fix: statistics

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-09-19 22:26:53 +00:00
Andrew Lamb 58d892fcdf
chore: Update DataFusion pin (#8749)
* chore: Update DataFusion pin and `chrono`

* chore: Update for deprecation

* chore: Update plans

* fix: fix update logic in percentile

* chore: update to avoid deprecated from_exprs api

* fix: Update arrow pin, fix plan errors

* test: for describe

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-09-18 18:11:23 +00:00
Andrew Lamb ed2da2a831
Revert "chore: Update DataFusion pin (#8698)" (#8714)
This reverts commit 74c0851fc2.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-09-11 17:19:04 +00:00
Andrew Lamb 74c0851fc2
chore: Update DataFusion pin (#8698)
* chore: Update DataFusion pin

* chore: Update for new API

* fix: fix test

* fix: only check error messages

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-09-11 13:54:24 +00:00
Andrew Lamb 45c6bfea9c
chore: Update datafusion, arrow/flight/parquet to `46.0.0` , object_store to `0.7.0` (#8577)
* chore: Update DataFusion pin

* chore: Update for new API

* fix: Update for API

* fix: update compactor test

* fix: Update to patched version of arrow 46.0.0

* fix: map  `DataFusionError::Configuration` to an internal error

* fix: do not use deprecated API

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-09-08 12:49:57 +00:00
Dom Dwyer c0b4a10874
feat(gossip): add gossip_compaction crate
Adds a crate that layers compaction-specific gossip types and
abstractions over the underlying gossip transport for a nicer (and
decoupled!) internal API.
2023-09-04 14:05:39 +02:00
Marco Neumann 12f2716180
feat: scaffolding for ingester->querier V2 client (#8632)
Adds basic structure for #8349. This will be filled in using separate
PRs for easier review.

The layer structure was chosen to simplify testing and allow composition
of features (like retries, circuit breaking, metrics, etc.). In contrast
to the V1 client (`querier::ingester`) a client here addresses exactly 1
ingester, not multiple (via an `addr` parameter). The tracking around
mutiple states in the V1 version is not really nice and overly
complicated.
2023-09-01 07:58:26 +00:00
Dom fa738cfec3
Merge branch 'main' into dom/gossip-parquet-crate 2023-08-25 13:07:07 +01:00
Andrew Lamb e4505912a1
chore: Update DataFusion pin (#8544)
* chore: Update DataFusion pin

* refactor: Use upstream check

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-24 18:31:33 +00:00
Dom Dwyer 6a0c54758b
feat(gossip): new parquet file gossip crate
Adds a reusable "gossip_parquet_file" crate that provides a use-case
specific wrapper over the underlying gossip transport.

This crate deals with the encoding and decoding of parquet gossip
messages, handling them off to the application, and decoupling latency
of handlers from the gossip reactor.
2023-08-24 11:23:17 +02:00
Dom Dwyer 2e77507f7b
feat: implement gossip_schema crate
Adds a new gossip_schema crate that provides a high-level interface to
schema change notifications.

This crate layers schema-specific interfaces over the existing low-level
gossip crate. Users can obtain best-effort schema change notifications
by implementing a SchemaEventHandler delegate given to a SchemaRx, or
efficiently dispatch schema change notifications to listening peers
using a SchemaTx.

Schema notifications are sent over the Topic::SchemaChanges topic
(ID=1), which the caller must register as an interest on receiving
gossip nodes.
2023-08-22 12:45:22 +02:00
Andrew Lamb 967aef0e9d
chore: Update datafusion (#8515)
* chore: Update datafusion

* fix: update for API

* fix: Verify unsupported statements, with tests

* fix: update tests

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-21 17:49:21 +00:00
Paul Dix 8263c4f3cf feat: implement new ingest structure
Define new proto for the structure that gets sent from router to ingester and persisted in the ingester WAL.
Create ingest_structure crate with functions to convert from line protocol to new proto structure while validating schema.
Add function to convert new proto structure to RecordBatch.
2023-08-20 17:31:52 -04:00
Andrew Lamb af8967f9e1
chore: Update DataFusion to get fix for string functions on tags (#8479)
* chore: Update DataFusion pin

* test: add test

* fix: Update test with correct query
2023-08-17 17:00:04 +00:00
Andrew Lamb 232eee059f
chore: Update DataFusion (#8460)
* chore: Update DataFusion

* chore: update for API changes

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-10 14:54:52 +00:00
Andrew Lamb ad663842cb
chore: Update `datafusion` / `arrow` / `parquet` to `45.0.0` (#8452)
* chore: Update datafusion / arrow / parquet to `45.0.0`

* chore: remove deprecated API

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-09 13:18:18 +00:00
Andrew Lamb 46bfa0badc
chore: Update DataFusion (#8447)
* chore: Update DataFusion pin

* chore: Update for API changes
2023-08-08 13:41:14 +00:00
Andrew Lamb 6e13ff8cb8
chore: Update DataFusion pin (#8390)
* chore: Update DataFusion pin

* chore: Update for API

* fix: update plans

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-08-02 14:58:16 +00:00
Andrew Lamb de79619e71
chore: Update datafusion (#8355)
* chore: Update datafusion pin

* fix: Update for change in API

* chore: Update plan

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-31 15:41:00 +00:00
Andrew Lamb 3eb48ef210
chore: Update datafusion again (#8247)
* chore: Update datafusion to get new grouping

* chore: Update for new API

* chore: update tests

* fix: new API

* fix: state type

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-21 11:20:36 +00:00
Andrew Lamb 2b164c0037
chore: Update DataFusion pin (#8246) 2023-07-17 15:53:54 +00:00
Andrew Lamb c3788ee2b8
chore: Update datafusion pin (#8223) 2023-07-12 18:18:41 +00:00
Dom ceecd11064
Merge branch 'main' into dom/gossip-basic 2023-07-12 10:50:57 +01:00
Andrew Lamb b24f9c81ba
chore: Update DataFusion pin, updates for API changed (#8199) 2023-07-11 13:36:38 +00:00
Dom Dwyer 69ab70ce99
feat: init gossip package
Adds a new empty "gossip" package to the workspace.
2023-07-10 12:11:12 +02:00
Dom Dwyer 5027c9a88c
chore: sort workspace members
Sort the package names in the workspace member declaration.
2023-07-10 12:11:03 +02:00
Andrew Lamb 3ce11d8d66
chore: Update DataFusion (#8190)
* chore: Update DataFusion

* chore: Run cargo hakari tasks

* fix: Update for API changes

* fix: use display format

* chore: Update explain plan output

* fix: update plans

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-07-10 09:54:50 +00:00
Andrew Lamb 4a1f8db254
chore: Update datafusion + arrow/arrow-flight/parquet to patched version `42.0.0` (#8113)
* Revert "Revert "chore: Update datafusion + arrow/arrow-flight/parquet to version `42.0.0` (#8036)" (#8049)"

This reverts commit fb0674fc01.

* chore: Update Cargo and hakari

* chore: Update to patched version

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-30 12:59:31 +00:00
Marco Neumann 9874b283b7
feat: tokio metrics bridge (#8091) 2023-06-29 08:43:57 +00:00
Andrew Lamb fb0674fc01
Revert "chore: Update datafusion + arrow/arrow-flight/parquet to version `42.0.0` (#8036)" (#8049)
This reverts commit 70ffedadc7.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-22 11:03:25 +00:00
Andrew Lamb 70ffedadc7
chore: Update datafusion + arrow/arrow-flight/parquet to version `42.0.0` (#8036)
* chore: Update datafusion + arrow/arrow-flight/parquet to version `42.0.0`

* chore: Update for new APIs

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-21 16:11:36 +00:00
wiedld 8b7ef69f6f
refactor: move partitions_source to scheduler (#8010)
* refactor: make compactor_scheduler crate

* refactor: move PartitionsSource into the compactor_scheduler

    The compactor currently uses PartitionsSource in two ways:
       * for the preparation of PartitionIds prior to the compactor pipeline.
       * for the abstraction which utilize the PartitionIds during the IO pipeline.
    This commit is a refactoring to enable us to delineate between these two utilizations.
    The former (preparation) utilization will now be done in the compactor_scheduler.
    Since the compactor is dependent on the compactor_scheduler, it made sense to move the trait to the scheduler.
2023-06-16 10:02:13 -07:00
Stuart Carnie e10b8c93c8
chore: Update DataFusion and other dependencies (#8014)
* chore: Update DataFusion pin

* chore: Update API changes

* chore: Don't use deprecated API

* chore: Run cargo hakari tasks

* chore: Update tests due to changes in logical plan nodes from DF update

* chore: Fix broken links in docs

* chore: Adjust changes to expected output

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2023-06-16 10:39:36 +00:00
Andrew Lamb 5889c96501
chore: Update `datafusion` and other dependencies (#7981)
* chore: Update DatFaFusion pin

* chore: Update other dependencies

* chore: Update hakari

* fix: Update for API changes

* fix: Update explain plan

* fix: Update influxql plans

* fix: rustdoc links

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-06-16 09:48:55 +00:00