feat: support query parameters
This adds support for parameters in the /api/v3/query_sql
and /api/v3/query_influxql API
The new parameter `params` is supported in the URL query string
of a GET request, or in the JSON body of a POST request.
Two new E2E tests were added to check successful GET/POST as well
as error scenario when params are not provided for a query string
that would expect them.
feat: support the v1 query API
This PR adds support for the `/api/v1/query` API, which is meant to
serve the original InfluxDB v1 query API, to serve single statement
`SELECT` and `SHOW` queries. The response, which is returned as JSON,
can be chunked via the `chunked` and optional `chunk_size` parameters.
An optional `epoch` parameter can be supplied to have `time` column
timestamps converted to a UNIX epoch with the given precision.
## Buffering
The response is buffered by default, but if the `chunked` parameter
is not supplied, or is passed as `false`, then the entire query
result will be buffered into memory before being returned in the
response. This is how the original API behaves, so we are replicating
that here.
When `chunked` is passed as `true`, then the response will be a
stream of chunks, where each chunk is a self-contained response,
with the same structure as that of the non-chunked response. Chunks
are split up by the provided `chunk_size`, or by series, i.e.,
measurement, which ever comes first. The default chunk size is 10,000
rows.
Buffering is implemented with the `QueryResponseStream` and
`ChunkBuffer` types, the former implements the `Stream` trait,
which allows it to be streamed in the HTTP response directly with
`hyper`'s `Body::wrap_stream`. The `QueryResponseStream` is a wrapper
around the inner arrow `RecordBatchStream`, which buffers the
streamed `RecordBatch`es according to the requested chunking parameters.
## Testing
Two new E2E tests were added to test basic query functionality and
chunking behaviour, respectively. In addition, some manual testing
was done to verify that the InfluxDB Grafana plugin works with this
API.
This commit re-enables the limits test after making a fix that has it
run <1 second on my laptop vs the old behavior of >=30 seconds. It does
so by constructing one single write_lp request to create 1995 tables
rather than 1995 individual requests that make a table. This is far more
efficient.
feat: support authenticating v1 APIs with p parameter
The p URL query parameter can be used to authenticate requests
to the /api/v1/query and /api/v1/write APIs
A test was added to ensure this works
* feat: add `Authorizer` impls to authz REST and gRPC
This adds two new Authorizer implementations to Edge: Default and
AllOrNothing, which will provide the two auth options for Edge.
Both gRPC requests and HTTP REST request will be authorized by
the same Authorizer implementation.
The SHA512 digest action was moved into the `Authorizer` impl.
* feat: add `ServerBuilder` to construct `Server
A builder was added to the Server in this commit, as part of an
attempt to get the server creation to be more modular.
* refactor: use test server fixture in auth e2e test
Refactored the `auth` integration test in `influxdb3` to use the
`TestServer` fixture; part of this involved extending the fixture
to be configurable, so that the `TestServer` can be spun up with
an auth token.
* test: add test for authorized gRPC
A new end-to-end test, auth_grpc, was added to check that
authorization is working with the influxdb3 Flight service.
This changes the 'influxdb3 create token' command so that it will just
automatically generate a completely random base64 encoded token prepended with
'apiv3_' that is then fed into a Sha512 algorithm instead of Sha256. The
user can no longer pass in a token to be turned into the proper output.
This also changes the server code to handle the change to Sha512 as well.
Closes#24704
feat: support SHOW RETENTION POLICIES
Added support through the influxdb3 Query Executor to perform
SHOW RETENTION POLICIES queries, both on a specific database as well
as accross all databases.
Test cases were added to check this functionality.
This commit is the final piece for the write_lp endpoint. It adds limits
to Edge such that:
- There can only be 5 Databases
- There can only be 500 Columns per Table
- There can only be 2000 Tables across all Databases
We do this by modifying the catalog code to error out whenever one of
these limits would be exceeded before permanently modifying the schema.
These are hard coded limits and cannot be configured by the user.
Closes#24554
feat: add query_influxql api
This PR adds support for the /api/v3/query_influxql API. This re-uses code from the existing query_sql API, but some refactoring was done to allow for code re-use between the two.
The main change to the original code from the existing query_sql API was that the format is determined up front, in the event that the user provides some incorrect Accept header, so that the 400 BAD REQUEST is returned before performing the query.
Support of several InfluxQL queries that previously required a bridge to be executed in 3.0 was added:
SHOW MEASUREMENTS
SHOW TAG KEYS
SHOW TAG VALUES
SHOW FIELD KEYS
SHOW DATABASES
Handling of qualified measurement names in SELECT queries (see below)
This is accomplished with the newly added iox_query_influxql_rewrite crate, which provides the means to re-write an InfluxQL statement to strip out a database name and retention policy, if provided. Doing so allows the query_influxql API to have the database parameter optional, as it may be provided in the query string.
Handling qualified measurement names in SELECT
The implementation in this PR will inspect all measurements provided in a FROM clause and extract the database (DB) name and retention policy (RP) name (if not the default). If multiple DB/RP's are provided, an error is thrown.
Testing
E2E tests were added for performing basic queries against a running server on both the query_sql and query_influxql APIs. In addition, the test for query_influxql includes some of the InfluxQL-specific queries, e.g., SHOW MEASUREMENTS.
Other Changes
The influxdb3_client now has the api_v3_query_influxql method (and a basic test was added for this)
* feat: Add partial write and name check to write_lp
This commit adds new behavior to the v3 write_lp http endpoint by
implementing both partial writes and checking the db name for validity.
It also sets the partial write behavior as the default now, whereas
before we would reject the entire request if one line was incorrect.
Users who *do* actually want that behavior can now opt in by putting
'accept_partial=false' into the url of the request.
We also check that the db name used in the request contains only
numbers, letters, underscores and hyphens and that it must start with
either a number or letter.
We also introduce a more standardized way to return errors to the user
as JSON that we can expand over time to give actionable error messages
to the user that they can use to fix their requests.
Finally tests have been included to mock out and test the behavior for
all of the above so that changes to the error messages are reflected in
tests, that both partial and not partial writes work as expected, and
that invalid db names are rejected without writing.
* feat: Add precision to write_lp http endpoint
This commit adds the ability to control the precision of the time stamp
passed in to the endpoint. For example if a user chooses 'second' and
the timestamp 20 that will be 20 seconds past the Unix Epoch. If they
choose 'millisecond' instead it will be 20 milliseconds past the Epoch.
Up to this point we assumed that all data passed in was of nanosecond
precision. The data is still stored in the database as nanoseconds.
Instead upon receiving the data we convert it to nanoseconds. If the
precision URL parameter is not specified we default to auto and take a
best effort guess at what the user wanted based on the order of
magnitude of the data passed in.
This change will allow users finer grained control over what precision
they want to use for their data as well as trying our best to make a
good user experience and having things work as expected and not creating
a failure mode whereby a user wanted seconds and instead put in
nanoseconds by default.
* feat: support FlightSQL by serving gRPC requests on same port as HTTP
This commit adds support for FlightSQL queries via gRPC to the influxdb3 service. It does so by ensuring the QueryExecutor implements the QueryNamespaceProvider trait, and the underlying QueryDatabase implements QueryNamespace. Satisfying those requirements allows the construction of a FlightServiceServer from the service_grpc_flight crate.
The FlightServiceServer is a gRPC server that can be served via tonic at the API surface; however, enabling this required some tower::Service wrangling. The influxdb3_server/src/server.rs module was introduced to house this code. The objective is to serve both gRPC (via the newly introduced tonic server) and standard REST HTTP requests (via the existing HTTP server) on the same port.
This is accomplished by the HybridService which can handle either gRPC or non-gRPC HTTP requests. The HybridService is wrapped in a HybridMakeService which allows us to serve it via hyper::Server on a single bind address.
End-to-end tests were added in influxdb3/tests/flight.rs. These cover some basic FlightSQL cases. A common.rs module was added that introduces some fixtures to aid in end-to-end tests in influxdb3.
This commit adds basic authorization support to Edge. Up to this point
we didn't need have authorization at all and so the server would
receive and accept requests from anyone. This isn't exactly secure or
ideal for a deployment and so we add a basic form of authentication.
The way this works is that a user passes in a hex encoded sha256 hash of
a given token to the '--bearer-token' flag of the serve command. When
the server starts with this flag it will now check a header of the form
'Authorization: Bearer <token>' by making sure it is valid in the sense
that it is not malformed and that when token is hashed it matches the
value passed in on the command line. The request is denied with either a
400 Bad Request if the header is malformed or a 401 Unauthorized if the
hash does not match or the header is missing.
The user is provided a new subcommand of the form: 'influxdb3 create
token <token>' where the output contains the command to run the server
with and what the header should look like to make requests.
I can see future work including multiple tokens and rotating between
them or adding new ones to a live service, but for now this shall
suffice.
As part of the commit end-to-end tests are included to run the server
and make requests against the HTTP API and to make sure that requests
are denied for being unauthorized, accepted for having the right header,
or denied for being malformed.
Also as part of this commit a small fix is included for 'Accept: */*'
headers. We were not checking for them and if this header was included
we were denying it instead of sending back the default payload return
value.