* feat: add create and get database to API
This commit is start of the IOx specific API. It puts everything under /iox/api/v1 as this is the beginning of the IOx API. Creating a database is done with a PUT and a GET request can retrieve the DatabaseRules details.
* feat: add defaults for DatabaseRules for create_database
* feat: add create and get database to API
This commit is start of the IOx specific API. It puts everything under /iox/api/v1 as this is the beginning of the IOx API. Creating a database is done with a PUT and a GET request can retrieve the DatabaseRules details.
This pulls the different backing implmenetations into their own modules. They're about to get more complex so it felt like it was time to separate them out rather than building towards a single multi-thousand line lib.rs. The error type is only defined in lib and imported by the individual modules, which I think makes it easier to work with.
* refactor: Create database with mutable buffer, read buffer and parquet files
* docs: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* fix: rename planners to clarify what they are
* refactor: simplify traits
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* feat: Create configuration system, port IOx to use it
* docs: Apply suggestions from code review
Co-authored-by: Paul Dix <paul@influxdata.com>
* fix: fix test for setting values
Co-authored-by: Paul Dix <paul@influxdata.com>
* refactor: Update docs, remove unused field
* refactor: rename partition -> chunk
* feat: Introduce new partition, which is a holder for Chunks
* refactor: Remove use of wal from mutable database
* refactor: cleanups, remove last direct use of chunks
* fix: delete old benchmarks
* fix: clippy sacrifice
* docs: tidy up comments
* refactor: remove unused error types
* chore: remove commented out tests
This moves the HTTP API over to Routerify, which has the basic route parsing logic that will enable the API design for IOx.
I had a little trouble with the error handling in Routerify so I ended up creating a macro for constructing error responses in the HTTP API. I'm not sure what I think of this pattern so I'm interested in what others think. Another option would be to have two functions for each API endpoint. One which is x_handler with a Routerify function signature. Then another which is just x that has the Result<Response<Body>, ApplicationError> return type, which would make using the ? operator work in those functions. That would eliminate the need for the return_err macro.
I'm happy to refactor to that if people prefer it.
* chore: Connect server crate to http routes and server command
This updates the http_routes and main server to use the Server crate. This is the first step in a larger effort to start hooking up the initial IOx API and get things running end to end with in-memory database, WAL buffer, and object storage.
For the time being, this disables the previous disk based WAL. Or rather, it uses the WriteBufferDb without it. That means that this IOx server has no persistence until later. Because of this, the restart in the end-to-end was removed.
Later PRs will add the WAL buffer and restart logic that loads from object store. We can opt to bring the local disk based WAL back later, but it will likely require some refactoring to work with how the WAL Buffer will operate.
Adds telemetry / tracing with support for a Jaeger backend, and changes the
logger from env_logger to a tracing subscriber to collect the log entries.
Events are batched and then emitted asynchronosuly via UDP to the Jaeger
collector using the tokio runtime. There's a bunch of settings (env
vars) related to batch sizes and flush frequency etc - they're all using
their default values at the moment (if it ain't broke...) See the docs
for more info:
https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/sdk-environment-variables.md#opentelemetry-environment-variable-specification
This is only part 1 of telemetry - it does NOT propagate traces across RPC
boundaries as we're still defining how all this should work. I've created #541
to track this.
Closes#202 and closes#203.
* fix: Error if hint argument is provided to read_groupg
* fix: Verify compatible group and group_keys settings
* docs: Add clarifying comments on validation
* refactor: use into() rather than String::from for consistency
* feat: Implement write buffer to Parquet snapshotting
This introduces snapshot to the server packages to manage snapshotting. It also introduces a new trait for representing a Partition. There is a very crude API wired up in http_routes for testing purposes. Follow on work will bring the server package into http_routes and rework the snapshot API.
Previosuly the $ORG and $BUCKET was joined as:
$ORG + "_" + $BUCKET
Which is fine unless either $ORG or $BUCKET includes a "_", such as:
$ORG = "org_a"
$BUCKET = "bucket"
and
$ORG = "org"
$BUCKET = "a_bucket"
This change continues to join $ORG and $BUCKET with an underscore, but
disallows underscores in either $ORG or $BUCKET. It appears these values
are non-zero u64s in the gRPC protocol converted to their base-10 string
representations for the DB name, so this seems safe to enforce.
In addition, this change introduces a `DatabaseName` type to avoid
passing bare strings around, and allow consuming code to ensure only
valid database names are provided at compile type. This type works with
both owned & borrowed content so doesn't force a string copy where we
can avoid it, and derefs to `str` to make it easier to use with existing
code.
I've been minimally invasive in pushing the `DatabaseName` through the
existing code and figured I'd see what the sentement is first.
Candidates for conversion from `str` to `DatabaseName` that seem to make
sense to me include:
- `DatabaseStore` trait
- `RemoteServer` trait
- Others? Basically anywhere other than the "edge" API inputs
Fixes#436 (thanks @zeebo)
Rather than having to specify unique ports for test server instances, have the
kernel randomly assign ports and configure the storage gRPC client to use them.
* feat: Update storage protobuf definitions, add stubs for read_window_aggregate
* refactor: Extract the features field in a clearer way
* docs: Add provenance information to service.proto
* feat: Allow binary tag references in gRPC, predicate matching patterns
* feat: New predicate format and builder
* fix: Update to work with branches
* test: Add test coverage for rpc predicate conversion
* refactor: use From trait
* refactor: make logic more idomatic
* refactor: remove spurious log message
* fix: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* refactor: use TryFrom trait
* fix: make it compile again
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* feat: write_database support for predicates
* fix: temporarily pull in arrow fork to pick up fix for ARROW-10136
* fix: Update mutex usage based on PR feedback
* fix: more mutex polish and use OptionExt
* fix: update comments
* fix: rust-fu the table lookup
* fix: update docs
* fix: more idomatic rust types
* fix: better usage of reference types
* refactor: move GrpcInputs to its own module so I can reuse it
* feat: Basic gRPC support for listing measurements, tests for same
* fix: move Fixture definition, rename client
* fix: remove confusing doc comment
* test: traits for database and tests for http handler
* refactor: Use generics and trait bounds instead of trait objects
* refactor: Replace trait objects with an associated type
* refactor: Extract an associated Error type on the Database traits
* refactor: Remove some explicit conversions to_string that Snafu takes care of
* docs: add comments
* refactor: move traits into storage module
Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@integer32.com>
This is the initial prototype of the WriteBuffer and WAL. This does the following:
* accepts a slice of ParsedLine into the DB
* writes those into an in memory structure with tags represented as u32 dictionaries and all field types supported
* persists those writes into the WAL as Flatbuffer blobs (one WAL entry per slice of lines written, or WriteBatch)
* has a method to return a table from the buffer as an Arrow RecordBatch
* recovers the WAL after the database is closed and opened back up again
* has a single test that covers the end-to-end from the DB side
* It doesn't include partitioning yet. Although the write_lines method does actually try to do partitions on time. That'll get changed to be something more general defined by a per database configuration.
* hooked up to the v2 HTTP write API
* hooked up to a read API which will execute a SQL query against the data in the buffer
This includes a refactor of the WAL:
Refactors the WAL to remove async and threading so that it can be moved higher up. This simplifies the API while keeping just about the same amount of code in ParitionStore to handle the asynchronous writes.
This also modifies the WAL to remove the SideFile implementation, which was causing significant performance problems and write amplification. The downside is that WAL writes are no longer guarranteed atomic.
Further, this modifies the WAL to keep the active segement file handle open. Appends now don't have to list the directory contents and look for the latest file and open the file handle to do appends, which should also improve performance and reduce iops.
The `/api/v2/create_bucket` API was delorean-specific for testing
purposes. This change makes it match the [Influx 2.0 API][influx] and
adds a method to the client for creating buckets.
The client will always send an empty array of `retentionRules` because
that is a required parameter for the Influx API. Delorean always ignores
`retentionRules`. The `description` and `rp` parameters are optional and
are never sent.
[influx]: https://v2.docs.influxdata.com/v2.0/api/#operation/PostBuckets
I believe the gRPC create bucket is also delorean-specific and perhaps
not needed, but I'm leaving it in for now with a note.
This gets us built-in help text and error messages, and does less work
before failing because of an unsupported value.
Before this change, the help text was:
```
OPTIONS:
--compression-level <compression_level>
Compression level: max or compatibility (default). [default: compatibility]
```
After this change, the help text is:
```
OPTIONS:
--compression-level <compression_level>
How much to compress the output data. 'max' compresses the most; 'compatibility' compresses in a manner more
likely to be readable by other tools. [default: compatibility] [possible values: max, compatibility]
```
Before this change, if you supplied an unsupported value, the error was:
```
[2020-06-29T14:47:42Z INFO delorean::commands::convert] convert starting
[2020-06-29T14:47:42Z INFO delorean::commands::convert] Preparing to convert 591 bytes from tests/fixtures/lineproto/temperature.lp
Conversion failed: Error creating a parquet table writer Unknown compression level 'foo'. Valid options 'max' or 'compatibility'
```
After this change, the error is:
```
error: 'foo' isn't a valid value for '--compression-level <compression_level>'
[possible values: compatibility, max]
```
Then fix the failures, mostly by adding derives and then removing some
unneeded (cheap) clones.
Document places where we purposefully don't use the same lints.
Not unifying missing_docs.
👀https://github.com/rust-lang/cargo/issues/5034
* feat: Implement boolean support for the line protcol parser
* fix: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* fix: fmt+clippy
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>