* feat: Implment data driven query testing and port explain tests
* fix: do not fmt the auto generated cases
* refactor: split setup and parser into separate modules
* refactor: Add log to runner, add end to end tests
* docs: fixu cpmments
* test: Reproducer showing that the min/max selectors are order dependent
* fix: pick correct timestamp for first/last selectors
* refactor: remove println
* docs: Fixup comments and add to link to arrow-datafusion/issues/600
* fix: Add debug if timestamp is null
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Instead of waiting for the server ID to be set and then mark the server
as errored, directly check the object store on startup. This is
important so that we fail fast when Istio isn't up and running yet.
* feat: Implement DeduplicateExec
* fix: Doc comments
* fix: fix comment
* fix: Update with arrow ticket references and use datafusion coalsce batches impl
* refactor: rename inner.rs to algo.rs
* docs: Add additional documentation on rationale for last field value
* docs: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* docs: Update query/src/provider/deduplicate/algo.rs
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* docs: Apply suggestions from code review
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
* refactor: do not use pub(crate)
* docs: fix test comments
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
This adds a new module, persistence_windows, to the mutable buffer crate. Later PRs will add this into the mutable buffer chunk where it can be used to track when the lifecycle for persistence should be triggered.
This PR factors out the tracing/logging CLI optinos into the `trogging` utility crate,
so that multiple binaries from the IOx suite (such as conductor) can use the same (and quite complex)
logging/tracing configuration options (flags and env vars).
Closesinfluxdata/conductor#343
This prevents users from `parquet_file::metadata` to also depend on
`parquet` directly. Furthermore they don't need to important dozend of
functions and can instead just use `IoxParquetMetaData` directly.
__Rationale__
We currently use the `tracing` framework to output to both log outputs (e.g. stdout for k8s) and distributed tracing collectors (e.g. opentelemetry jaeger).
However, due to a limitation in the `tracing` SDK, we can only have one "filter" level that applies
to both logs and tracing outputs. This is unpractical because tracing collectors are designed
to receive high verbosity data (which will be then sampled within the opentelemetry library),
while logs generally are limited to the DEBUG level on production.
This PR adds a `FilteredLayer` tracing subscriber layer, that wraps a subscriber layer with a independent
filter, which can filter events goint to the wrapper subscriber layer more agressively than the global layer.
This will allow us to emit logs at INFO or DEBUG level while passing all events to opentelemetry at TRACE
level (and opentelemetry SDK will then sample the events so that only a small part will be sent to the
ot collector)
__Note__
This PR just implements the `FilteredLayer` and a test. Another PR will integrate this with
our log/tracing setup code.
* refactor: Separate query_tests into its own crate
* fix: references
* refactor: break out server benchmarks
* fix: Update query_tests/src/lib.rs
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>