Prior to this commit, calling shutdown() on the CompactorServerType (the
server layer run by the iox binary) would cancel it's own
CancellationToken, while the CompactorHandler (the actual compaction
workload entrypoint) would be watching it's own, different token.
This commit removes the redundant CancellationToken in the
CompactorServerType, instead using the inner CompactorHandler for
cancellation notification & completion.
Prior to this commit, calling shutdown() on the QuerierServer (the
server layer run by the iox binary) would cancel it's own
CancellationToken, while the QuerierHandlerImpl (the actual querier
workload entrypoint) would be watching it's own, different token.
This commit removes the redundant CancellationToken in the
QuerierServer, instead using the inner QueryHandlerImpl for cancellation
notification & completion.
* refactor: inline function that is used once
* refactor: generalize multi-chunk creation for NG
* refactor: `TwoMeasurementsManyFieldsTwoChunks` is OG-specific
* refactor: generalize `OneMeasurementTwoChunksDifferentTagSet`
* refactor: port `OneMeasurementFourChunksWithDuplicates` to NG
* refactor: `TwoMeasurementsManyFieldsLifecycle` is OG-specific
* refactor: simplify NG chunk generation
* refactor: port `ThreeDeleteThreeChunks` to NG
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Add the generic components to create two-chunk scenarios. Includes small
scenario fixes for things like system tables that are not identical
between OG and NG (also see #4111.)
Ref #3934.
Some query test scenarios are duplicates and are very OG specific. Let's
use generic scenarios (i.e. the ones that contain all chunk stages
instead of a specific one) where applicable.
For #3934.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Set to_delete to the time the file was marked as deleted rather than
true.
Fixes#4059.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
The sort key is optional and currently only produced by `iox_tests`.
Writing it within the ingester/compactor is tracked by #3968. The sort
key is read by the querier (and this will be verified by the query tests
and is required to merge #4103).
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: schema client and CLI
* chore: clarification in comment in schema command
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This includes some type changes to dispatch between OG and NG and allows
some tests to be run against the NG querier. This only contains parquet
files though, so it's somewhat a limited scope.
For #3934.
Removed some unnecessary tests as they no longer apply with the new buffer structure. This will hopefully reduce the memory footprint of the ingesters significantly.
Closes#4072
* feat: schema grpc server & proto in router2
* chore: comments in schema proto
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* refactor: dyn-dispatch database in query subsystem
This is similar to #4080 but concerns the database itself.
For #3934.
* docs: improve wording
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: `TombstoneRepo::list_by_table`
* feat: `ParquetFileRepo::list_by_table_not_to_delete`
* refactor: `querier` w/o `db`
Get the `querier` to work w/o relying on `db`. A few notes:
- Testing is kinda shallow, we really need to get `query_tests` working
w/ `querier` (see #3934).
- We still run a sync loop for namespaces, tables and schemas. This will
be a replaced by "update namespace incl. tables and schemas on demand".
Note however that we cannot fetch single tables and schemas on demand
at the moment, because DataFusion doesn't implement async schema
inspection (only `scan` / "give me all the chunks" is async). I think
that's OK for now and we can address this later.
- There is NO cache for parquet files and tombstones at the moment. For
correctness, they need to be fetched in a single transaction (or we
need a kinda tricky sequence number / logical clock tracking) and I am
not sure yet how this makes sense when we have the ingester data wired
up and predicates pushed down to the catalog (see next point). So
let's measure first and then decide on a caching strategy for this.
- Predicates are currently NOT pushed down to the catalog. I'll need to
figure out how to extract time range from generic DataFusion
expressions to make that work (it's easier for InfluxRPC queries, but
they are not tested at the moment, see first point).
Sorry that this commit is kinda huge. I initially planned to only
migrate the chunks away from `db` and leave the tables and schemas for a
follow-up PR, but the DataFusion trait structure (chunks are bound to
their tables) makes this kinda pointless.
Closes#3974.
* docs: explain what we're doing
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
* docs: mention tracking issues
* docs: explain what we're doing
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
This should never be done on its own so doesn't really need to be its
own method. We also don't do anything with the returned data, so no need
to allocate those vectors.