Immutable `Box<Vec<T>>`/`Arc<Vec<T>>` are better stored as
`Box<[T]>`/`Arc<[T]>` because:
- allocation always exact (no need for `shrink_to_fit`)
- smaller (the fat pointer is just the memory address and the length, no
capacity required)
- less allocation (`Box`/`Arc` -> slice instead of `Box`/`Arc` -> `Vec`
-> buffer); in fact the vector itself was offen missing in the
accounting code
Found while I was working on #7987.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Avoid that the querier accesses files that were flagged for deletion a
long time ago. This would happen if the following conditions hold:
- we have very long-running querier pods (e.g. over holidays)
- the table doesn't receive any writes (or the partition if we ever
change the cache granularity), hence the querier is never informed
that its state is out-of-date
- a compactor runs a cold compaction, and by doing so flags a file for
deletion
- the GC finally wants to delete it
This is mostly a safety measure to prevent weird internal server errors
that should nearly never happen. On the other hand I do not want to hunt
Heisenbugs.
This is the major part of #7470. Additional clean ups (e.g. to remove
the actual types from `data_types`) will follow.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This commit fixes loads of crates (47!) had unused dependencies, or
mis-configured dependencies (test deps as normal deps).
I added the "unused_crate_dependencies" to all crates to help prevent
this mess from growing again!
https://doc.rust-lang.org/beta/nightly-rustc/rustc_lint_defs/builtin/static.UNUSED_CRATE_DEPENDENCIES.html
This has the minor downside of false-positives when specifying
dev-dependencies for test/bench binaries - these are files in /test or
/benches (not normal tests). This commit includes a workaround,
importing them in lib.rs (gated by a feature flag). I think the
trade-off of better dependency management is worth it!
Introduce a new header called `iox-debug` which when set enables certain
debug features. The first one will be the `system.queries` table which
is a process-local, namespace-scoped query log. In most prod setups this
is only useful for debugging and will confuse the user a lot because
when multiple queries are deployed then the K8s routing decides which
pod/process the users hits. This leads to an inconsistent view. However
the log is still useful for debugging.
This also wires the "debug header set" flag through the Flight ticket,
because JDBC proved (integration tests FTW!) that headers are only
passed to `GetFlightInfo` but not to `DoGet` and the ticket must encode
all the relevant information.
Closes#7119.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
As of #7250 / #7449 the partition sort key is no longer required for
query planning. Instead we use a combination of
`QueryChunk::partition_id` and `QueryChunk::sort_key` which is more
robust and easier to reason about.
Removing it simplifies the querier code a lot since we no longer need to
have a sort key for the ingester chunks and also don't need to "sync"
the sort key between chunks for consistency.
Warming up a cache should not block the planning, it is a mere signal to
the cache system to start to fetch data. See code comment for more
details.
This lowers the query latency in a few cases. I've seen at least one
trace were this would have been useful. This will never make things
worse (because the cache system drives the request to completion anyways).
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
Tests that use the in-memory catalog are creating different shards that
then creates old-style Parquet file paths, but in production, everything
uses the transition shard now. To make the tests more like production,
only ever create and use the transition shard, and stop checking for
different shard IDs.
This is helpful to test changes in our defaults but also for testing.
Required for https://github.com/influxdata/idpe/issues/17474 .
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* chore: Update datafusion and arrow/parquet to 37, tonic to 0.9.1
* refactor: Update for FieldRef and other API changes
* fix: Update field size calculation
* fix: Use `NullBuffer` directly
* fix: remove outdated comment
* chore: Update test for tonic
* chore: Run cargo hakari tasks
* chore: cargo update
---------
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
When pre-warming the catalog cache before the ingester responses have
returned, we don't have any ingester parquet file counts. This is
different than asking the ingesters for the parquet file counts and not
getting any. So keep the Option to be able to treat "not present"
differently from "present but empty".
* feat: improve querier->ingester tracing
- add more hierarchy items on the querier side
- ensure that streaming is correctly traced by the querier
* refactor: improve span name
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
* docs: `QueryDataTracer`
---------
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
In one prod case the majority of this was NOT spend on creating the
child chunks. I suspect that the summary creation and the string cloning
involved in there are quite slow. So let's have slightly more detailed
tracing and see.
The regex for replacing UUIDs needed to be changed like the normalizer's
regex did, so keep them in sync by using the same code.
This might point to the normalizer needing to be moved somewhere else,
or changing these tests to be e2e?
* chore: Update datafusion
* chore: update the plans
* fix: update some plans
* chore: Update plans and port some explain plans to use insta snapshots
* fix: another plan
* chore: Run cargo hakari tasks
---------
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
This commit adds initial support for "soft" namespace deletion, where
the actual records & data remain, but are no longer queryable /
writeable.
Soft deletion is eventually consistent - users can expect to continue
writing to and reading from a bucket after issuing a soft delete call,
until the various components either restart, or have their caches
flushed.
The components treat soft-deleted namespaces differently:
* router: ignore soft deleted namespaces
* ingester: accept soft deleted namespaces
* compactor: accept soft deleted namespaces
* querier: ignore soft deleted namespaces
* various gRPC services: ignore soft deleted namespaces
This ensures that the ingester & compactor do not see rows "vanishing"
from the database, and continue to make forward progress.
Writes for the deleted namespace that are buffered in the ingester will
be persisted as normal, allowing us to support "un-delete" operations
where the system is restored to a the state at which the delete was
issued (rather than loosing the buffered data).
Follow-on work is required to ensure GC drops the orphaned parquet files
after the configured GC time, and optimisations such as not compacting
parquet from soft-deleted namespaces seems like a trivial win.
It seems that tonic is caching DNS results for too long and clings to an
old ingester that no longer exists.
See https://github.com/influxdata/idpe/issues/17022 (not sure though if
this fix is sufficient, let's see).
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
* feat: introduce a new way of max_sequence_number for ingester, compactor and querier
* chore: cleanup
* feat: new column max_l0_created_at to order files for deduplication
* chore: cleanup
* chore: debug info for chnaging cpu.parquet
* fix: update test parquet file
Co-authored-by: Marco Neumann <marco@crepererum.net>
* refactor: invalidate querier cache if ingester is gone
For #6549 but I think even w/o the plan illustrated there, this is the
right thing to do.
Also changes the cache system to use flats sorted vectors instead of costly hash
maps.
* refactor: simplify code
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
- Place IDs last because they may hit the "max line length" limit and be
truncated. The other information should NOT be truncated with it.
- Unpack IDs to integer to remove useless `ParquetFileID(...)` wrappers
in output.
- Print number of files in addition to the actual list to simplify
debugging.
This was added in c82d0d8ca6dc02dcdd40a4c656a1ee51f3f9bfee with the
comment:
> Right now this would clearly indicate a bug and before I am trying to
> understand some prod issues, I wanna rule that one out.
In the RPC write path, this isn't a bug, it's quite expected.
Updating the sort key is not commutative and MUST be serialised. The
correctness of the current catalog interface relies on the caller
serialising updates globally, something it cannot reasonably assert in a
distributed system.
This change of the catalog interface pushes this responsibility to the
catalog itself where it can be effectively enforced, and allows a caller
to detect parallel updates to the sort key.
* refactor: speed up partition sort key syncing
Prior to syncing, all chunks have a "locally correct" partiton sort key,
i.e. one that at least covers all chunk columns (this is ensured during
chunk creation, both for parquet chunks as well as ingester chunks).
However due to the timing, some chunks may have a newer (= longer)
partition sort key. All we need to do to fix this is to pick the longest
partition sort key, there is no need to go through the whole cache
system again.
For #6358.
* docs: improve
Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: Andrew Lamb <alamb@influxdata.com>