During the schema merge the new tables are iterated over already (to find
which tables and columns are new), so the number needed for the metrics
can be pre-computed to spare two extra loops over the new tables and new
columns returned in `ChangeStats`.
This adds some computational overhead during the merging of new
namespace schema with what's in the router's local cache, but will allow
gossiping of changes.
PR #8327 introduced a bunch of metrics for the sqlx connection pool. One
of the metrics was the "used" metrics that was supposed to count
"currently in use" connection. In prod however this metric underflows to
a very large integer. It seems that "acquire" callback is only used by sqlx for
re-used connections (i.e. for the transition from "idle" to "used").
Now we could try to work around it but since there is no "close
connection" callback, I doubt it it possible to do the accurately.
Luckily though we don't really need that counter. sqlx already offers
"active" (defined as idle + used) and "idle", so getting "used" is just
the difference. I removed the "used" metric nevertheless because
"active" and "idle" are read independently from each other (based on atomic
integers) and are NOT guaranteed to be in-sync. Calculating the
difference within IOx however would give the illusion that they are. So
I leave this to the dashboard / alert / whatever, because there it is
usually understood that metrics are samples and may be out of sync for a
very short time.
A nice side effect of this change is that it simplifies the code quite a
bit.
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
In very rare cases a panic mid-write can result in a partially completed
write to the WAL which contains no table data. This is now not replayed
(as there is nothing to replay) and does not panic when encountered,
but tracks the occurence into the WAL replayed ops metric and logs a
warning.
Exposes the `ERROR_WINDOW` parameter that controls the router's
downstream error-gate health check behaviour as an environment
variable/command line flag. This allows tuning, per-environment, the
period over which the error rate of 80% must be exceeded to cause an
ingester to appear unhealthy.
Cache the merged Schema of all the RecordBatch within a buffer at
snapshot generation time.
To be useful, this cached schema is made available to the PartitionData
for re-use, allowing the schema of "hot" data within a partition's
mutable buffer to be read without generating a RecordBatch first.
This commit implements peer exchange (abbreviated PEX) between peers
of the gossip cluster.
This allows using a set of fixed seeds and dynamic node membership -
nodes can come and go without having to be manually configured across
all peers in order to communicate.
"Dead" peers are periodically cleaned from the local list of active
peers, ensuring the list of peers doesn't grow forever as node churn
occurs. This is a best-effort, conservative process, biasing towards
reliability/deliverability rather than accuracy and fast removal - it's
not a health check!
Provide row count & timestamp min/max statistics on a per-partition
basis.
This commit builds on the FSM summary statistics, merging all FSM
statistics across all data within the PartitionData (in various states)
and making them available to the caller.
Cache the row count & timestamp min/max values within the partition FSM
/ buffer, and make them available through the Queryable trait.
This allows the PartitionData to read the row count of a buffer (either
"hot" for writes, a "snapshot" of immutable RecordBatch, or "persisting"
for in-flight persisting data).
These values will enable early partition pruning.
To better gauge how many connections we use and especially if we hit the
max connection limit, it would be helpful to actually have some metrics
available for the pool usage. This change adds a few basic metrics.