Commit Graph

484 Commits (07505c8f721993c4b38a111458eea2023cfa974d)

Author SHA1 Message Date
Nga Tran 77a2541172
feat: flag partitions for delete (#6075)
* feat: flag partition for delete

* fix: compare the right date and time

* chore: Run cargo hakari tasks

* chore: cleanup

* fix: typos

* chore: rust style tidy ups in catalog

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Luke Bond <luke.n.bond@gmail.com>
2022-11-09 12:06:23 +00:00
Dom d9c97795fc
feat: use IDs in ingester query API (#6093)
* refactor: NS+table ID (instead of name) in querier<>ingester

* feat(ingester): use IDs for query API

Changes the ingester to utilise the ID fields (instead of names) sent
over the query wire message wrapped within the Flight API.

BREAKING: this changes the "query-ingester" CLI command arguments which
now expects the namespace & table IDs, rather than their names.

* refactor(ingester): add more query logging context

Updates the log messages during query execution to include more context
fields.

* style: remove unused import

Co-authored-by: Marco Neumann <marco@crepererum.net>
2022-11-09 11:25:13 +00:00
Dom Dwyer 38b0459994 test: simplify tests / remove catalog
Remove the catalog from tests that only initialised an implementation in
order to call buffer_operation().
2022-11-08 17:02:01 +01:00
Dom Dwyer 226f14a97f perf(ingester): remove table lookup query
Now DML operations contain the table ID, the ingester has all necessary
data to initialise the TableData buffer node without having to query the
catalog.

This also removes the catalog from the buffer_operation() call path,
simplifying testing.
2022-11-08 17:00:44 +01:00
Dom Dwyer 225c3b97c1 perf(ingester): remove namespace lookup query
Now DML operations contain the namespace ID, the ingester has all
necessary data to initialise the NamespaceData buffer node without
having to query the catalog.
2022-11-08 16:57:53 +01:00
Dom Dwyer 8ebea0df37 feat: table/namespace IDs in write protocol
Expose the Table and Namespace IDs encoded within the serialised DML
write (added in #6036).

This makes the IDs available for use in the consumers, ending the
transition period. This commit DOES NOT remove the strings sent over the
wire.
2022-11-08 16:57:53 +01:00
Dom b7f7ee6a13
Merge branch 'main' into dom/mutex-pushdown 2022-11-08 14:57:32 +00:00
Dom Dwyer b73d07c22b perf(ingester): granular per-partition locking
This commit pushes the existing table-level mutex down to the partition.

This allows the ingester to gather data from multiple partitions within
a single table in parallel, and reduces contention between ingest/query
workloads.
2022-11-08 15:45:59 +01:00
Dom Dwyer b8181119e1 refactor: push down per-partition op skipping
This moves the logic that skips operations that do not need to be
applied to a partition during shard replay from the table level, to the
partition level.
2022-11-08 15:45:52 +01:00
Dom Dwyer 4c8882e33a docs: ref link to fix PR 2022-11-08 15:17:46 +01:00
Dom Dwyer d71f023a57 refactor: inline helpers
Inline the hash generation & key comparator.
2022-11-08 15:17:46 +01:00
Dom Dwyer 8dd7f2c603 refactor: accept owned key for insert()
Changes the bounds on the ArcMap to accept an owned key, avoiding an
extra allocation.

Cleans up the bounds on other fn to ensure the borrowed key impl Eq and
is the ref type of K.
2022-11-08 15:17:46 +01:00
Dom Dwyer bbc2afe2a1 refactor: extract key equality checking
Creates a shared fn for checking key equality to DRY the various
chaining checks.
2022-11-08 15:17:46 +01:00
Dom Dwyer 8eaccd518b fix: cross-thread map entry visibility
This commit changes the ArcMap HashBuilder to use the same instance as
the underlying HashMap hasher.

This prevents divergent hashing across threads that MAY initialise a
hasher with a different seed.
2022-11-08 15:17:46 +01:00
Dom Dwyer 66a6e8e929 test: cross-thread hashmap entry visibility
At the time of this commit, this test fails. Performing a get() on a key
previously inserted by another thread should not fail.
2022-11-08 15:17:46 +01:00
Dom Dwyer fbd25a06d0 revert: push down per-partition op skipping
This reverts commit 425fd46def.
2022-11-08 10:31:51 +01:00
Dom Dwyer 7ac0857a28 revert: granular per-partition locking
This reverts commit 79d24fa350.
2022-11-08 10:31:37 +01:00
Dom Dwyer 79d24fa350 perf(ingester): granular per-partition locking
This commit pushes the existing table-level mutex down to the partition.

This allows the ingester to gather data from multiple partitions within
a single table in parallel, and reduces contention between ingest/query
workloads.
2022-11-07 13:45:03 +01:00
Dom Dwyer 425fd46def refactor: push down per-partition op skipping
This moves the logic that skips operations that do not need to be
applied to a partition during shard replay from the table level, to the
partition level.
2022-11-07 13:45:03 +01:00
kodiakhq[bot] 5e297e259b
Merge branch 'main' into dom/arcmap-get_or_insert_with 2022-11-07 11:47:00 +00:00
Andrew Lamb 034d9b371d
chore: Update datafusion and arrow/arrow-flight/parquet to `26.0.0` (#6061)
* chore: Update datafusion and arrow/arrow-flight/parquet to `26.0.0`

* fix: Update query_functions

* fix: update for TimestampNanosecondArray API changes

* fix: update for TimestampNanosecondArray API changes

* chore: Update flatbuffers and remove rustsec warning

* chore: Update text

* fix: update more test

* fix: Lock ahash to exactly 0.8.0

* fix: Update datafusion pin

* chore: Run cargo hakari tasks

Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@gmail.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-07 11:01:58 +00:00
Dom Dwyer 2b9e0e173f refactor: rename ArcMap::get_or_insert_with()
Renames ArcMap::get_or_else() to ArcMap::get_or_insert_with() for
consistency with the stdlib HashMap Entry.
2022-11-07 11:56:55 +01:00
Marco Neumann f511db380c
refactor: remove table name from chunks (#6063)
It should be always clear from the context to which table a chunk
belongs.

I think having a table name bound to a chunk goes back to a time where
chunks had multiple tables.

Helps with #6049.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-07 10:42:57 +00:00
YIXIAO SHI 586035b34d
chore: delete metric duplicate character (#6057)
* chore: delete metric duplicate character

* fix: failure ci test case

* fix: failure ci test case

* fix: failure ci test case

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-07 10:04:31 +00:00
Dom Dwyer 6fa48731aa feat: NamespaceId in DmlDelete
Changes the DmlDelete to contain the NamespaceId for which it should be
applied, propagating this value over the wire.

Like the existing IDs within the DmlWrite, these values are marked
unsafe to use due to avoid the consumers utilising them accidentally
during deployment. Unlike DmlWrite, the DmlDelete is completely unused,
so this is less of an issue.
2022-11-03 13:57:40 +01:00
Dom Dwyer 30f69ce4f6 feat: ArcMap values() snapshot
Returns a snapshot of the values within an ArcMap.
2022-11-03 11:49:01 +01:00
Dom Dwyer 17890a9906 feat: add ArcMap map type
Implements a map of K -> Arc<V> with exactly-once initialisation
semantics.

This map can be used to ensure a given key maps to singleton instances
of V; exactly what all the nodes in the ingester "buffer tree" of shard
-> namespace -> table -> partition require.

This impl contains unused funcs (silenced with an allow(dead_code)) due
to it being picked from a future branch.
2022-11-03 11:29:09 +01:00
Andrew Lamb 4fb2843d05
refactor: Rename `schema::selection::Selection` to `schema::projection::Projection` (#6037)
* chore: Rename `schema::selection::Selection` to `schema::projection::Projection`

* fix: docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-02 18:15:04 +00:00
Dom Dwyer ddd6ab0ba4 refactor(write_buffer): pass IDs in wire format
This commit is part of a two-part change in order to add the table &
namespace IDs to the write buffer wire format. This commit forms the
first half; changing the producer to send the IDs.

In this commit the new ID values are never read on the consumer side,
ensuring there is no consumer dependency on them. This ensures they
remain operational during a rollout, where the consumer may be updated
to the latest code dependent on the IDs before the producer is updated
to send them. This also ensures we have a window of time where where the
consumers can be rolled back after being updated, and still handle
replaying messages in Kafka.
2022-11-02 13:28:56 +01:00
Marco Neumann 45b3984aa3
refactor: simplify `QueryChunk` data access (#6015)
* refactor: simplify `QueryChunk` data access

We have only two types for chunks (now that the RUB is gone):

1. In-memory RecordBatches
2. Parquet files

Loads of logic is duplicated in the different `read_filter`
implementations. Also `read_filter` hides a solid amount of logic from
DataFusion, which will prevent certain (future) optimizations. To enable #5897
and to simplify the interface, let the chunks return the data (batches
or metadata for parquet files) directly and let `iox_query` perform the
actual heavy-lifting.

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* docs: improve

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-11-02 08:18:33 +00:00
Marco Neumann 072439e428
refactor: mandatory `QueryChunkMeta::summary` (#5997)
With #5963 merged, all chunks now provide a summary (even though it may
not contain data for all columns). So let's make it mandatory, which
also removes a few 🙈-style `.except(...)` calls.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-31 16:38:02 +00:00
Carol (Nichols || Goulding) dad1ad1318
feat: Add the catalog service to ingester, querier, and compactor
So that `remote get` that uses the catalog service can work no matter
what kind of server you contact.
2022-10-28 10:49:26 -04:00
Carol (Nichols || Goulding) 53445af25d
chore: Alphabetize some dependencies
I can't handle not knowing where to look for a dependency or knowing
where to add a new dependency.
2022-10-28 10:34:25 -04:00
Andrew Lamb e9d04ffcb5
feat: Log how long each persist plan takes to complete (#5989)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-28 13:52:39 +00:00
kodiakhq[bot] 1567227b49
Merge branch 'main' into dom/require-partition-key 2022-10-28 10:31:22 +00:00
Marco Neumann 8447d46093
refactor: remove `QueryChunkMeta::timestamp_min_max` (#5963)
Use the table summary instead. This allows us to have a single mechanism
that both IOx and DataFusion understand. This basically lifts the "basic
table summary" mechanism that the querier uses to `iox_query` and let
the compactor and ingester use the same mechanism.

While not strictly necessary, simplifying the `QueryChunk[Meta]`
interface helps with #5897.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-10-28 10:29:16 +00:00
Dom Dwyer 72a358e52f refactor(dml): PartitionKey required for writes
Changes the DmlWrite type to require a PartitionKey be specified,
instead of accepting an Option.

This requirement was already in place - the write buffer upheld an
invariant that all writes contained a partition key value (was not
"None") or it panicked at runtime when attempting to enqueue the write.

It is now possible to encode this invariant in the type system, which is
what this change does.
2022-10-28 10:57:30 +02:00
Dom Dwyer 5d2f4a0ad1 docs: fix issue URL for memory tracking bug 2022-10-27 10:15:15 +02:00
Dom Dwyer f6416675c2 docs: mark hyperlink in rustdoc comments 2022-10-27 10:15:15 +02:00
Dom Dwyer 678fb81892 refactor(ingester): use partition buffer FSM
This commit makes use of the partition buffer state machine introduced
in https://github.com/influxdata/influxdb_iox/pull/5943.

This commit significantly changes the buffering, and querying, of data
from a partition, swapping out the existing "DataBuffer" for the new
state machine implementation (itself simplified due to temporary lack of
incremental snapshot generation, see #5944).

This commit simplifies the query path, removing multiple types that
wrapped one-another to pass around various state necessary to perform a
query, with various query functions needing different types or
combinations of types. The query path now operates using a single type
(named "QueryAdaptor") that provides a queryable interface over the set
of RecordBatch returned from a partition.

There is significantly increased testing of the PartitionData itself,
covering data in various states and the ordering of returned RecordBatch
(to ensure correct materialisation of updates). There are also
invariants upheld by the type system / compiler to minimise the
complexities of working with empty batches & states, and many asserts
that ensure (mostly existing!) invariants are upheld.
2022-10-27 10:15:15 +02:00
Carol (Nichols || Goulding) 88c3a1f5e7
feat: Use workspace dep inheritance for the arrow-flight crate 2022-10-26 10:34:54 -04:00
Carol (Nichols || Goulding) 3145e2c05b
feat: Use workspace dep inheritance for the arrow crate 2022-10-26 10:34:29 -04:00
Carol (Nichols || Goulding) 44936f661a
feat: Use workspace dep inheritance for datafusion instead of shim crate 2022-10-26 10:33:56 -04:00
Carol (Nichols || Goulding) 2e83e04eab
feat: Use workspace package metadata to reduce differences and repetition 2022-10-24 13:04:09 -04:00
Dom Dwyer 39f826518b revert: use histogram to record TTBR
This reverts commit c63312ce12.

This change fixed a low-priority alert when there was no traffic flowing
through the system. The loss in TTBR value fidelity due to bucketing is
a greater concern as it affects live, high-volume clusters and hinders
operational insight.
2022-10-24 10:27:22 +02:00
Dom Dwyer 7b3fa43209 refactor: disable incremental snapshot generation
This commit removes the on-demand, incremental snapshot generation
driven by queries.

This functionality is "on hold" due to concerns documented in:

    https://github.com/influxdata/influxdb_iox/issues/5805

Incremental snapshots will be introduced alongside incremental
compactions of those same snapshots.
2022-10-21 17:41:43 +02:00
Dom db83053be7
Merge branch 'main' into dom/buffer-fsm 2022-10-21 16:32:54 +01:00
Dom Dwyer 8ca72ceff1 docs: fix state mod comments 2022-10-21 17:32:19 +02:00
Dom Dwyer c8fdd76033 feat(ingester): partition buffer state machine
This commit introduces code that is intended to replace the current
implicit state machine used by PartitionData. The existing code is still
in use, the new code is NOT used in this commit. A follow-up commit will
switch over to minimise the diff.

This change has two main goals;
    * encapsulation & simplification for callers
    * robust implementation so developing correct additions is easier

This is a significant refactor of the partition buffering logic to
encapsulate the various states of data (buffering, snapshot, persisting
and the mixed states between them) within the Partition. This alleviates
the rest of the system from having to be concerned with the differences
between "buffering" data, and "unpersisted data", "snapshot data",
"persisting data", "persisting with snapshots" etc - callers now invoke
a method called get_query_data() and they are provided with all the
relevant data for a partition. This abstraction change alone
significantly reduces code and test complexity in the rest of the
ingester.

For the second goal, the new implementation leverages an explicit state
machine, encoded using typestates. Typestate ensures compile-time
correctness of transitions and method calls, and the explicit FSM itself
helps ensure the system progresses in the desired manner - this fixes
and helps prevent bugs caused by implicit states such as:

    https://github.com/influxdata/influxdb_iox/issues/5805

This state machine makes the system states explicit and
self-descriptive, helping to reduce the cost of developer on-boarding
(no prior knowledge of "how this bit works") and reduces ongoing
developer burden. This explicit nature also de-risks adding new
functionality - it should be relatively easy to add concurrent snapshot
generation or incremental compaction without introducing bugs. The state
transition logic is abstracted away from callers, minimising the
overhead of this strategy.
2022-10-21 14:25:51 +02:00
Carol (Nichols || Goulding) 59e1c1d5b9
feat: Pass trace id through Flight requests from querier to ingester
Fixes #5723.
2022-10-20 08:55:30 -04:00