Commit Graph

192 Commits (37c7ce793cc12a7f5dc29b3bc3d575c6c500f2bf)

Author SHA1 Message Date
Andrew Lamb 02893e598c
chore: Update datafusion and upgrade arrow/parquet/arrow-flight to 13 (#4516)
* chore: Tool for automating arrow version update

* chore: Update datafusion and arrow/parquet/arrow-flight

* fix: update for changes in Arrow API

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-05 00:21:02 +00:00
dependabot[bot] 420c306caa
chore(deps): Bump tokio from 1.17.0 to 1.18.0 (#4453)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.17.0 to 1.18.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.17.0...tokio-1.18.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-28 08:21:17 +00:00
Marco Neumann 59f6556483
fix: do not create empty batches in ingester (#4443)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-27 17:52:22 +00:00
Nga Tran fa2c1febf4
feat: use stored partition sort key to deduplicate data (#4360)
* feat: use stored sort key to deduplicate data

* refactor: verify if one is a super sort key of the other

* test: unit tests for scan and deduplication plans

* fix: typo

* refactor: refactor and add comments

* feat: cache partition sort key to read during planning as needed

* test: tests for query plans with different overlap groups

* chore: cleanup

* chore: resolve merge conflicts

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-26 20:36:32 +00:00
Marco Neumann 11f87cffdd
fix: memorize max persisted tombstone (#4430) 2022-04-26 16:13:09 +00:00
Marco Neumann bd600bbac6
refactor: allow ingester to be integrated into query tests (#4427)
* refactor: improve `IngesterData` public interface

* feat: impl `Debug` for `Test{Namespace,Sequencer}`

* refactor: trait interface for `LifecyleHandle`

This is required to mock the lifecycle for query tests.

* refactor: trait for partitioner
2022-04-26 13:44:30 +00:00
二手掉包工程师 4b47d723b1
refactor: Rename time to iox_time (#4416)
Signed-off-by: hi-rustin <rustin.liu@gmail.com>

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-26 00:19:59 +00:00
Marco Neumann 86e8f05ed1
fix: make all catalog IDs 64bit (#4418)
Closes #4365.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-25 16:49:34 +00:00
Nga Tran d963110842
feat: group chunk overlaps based on time range only (#4389)
* feat: overlap for NG querier

* chore: cleanup

* refactor: address review comments

* fix: typo

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-25 13:32:07 +00:00
Marco Neumann f444e63960
test: include materialized delete predicates in NG query tests (#4371)
* refactor: move `batch_filter` to `datafusion_util`

* fix: outdated docstring

* feat: allow passing record batches to `iox_tests` parquet files

* test: include materialized delete predicates in NG query tests

* docs: improve wording

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-21 13:00:13 +00:00
Andrew Lamb 73bed810da
chore: Update arrow, arrow-flight, parquet, tonic, prost, etc (#4357)
* chore: Update datafusion

* chore: Update arrow/arrow-flight/parquet to 12

* chore: update datafusion correctly

* chore: Update prost, tonic, and dependents

* fix: Fixup some api changes

* fix: Update test output in db

* fix: Update test output in parquet_file

* fix: remove old pbjson types

* fix: Add "--experimental_allow_proto3_optional" flag

* chore: Run cargo hakari tasks

* fix: compile error

* chore: Update heappy

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-20 11:12:17 +00:00
Andrew Lamb 5ea676d3f7
feat: add per kafka partition durability reporting to write info response (#4341)
* feat: add per kafka partition durability reporting to write info response

* fix: buf lint + test cleanup

* fix: clean up protobuf

* refactor: pull out conversion of KafkaPartitionStatus into a function

* fix: fmt

* fix: typo

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-19 16:46:20 +00:00
Marco Neumann 5b48675435
fix: actually transmit record-batch metadata from querier (#4347)
Attaching the "batch => partition" mapping via per-batch schema KV
metadata does NOT work because flight will transmit the schema once for
all batches (even though on the Rust side we have a schema ref attached
to every batch, probably for convenience). Instead we now use the same
global protobuf metadata that we also use for the "partition => max
sequence number" information. This somewhat limits our ability to create
record batches lazily on the ingester side (since the global metadata is
sent before any actual payload) but I think we should not modify the
usage of the flight protocol too much right now (e.g. by sending more
schema messages). If this becomes an issue, we can always find a more
complex solution in the future.
2022-04-19 10:54:23 +00:00
Nga Tran 2a601c3099
fix: Revert "chore: Revert "fx: Revert "fix: Revert "feat: Use the sort key stored in the catalog during compaction" (#4299)" (#4303)" (#4327)" (#4328)
* fix: Revert "chore: Revert "fx: Revert "fix: Revert "feat: Use the sort key stored in the catalog during compaction" (#4299)" (#4303)" (#4327)"

This reverts commit 7e5d719027.

* chore: resolve merge conflict

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-18 15:27:39 +00:00
Nga Tran 7e5d719027
chore: Revert "fix: Revert "fix: Revert "feat: Use the sort key stored in the catalog during compaction" (#4299)" (#4303)" (#4327)
This reverts commit fe8d9948d5.
2022-04-14 17:11:55 +00:00
Carol (Nichols || Goulding) fe8d9948d5
fix: Revert "fix: Revert "feat: Use the sort key stored in the catalog during compaction" (#4299)" (#4303)
This reverts commit 7ddbf7c025.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-14 15:42:28 +00:00
Marco Neumann 351b0d0c15
fix: unknown namespace/table in querier<>ingester flight protocol (#4307)
* fix: return "not found" gRPC error instead of "internal" when ingester does not know table

* fix: properly handle "namespace not found" in ingester queries

* fix: make `initialize_db` work with async code

* test: add custom step for NG tests

* fix: handle "unknown table/namespace" resp. in querier

* docs: explain test setup

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-04-14 12:36:15 +00:00
Marco Neumann 8bf2fbb7d3
fix: ingester min-unpersisted-sequence-number calc + doc (#4302)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-14 07:05:06 +00:00
Carol (Nichols || Goulding) 7ddbf7c025
fix: Revert "feat: Use the sort key stored in the catalog during compaction" (#4299)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-13 14:11:10 +00:00
kodiakhq[bot] 21f748062e
Merge branch 'main' into cn/sort-in-compactor 2022-04-13 12:43:31 +00:00
Marco Neumann 83f77712b1
refactor: querier<>ingester flight protocol adjustments (#4286)
* refactor: querier<>ingester flight protocol adjustments

This makes a few adjustments to the querier<>ingester flight protocol.

Query Scope
===========
The querier will request data for ALL sequencer IDs for now. There is
no reason to have a request per sequencer ID. We can add a range/set
filter later if we want, but this is not required for now.

Partition-level
===============
The only time when the querier cares about sequencer IDs (i.e. sharding)
at all is when it selects which ingesters to ask for unpersisted data
(this is currently not implemented, it just asks all ingesters).
Afterwards the querier only cares about partitions (which are bound to
specific sequencers anyways) because this is the level where parquet
file persistence and compaction as well as deduplication happen. So we
make partitions a first-class citizen in the ingester response.

Metadata VS RecordBatches
=========================
The global app-metadata will list all partitions and their max
persisted parquet files and tombstones (theoretically tombstones are at
table-level, but the ingester could in the future break them down to the
partition-level). Then it receives a stream of record batches. Each
record batch is tagged (via key-value metadata in its schema) so it can
be assigned to a partition. At the moment the ingester returns 0 or 1
batches per unpersisted partition (0 in case we've filtered out all the
data via the predicate), but in the future it is free to return multiple
batches. This setup gives the ingester more freedom over memory
management and (potentially parallel) query processing, while at the
same time keeps the set of duplicated information minimal and allows
easy extensions (since the global metadata is a full-blown protobuf
message).

Querier
=======
At the moment the querier ignores all the metdata. Follow-up PRs will
change that.

* docs: improve

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* refactor: make code clearer

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-04-12 16:48:40 +00:00
Carol (Nichols || Goulding) 87e4a1a51d
refactor: Move ingester sort key to schema sort key
This logic isn't actually ingester specific
2022-04-11 14:09:45 -04:00
Carol (Nichols || Goulding) a053077a05
refactor: Make compute_sort_key more general than the ingester
Enable computing sort keys for a schema and an iterator of record
batches.
2022-04-11 14:09:45 -04:00
Dom Dwyer 5c3cbb14b4 test: join ingester background tasks 2022-04-08 14:24:56 +01:00
Dom Dwyer dce939c580 refactor: use SequencedStreamHandler
Removes the old stream_in_sequenced_entries() write buffer handler,
replacing it with the SequencedStreamHandler introduced in #4203.

This change will affect the metrics emitted by an ingester as outlined
in #4243.
2022-04-08 11:28:39 +01:00
Dom Dwyer 71a278ac7e refactor: accept !Sync write buffer streams
Removes the Sync bound SequencedStreamHandler input stream type, as the
BoxStream returned by the WriteBufferStreamHandler is not Sync.

This change means the SequencedStreamHandler is not Sync either, but is
still Send and therefore can be moved into tokio tasks.
2022-04-08 11:28:39 +01:00
Dom Dwyer c2236fa3fb feat: impl DmlSink for IngesterData
This commit adds an adaptor (IngestSinkAdaptor) that provides a DmlSink
implementation for the existing write path (IngesterData). With this,
the existing write path becomes compatible with the new
op stream handler (SequencedStreamHandler).
2022-04-08 11:28:39 +01:00
Andrew Lamb a30a85e62c
feat: Add get_write_info service (#4227)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-07 19:24:58 +00:00
kodiakhq[bot] 8bd0bfb669
Merge branch 'main' into dom/ingester-op-instrumentation 2022-04-07 16:33:25 +00:00
kodiakhq[bot] f5996c5ab4
Merge branch 'main' into cn/sort-key-across-persists 2022-04-07 14:40:55 +00:00
Dom 998a66fd98
docs: Update ingester/src/stream_handler/sink_instrumentation.rs
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-04-07 12:18:14 +01:00
Carol (Nichols || Goulding) 30c3ef5aa6
fix: Only save relevant columns in parquet file's sort key 2022-04-06 14:09:08 -04:00
Dom Dwyer 24eeddce8a chore: fix lint warnings 2022-04-06 16:45:31 +01:00
Dom Dwyer 091640bb23 feat: emit tracing span for op apply
This commit uses the tracing metadata within the DmlOperation to emit a
tracing span from the ingester covering the DmlSink::apply() operation.
2022-04-06 16:32:00 +01:00
Dom Dwyer f6c65f52a3 refactor: impl WatermarkFetcher
Implement WatermarkFetcher for PeriodicWatermarkFetcher and remove
unnecessary async.
2022-04-06 16:32:00 +01:00
Dom Dwyer 436da19d9a feat: DmlSink instrumentation
This commit adds the SinkInstrumentation type that decorates an inner
DmlSink with call latency and write buffer metrics.

The write buffer / sink call metrics may be split apart into two
separate responsibilities in the future if there are multiple DmlSink
that need instrumentation, but deferring adding more types until it is
needed.
2022-04-06 16:32:00 +01:00
Andrew Lamb c244b03281
feat: Add `SequencerProgress` reporting to ingester (#4238)
* feat: Add `SequencerProgress` reporting to ingester

* refactor: Use KafkaPartition in write_summary

* fix: Update docstrings

* refactor: Change ingester to use KafkaPartition everywhere

* refactor: add SequencerProgress::combine

* refactor: return new SequencerProgress rather than updating

* fix: distinguish between yes/no/unknown in WriteSummary

* docs: Update data_types2/src/lib.rs

Co-authored-by: Paul Dix <paul@pauldix.net>

Co-authored-by: Paul Dix <paul@pauldix.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-06 15:13:21 +00:00
Carol (Nichols || Goulding) bf3cb45723
refactor: Pass PartitionInfo as argument 2022-04-06 09:31:42 -04:00
Carol (Nichols || Goulding) f0d5987317
feat: Update partition sort_key in catalog after persist
Connects to #4196.
2022-04-06 09:31:42 -04:00
Carol (Nichols || Goulding) b16fcc284d
feat: Add new columns to the sort key during compaction
Connects to #4196.
2022-04-06 09:31:42 -04:00
Carol (Nichols || Goulding) 98d052dba7
feat: Use catalog sort key if specified
Pass the sort key from the catalog through to compact_persisting_batch.
If the sort key is Some, use that. If the sort key is None, compute it
from the data's cardinality with compute_sort_key.

Connects to #4196.
2022-04-06 09:31:42 -04:00
Dom Dwyer 891d2e1368 feat: periodic kafka max watermark offset fetcher
Adds the PeriodicWatermarkFetcher type responsible for querying write
buffer / Kafka for the maximum sequence number / offset, surfacing any
errors via both logs & metrics.

This high watermark / max offset value is used within the ingest
instrumentation metrics. This use case is tolerant of caching / stale
values, and as such the value is periodically updated to minimise load
on the write buffer.
2022-04-05 12:02:07 +01:00
Dom Dwyer aaa677dec8 docs: describe graceful shutdown behaviour 2022-04-05 11:31:55 +01:00
Dom Dwyer 8edefc415d refactor: rename ttbr -> write_time in tests 2022-04-05 11:31:55 +01:00
Dom Dwyer a387ec361d refactor: use self.deref() instead of **self 2022-04-05 11:31:55 +01:00
Dom Dwyer f15275cf96 feat: expose ingest sequencer errors
Instruments the SequencedStreamHandler with a series of new metrics that
record the various error classes observable in the stream handler.

These metrics are labelled with potential_data_loss=true where relevant
to surface potential data loss events for alerting & further review.
2022-04-05 11:31:55 +01:00
Dom Dwyer 083ff1f8e3 refactor: ingest stream handler
Refactors the stream_in_sequenced_entries() into a new impl in the
SequencedStreamHandler type, decoupling the reading / decoding of ops
from Kafka (and associated error handling) from the "what happens to
those ops" concern to ease testing, encapsulate the specifics of "how to
get an op" and improve flexibility.

This is intended to provide robust error handling within what is
reasonably possible (unexpected errors are always unexpected!) while
retaining the existing metrics and functionality. I've also separated
out code that exists in the current impl specifically to drive tests
from the prod code path, instead driving those behaviours through mocks.

As of this commit, the handler is not used - this commit simply adds the
new impl.
2022-04-05 11:31:54 +01:00
Paul Dix 81d41f81a1
fix: ingester replay logic (#4212)
Fix the ingester to track the max persisted sequence number per partition.
Ensure replay takes in data from unpersisted partitions.
Simplify the table persist info to not return a max persisted sequence number for the table as that information isn't needed.
2022-04-04 18:04:34 +00:00
Carol (Nichols || Goulding) d41adf074f
test: Add assertions for sort keys 2022-04-01 13:13:04 -04:00
Carol (Nichols || Goulding) f4b5fa1b5e
feat: Implement distinct counts in terms of distinct values
For one record batch.

Connects to #4194.
2022-03-31 16:46:27 -04:00