Commit Graph

6857 Commits (4db27eec68cda9b5f7fd5b23d58e629ea0d586c8)

Author SHA1 Message Date
Dom Dwyer b38deaa721 refactor: decouple error types for DmlHandler
Allows the DmlHandler to return different types for each method.

This enables a DmlHandler implementation decorating an inner handler to
return the inner handler's error directly, avoiding any "wrapper"
errors.
2022-01-28 11:01:06 +00:00
Dom 1669acc0c2
build: update tokio (#3566)
Release notes:
    https://github.com/tokio-rs/tokio/releases/tag/tokio-1.16.0
2022-01-28 10:36:12 +00:00
Marco Neumann a22ca7c3d7
fix: router2 writer buffer topic (#3555)
- Kafka does not support `_` in topic names, but `-` works, so let's
  change the default
- Expose topic config via CLI/env
2022-01-28 10:10:04 +00:00
Dom 32d7c4cbfe
refactor: remove InfluxColumnType::IOx (#3565)
* refactor: remove InfluxColumnType::IOx

Remove unused column variant - see #3554 for context.

* refactor: reserve SEMANTIC_TYPE_IOX name in proto

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 21:15:36 +00:00
Andrew Lamb b486258dfb
chore: run cargo update (#3559)
Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 21:05:31 +00:00
Dom 9201023ea4
feat: schema validation for MutableBuffer instances (#3554)
* refactor: Debug bounds on Catalog trait

* feat: validate MutableBatch schema

Changes the schema validation code to validate MutableBatch instances
(coming from a pre-parsed LP write, and non-LP-based writes) instead of
parsed LP lines.

* refactor: Send bound on boxed errors

* refactor: clippy

Allow assert_eq!(bool, bool) for readability.

* refactor: no PartialEq<MB Column> for ColumnSchema

Remove the PartialEq<mutable_buffer::Column> for ColumnSchema - it's
definitely more readable as a method call.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 20:55:18 +00:00
Andrew Lamb f24ce03754
fix: provide correct environment variable to change log filter (#3561)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 20:45:06 +00:00
Edd Robinson 87af662d3b
refactor: include comments in test cases (#3550) 2022-01-27 20:35:14 +00:00
Luke Bond 4a96e52290
feat: router2 sharder benchmarking (#3558)
* feat: benchmarking the router2 sharder

* chore: added throughput to sharder benchmarks; vary num buckets
2022-01-27 18:09:16 +00:00
Raphael Taylor-Davies 5efc42494c
feat: add chunk order to chunk columns table (#3556)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 17:14:28 +00:00
Raphael Taylor-Davies d1d45fe818
feat: columnar predicate pruning across `Chunks` (#3553)
* feat: columnar predicate pruning

* fix: doc

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 17:02:46 +00:00
Nga Tran fb33a88dc8
test: Delete application during Ingester's compaction (#3542)
* test: Delete application during Ingester's compaction

* fix: typos

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

* chore: remove comments

Co-authored-by: Andrew Lamb <alamb@influxdata.com>

Co-authored-by: Andrew Lamb <alamb@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 16:53:17 +00:00
Andrew Lamb 2062267d0f
chore: Update hashbrown (#3551)
* chore: Update hashbrown

* fix: hakari

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 15:34:10 +00:00
Edd Robinson 5befa7922b
refactor: move chunk metrics to module (#3548)
* refactor: move chunk metrics into module

* refactor: use Metrics as internal name

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 15:24:31 +00:00
Dom ce568ab447
build(iox_catalog): remove unused dependencies (#3552)
* build: don't pull in all of tokio

We already specify the tokio features we need so "full" (all features)
is not necessary.

* build: remove chrono dependency

Appears unused.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 15:14:42 +00:00
Andrew Lamb 8dd96127d6
fix: Reuse the same DataFusion DiskManager and MemoryManager (Do not recreate temp files) (#3515)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 15:04:06 +00:00
Dom 5447554aee
refactor(router2): DML handler stack (#3549)
* refactor: composable DmlHandler stack

Changes the DmlHandler trait to allow composition of handler logic in
order to construct the complete request processing pipeline.

* feat: debug log write/delete requests

Log requests hitting the HTTP endpoint at DEBUG.

* refactor: dml_handler -> dml_handlers

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 14:54:27 +00:00
Raphael Taylor-Davies 21c1824a7a
refactor: remove table_names from Predicate (#3545)
* refactor: remove table_names from Predicate

* chore: fix benchmarks

* chore: review feedback

Co-authored-by: Edd Robinson <me@edd.io>

* chore: review feedback

* chore: replace Default::default with InfluxRpcPredicate::default()

Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 14:44:49 +00:00
Paul Dix 16d584b2ff
feat: Add db_name/namespace to DmlWrite and DmlDelete (#3531)
* feat: Add db_name/namespace to DmlWrite and DmlDelete

This is required for the new ingester to be able to work with the write buffer. The protobuf that gets serialized over Kafka already includes the database name, it just wasn't getting carried through to the marshaled Dml operation.

* fix: database != namespace, propagation through write buffer

Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 14:12:20 +00:00
Andrew Lamb 5488c257d1
chore: Update datafusion, upgrade to arrow/parqet/arrow-flight 8.0.0 (#3517)
* chore: Update datafusion

* chore: update to arrow 8

* fix: update to use new DataFusion APIs

* fix: update case for sortedness

* fix: cargo hakari
2022-01-27 13:33:27 +00:00
Andrew Lamb 7261571abf
fix: Revert "chore: temporarily hack around datafusion tempfiles" (#3525)
This reverts commit ae5763c1cb6bb4a98ffe0779a3a35f6daaf10971.

Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 12:08:49 +00:00
Carol (Nichols || Goulding) bc44d33108
feat: Implement a snapshot method on DataBuffer (#3518)
* feat: Implement a snapshot method on DataBuffer

Fixes #3510.

* test: Add a test snapshotting batches with different but compatible schemas

* fix: Simplify min/max sequencer number collection

The first batch should always have the min sequencer number. The last
batch should always have the max sequencer number. The min should always
be less than (or equal to, in case there's only one batch) the max.
2022-01-26 15:22:51 +00:00
Edd Robinson 0a0b8b2150
feat: decouple read buffer row group size from Datafusion batch size (#3538)
* feat: add chunk builder

* test: test coverage for chunk builder

* refactor: apply suggestions from code review

* refactor: address PR feedback
2022-01-26 12:39:29 +00:00
Luke Bond 107f39d53c
feat: add trace collector to router2 (#3529)
* feat: add trace collector to router2

* chore: fmt
2022-01-26 11:51:17 +00:00
Dom 6b0f7e6b2b
feat: initialise ShardedWriteBuffer (#3528)
Initialises a ShardedWriteBuffer for the hard-coded "iox_shared" topic.

Adds the following CLI flags:

    * --write-buffer: type of buffer [kafka, rskafka, file]
    * --write-buffer-addr: write buffer endpoint address

The server uses these config options to initialise the appropriate write
buffer backend, and configure the TableNamespaceSharder to shard
operations over the set of sequencers exposed by the write buffer.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-26 10:49:34 +00:00
Raphael Taylor-Davies 1b6aed063d
feat: add per-partition tracing (#3532)
* feat: add per-partition tracing

* chore: docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-26 10:39:21 +00:00
Marco Neumann 2928254c0f
fix: test logging (#3536)
- Use a more standard way to setup the tracing subsystem (as described
  in tracing-subscriber docs)
- Also capture content from `log` crate
- Play nice w/ Rust's libtest message capture

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-26 10:28:51 +00:00
Marco Neumann 7feb10dd30
fix: bring back GIT revision to our prod images (#3537)
This was likely broken since #3313 and leads to IOx reporting `UKNOWN`
instead of a proper GIT revision. Having the latter one available can be
very useful for debugging a binary or if you look through log files (we
print the IOx version and revision during server startup).

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-26 10:12:43 +00:00
Edd Robinson 1c2681c24e
refactor: remove unused const (#3521) 2022-01-26 09:23:50 +00:00
Nga Tran 52866fe6a9
fix: merge record batches into one batch (#3535)
* fix: merge record batches into one batch

refactor: address review comments

* chore: update test output
2022-01-25 23:29:16 +00:00
Nga Tran d559561fd7
refactor: have the deduplicate work without chunk statistics (#3519)
* refactor: have the deduplicate work without chunk statistics

* test: more tests for duplicates data on different combinations of record batches

* refactor: address review comments
2022-01-25 17:00:25 +00:00
Dom b846ead320
feat(router2): shard writes/deletes into write buffer (#3499)
* feat: Sequencer wrapper

This type wraps an underlying WriteBufferWriter implementation, tagging
it with a sequencer ID it should use when enqueuing operations to the
buffer.

* feat: mock sharder

Implements a mock Sharder impl that returns pre-configured responses to
shard(), and captures the input to the call.

* feat: sharded write buffer

Implements sharding of ops into an underlying WriteBuffer.

Writes are sharded by some abstract Sharder impl, collated per shard to
maximise the size of each op (and therefore compression efficiency),
converted into a DML operation and then enqueued in parallel to the
underlying WriteBuffer implementation.

Deletes are modelled as being mapped to a single write buffer shard,
which is the case while we support sharding based on the table &
namespace only. Deletes will be extended to support (potentially)
multiple shards when column overrides are implemented.

* refactor: runtime write buffers

Switch from using static dispatch, to using a runtime specified
WriteBufferWriting implementation.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-25 15:19:48 +00:00
kodiakhq[bot] 5ad6ca1fda
Merge pull request #3527 from influxdata/squash
chore: Set default to squash
2022-01-25 15:08:07 +00:00
Marko Mikulicic d0590104ff
chore: Set default to squash 2022-01-25 15:57:10 +01:00
Raphael Taylor-Davies db46ac04d0
feat: support line protocol precision parameter (#3522) (#3526)
* feat: support line protocol precision parameter (#3522)

* chore: format imports

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-25 14:12:22 +00:00
Andrew Lamb 51a0f8a56a
chore: temporarily hack around datafusion tempfiles (#3524) 2022-01-25 12:30:29 +00:00
Raphael Taylor-Davies 54ae5de9bf
feat: chunk pruning metrics (#3516)
Co-authored-by: Edd Robinson <me@edd.io>
2022-01-25 11:11:50 +00:00
kodiakhq[bot] 6287d7dc5c
Merge pull request #3514 from influxdata/ntran/ingest_compact
feat: ingester's compaction
2022-01-24 20:01:08 +00:00
NGA-TRAN f9c1e80a7f chore: update thread_local
chore: update thread_local
2022-01-24 13:37:52 -05:00
NGA-TRAN c6a195b0e6 refactor: address review comments 2022-01-24 13:05:44 -05:00
NGA-TRAN 797ba459b9 chore: merge main to branch 2022-01-24 12:06:23 -05:00
NGA-TRAN 5f98a07b7f chore: add Corgo.lock 2022-01-24 12:03:02 -05:00
NGA-TRAN 939ea536d4 feat: add but ignore a few compaction tests 2022-01-24 12:00:23 -05:00
kodiakhq[bot] 5eb2e8b7fe
Merge pull request #3506 from influxdata/pd/ingester-server
feat: Add scaffolding for ingester server
2022-01-24 17:00:00 +00:00
kodiakhq[bot] bf0bb3c643
Merge pull request #3505 from influxdata/pd/refactor-catalog-api
refactor: Clean up the Catalog API
2022-01-24 15:20:12 +00:00
NGA-TRAN ee0a468b4d feat: a few tests for compaction 2022-01-21 18:15:23 -05:00
Paul Dix bb893510a0 feat: Add scaffolding for ingester server
* Adds a new ingester command to start an ingester server
* Moves previous ingester server over to handler
* Skeleton for gRPC and HTTP handlers
2022-01-21 18:02:19 -05:00
NGA-TRAN fa41067e3d refactor: for paul 2022-01-21 16:50:49 -05:00
NGA-TRAN cd01b141f3 refactor: for paul 2022-01-21 16:49:02 -05:00
Paul Dix bfa54033bd refactor: Clean up the Catalog API
This updates the catalog API to make it easier to work with for consumers. I also found a bug in the MemCatalog implementation while refactoring the tests to work with the new API definition. Consumers will now be able to Arc wrap the catalog and use it across awaits.
2022-01-21 16:01:13 -05:00