Commit Graph

889 Commits (23280d0489112d50ab3c7952f4dfb68f8663c971)

Author SHA1 Message Date
Marco Neumann ebeccf037c
feat: limit querier concurrency by limiting number of active namespaces (#4752)
This is a rather quick fix for prod. On the mid-term we probably wanna
rethink our deployment strategy, e.g. by using "one query per pod" and
by deploying queryd w/ IOx into the same pod.
2022-06-01 11:59:35 +00:00
Paul Dix 6af32b7750
feat: add concurrency limit for ingester queries (#4703)
I've defaulted it to 20, we can adjust as needed.

Closes #4657

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-30 10:22:17 +00:00
Andrew Lamb 700a1de8f3
fix: fix at least one intermittent failure (#4711) 2022-05-26 21:24:37 +00:00
Andrew Lamb 633117e595
feat: avoid catalog access on each query (#4650)
* feat: cache catalog access on query

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2022-05-26 20:44:22 +00:00
Nga Tran 6cc767efcc
feat: teach compactor to compact smaller number of files (#4671)
* refactor: split compact_partition into two functions to handle concurrency better

* feat: limit number of files to compact

* test: add test for limit num files

* chore: fix cipply

* feat: split group if over max size

* fix: split the overlapped group to limit size or file num

* chore: reduce config values

* test: add tests and clearer comments for the split_overlapped_groups and test_limit_size_and_num_files

* chore: more comments

* chore: cleanup
2022-05-25 19:54:34 +00:00
Marko Mikulicic 9ddb0a816e
fix: Return panic message in internal error (#4693)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-25 15:11:17 +00:00
Marco Neumann a08a91c5ba
fix: ensure querier cache is refreshed for partition sort key (#4660)
* test: call `maybe_start_logging` in auto-generated cases

* fix: ensure querier cache is refreshed for partition sort key

Fixes #4631.

* docs: explain querier sort key handling and test

* test: test another version of issue 4631

* fix: correctly invalidate partition sort keys

* fix: fix `table_not_found_on_ingester`
2022-05-25 10:44:42 +00:00
Marko Mikulicic cdbe546e50
fix: return gRPC error on panic (#4686) 2022-05-25 07:06:25 +00:00
Andrew Lamb a8d5f7f5f7
test: add debug output to test (#4684) 2022-05-24 19:57:11 +00:00
Marco Neumann 9c1ffc2b0d
test: panic handling, add compactor to end to end test harness (#4677)
* feat: add test gRPC client

* test: start compactor in mini cluster

* test: assert panic handling

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-24 14:55:26 +00:00
dependabot[bot] ca49820a0f
chore(deps): Bump console-subscriber from 0.1.5 to 0.1.6 (#4670)
Bumps [console-subscriber](https://github.com/tokio-rs/console) from 0.1.5 to 0.1.6.
- [Release notes](https://github.com/tokio-rs/console/releases)
- [Commits](https://github.com/tokio-rs/console/compare/console-subscriber-v0.1.5...console-subscriber-v0.1.6)

---
updated-dependencies:
- dependency-name: console-subscriber
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-24 08:24:12 +00:00
dependabot[bot] 76f7043417
chore(deps): Bump once_cell from 1.11.0 to 1.12.0 (#4666)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.11.0 to 1.12.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.11.0...v1.12.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-24 08:14:03 +00:00
Marco Neumann 2029bd16ba
feat: enable debugging of failed querier->ingester requests (#4659)
* feat: enable debugging of failed querier->ingester requests

- extend `query-ingester` CLI to allow usage of predicates
- on failed requests: log all information that required for the CLI
- test the "ingester fails" scenario

* test: explain

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* docs: improve

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* refactor: move b64 pred. serde into a single crate

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-05-23 15:37:31 +00:00
Carol (Nichols || Goulding) c811bebdb7
feat: Add ingester CLI option to skip to oldest available WB seq num
The default behavior of the ingester is to panic if the min unpersisted
sequence number in the catalog is unknown to the write buffer due to the
retention policies having evicted that sequence number.

Specifying `--skip-to-oldest-available` changes this behavior to skip to
the oldest sequence number the write buffer does have available and go
from there.

Fixes #4624.
2022-05-20 10:51:07 -04:00
dependabot[bot] 6bc0c74c7d
chore(deps): Bump once_cell from 1.10.0 to 1.11.0 (#4646)
* chore(deps): Bump once_cell from 1.10.0 to 1.11.0

Bumps [once_cell](https://github.com/matklad/once_cell) from 1.10.0 to 1.11.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.10.0...v1.11.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-20 07:40:38 +00:00
Marco Neumann 20fa70d54b feat: add `measurement_fields` support to `influxdb_iox storage` 2022-05-19 16:50:46 +02:00
Marco Neumann 52346642a0
ci: fix cargo deny (#4629)
* ci: fix cargo deny

* chore: downgrade `socket2`, version 0.4.5 was yanked

* chore: rename `query` to `iox_query`

`query` is already taken on crates.io and yanked and I am getting tired
of working around that.
2022-05-18 09:38:35 +00:00
Andrew Lamb 3a33e806c7
chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `14.0.0` (#4619)
* chore: Update datafusion deps

* chore: update arrow/parquet/arrow flight deps

* chore: Run cargo hakari tasks

* chore: Update location of utils

* chore: Update some more APIs

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-05-17 14:13:03 +00:00
Marco Neumann 779f0e9cdf
feat: querier RAM pool (#4593)
* feat: `SortKey::size`

* feat: `FunctionEstimator`

* feat: querier RAM pool

Let's put all the caches into a single RAM pool, so we can at least
somewhat control RAM usage. Note that this does NOT limit the peak
memory during query execution though, but should at least stop unlimited
cache growth. A follow-up PR will add metrics.

* refactor: improve some size calculations

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-17 13:11:20 +00:00
dependabot[bot] 259d2486c1
chore(deps): Bump tokio-util from 0.7.1 to 0.7.2 (#4605)
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.1 to 0.7.2.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.1...tokio-util-0.7.2)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-16 11:42:31 +00:00
Raphael Taylor-Davies f2bb0fdf77
feat: update to crates.io object_store version (#4595)
* feat: update to crates.io object_store version

* chore: Run cargo hakari tasks

* fix: tests

* chore: remove object store integration test plumbing

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-05-13 16:26:07 +00:00
Carol (Nichols || Goulding) 55313d290a
fix: Update or remove comments that mention NG or OG
Connects to #4450.
2022-05-12 16:09:08 -04:00
Carol (Nichols || Goulding) 30e53fd09c
fix: Rename end-to-end NG tests to not contain NG
Connects to #4450.
2022-05-12 16:09:07 -04:00
Carol (Nichols || Goulding) 48e6e5713d
fix: Rename test_helpers_end_to_end_ng to test_helpers_end_to_end
Connects to #4450.
2022-05-12 16:09:07 -04:00
Carol (Nichols || Goulding) 78bbe629b2
feat: Add more logging to understand the flaky multi ingester test better (#4580)
* feat: Increase logging to investigate multi ingester flaky test

* feat: Temporarily disable a test while logging is increased in CI

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-12 20:05:05 +00:00
Carol (Nichols || Goulding) 2079cf98f6
fix: Add back a test case that needs to check ingester for write info
Specifically because the querier doesn't know about the ingester.
2022-05-11 15:30:59 -04:00
Carol (Nichols || Goulding) 48b84b3bdf
feat: Querier can get write status from ingesters
Connects to influxdata/influxdb-iox-client-go#27.
2022-05-11 14:12:10 -04:00
Andrew Lamb 381ad3b81d chore: Update heappy 2022-05-11 09:49:10 -04:00
Andrew Lamb b8cb4c3f2b
feat: Interrogate schema from querier (as well as router) (#4557)
* refactor: move SchemaService into `service_grpc_schema`

* feat: implement schema gRPC for querier

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-05-10 20:55:58 +00:00
Andrew Lamb 03ee6840d0
feat: Add `debug namespaces` CLI command (#4556) 2022-05-10 18:35:05 +00:00
Andrew Lamb 84fd883688
feat: Add query_ingester CLI command (#4554)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-10 18:18:07 +00:00
Raphael Taylor-Davies 84d60ce56e
fix: feature flags (#4550)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-10 13:42:51 +00:00
Raphael Taylor-Davies 99b1a9b83f
refactor: split out ObjectStoreMetrics (#4547)
* refactor: split out ObjectStoreMetrics

* chore: add workspace hack

* fix: compile
2022-05-10 10:56:28 +00:00
Raphael Taylor-Davies 8b379c83cc
refactor: simplify object_store path handling (#4534)
* refactor: simplify object_store path handling

* fix: aws integration tests

* chore: lint

* fix: update gcs tests

* refactor: move errors into submodules

* chore: lint

* chore: review feedback

* refactor: replace provider with Display

* fix: failing tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-09 18:43:22 +00:00
Carol (Nichols || Goulding) 1759443a13
fix: Remove unused dependencies in influxdb_iox found by manual inspection 2022-05-06 14:51:54 -04:00
Carol (Nichols || Goulding) fcd4815645
fix: Rename router2 to router 2022-05-06 14:51:52 -04:00
Carol (Nichols || Goulding) 0650a9bb77
fix: Rename ioxd_router2 to ioxd_router 2022-05-06 14:45:39 -04:00
Carol (Nichols || Goulding) 068096e7e1
fix: Rename data_types2 to data_types 2022-05-06 14:45:39 -04:00
Carol (Nichols || Goulding) 0541c6e40f
fix: Remove data_types crate where it's no longer used 2022-05-06 14:45:39 -04:00
Carol (Nichols || Goulding) 485d6edb8f
refactor: Move IngesterQueryRequest to generated_types 2022-05-06 14:45:37 -04:00
Carol (Nichols || Goulding) ea46830954
fix: Remove iox_object_store crate; move ParquetFilePath to parquet_file 2022-05-06 14:45:36 -04:00
Carol (Nichols || Goulding) f8bdb022bc
fix: Remove job_registry crate 2022-05-06 11:35:11 -04:00
Carol (Nichols || Goulding) c45a85ca81
fix: Remove now-obsolete 'debug dump catalog' command 2022-05-06 11:30:36 -04:00
Carol (Nichols || Goulding) b88d071ce7
fix: Remove server 2022-05-06 11:30:36 -04:00
Carol (Nichols || Goulding) e0bc1801ac
fix: Remove router 2022-05-06 11:30:36 -04:00
Carol (Nichols || Goulding) cae32209da
fix: Remove parquet_catalog 2022-05-06 11:30:26 -04:00
Carol (Nichols || Goulding) 2d8656e2e1
fix: Remove mutable_buffer 2022-05-06 11:27:33 -04:00
Carol (Nichols || Goulding) e7de16732d
fix: Remove internal_types 2022-05-06 11:27:33 -04:00
Carol (Nichols || Goulding) 7286b4391a
fix: Remove db crate 2022-05-06 11:27:33 -04:00
kodiakhq[bot] 256b60d670
Merge branch 'main' into cn/aio-default 2022-05-06 15:08:07 +00:00
Carol (Nichols || Goulding) 3ae491c801 fix: Give an error variant a non-redundant name
That is, fix rather than allow the clippy lint
2022-05-06 10:59:58 -04:00
Carol (Nichols || Goulding) 32a41b6c05 fix: Remove commented-out enum variants 2022-05-06 10:59:58 -04:00
Andrew Lamb 1d749b3b09 Merge remote-tracking branch 'origin/main' into cn/aio-default 2022-05-06 10:58:11 -04:00
Carol (Nichols || Goulding) 75356ab9e6
feat: Make top-level command optional; run all-in-one mode by default 2022-05-06 10:26:24 -04:00
Carol (Nichols || Goulding) 62a70d9705
feat: Make all-in-one options available at the top command level 2022-05-06 10:24:47 -04:00
Carol (Nichols || Goulding) 7cadb0192b
fix: Make all-in-one mode be the default run mode 2022-05-06 10:24:45 -04:00
Jake Goulding c765455c73
feat: Remove the `server` command 2022-05-06 09:48:30 -04:00
Jake Goulding ece38417bc
feat: Remove the `run database` command 2022-05-06 09:48:30 -04:00
Jake Goulding b939d80db8
feat: Remove the `run router` command 2022-05-06 09:48:30 -04:00
Jake Goulding 041e758a6f
refactor: Move the `schema` command to `debug schema` 2022-05-06 09:48:30 -04:00
Jake Goulding fabfbada60
refactor: Move the `database query` command to `query` 2022-05-06 09:48:29 -04:00
Jake Goulding 3844c65617
refactor: Move the `database write` command to `write` 2022-05-06 09:48:29 -04:00
Andrew Lamb 02893e598c
chore: Update datafusion and upgrade arrow/parquet/arrow-flight to 13 (#4516)
* chore: Tool for automating arrow version update

* chore: Update datafusion and arrow/parquet/arrow-flight

* fix: update for changes in Arrow API

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-05 00:21:02 +00:00
dependabot[bot] 590a71010d
chore(deps): Bump serde_json from 1.0.80 to 1.0.81 (#4514)
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.80 to 1.0.81.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.80...v1.0.81)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-04 14:10:35 +00:00
dependabot[bot] c05383bbf5
chore(deps): Bump serde_json from 1.0.79 to 1.0.80 (#4501)
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.79 to 1.0.80.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.79...v1.0.80)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-02 15:33:47 +00:00
kodiakhq[bot] e09157e81a
Merge branch 'main' into cn/bye-og-grpc-client 2022-05-02 14:54:54 +00:00
dependabot[bot] e3adff2281
chore(deps): Bump console-subscriber from 0.1.4 to 0.1.5 (#4503)
Bumps [console-subscriber](https://github.com/tokio-rs/console) from 0.1.4 to 0.1.5.
- [Release notes](https://github.com/tokio-rs/console/releases)
- [Commits](https://github.com/tokio-rs/console/compare/console-subscriber-v0.1.4...console-subscriber-v0.1.5)

---
updated-dependencies:
- dependency-name: console-subscriber
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-02 14:19:11 +00:00
dependabot[bot] c59eaada5a
chore(deps): Bump thiserror from 1.0.30 to 1.0.31 (#4498)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.30 to 1.0.31.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.30...1.0.31)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-02 13:43:45 +00:00
Andrew Lamb dd3147c2ec
fix: allow `--grpc-bind` and `--api-bind` args in all-in-one mode (#4494)
* fix: allow `--grpc-bind`  and `--api-bind` args in all-in-one mode

* fix: shutdown compactor more quickly on cancel

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-01 20:20:16 +00:00
Carol (Nichols || Goulding) a4443e4c31
fix: Remove OG gRPC client code and APIs 2022-04-29 16:29:49 -04:00
Carol (Nichols || Goulding) 8e21d062fb
fix: Remove use of OG list databases from sql observer mode
Fixes #4478.
2022-04-29 14:33:48 -04:00
Carol (Nichols || Goulding) ede51d2529
fix: Remove OG 'show databases' command in the SQL REPL
Connects to #4478.
2022-04-29 14:33:22 -04:00
Andrew Lamb bc77f87ea9
feat: port influxrpc read_group and read_window_aggregate e2e test to NG (#4464)
port more
2022-04-29 16:13:32 +00:00
Carol (Nichols || Goulding) bb7910f689
fix: Remove influxdb_iox unused dev deps 2022-04-29 10:32:40 -04:00
Carol (Nichols || Goulding) e5cbee9fa8
fix: Delete OG end-to-end tests 2022-04-29 10:32:21 -04:00
Andrew Lamb 6381ea60bb
chore: port remaining read_filter influxrpc tests to NG (#4383)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-29 14:06:50 +00:00
Marco Neumann 0a20086a58
feat: expose catalog timeouts via CLI/env (#4472)
This is useful for local instances that run against a prod system,
because port forwarding can lead to long connection delays.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-29 11:14:15 +00:00
Andrew Lamb ed1ad858c0
chore: add some logging to flaky end to end test (#4465)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-29 11:06:06 +00:00
dependabot[bot] 01a6cf1c2a
chore(deps): Bump http from 0.2.6 to 0.2.7 (#4470)
Bumps [http](https://github.com/hyperium/http) from 0.2.6 to 0.2.7.
- [Release notes](https://github.com/hyperium/http/releases)
- [Changelog](https://github.com/hyperium/http/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/http/compare/v0.2.6...v0.2.7)

---
updated-dependencies:
- dependency-name: http
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-29 09:56:48 +00:00
Paul Dix 8e48fcd620
feat: add remote pull partition (#4433)
Add lookup of partitions by table id to catalog.
Add API to catalog to return partitions by table id.
Add to client to return partitions by table id.
Add CLI to pull remote schema, partition, and parquet files into a local catalog and object store.
2022-04-28 21:04:27 +00:00
Andrew Lamb 63df3ceb6f
feat: share server processes in end to end test (#4387)
* feat: share server processes in end to end test

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-28 17:51:14 +00:00
Carol (Nichols || Goulding) 5c89e7b952
fix: Remove max catalog connection config from all-in-one mode 2022-04-28 09:29:03 -04:00
Carol (Nichols || Goulding) a024e07147
fix: Small corrections to some comments 2022-04-28 09:29:02 -04:00
Carol (Nichols || Goulding) df1afa3481
feat: Log what kind of server in startup/shutdown logs 2022-04-28 09:29:02 -04:00
Carol (Nichols || Goulding) d6d50f83c2
feat: Set different catalog config defaults for all-in-one mode
Connects to #4399.

If `--catalog-dsn` is specified, use that Postgres catalog. If
`--catalog-dsn` is not specified, use an in-memory catalog.
2022-04-28 09:29:02 -04:00
Carol (Nichols || Goulding) 941dd12dd1
feat: Set different write buffer config defaults for all-in-one mode
Connects to #4399.

Only file-based write buffer is supported. If `--data-dir` is specified,
store it there, otherwise store it in a temp directory to be ephemeral
2022-04-28 09:29:02 -04:00
Carol (Nichols || Goulding) 0cfd16263c
refactor: Change run_config to logging_config
The only spot this method is used actually wants the logging config
2022-04-28 09:29:01 -04:00
Carol (Nichols || Goulding) 06342f9ed8
feat: Set different ingester defaults for all-in-one mode
Connects to #4399.

Manually flatten the arguments to set different defaults, not allow
setting the partition min/max, but still allow customization of the
other arguments.
2022-04-28 09:29:01 -04:00
Dom Dwyer 246af0c3ca test: remove hard-coded metric count
Prior to this commit, adding any metric to the catalog (and only the
catalog) would cause the end_to_end_ng_cases::metrics::test_metrics test
to fail due to asserting an exact number of metrics observed.

This commit changes the check condition to a more permisive >= rather
than ==.
2022-04-28 11:28:48 +01:00
dependabot[bot] 420c306caa
chore(deps): Bump tokio from 1.17.0 to 1.18.0 (#4453)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.17.0 to 1.18.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.17.0...tokio-1.18.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-28 08:21:17 +00:00
dependabot[bot] a195973cfb
chore(deps): Bump clap from 3.1.11 to 3.1.12 (#4406)
* chore(deps): Bump clap from 3.1.11 to 3.1.12

Bumps [clap](https://github.com/clap-rs/clap) from 3.1.11 to 3.1.12.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.1.11...v3.1.12)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* fix: Update tests now that the clap crossed-streams bug has been fixed

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@gmail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-27 17:43:55 +00:00
Andrew Lamb bea4556749
chore: Ignore flaky OG end to end test (#4441)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-27 15:17:46 +00:00
Dom Dwyer fb777e7e51 feat(router2): configurable max HTTP requests
Adds a CLI / env configuration option controlling the maximum number of
simultaneous HTTP requests in the router.
2022-04-26 11:13:25 +01:00
二手掉包工程师 4b47d723b1
refactor: Rename time to iox_time (#4416)
Signed-off-by: hi-rustin <rustin.liu@gmail.com>

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-26 00:19:59 +00:00
Jake Goulding e3caf24954
feat: Rewrite the NG end-to-end metrics test (#4404)
* refactor: Expose data generation tool for wider use

* feat: Add a step for retrieving the server metrics

* refactor: Copy the OG end-to-end metrics test to NG

* feat: Rewrite the NG end-to-end metrics test

This is still broken because the the row timestamp metrics don't exist
in NG.

* fix: Test metrics relevant to NG

* refactor: Move the data generator to the test helper crate

* refactor: Extract a ReadFilter request builder into the test helper crate

* refactor: Make test helper request builder able to build other gRPC requests

Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@gmail.com>
2022-04-25 19:47:56 +00:00
Marco Neumann 86e8f05ed1
fix: make all catalog IDs 64bit (#4418)
Closes #4365.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-25 16:49:34 +00:00
Carol (Nichols || Goulding) f482286ac9
fix: Update function name that was fixed upstream 2022-04-25 11:04:37 -04:00
Carol (Nichols || Goulding) 0bda66a01d
feat: Write end-to-end tests for the gRPC write API in NG
Fixes #3941.
2022-04-25 09:46:13 -04:00
Jake Goulding f92fa69c8c
fix: Typo in test helper method name (#4402)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-22 20:21:30 +00:00
Carol (Nichols || Goulding) 117569184e
feat: Port end-to-end logging test to NG (#4400)
* feat: Copy end-to-end logging test to NG

This was created with:

cp influxdb_iox/tests/end_to_end_cases/influxdb_ioxd.rs influxdb_iox/tests/end_to_end_ng_cases/logging.rs

* feat: Port logging test to NG end-to-end tests

And re-enable it, it was ignored.

* fix: Specify that an in-memory catalog should be used for the logging test

* fix: Check for gRPC instead of HTTP

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-22 18:05:43 +00:00
Andrew Lamb 14cb2f5674
fix: less async shenanigans in end to end tests (#4384)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-22 14:38:57 +00:00
Andrew Lamb ff902c40d2
chore: port debug end to end tests to NG (#4393)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-22 14:30:09 +00:00
dependabot[bot] e8bfd7a537
chore(deps): Bump clap from 3.1.10 to 3.1.11 (#4390)
* chore(deps): Bump clap from 3.1.10 to 3.1.11

Bumps [clap](https://github.com/clap-rs/clap) from 3.1.10 to 3.1.11.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.1.10...v3.1.11)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* fix: update tests for changes to clap

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-22 11:15:48 +00:00
Andrew Lamb c9c41f4aed
refactor: use `info!` rather `println!` in end to end tests (#4380)
* refactor: use `info!` rather `println!` in end to end tests

* chore: change from println to info in end_to_end_ng_cases too

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-22 10:01:20 +00:00
Andrew Lamb c0ed688043
refactor: Split influxrpc end to end tests into smaller modules (#4382)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-22 07:53:45 +00:00
Carol (Nichols || Goulding) 2087dad1f7
feat: Port tracing tests to NG end-to-end tests 2022-04-21 16:22:43 -04:00
Carol (Nichols || Goulding) 02692751c7
feat: Copy end-to-end tracing tests to NG end-to-end tests
This doesn't compile at all, but this is a straight cp so should make it
a bit easier to see what has changed in the port in the next commit.
2022-04-21 16:20:34 -04:00
Andrew Lamb d41086ac7f
refactor: Move Connection in ServerFixture to make TestServer shareable (#4381)
* refactor: Move Connection in ServerFixture to make TestServer shareable

* fix: Update docstrings

* fix: restore panic on previous error

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-21 18:15:53 +00:00
Andrew Lamb abd005e0c2
refactor: Consolidate cluster creation in end to end tests (#4373) 2022-04-21 18:03:17 +00:00
Andrew Lamb cc9205024c
test: Begin porting influxrpc tests to ng end to end tests (#4372)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-21 16:24:45 +00:00
Andrew Lamb 2984578aac
chore: Update pprof, remove old versions of prost (#4374)
* chore: Upgrade pprof to 0.8

* chore: Update heappy
2022-04-21 10:52:23 +00:00
Andrew Lamb c05612935b
test: add end to end querier test with multiple ingesters (#4309)
* test: add a test with multiple ingesters

* docs: improve doc test

* refactor: Print out step number in StepTest

* fix: make timeout more explicit

* fix: add tests for merge_info
2022-04-20 11:56:41 +00:00
Dom Dwyer c363242902 refactor: emit panic metrics for server types
Configures the long-running / server modes to emit panic metrics.
2022-04-20 12:30:02 +01:00
Andrew Lamb 73bed810da
chore: Update arrow, arrow-flight, parquet, tonic, prost, etc (#4357)
* chore: Update datafusion

* chore: Update arrow/arrow-flight/parquet to 12

* chore: update datafusion correctly

* chore: Update prost, tonic, and dependents

* fix: Fixup some api changes

* fix: Update test output in db

* fix: Update test output in parquet_file

* fix: remove old pbjson types

* fix: Add "--experimental_allow_proto3_optional" flag

* chore: Run cargo hakari tasks

* fix: compile error

* chore: Update heappy

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-20 11:12:17 +00:00
Andrew Lamb 5ea676d3f7
feat: add per kafka partition durability reporting to write info response (#4341)
* feat: add per kafka partition durability reporting to write info response

* fix: buf lint + test cleanup

* fix: clean up protobuf

* refactor: pull out conversion of KafkaPartitionStatus into a function

* fix: fmt

* fix: typo

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-19 16:46:20 +00:00
Andrew Lamb 6088c1a588
refactor: rewrite schema command end to end tests to use new StepTest framework (#4337)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-19 16:29:50 +00:00
Paul Dix 103629b01d
feat: add client and CLI to get file from object store (#4343)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-19 16:16:27 +00:00
Marco Neumann 5b48675435
fix: actually transmit record-batch metadata from querier (#4347)
Attaching the "batch => partition" mapping via per-batch schema KV
metadata does NOT work because flight will transmit the schema once for
all batches (even though on the Rust side we have a schema ref attached
to every batch, probably for convenience). Instead we now use the same
global protobuf metadata that we also use for the "partition => max
sequence number" information. This somewhat limits our ability to create
record batches lazily on the ingester side (since the global metadata is
sent before any actual payload) but I think we should not modify the
usage of the flight protocol too much right now (e.g. by sending more
schema messages). If this becomes an issue, we can always find a more
complex solution in the future.
2022-04-19 10:54:23 +00:00
Paul Dix 5bf4550259
feat: add object store service to router (#4338)
Add method to catalog to get parquet file by object store id.
Add gRPC service for object store to get a file from by its uuid.
Add the object store service to router2 with object store config.
2022-04-16 17:58:31 +00:00
Paul Dix 197a3818d3
feat: add catalog client and remote command (#4329)
* feat: add catalog client and remote command

Adds the catalog gRPC service to influxdb_iox_client.
Adds a new remote command to execute commands against a remote IOx host.
Adds partition subcommand to remote to get the details of a partition by id.

* test: add end to end test for `remote partition` CLI (#4336)

* chore: cleanup partition CLI PR feedback

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-04-15 13:59:49 +00:00
Marco Neumann 351b0d0c15
fix: unknown namespace/table in querier<>ingester flight protocol (#4307)
* fix: return "not found" gRPC error instead of "internal" when ingester does not know table

* fix: properly handle "namespace not found" in ingester queries

* fix: make `initialize_db` work with async code

* test: add custom step for NG tests

* fix: handle "unknown table/namespace" resp. in querier

* docs: explain test setup

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-04-14 12:36:15 +00:00
Andrew Lamb 85f3e696e8
refactor: Use declarative steps to reduce duplication in end to end testing (#4301)
* refactor: Use declarative steps to reduce duplication in end to end testing,

* fix: improve whitespace formatting

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-13 16:24:57 +00:00
Andrew Lamb bd4566cbe0
fix: Do not specify a querier address by default (#4289) 2022-04-12 20:52:18 +00:00
Marco Neumann 83f77712b1
refactor: querier<>ingester flight protocol adjustments (#4286)
* refactor: querier<>ingester flight protocol adjustments

This makes a few adjustments to the querier<>ingester flight protocol.

Query Scope
===========
The querier will request data for ALL sequencer IDs for now. There is
no reason to have a request per sequencer ID. We can add a range/set
filter later if we want, but this is not required for now.

Partition-level
===============
The only time when the querier cares about sequencer IDs (i.e. sharding)
at all is when it selects which ingesters to ask for unpersisted data
(this is currently not implemented, it just asks all ingesters).
Afterwards the querier only cares about partitions (which are bound to
specific sequencers anyways) because this is the level where parquet
file persistence and compaction as well as deduplication happen. So we
make partitions a first-class citizen in the ingester response.

Metadata VS RecordBatches
=========================
The global app-metadata will list all partitions and their max
persisted parquet files and tombstones (theoretically tombstones are at
table-level, but the ingester could in the future break them down to the
partition-level). Then it receives a stream of record batches. Each
record batch is tagged (via key-value metadata in its schema) so it can
be assigned to a partition. At the moment the ingester returns 0 or 1
batches per unpersisted partition (0 in case we've filtered out all the
data via the predicate), but in the future it is free to return multiple
batches. This setup gives the ingester more freedom over memory
management and (potentially parallel) query processing, while at the
same time keeps the set of duplicated information minimal and allows
easy extensions (since the global metadata is a full-blown protobuf
message).

Querier
=======
At the moment the querier ignores all the metdata. Follow-up PRs will
change that.

* docs: improve

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* refactor: make code clearer

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-04-12 16:48:40 +00:00
Dom Dwyer f08c8373aa feat: tokio-console support
This commit adds an optional feature (disabled by default) to the IOx
server binary named "tokio_console". When enabled, it adds support for
the tokio-console to IOx:

    https://github.com/tokio-rs/console

Enabling this feature drops the compile-time log filter to TRACE, and
enables the necessary dependencies to support the instrumentation needed
for the console to function.

Unfortunately, this feature uses tonic 0.7 (latest) while we use tonic
0.6, so we wind up with two tonic versions being compiled in when this
feature is enabled.
2022-04-12 13:02:31 +01:00
Andrew Lamb d8de38cdb9
feat: MVP include un-persisted results from the ingester in query results (#4255)
* feat: Return not-yet-persisted data in query results

* fix: comments from code review

* fix: update for logical merge conflict

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-12 11:03:19 +00:00
Marco Neumann 380cd9bbff
refactor: use a single flight client implementation (#4273)
"end-user -> querier" and "querier -> ingester" should use a single
Flight client implementation. The difference is just the request and
response metadata.

This changes our default Flight client to use protobuf instead of JSON
for the ticket format.
2022-04-12 09:08:25 +00:00
Andrew Lamb 3f5eab7648
feat: allow the querier to talk with multiple ingesters (#4271)
* refactor: Move querier config to clap_blocks

* refactor: Add tests

* refactor: allow multiple addresses

* refactor: Update to use multiple addresses

* fix: bow to clippy

* fix: docstring

* fix: error if address is repeated multiple times

* chore: Add error enum, plumb through

* fix: clippy

* refactor: improve Rust API

* fix: fix test
2022-04-11 18:49:49 +00:00
Andrew Lamb f6e6821276
feat: Add basic Querier <--> Ingester "Service Configuration" (#4259)
* feat: Add basic Querier <--> Ingester "Service Configuration"

* docs: update comments in test

* refactor: cleanup tests a little

* refactor: make trait more consistent

* docs: improve comments in IngesterPartition
2022-04-11 11:50:22 +00:00
Andrew Lamb eb7d41f7a1
test: Add schema validation to end to end querier test (#4258)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-08 18:11:00 +00:00
Andrew Lamb 2cdd1951d9
fix: fix pprof (#4261) 2022-04-08 17:46:01 +00:00
Andrew Lamb a30a85e62c
feat: Add get_write_info service (#4227)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-07 19:24:58 +00:00
Andrew Lamb edda409b19
refactor: Extract `ioxd_test`, `ioxd_compactor`, `ioxd_ingester`; remove `ioxd` (#4210)
* refactor: Extract test, compactor, ingester, and test

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-03 10:42:22 +00:00
Andrew Lamb 833c10c083
feat: return write_token from HTTP writes to router2 (#4202)
* feat: return write_token from HTTP writes to router2

* fix: Update router2/src/dml_handlers/instrumentation.rs

Co-authored-by: Dom <dom@itsallbroken.com>

* refactor: Use WriteSummary::default more vigorously

* fix: fix typo and add links to follow on issues

Co-authored-by: Dom <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-04-02 10:34:51 +00:00
Nga Tran 77ad4a7dad
feat: replace a compactor constant with an CLI config param (#4204) 2022-04-01 17:50:43 +00:00
Andrew Lamb d37af1a7f5
fix: include git sha (again) in release build (#4193)
* fix: error if git-sha can not be found

* refactor: move main to influxdb_iox

* fix: fmt
2022-03-31 19:14:21 +00:00
Andrew Lamb 532d227d11
refactor: extract router2 into ioxd_router2 (#4183)
* refactor: extract router2 from ioxd

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-03-31 17:39:05 +00:00
Andrew Lamb 367e926d35
refactor: extract querier into ioxd_querier (#4182)
* refactor: extract querier into ioxd_querier

* fix: dep
2022-03-31 16:03:31 +00:00
Andrew Lamb a384448b92
refactor: rename Sequence::id and Sequence::number field names (#4190)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-31 15:17:58 +00:00
Andrew Lamb a1df864283
feat: Support 'SHOW NAMESPACES' in sql repl (#4164)
* feat: Support `SHOW NAMESPACES` in sql repl

* feat: add basic support to clients

* fix: add get_namespaces service test

* fix: proper error handling

* test: end to end test for namespace client

* refactor: Use QuerierDatabase rather than Catalog

* refactor: remove unused function
2022-03-31 12:57:33 +00:00
Paul Dix 04d961e70d
feat: wire up compactor scheduler and config (#4139)
Add configuration options for compactor for the max size of level 0 files and split percentage.
Add metrics for compaction to track the number of candidates, compactions, and durations.
Add functions to separate identifying partitions to compact from running compaction.
Make compaction run in smaller chunks, specifically per partition.
Update compaction to automatically promote level 0 files that are non-overlapping without waiting some period of time.

Closes #4120

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-30 17:45:24 +00:00
Andrew Lamb 92da65a065
feat: Add end to end tests for querier and schema client (#4178)
* refactor: split up ingester schema test, add mini cluster

* feat: add schema cli test

* feat: add end to end test for querier

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-30 17:07:32 +00:00
Edd Robinson 7a437387d9
feat: add KeySortCapability capability (#4176) 2022-03-30 15:57:03 +00:00
Andrew Lamb f2a3dd58b2
refactor: rename `influxdb_ioxd` to `ioxd` (#4162)
* refactor: rename influxdb_ioxd to ioxd

* fix: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-30 13:31:27 +00:00
Marco Neumann 2b76c31157
refactor: make statistics null counts optional (#4160)
Min/max values and distinct counts are already optional, so let's make
the null counts optional as well. This will be helpful for NG to deal w/
partial statistics (e.g. we only populate stats for the time column).

Note that the total count is still mandatory, but we normally have the
chunk/file-level row count at hand.
2022-03-29 17:47:57 +00:00
Andrew Lamb 4ca52e5ae0
refactor: Extract common, OG database and router out of influxdb_ioxd (#4149)
* refactor: Extract common, OG database and router out of influxdb_ioxd

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-03-29 13:07:19 +00:00
dependabot[bot] 17af5fcbd1
chore(deps): Bump tokio-util from 0.7.0 to 0.7.1 (#4154)
* chore(deps): Bump tokio-util from 0.7.0 to 0.7.1

Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.0 to 0.7.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.0...tokio-util-0.7.1)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-29 08:39:02 +00:00
Andrew Lamb 196ad50f5a
test: add e2e test for schema CLI (#4132)
* test: add e2e test for schema CLI

* fix: logical confict
2022-03-25 18:11:03 +00:00
Marco Neumann 9886ff42cc refactor: clean up querier public interface 2022-03-25 11:54:52 +01:00
Andrew Lamb e222acbb48
refactor: use typed `TestConfig` rather than environment variable names for NG end to end tests (#4126)
* refactor: move environment variable mapping in end to end tests into TestConfig

* fix: clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-24 18:36:47 +00:00
Andrew Lamb 39d9f30f12
refactor: Make `server` an implementation detail (#4122) 2022-03-24 15:58:04 +00:00
Andrew Lamb 5c69a3f43b
chore: Update deps: datafusion, arrow/arrow-flight/parquet to 11, zstd to 0.11 (#4119)
* chore: update datafusion

* chore(deps): Bump arrow from 10.0.0 to 11.0.0

Bumps [arrow](https://github.com/apache/arrow-rs) from 10.0.0 to 11.0.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/10.0.0...11.0.0)

---
updated-dependencies:
- dependency-name: arrow
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore(deps): Bump arrow-flight from 10.0.0 to 11.0.0

Bumps [arrow-flight](https://github.com/apache/arrow-rs) from 10.0.0 to 11.0.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/10.0.0...11.0.0)

---
updated-dependencies:
- dependency-name: arrow-flight
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: update parquet to 11.0.0

* fix: error on create schema, test for same

* fix: upgrade zstd

* chore: Run cargo hakari tasks

* fix: fix logical merge conflict

* fix: hakari

* fix: hakari

* fix: update newly introduced dep

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-24 15:27:36 +00:00
Andrew Lamb fb75ef7b82
test: add end to end test for all in one mode, restructure fixture (#4114)
* test: add end to end test for all in one mode, restructure fixture

* docs: fix typos and clarify schema requrements
2022-03-24 12:53:25 +00:00
Andrew Lamb 7f2c2fde2c
fix: fix all in one mode argument handling so it can start (#4115)
* fix: fix all in one mode argument handling

* fix: clippy

* fix: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-23 18:16:22 +00:00
Paul Dix 4f5321d19b
feat: add compactor configuration for kafka topic and sequencers (#4107) 2022-03-23 14:11:47 +00:00
Luke Bond e109fa4987
feat: schema client and CLI (#4105)
* feat: schema client and CLI

* chore: clarification in comment in schema command

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-23 13:49:24 +00:00
Andrew Lamb b83b000590
chore: Update datafusion (#4071)
* chore: update to datafusion 5936edc2a94d5fb20702a41eab2b80695961b9dc

* chore: Update apis to match datafusion changes
2022-03-22 13:17:41 +00:00
Dom Dwyer 6fb1a9b592 feat(querier): enable object store metrics 2022-03-15 16:32:52 +00:00
Dom Dwyer f4d836eed7 feat(ingester): enable object store metrics 2022-03-15 16:32:52 +00:00
Dom Dwyer 65273721b6 feat(compactor): enable object store metrics 2022-03-15 16:32:52 +00:00
Dom Dwyer 5585dd3c21 refactor: switch to using DynObjectStore
Changes all consumers of the object store to use the dynamically
dispatched DynObjectStore type, instead of using a hardcoded concrete
implementation type.
2022-03-15 16:32:52 +00:00
Dom Dwyer 1d5066c421 refactor: rename ObjectStore -> ObjectStoreImpl
Frees up the name for so we can use `dyn ObjectStore` throughout the
code instead of `ObjectStoreApi`.
2022-03-15 16:29:43 +00:00
Andrew Lamb 9b3f946c10
feat: all in 1 IOx NG mode (#3965)
* feat: Add all_in_one mode

* fix: doc

* docs: fix truncated docs

* refactor: correctly identify PG connections

* refactor: resolve failed merge

Co-authored-by: Dom Dwyer <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-15 16:28:37 +00:00
Nga Tran 5a29d070ea
feat: Implement the compact function for NG Compactor (#4001)
* feat: initial implementation of compact a given list of overlapped parquet files

* feat: Add QueryableParquetChunk and some refactoring

* feat:  build queryable parquet chunks for parquet files with tombstones

* feat: second half the implementation for Compactor's compact. Tests will be next

* fix: comments for trait funnctions fof QueryChunkMeta

* test: add tests for compactor's compact function

* fix: typos

* refactor: address Jake's review comments

* refactor: address Andrew's comments and add one more test for files in different order in the vector

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-11 20:25:19 +00:00
Andrew Lamb cc4875cca0
refactor: decouple ingester setup and creation logic from the config structs (#4020)
* refactor: decouple ingester setup and creation logic from the config structs

* fix: clippy

* refactor: remove comments
2022-03-11 19:25:50 +00:00
Andrew Lamb b24ae7d23b
refactor: extract out compactor creation from config (#4018)
* refactor: extract out compactor creation from config

* fix: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-11 14:46:34 +00:00
Andrew Lamb 29df46975f
refactor: extract out querier creation from config (#4017)
* refactor: extract out querier creation from config

* fix: clippy
2022-03-11 14:38:09 +00:00
dependabot[bot] bd2824aa59
chore(deps): Bump pprof from 0.6.2 to 0.7.0 (#4004)
* chore(deps): Bump pprof from 0.6.2 to 0.7.0

Bumps [pprof](https://github.com/tikv/pprof-rs) from 0.6.2 to 0.7.0.
- [Release notes](https://github.com/tikv/pprof-rs/releases)
- [Changelog](https://github.com/tikv/pprof-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tikv/pprof-rs/commits)

---
updated-dependencies:
- dependency-name: pprof
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* fix: `pprof` features for version 0.7

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-11 11:08:59 +00:00
Nga Tran f03ebd79ab
refactor: move querier's test utils to a new crate to get reused by tests in other crates (#4013)
* refactor: move querier's test utils to a new crate to be able resued by tests in other crates

* chore: remove unused import
2022-03-10 18:17:58 +00:00
Andrew Lamb 2c3d30ca32
chore: Update datafusion, arrow, flight and parquet (#4000)
* chore: Update datafusion, arrow, flight and parquet

* fix: api change

* fix: fmt

* fix: update test metadata size

* fix: Update sizes in parquet test

* fix: more metadata size update
2022-03-10 12:24:47 +00:00
Carol (Nichols || Goulding) c891f3c4f2
fix: Print an error with sample env var variable if unset 2022-03-09 09:55:43 -05:00
Carol (Nichols || Goulding) 8af2f60b59
fix: Run catalog setup as part of end-to-end test setup 2022-03-09 09:55:43 -05:00
Carol (Nichols || Goulding) 93b0cdbcc4
fix: Create the test database as part of ng server fixture startup 2022-03-09 09:55:43 -05:00
Carol (Nichols || Goulding) e4fb227c6e
feat: Set the catalog URL explicitly in the test config 2022-03-09 09:55:43 -05:00
Carol (Nichols || Goulding) 465fb0272d
fix: Remove unneeded server id const 2022-03-09 09:55:43 -05:00
Carol (Nichols || Goulding) 6d086705a8
fix: Remove ng create_shared server fixture method for now 2022-03-09 09:55:42 -05:00
Carol (Nichols || Goulding) e315012fe3
fix: Switch NG tests to use TEST_INFLUXDB_IOX_CATALOG_DSN too 2022-03-09 09:55:42 -05:00
Carol (Nichols || Goulding) a14e642f39
refactor: Extract NG end-to-end tests from OG end-to-end tests 2022-03-09 09:55:42 -05:00
Carol (Nichols || Goulding) 1536bdeca0
test: Add an end-to-end NG test 2022-03-09 09:55:42 -05:00
Carol (Nichols || Goulding) 66a5649258
test: Print out the server type with server log messages
So that when you have more than one server running in a test, it's
easier to see which one is saying what
2022-03-09 09:55:42 -05:00
Carol (Nichols || Goulding) ae45d9f750
test: Add Router2 as a supported ServerFixture type 2022-03-09 09:55:42 -05:00
Andrew Lamb d2c0acdd46
refactor: Remove serving readiness gate (#3986)
* refactor: Remove serving_readiness

* fix: remove more

* fix: remove test

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-09 12:17:44 +00:00
Nga Tran 09fba1d2c0
feat: NG Compactor - main function for finding and compacting parquet files (#3973)
* feat: main function for finding and compacting parquet files

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* refactor: rename file and struct

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-08 16:34:43 +00:00
Andrew Lamb 357ea52d8d
refactor: decouple router2 setup and creation logic from the config structs, switch to `Arc<dyn ServerType>` (#3969)
* refactor: Extract router server instantiation into `influxdb_ioxd`

* refactor: complete dyn ServerType

* fix: remove some leftovers

* fix: clippy

* fix: restore startup message order

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-08 14:06:18 +00:00
Andrew Lamb 7fa17ef1d7
refactor: remove unused error enums (#3970)
* refactor: remove unused error enums

* fix: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-07 21:40:55 +00:00
Paul Dix 337e432a0f
feat: ingester persists on cold partitions (#3942)
Add configuration and lifecycle to trigger partition persistence if it hasn't received a write in a given number of secods.

Fixes #3869
2022-03-07 18:55:56 +00:00
Dom Dwyer aaec1c7828 build: propagate feature flags to influxdb_ioxd
When you did a "cargo build --no-default-features" before this commit,
the influxdb_iox crate was compiled without the default features
(namely jemalloc stuff) but the influxdb_ioxd crate was compiled with
jemalloc (it was not possible to compile without jemalloc support).

This commit fixes propagation of the jemalloc flag
("jemalloc_replacing_malloc") into the influxdb_ioxd crate.
2022-03-07 15:44:13 +00:00
Raphael Taylor-Davies 80fb75d90b
feat: add a flag to enable per-partition tracing (#3928)
* feat: add a flag to enable per-partition tracing

* chore: rename constant

* feat: use BooleanFlag and cache result
2022-03-07 13:49:23 +00:00
kodiakhq[bot] e47bfc40be
Merge branch 'main' into dom/router-grpc-write 2022-03-07 11:13:19 +00:00
dependabot[bot] 48908054d1
chore(deps): Bump once_cell from 1.9.0 to 1.10.0 (#3955)
* chore(deps): Bump once_cell from 1.9.0 to 1.10.0

Bumps [once_cell](https://github.com/matklad/once_cell) from 1.9.0 to 1.10.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.9.0...v1.10.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-03-07 08:54:20 +00:00
Dom Dwyer 7c5ba34d44 refactor: enable gRPC handler
Plumbs the gRPC write handler into the existing router2 server.
2022-03-04 14:51:43 +00:00
Raphael Taylor-Davies 636381c805
fix: fix read-window-aggregate CLI defaults (#3930) 2022-03-04 12:09:07 +00:00
Dom Dwyer bb9b140f4b refactor: sequencer metrics
Records per-sequencer (kafka partition) enqueue latency / counts broken
down by operation success/error.
2022-03-03 23:40:13 +00:00
Raphael Taylor-Davies e304613546
feat: include trace ID in query log (#3912) (#3923)
* feat: include trace ID in query log (#3912)

* chore: fmt

* chore: lint
2022-03-03 17:50:49 +00:00
Andrew Lamb d6afda0227
refactor: split `influxdb_ioxd`, `clap_blocks` and `serving_readiness` from influxdb_iox crate (#3908)
* refactor: split influxdb_ioxd, clap_blocks, and serving_readiness out of influxdb_iox

split out serving readiness, get compiling

* fix: hakari

* fix: hakari again

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-03 16:48:30 +00:00
Raphael Taylor-Davies 82aa314659
feat: make wait-server-initialized wait for database init (#3915)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-03 15:37:22 +00:00
Edd Robinson ea32bc366a feat: add read_group tracing 2022-03-03 14:27:01 +00:00
Edd Robinson 32baaa1ee7 feat: add tracing to field_columns 2022-03-03 14:27:01 +00:00
Edd Robinson 787a848bf5 feat: add tracing for tag_values 2022-03-03 14:27:01 +00:00
Edd Robinson 6a6fbf73ae feat: add tracing support tag_keys 2022-03-03 14:27:01 +00:00
kodiakhq[bot] 04a7a957fe
Merge branch 'main' into dom/rustls 2022-03-03 11:59:40 +00:00
kodiakhq[bot] c0ebe5a2f1
Merge branch 'main' into dom/fix-feature-flags 2022-03-03 11:48:35 +00:00
Dom Dwyer 6204a9046d fix: jemalloc feature flag
In #2435 we moved the jemalloc metrics around, and missed a few feature
gates to avoid pulling in jemalloc unless it is requested.
2022-03-03 11:48:24 +00:00
Raphael Taylor-Davies 0660b514e3
feat: allow persisting all partitions and/or all tables from CLI (#3874)
* feat: allow persisting all partitions and/or all tables

* chore: use BTreeSet

* feat: non-zero exit code on error continue

* test: test continue-on-error

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-03 11:17:34 +00:00
Dom Dwyer 46bb107be4 refactor: use rustls
Removes openssl as a dependency, switching to rustls[1] as the TLS
implementation throughout.

It is important to note that this change brings with it a significant
behavioural difference - rustls does not currently support IP SANs in
certificates (instead only supporting fully-qualified names / DNS) and
this will manifest as a failure to connect to IP endpoints over TLS.
This might be a blocker that prevents us using rustls exclusively, but
there's noe asy way to know without trying it. Fortunately the rustls
project has received funding to work on IP SAN support[2].

[1]: https://github.com/rustls/rustls
[2]: https://www.abetterinternet.org/post/preparing-rustls-for-wider-adoption/
2022-03-03 11:05:20 +00:00
kodiakhq[bot] caba3e9fd2
Merge branch 'main' into cn/querier-flight-request 2022-03-02 20:30:00 +00:00
Carol (Nichols || Goulding) 806a43eb6c
fix: Temporarily remove ingester end-to-end test case
This now needs more setup because without any data, the flight query
request returns an error. The setup is proving to be nontrivial.
2022-03-02 15:29:48 -05:00
Edd Robinson 3d047073b9
feat: add tracing down to the chunk level (#3804)
* refactor: wire exectution context to Deduplicator

* feat: example trace to chunk read_filter

* refactor: make execution context required

* refactor: expose metadata API

* refactor: more span context for chunk read_filter

* refactor: fix build

* refactor: push context into result stream

* refactor: make executor optional
2022-03-02 19:08:22 +00:00
Carol (Nichols || Goulding) 2a90841715
refactor: Move IngesterQueryRequest to data_types2
So that querier doesn't need to depend on ingester.
2022-03-02 13:52:13 -05:00
Carol (Nichols || Goulding) 8f3e44bf76
refactor: Extract a crate for shared data types in the new design 2022-03-02 12:16:15 -05:00
Carol (Nichols || Goulding) 16d86ed05b
feat: Deserialize metadata to get max_sequencer_number
And add an end-to-end test for the flight request to the ingester.
2022-03-02 11:50:47 -05:00
Carol (Nichols || Goulding) 141a6087d0
feat: Querier able to send Flight queries to Ingester
Fixes #3773.
2022-03-02 11:50:45 -05:00
Andrew Lamb 286d5f7b2b
feat: add `success` column to system.queries (#3891)
* feat: add `success` column to system.queries

* refactor: Remove lifetime from QueryCompletedToken and thread through flight

* test: update test to make incomplete query clearer

* refactor: use better patter to set complete

* fix: logical merge conflict
2022-03-02 15:05:06 +00:00
Dom Dwyer bd64f55658 feat: http ingest metrics
Records LP line count, field count & request body size (decompressed,
byte size) for writes, and request body byte size for deletes.
2022-03-02 13:05:55 +00:00
Marco Neumann af57664f53
feat: wire up flight into the querier (#3889) 2022-03-02 09:20:26 +00:00
Marko Mikulicic 39e39f92a8
fix: remove rskafka from valid options for INFLUXDB_IOX_WRITE_BUFFER_TYPE (#3890)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-03-01 17:29:02 +00:00
Marco Neumann 01f4d3ed3b
fix: check arrow instead of storage gRPC health for SQL REPL (#3886)
There is no reason a query pod should support the storage API. Note that
some features like the observer mode or `show databases;` still need the
management API. We'll probably need to fix that for NG at some point.
2022-03-01 15:46:02 +00:00
Marco Neumann ace4af1b66
feat: `DedicatedExecutor` async `join` and job `detach`. (#3835)
* feat: detach dedicated exec jobs

* feat: async `DedicatedExecutor::join`

Now `DedicatedExecutor` follows the system we use for other server
components:

- `shutdown`:  a quick sync call that signals the shutdown but doesn't
  drop
- `join`: async awaits until the executor has finished shutdown
- `drop`: warn but still try to shut down

* test: irmpove `detach_receiver` test

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-03-01 15:25:31 +00:00
Marco Neumann e5c45aeab6
refactor: generic top-level flight implementation (#3882)
This allows us to implement flight for the NG querier by just
implementing a few traits and reuse all the existing glue code and
optimizations (like dictionary handling).
2022-03-01 14:33:08 +00:00
Marco Neumann 48722783f9
feat: offer metrics for in-mem catalog (#3876)
This can be quite helpful to test certain caching behavior w/o writing
yet-another abstraction layer.
2022-03-01 11:33:54 +00:00
Marco Neumann 33851be3a5
chore: upgrade Rust to 1.59 (#3875)
Mostly a few new clippy crates around `flat_map`, `and_then`, and
"underscore locks" (!!!):
https://rust-lang.github.io/rust-clippy/master/index.html#let_underscore_lock

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-28 15:14:19 +00:00
Marco Neumann b213796c98
feat: sync namespaces in querier (#3865)
* feat: `NamespaceRepo::list`

* feat: sync namespaces in querier

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-28 15:01:28 +00:00
Raphael Taylor-Davies 2a842fbb1a
feat: correctly sort data and store in catalog metadata (#3864)
* feat: respect sort order in ChunkTableProvider (#3214)

feat: persist sort order in catalog (#3845)

refactor: owned SortKey (#3845)

* fix: size tests

* refactor: immutable SortKey

* test: test sort order restart (#3845)

* chore: explicit None for sort key

* chore: test cleanup

* fix: handling of sort keys containing fields

* chore: remove unused selected_sort_key

* chore: more docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-25 17:56:27 +00:00
Dom Dwyer 3579252682 refactor: configurable max catalog connections
Allow the maximum number of catalog (postgres) connections to be
specified as part of the catalog configuration.
2022-02-25 11:23:21 +00:00
kodiakhq[bot] a1b38b7102
Merge branch 'main' into dom/ns-cache-content-metrics 2022-02-24 16:31:09 +00:00
Luke Bond 34e06e8689
fix: compactor server stays up; removed unused delegates (#3855)
* fix: compactor server stays up; removed unused delegates

* chore: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-24 16:30:44 +00:00
Luke Bond 4731913c44
feat: skeleton of querier CLI (#3843)
* feat: skeleton of querier CLI

* chore: wrap metrics in opt&arc in querier to satisfy new api

* chore: derive debug in querier handler

* chore: add join handles and their shutdown to nascent querier server

* chore: querier server http unimpl -> 404

* fix: join/shutdown fix in querier; removed unused delegates
2022-02-24 15:42:56 +00:00
Dom Dwyer 4024e95ce9 refactor: borrow metric registry
There's no need for the namespace metrics to take (shared) ownership of
the metric registry, so lend it at the call site instead of cloning the
arc.
2022-02-24 15:04:49 +00:00
Dom Dwyer d7eda88581 refactor: early schema validation
Changes the configuration of the router request pipeline to move schema
validation before partitioning.

This reduces the concurrency of callsm into the schema validator when a
single write is split into one or more partitions, reducing contention
and cash thrashing. It also ensures we don't bother partitioning the
writes if the request will fail.
2022-02-23 18:59:14 +00:00
kodiakhq[bot] 3e69a5e1b4
Merge branch 'main' into dom/catalog-instrumentation 2022-02-23 16:14:40 +00:00
Marco Neumann 657ac249e9
feat: track ingester jobs (#3836) 2022-02-23 15:33:47 +00:00
Dom Dwyer aaf8951927 feat: instrument postgres catalog impl
Wraps the postgres implementation of the catalog with a MetricDecorator.

This is slightly intrusive, with the metrics registry being pushed into
the PostgresCatalog type in order to decorate the impls returned when
calling the Catalog::repositories() and Catalog::start_transaction()
methods (rather than being a pure decorator) in order to use static
dispatch and let the compiler optimise away as much overhead as
possible.
2022-02-23 14:34:26 +00:00
Raphael Taylor-Davies 746b18bbed
feat: allow overriding CLI request timeout (#3824)
* feat: allow overriding CLI request timeout

* chore: rename to --rpc-timeout to avoid name collision

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-23 11:35:45 +00:00
dependabot[bot] b63f920d4c
chore(deps): Bump parquet from 9.0.2 to 9.1.0 (#3828)
* chore(deps): Bump parquet from 9.0.2 to 9.1.0

Bumps [parquet](https://github.com/apache/arrow-rs) from 9.0.2 to 9.1.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/9.0.2...9.1.0)

---
updated-dependencies:
- dependency-name: parquet
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: update chunk size test

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Raphael Taylor-Davies <r.taylordavies@googlemail.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-23 11:25:15 +00:00
dependabot[bot] 5a79b3a68b
chore(deps): Bump arrow-flight from 9.0.2 to 9.1.0 (#3829)
Bumps [arrow-flight](https://github.com/apache/arrow-rs) from 9.0.2 to 9.1.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/9.0.2...9.1.0)

---
updated-dependencies:
- dependency-name: arrow-flight
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-23 11:03:22 +00:00
dependabot[bot] 3b7d31c88a
chore(deps): Bump arrow from 9.0.2 to 9.1.0 (#3826)
Bumps [arrow](https://github.com/apache/arrow-rs) from 9.0.2 to 9.1.0.
- [Release notes](https://github.com/apache/arrow-rs/releases)
- [Changelog](https://github.com/apache/arrow-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/apache/arrow-rs/compare/9.0.2...9.1.0)

---
updated-dependencies:
- dependency-name: arrow
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-23 09:25:46 +00:00
dependabot[bot] ad3868ed7c
chore(deps): Bump tokio from 1.16.1 to 1.17.0 (#3814)
* chore(deps): Bump tokio from 1.16.1 to 1.17.0

Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.16.1 to 1.17.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.16.1...tokio-1.17.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* build: update workspace-hack

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Dom Dwyer <dom@itsallbroken.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-22 16:27:43 +00:00
dependabot[bot] 65ab5213e5
chore(deps): Bump clap from 3.0.14 to 3.1.1 (#3809)
Bumps [clap](https://github.com/clap-rs/clap) from 3.0.14 to 3.1.1.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.0.14...v3.1.1)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-22 14:51:53 +00:00
Raphael Taylor-Davies 1960645055
feat: add wildcard support to persist partition CLI command (#3790)
* feat: add wildcard support to persist partition

* chore: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-22 14:21:06 +00:00
Raphael Taylor-Davies 0229147909
feat: preserve catalog on error (#1522) (#3802)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-22 14:10:20 +00:00
kodiakhq[bot] 326871f149
Merge branch 'main' into dom/router-handler-chain 2022-02-22 13:49:09 +00:00
Luke Bond 0f012de70c
feat: adding compactor CLI command and crate
Closes: #3777
2022-02-21 12:24:09 +00:00
Raphael Taylor-Davies 39c42678d7
feat: trigger persistence if over soft limit and no evictable chunks (#3791)
* feat: trigger persistence if over soft limit and no evictable chunks

* chore: fmt

* fix: avoid test_full_lifecycle exceeding soft limit

* fix: don't expect chunk to be unloaded

* feat: only trigger if no outstanding persist

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-21 10:14:27 +00:00
Dom Dwyer bb132b61ad refactor: chain DML handlers
The router is composed of several DML handlers called in sequence in
order to construct the full request handling pipeline. Prior to this
commit, each handler nested the next handler it calls internally,
producing a nested call chain that resulted metrics (added in #3764)
recording cumulative latency like this:

              ┌ ─

              │     ┌───────────────┐
                    │  NS Creation  │
              │     └───────────────┘
                            │  ┌───────────────┐
              │             │  │  Partitioner  │
                            │  └───────────────┘
              │             │          │
                            │          │
 Cumulative   │             │          │  ┌───────────────┐
   Timings                1.5s        1s  │    etc...     │
              │             │          │  └───────────────┘
                            │          │
              │             │          │
                            │  ┌───────────────┐
              │             │  │  Partitioner  │
                            │  └───────────────┘
              │     ┌───────────────┐
                    │  NS Creation  │
              │     └───────────────┘

              └ ─

This meant it was hard to determine the latency of a single handler
without knowing (and subtracting the latency of) all the child handlers
it calls.

This commit replaces the intrusive nested handler call chain with an
external Chain combinator type to compose together individual handlers,
resulting in correct per-handler timings and simpler code/tests:

          ┌───────────────┐
          │  NS Creation  │
          └───────────────┘
                  │
                 .5s       ┌───────────────┐
                  └───────▶│  Partitioner  │
                           └───────────────┘
                                   │
                                  1s    ┌───────────────┐
                                   └───▶│    etc...     │
                                        └───────────────┘
2022-02-18 14:19:53 +00:00
Raphael Taylor-Davies 83cba3d2fb
feat: template static router config (#3781)
* feat: template static router config

* chore: lint and improved failure output

* chore: clarify docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-18 10:53:10 +00:00
Marco Neumann f54ef92b77
fix: supervise and shutdown ingester background tasks (#3769)
* fix: supervise and shutdown ingester background tasks

Closes #3761.
Closes #3762.

* docs: improve wording

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>

* test: join/shutdown handling for ingester

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
2022-02-18 09:35:29 +00:00
Andrew Lamb 9588b43a90
fix: Make errors in rewriting return `Error` rather than a `panic` (#3767)
* test: add test for predicate errors

* fix: Return errors properly rather than panic

* fix: handle errors in influxrpc planner

* fix: appease clippy

* fix: tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-17 15:39:14 +00:00
kodiakhq[bot] c89fa3701e
Merge branch 'main' into dom/router2-metrics 2022-02-17 15:16:39 +00:00
Edd Robinson 7a2b43f1fb refactor: emit trace ID information 2022-02-16 19:01:14 +00:00
Edd Robinson e4e9b56930 feat: add support for auto-generating a query trace 2022-02-16 18:08:49 +00:00
Dom Dwyer d6d0ae8d80 feat: add instrumentation to request pipeline
Wraps the sharded write buffer, schema validator, partitioner and
overall request handler in instrumentation to record call latencies and
export them via the /metrics endpoint.
2022-02-16 14:00:49 +00:00
Dom Dwyer 92fe507e52 feat: instrumented namespace cache
Decorates the NamespaceCache with a set of cache get hit/miss counters,
and put insert/update counters to expose cache behaviour.
2022-02-16 14:00:49 +00:00
kodiakhq[bot] d0965bb0b2
Merge branch 'main' into dom/mb-partitioning 2022-02-16 11:30:42 +00:00
Paul Dix f542045485
feat: wire up persistence in ingester (#3685)
This adds persistence into the ingester with a lifecycle manager. The persist operation must still be updated to keep track of the min_unpersisted_sequence_number for each sequencer.
2022-02-16 00:13:40 +00:00
Edd Robinson 7ac9e216c4 refactor: use same log message 2022-02-15 14:36:55 +00:00
Edd Robinson 8a5ea29190 refactor: add measurement to log 2022-02-15 14:31:26 +00:00
Marco Neumann 44ee0166a0
fix: start Kafka write buffer stream at "earliest" offset, not at "0" (#3748) 2022-02-15 13:36:59 +00:00
Marco Neumann 9e7a27b344
fix: default Kafka topic name is `iox-shared` (#3747)
Do NOT use underscores in the Kafka topic because this is not supported
by Kafka. This was initially fixed by #3555 but reverted by #3623.
2022-02-15 12:34:46 +00:00
Andrew Lamb a30803e692
chore: Update datafusion, update `arrow`/`parquet`/`arrow-flight` to 9.0 (#3733)
* chore: Update datafusion

* chore: Update arrow

* fix: missing updates

* chore: Update cargo.lock

* fix: update for smaller parquet size

* fix: update test for smaller parquet files

* test: ensure parquet_file tests write multiple row groups

* fix: update callsite

* fix: Update for tests

* fix: harkari

* fix: use IoxObjectStore::existing

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-15 12:10:24 +00:00
Dom Dwyer e055800039 refactor: enable Partitioner in request pipeline
Adds the Partitioner DML handler into the handler stack, modifying the
input types of down-stream handlers to accept the partitioned data.
2022-02-15 11:34:33 +00:00
dependabot[bot] 89105ccfab
chore(deps): Bump tokio-util from 0.6.9 to 0.7.0 (#3743)
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.6.9 to 0.7.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/commits)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-02-15 11:33:41 +00:00
Dom Dwyer e99922d518 refactor: parametrise DML handler input type
Allow a DML handler to specify the write input type on which it
operates.

This allows us to construct a write handler pipeline that transforms the
request as it passes through the various handlers. We'll use this to
implement a handler that annotates a normal set of table writes with the
partition key, modifying downstream handlers to expect this annotated
input.
2022-02-15 11:23:45 +00:00
Marco Neumann c6e374a025
feat: allow catalog access w/o a transaction (#3735)
* feat: allow catalog access w/o a transaction

Now the caller has the full control if they want to use a transaction or
not.

* fix: remove non-transaction-safe `create_many`

* fix: remove unnecessary transactions
2022-02-15 10:15:36 +00:00
dependabot[bot] 60a7f87645
chore(deps): Bump serde_json from 1.0.78 to 1.0.79 (#3739)
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.78 to 1.0.79.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.78...v1.0.79)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-14 20:42:54 +00:00
Raphael Taylor-Davies 26fd5273f0
feat: static database configuration (#2436) (#3732)
* feat: static database configuration (#2436)

* chore: fmt

* feat: don't base64 encode UUIDs in ServerConfigFile

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-14 19:42:49 +00:00
Raphael Taylor-Davies c79050254f
refactor: traitify database configuration (#2436) (#3730)
* refactor: traitify database configuration (#2436)

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-13 09:26:44 +00:00
Raphael Taylor-Davies 866777ecd2
feat: static router configuration (#2436) (#3725)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-11 14:09:37 +00:00
Raphael Taylor-Davies 4e3f66ed07
feat: CLI and gRPC APIs for shutting down and restarting databases (#3720)
* feat: allow catalog wipe and rebuild whilst shutdown

* feat: CLI and gRPC APIs for shutting down and restarting databases

* feat: add ability to skip replay on restart

* fix: test_wipe_persisted_catalog_error_db_exists

* fix: wipe_preserved_catalog
2022-02-11 10:14:43 +00:00
Raphael Taylor-Davies 910f381355
refactor: require UUID to create Database (#3715)
* refactor: require UUID to create Database

* chore: review feedback

* chore: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-10 20:04:06 +00:00
Raphael Taylor-Davies b1190262b7
feat: restartable `Database` (#3368) (#3711)
* feat: restartable `Database` (#3368)

* chore: fmt

* fix: wipe_preserved_catalog

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-10 18:32:05 +00:00
Andrew Lamb d9f331ba2a
chore: update datafusion, stop repartitioning so aggressively (#3633)
* chore: update datafusion

* fix: Update to use new datafusion api

* chore: update expected plans

* fix: support zero output partitions

* fix: update test

* fix: Update for new DataFusion API

* fix: newly added system table

* fix: update cargo lock
2022-02-09 19:53:41 +00:00
Carol (Nichols || Goulding) 73828323ac
feat: Ingester Flight gRPC API (#3623)
* feat: Add a way to run ingester with an in-memory catalog from the CLI

If you set the --catalog-dsn string to "mem", rather than using that as
a Postgres connection URL, create an in-memory catalog.

Planning on using this in tests, so not documenting.

* fix: Set default topic to the same value as SHARED_KAFKA_TOPIC

Namely, both should use an underscore. I don't think there's a way to
directly share these values between a constant and an annotation.

* feat: Add a flight API (handshake only) to ingester

* fix: Create partitions if using file-based write buffer

* fix: Change the server fixture to handle ingester server type

For now, the ingester doesn't implement the deployment API. Not sure if
it should or not.

* feat: Start implementing ingester do_get, namely decoding the query

Skip serialization of the predicate for the moment.

* refactor: Rename ingest protos to ingester to match crate name

* refactor: Rename QueryResults to QueryData

* feat: Move ingester flight client to new querier crate

* fix: Off by one error, different starting indexes in sequencers

* fix: Create new CLI argument to pick the catalog type

* fix: Create a CLI option to set the number of topics to auto-create in the write buffer

* fix: Check the arrow flight service's health to tell that the ingester gRPC is up

* fix: Set postgres as the default catalog type

* fix: Return an error rather than panicking if CLI args aren't right
2022-02-09 19:07:44 +00:00
Edd Robinson 2334e779eb feat: implement read_window_aggregate sub-command 2022-02-09 12:32:48 +00:00
Edd Robinson 0774e1d328 feat: add read_window_aggregate request builder 2022-02-09 12:32:48 +00:00
Marco Neumann 4bddab56e2
feat: create new sequencers in ingester on demand (#3671)
There is no need to introduce yet another admin action to do that. If
the sequencer does not exist yet, we can just create it and set the
`min_unpersisted_sequence_number` to 0 (which is done be `create_or_get`).
2022-02-09 12:26:30 +00:00
Edd Robinson dfa6fd8579 feat: add quiet option to storage 2022-02-08 21:27:29 +00:00
Edd Robinson 11855a5eff feat: add format flag 2022-02-08 21:15:07 +00:00
Edd Robinson c175ccd1b4
feat: make stop/stop/predicate global (#3681) 2022-02-08 20:06:47 +00:00
kodiakhq[bot] ace76cef14
Merge branch 'main' into dom/sharded-cache 2022-02-08 16:09:48 +00:00
Paul Dix 59b2141c0b
feat: Add lifecycle manager to ingester (#3645)
This adds the lifecycle manager to the ingester. It will trigger based on a threshold for max partition size or age or based on keeping total memory under a certain threshold.

It defines a new interface for a persister, which is stubbed out for IngesterData. I'm not sure yet how persistence errors should be handled. The assumption here is that the persister continues to retry persistence forever until it succeeds.

There is one scenario I can think of that may cause this lifecycle manager problems. If a single partition is very high throughput, it could cause things to back up as persistence is not parallelized within a single partition. Any given partition can currently only run one persistence operation at a time. We can address this later.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-08 15:23:40 +00:00
Marco Neumann 5de4d6203f
refactor: catalog transaction (#3660)
* refactor: catalog Unit of Work (= transaction)

Setup an inteface to handle Units of Work within our catalog. Previously
both the Postgres and the in-mem backend used "mini-transactions on
demand". Now the caller has a clear way to establish boundaries and
gets read and write isolation. A single `Arc<dyn Catalog>` can create as
many `Box<dyn UnitOfWork>` as you like, but note that depending on the
backend you may not scale infinitely (postgres will likely impose
certain limits and the in-mem backend limits concurrency to 1 to keep
things simple).

* docs: improve wording

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* refactor: rename Unit of Work to Transaction

* test: improve `test_txn_isolation`

* feat: clearify transaction drop semantics

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-08 13:38:33 +00:00
kodiakhq[bot] 4567800901
Merge branch 'main' into er/feat/tag_values_cli 2022-02-08 13:07:59 +00:00
Raphael Taylor-Davies be662ec731
feat: lazy query log! (#3654)
* feat: lazy query log

* chore: fmt

* chore: review feedback

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-08 13:07:28 +00:00
Edd Robinson 6c10e1e901 feat: support _measurement/_field tag keys 2022-02-08 11:32:28 +00:00
Edd Robinson eb733042ca feat: add support for tag_values cli 2022-02-07 22:02:29 +00:00
Edd Robinson 38a889ecf6 refactor: remove unnecessary struct 2022-02-07 22:02:29 +00:00
Marco Neumann d9cc9f5a2a
feat: expose write buffer connection config via CLI (#3651)
* feat: improve rskafka config error messages

* feat: expose write buffer connection config via CLI
2022-02-07 16:24:28 +00:00
Marco Neumann 977ccc1989
fix: use a single metric registry for ingester (#3652)
With this change write buffer ingestion metrics are showing up under
`/metrics`

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-07 15:56:54 +00:00
Edd Robinson 87ac926e06
feat: add queries system table (#3655)
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-02-07 15:26:06 +00:00
Carol (Nichols || Goulding) 2e30483f1f
refactor: Remove predicate module from predicate crate (#3648)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-07 14:54:07 +00:00
Marco Neumann e2db1df11f
refactor: improve writer buffer consumer interface (#3631)
* refactor: improve writer buffer consumer interface

The change looks huge but is actually rather simple. To
understand the interface change, let me first explain what we want:

- be able to fetch watermarks for any sequencer
- have streams:
  - each streams tracks a sequencer and has an offset state (no read
    multiplexing)
  - we can seek a stream
  - seeking and streaming cannot be done at the same time (that would be
    weird and likely leads to many bugs both in write buffer and in the
    user code)
- ideally we don't need to create streams of all sequencers but can
  choose a subset

Before this change we had one mutable consumer struct where you can get
all streams and watermark functions (this mutable-borrows the consumer)
or you can seek a single stream (this also mutable-borrows the
consumer). This is a bit weird for multiple reasons:

- you cannot seek a single stream without dropping all of them
- the mutable-borrow construct makes it really difficult to pass the
  streams into separate threads
- the consumer is boxed (because its mutable) which makes it more
  difficult to handle in a large-scale application

What this change does is the following:

- you have an immutable consumer (similar to the producer)
- the consumer offers the following methods:
  - get the set of sequencer IDs
  - get watermark for any sequencer
  - get a stream handler (see next point) for any sequencer
- the stream handler captures the stream state (offset) and provides you
  a standard `Stream<_>` interface as well as a seek function.
  Mutable-borrows ensure that you cannot use both at the same time.

The stream handler provides you the stream via `handler.stream()`. It
doesn't implement `Stream<_>` itself because the way boxing, dynamic
dispatch work, and pinning interact (i.e. I couldn't get it to work
without the indirection).

As a bonus point (which we don't use however) you can now create
multiple streams for the same sequencer and they all have their own
offset.

* fix: review comments

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-07 12:24:17 +00:00
Edd Robinson a52c0a26e6 feat: print read filter results 2022-02-04 22:14:22 +00:00
Edd Robinson d328b37803 feat: teach IOx to convert RPC frames into Recordbatches 2022-02-04 18:34:54 +00:00
Edd Robinson 4cdaaf96bf refactor: clean up errors 2022-02-04 18:34:54 +00:00
Edd Robinson ea0ece8b4b feat: issue read_filter request 2022-02-04 18:34:54 +00:00
Dom Dwyer 0b044b95fb perf: use sharded namespace cache
Enables the ShardedCache for the namespace schema cache.
2022-02-04 16:12:51 +00:00
Dom Dwyer 026a557c0b refactor: rename TableNamespaceSharder
Rename to JumpHash and expose the hashing internals for reuse (outside
of only table & namespace sharding).
2022-02-04 15:56:09 +00:00
Dom Dwyer 0fd122e365 refactor: "inf" retention const
Adds the iox_catalog::INFINITE_RETENTION_POLICY constant.
2022-02-04 15:35:33 +00:00
Dom Dwyer f1ba50f40b feat: resolve query pool ID at startup
This commit adds a --query-pool flag to router2, used to upsert a
catalog record at startup. Auto-created namespaces will reference this
query pool.

This is for testing only and will be removed in a future commit.
2022-02-04 15:35:30 +00:00
Dom Dwyer aefc70a9ea feat(router2): namespace auto-creation
Decorate the existing request handler pipeline with a layer that
implicitly creates the namespace when a write request is received.
2022-02-04 15:34:15 +00:00
Marco Neumann 0c01044677
fix: partition range in ingester CLI has INCLUSIVE end (#3641) 2022-02-04 13:41:57 +00:00
Marco Neumann d2ccf23263
fix: use standard DSN argument for router2 CLI (#3632)
- support long-form (instead of relying on positional arguments)
- use same code as everying else

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-03 17:20:52 +00:00
Markus Westerlind 0bd7941a18
fix(REPL): Don't buffer lines until a trailing semicolon is found and add history hinting (#3630)
* fix(REPL): Don't buffer lines until a trailing semicolon is found

The repl would silently buffer all lines until a trailing semicolon were found which
resulted in some very confusing error messages as I would input invalid commands followed
by a command I thought were valid, except I'd still get an error due to the previous command being buffered.

This uses rustyline's helper feature to detect incomplete input (no trailing semicolon) and makes
it accept multiline input until the input is completed.

I also included some of rustyline's default hint and highlighting while I was at it.

* chore: cargo clippy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-03 17:11:01 +00:00
Marco Neumann b3b2d9b623
feat: catalog setup CLI command (#3627)
Closes #3509.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-03 14:16:21 +00:00
Andrew Lamb ab3c7573f5
test: add end to end for read_filter and empty string predicates (#3619)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-03 14:05:32 +00:00
Marco Neumann 50cff27b01
chore: remove rdkafka dependency (#3625)
All features are now covered by rskafka. This also removes the need to
specify a server ID for write buffer consumers. This was only used for
rdkafka since there we needed to specify a consumer group, even though
we did not use any transactions.
2022-02-03 13:33:56 +00:00
kodiakhq[bot] 3197ea945b
Merge branch 'main' into dom/extract-ns-cache 2022-02-03 12:30:37 +00:00
Andrew Lamb 77b80e7618
fix(InfluxQL): treat null tags as `''` rather than `null` in storagerpc queries (#3557)
* fix(InfluxQL): treat null tags as `''` rather than `null` in storage rpc queries

* test: add one more case

* fix: Update comment

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>

Co-authored-by: Raphael Taylor-Davies <1781103+tustvold@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-03 12:14:43 +00:00
Paul Dix ce46bbaada
feat: wire up the write buffer to the ingester process (#3533)
This adds the scaffolding for the ingester server to consume data from Kafka. This ingests data in an in memory structure while creating records in the catalog for any partitions that don't yet exist.

I've removed catalog_update.rs in ingester for now. That was mostly a placeholder and will be going in a combination of handler.rs and data.rs on my next PR which will have some primitive lifecycle wired up.

There's one ugly bit here where the DML write is cloned because it's getting borrowed to output spans and metrics. I'll need to follow up with a refactor to make it so that the DML write's tables can be consumed without it gumming up the metrics stuff.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-03 11:47:28 +00:00
Dom Dwyer 3cc4481616 refactor: extract NamespaceSchema cache
Breaks the in-memory cache of NamespaceSchema out into a decoupled type
that can be shared across multiple DML handlers.
2022-02-03 10:01:07 +00:00
Carol (Nichols || Goulding) a534136ccc
fix: Correct a 'long' argument name so the ingester command can run (#3621)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-02 20:09:41 +00:00
Andrew Lamb 429d59f1b6
feat: Simplify predicates in the `InfluxRpcFrontend` before using them (#3588)
* feat: normalize + simplify RPC predicates before using them

* docs: Update predicate/src/rpc_predicate.rs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-02-02 19:46:57 +00:00
kodiakhq[bot] 6de8ed4adc
Merge branch 'main' into dom/schema-validation 2022-02-02 16:05:41 +00:00
Luke Bond 6da15d9690 chore: cleanup catalog CLI output & args
Co-authored-by: Marco Neumann <marco@crepererum.net>
2022-02-02 15:30:19 +00:00
Luke Bond 15827a534b feat: catalog CLI command with update subcommand 2022-02-02 15:30:19 +00:00
Dom Dwyer 39d489d9e7 refactor: enable schema validation
Adds the SchemaValidator to the DML handler stack - this adds it into
the request path in router2.
2022-02-02 14:04:14 +00:00
Edd Robinson 5441682207 feat: add support for parsing predicate 2022-02-02 11:02:33 +00:00
Edd Robinson 08901c13cd feat: support parsing timerange 2022-02-02 11:02:33 +00:00
Edd Robinson a424d1c912 feat: shell command read_filter 2022-02-02 11:02:33 +00:00
Marco Neumann 59a2c74352
refactor: reusable ingester/router2 CLI pieces (#3590)
* refactor: use a single CLI parser for ingester/router2 WB

* refactor: reusable catalog DSN CLI handling

We are going to need DSN handling for the router as well as for the some
admin tools.

* fix: DNS -> DSN
2022-02-01 12:57:58 +00:00
Marco Neumann 22778a3a80
chore: upgrade rskafka and parking_lot (#3592) 2022-02-01 11:50:42 +00:00
Marco Neumann b326b62b44
feat: buffer writes when writing to RSKafka (#3520) 2022-02-01 10:07:52 +00:00
Carol (Nichols || Goulding) c633c9bc5c
feat: Wire object store into ingester persistence 2022-01-31 10:36:30 -05:00
Marco Neumann c50fc8764d
feat: basic non-panic HTTP/gRPC interface for ingester (#3583)
Don't panic when K8s requests a health status or someone requests a
non-found HTTP route; or when we just TRY to start up the gRPC service.
2022-01-31 11:13:14 +00:00
Andrew Lamb 36642eb71d
test: add end to end tests that query missing tags (#3563)
* test: add end to end tests that query missing tags

* fix: add github reference

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-28 18:42:57 +00:00
Marco Neumann 3659a7f799
refactor: clean up ingester CLI (#3569)
- use same args/envs names as router2 does
- kafka => write buffer
- add long forms to all CLI args so we don't have to pass positional
  arguments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-28 17:58:16 +00:00
Raphael Taylor-Davies 4101d16f71
chore: feature flag consistency (#3574)
* chore: feature flag consistency

* chore: add aarch64-apple-darwin to hakari

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-28 16:38:59 +00:00
Marco Neumann a22ca7c3d7
fix: router2 writer buffer topic (#3555)
- Kafka does not support `_` in topic names, but `-` works, so let's
  change the default
- Expose topic config via CLI/env
2022-01-28 10:10:04 +00:00
Andrew Lamb f24ce03754
fix: provide correct environment variable to change log filter (#3561)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 20:45:06 +00:00
Andrew Lamb 2062267d0f
chore: Update hashbrown (#3551)
* chore: Update hashbrown

* fix: hakari

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 15:34:10 +00:00
Dom 5447554aee
refactor(router2): DML handler stack (#3549)
* refactor: composable DmlHandler stack

Changes the DmlHandler trait to allow composition of handler logic in
order to construct the complete request processing pipeline.

* feat: debug log write/delete requests

Log requests hitting the HTTP endpoint at DEBUG.

* refactor: dml_handler -> dml_handlers

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 14:54:27 +00:00
Raphael Taylor-Davies 21c1824a7a
refactor: remove table_names from Predicate (#3545)
* refactor: remove table_names from Predicate

* chore: fix benchmarks

* chore: review feedback

Co-authored-by: Edd Robinson <me@edd.io>

* chore: review feedback

* chore: replace Default::default with InfluxRpcPredicate::default()

Co-authored-by: Edd Robinson <me@edd.io>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 14:44:49 +00:00
Paul Dix 16d584b2ff
feat: Add db_name/namespace to DmlWrite and DmlDelete (#3531)
* feat: Add db_name/namespace to DmlWrite and DmlDelete

This is required for the new ingester to be able to work with the write buffer. The protobuf that gets serialized over Kafka already includes the database name, it just wasn't getting carried through to the marshaled Dml operation.

* fix: database != namespace, propagation through write buffer

Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-27 14:12:20 +00:00
Andrew Lamb 5488c257d1
chore: Update datafusion, upgrade to arrow/parqet/arrow-flight 8.0.0 (#3517)
* chore: Update datafusion

* chore: update to arrow 8

* fix: update to use new DataFusion APIs

* fix: update case for sortedness

* fix: cargo hakari
2022-01-27 13:33:27 +00:00
Luke Bond 107f39d53c
feat: add trace collector to router2 (#3529)
* feat: add trace collector to router2

* chore: fmt
2022-01-26 11:51:17 +00:00
Dom 6b0f7e6b2b
feat: initialise ShardedWriteBuffer (#3528)
Initialises a ShardedWriteBuffer for the hard-coded "iox_shared" topic.

Adds the following CLI flags:

    * --write-buffer: type of buffer [kafka, rskafka, file]
    * --write-buffer-addr: write buffer endpoint address

The server uses these config options to initialise the appropriate write
buffer backend, and configure the TableNamespaceSharder to shard
operations over the set of sequencers exposed by the write buffer.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-26 10:49:34 +00:00
Raphael Taylor-Davies 1b6aed063d
feat: add per-partition tracing (#3532)
* feat: add per-partition tracing

* chore: docs

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-26 10:39:21 +00:00
Raphael Taylor-Davies db46ac04d0
feat: support line protocol precision parameter (#3522) (#3526)
* feat: support line protocol precision parameter (#3522)

* chore: format imports

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-25 14:12:22 +00:00
Paul Dix bb893510a0 feat: Add scaffolding for ingester server
* Adds a new ingester command to start an ingester server
* Moves previous ingester server over to handler
* Skeleton for gRPC and HTTP handlers
2022-01-21 18:02:19 -05:00
Andrew Lamb 9c19cd6cc4
fix: clamp start/end of TimestampRange to min/max valid timestamp values (#3487)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-20 16:08:00 +00:00
Andrew Lamb 9751301374
refactor: rename `storage_api.rs` end to end test to `influxrpc.rs` for consistency (#3497)
* refactor: rename `storage_api` end to end test to `influxrpc`  for consistency

* fix: fmt
2022-01-20 14:25:13 +00:00
Marco Neumann 168afb63ad feat: add `size` methods to DML-related types
This will be helpful when we want to batch DML operations in memory
(e.g. when using RSKafka).

This also ensures that `MBChunk` accounts for the column names that
are stored within `MutableBatch`.
2022-01-18 13:52:31 +01:00
Dom 40a290f6f7 feat: router2 HTTP handlers
Implements the HTTP v2 write API endpoint for router2.
2022-01-17 11:57:28 +00:00
Marco Neumann c399e676ca chore: upgrade clap to v3 2022-01-17 12:12:46 +01:00
Raphael Taylor-Davies 89db894df4
fix: serde_json `arbitrary_precision` (#3458) (#3469)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-17 09:50:54 +00:00
Andrew Lamb b036db293f
fix: Format `ParenExpression` RPC storage Expression `Node`s (#3463)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-14 16:02:51 +00:00
Edd Robinson 36ec6019f9 feat: wire up token to query frontends 2022-01-14 10:26:11 +00:00
Andrew Lamb dd23056efd
chore: update datafusion, arrow, prost, tonic, pbjson, etc (#3455)
* chore: update datafusion, arrow, prost, tonic, etc

* fix: update pprof as well

* chore: update hakari

* fix: update pbjson

* chore: update heappy

* fix: hakari

* fix: workaround https://github.com/influxdata/influxdb_iox/issues/3458

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-01-13 17:07:15 +00:00
Dom 430823c148
docs: fix typo
Co-authored-by: Marko Mikulicic <mkm@influxdata.com>
2022-01-13 15:42:30 +00:00
Dom 3d32901877 fix: undected gzip HTTP body truncation
When reading the gzip-encoded body of a HTTP request, the stream is read
up until the configured maximum number of allowable bytes, at which
point the body was silently trucated. This could allow fields in
submitted line protocol to be silently lost (amongst other bad things).

This change ensures that truncation results in a RequestSizeExceeded
error.
2022-01-13 14:38:37 +00:00