Commit Graph

812 Commits (5398a1b986ceb42d76a62dff43c7b005159774f1)

Author SHA1 Message Date
Marco Neumann 15b3705f9a
feat: add "read group" support to storage CLI (#5601)
* fix: do not panic if measurement name is not the first tag

* feat: add "read group" support to storage CLI

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-12 08:04:09 +00:00
Marko Mikulicic 6eaa971a52
chore: Allow running all-in-one with external object store (#5600)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-10 12:03:54 +00:00
dependabot[bot] 786ce75e26
chore(deps): Bump tokio-util from 0.7.3 to 0.7.4 (#5596)
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.3 to 0.7.4.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.3...tokio-util-0.7.4)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-09 07:40:16 +00:00
Marco Neumann 267a53a9e8
chore: update `tracing-subscriber`, fix trogging, fix CLI test port allocation (#5581)
* test: use dedicated ports for CLI tests

* chore: update `tracing-subscriber`

* fix: work around tracing-subscriber weirdness

It seems that trogging with tracing-subscriber >= 0.3.14 does not
produce any output at all. I suspect we are hitting
<https://github.com/tokio-rs/tracing/issues/2265>. Let's change the
construct to not use multiple optional layers but a single dyn-dispatch
layer. Logging shouldn't have such a high throughput that his makes any
difference, esp. because the dyn-dispatch happens AFTER the filter.
2022-09-08 09:37:37 +00:00
YIXIAO SHI 52ae60bf2e
chore: fix comment typo (#5551)
Co-authored-by: Dom <dom@itsallbroken.com>
2022-09-07 08:49:29 +00:00
Luke Bond a280acb860
Merge branch 'main' into alamb/guilio-python-main 2022-09-06 16:57:00 +01:00
Marco Neumann adeacf416c
ci: fix (#5569)
* ci: use same feature set in `build_dev` and `build_release`

* ci: also enable unstable tokio for `build_dev`

* chore: update tokio to 1.21 (to fix console-subscriber 0.1.8

* fix: "must use"
2022-09-06 14:13:28 +00:00
dependabot[bot] b494c73cb3
chore(deps): Bump console-subscriber from 0.1.7 to 0.1.8 (#5558)
Bumps [console-subscriber](https://github.com/tokio-rs/console) from 0.1.7 to 0.1.8.
- [Release notes](https://github.com/tokio-rs/console/releases)
- [Commits](https://github.com/tokio-rs/console/compare/tokio-console-v0.1.7...console-subscriber-v0.1.8)

---
updated-dependencies:
- dependency-name: console-subscriber
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-06 12:46:07 +00:00
dependabot[bot] 9f0b0328f7
chore(deps): Bump thiserror from 1.0.33 to 1.0.34 (#5556)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.33 to 1.0.34.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.33...1.0.34)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-06 09:18:41 +00:00
dependabot[bot] 366c4d9965
chore(deps): Bump once_cell from 1.13.1 to 1.14.0 (#5555)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.13.1 to 1.14.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.13.1...v1.14.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-06 09:02:28 +00:00
Juul Christiaens 8b419ecd84 refactor: changed iox_shared to iox-shared
changed io_shared to iox-shared in the following files: update_catalog.rs, partition.rs, lib.rs (in the service_grpc_catalog folder) and lib.rs (in the service_grpc_object_store folder).
2022-09-04 07:59:07 -04:00
Marco Neumann 92656edddd
chore: upgrade jemalloc to version 5.3.0 (#5542)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-02 10:45:13 +00:00
dependabot[bot] 00ed79ff1b
chore(deps): Bump thiserror from 1.0.32 to 1.0.33 (#5524)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.32 to 1.0.33.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.32...1.0.33)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-09-01 09:11:31 +00:00
Dom ed2490deb2
Merge branch 'main' into dom/ingester-row-limit 2022-08-31 14:56:42 +01:00
Dom Dwyer 2a19606456 feat(ingester): restrict partition row count
This limit restricts a single partition to containing at most N rows
before it is marked for persistence (note: being marked for persistence
does not currently prevent further ingest for that partition.)
2022-08-31 15:48:18 +02:00
Andrew Lamb 6669d85fb4
chore: Update datafusion + arrow/parquet to `21.0.0` (#5519)
* chore: Update arrow/arrow-flight/parquet to 21.0.0

* chore: Update datafusion pin

* chore: Fix arrow update script

* chore: Update Cargo.lock

* chore: Update for new API
2022-08-31 13:30:47 +00:00
Nga Tran cb10a7c6d8
feat: More accurate memory estimate for compaction (#5471)
* feat: initial implementation of memory estimation for a compaction

* feat: estimate size of files and have the right actions for the needed budget

* feat: run candidates in parallel

* fix: have the right name for the column field of the output struct

* feat: add metrics for estimated budgets

* chore: cleanup

* chore: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: fix syntax after applying review's suggestions

* refactor: Convert a Vec to VecDeque to go well with pop and push

* chore: remove max_concurrent_size_bytes and input_size_threshold_bytes

* chore: remove input_file_count_threshold

* test: tests for estimate_arrow_bytes_for_file

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-30 13:44:44 +00:00
Carol (Nichols || Goulding) dbd27f648f
refactor: Rename more mentions of Kafka to their other name where appropriate 2022-08-29 14:27:02 -04:00
Carol (Nichols || Goulding) 1b49ad25f7
refactor: Rename KafkaTopicId to TopicId 2022-08-29 14:27:02 -04:00
Carol (Nichols || Goulding) 58f0b63cdc
refactor: Rename KafkaTopic to Topic or TopicMetadata or topic name as appropriate 2022-08-29 14:27:02 -04:00
Carol (Nichols || Goulding) 3aa3ae2ba5
docs: Add more comments about why to use ShardIndex or ShardId 2022-08-29 14:07:20 -04:00
Carol (Nichols || Goulding) 74c9529062
fix: Rename KafkaPartition to ShardIndex 2022-08-29 14:07:18 -04:00
Carol (Nichols || Goulding) 6443858870
fix: Rename compactor option from sequencer to shard 2022-08-29 14:06:45 -04:00
Carol (Nichols || Goulding) 698f1a47ff
refactor: Rename test structures from sequencer to shard where appropriate 2022-08-29 14:06:44 -04:00
Jake Goulding 4abf21c724
refactor: Rename Sequencer (and its entourage) to Shard 2022-08-29 14:06:43 -04:00
Marco Neumann 8bc7606cb5
refactor: provide process-wide static strings (version, UUID) (#5487)
We currently only use the human-readable version string for the CLI
help, but for #5464 I want to use the GIT hash and a process-time UUID.
This is the prep work for that.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-29 12:25:19 +00:00
Luke Bond 3950ca3a17
feat: upsert partition & update sort key for each day in bulk ingest (#5447)
* feat: upsert partition & update sort key for each day in bulk ingest

feat: import schema now supports earliest/latest time merging
chore: tests & tidying up for bulk ingest catalog update

* fix: always sort time last in PK in import schema update catalog

* chore: additional test for computing sort key in bulk ingest

* chore: bulk import catalog update gets sequencer from sharder service

chore: import update schema tests refactor using sharder svc mock

* chore: dead code fix

* chore: import schema sequencer lookup test

* chore: clarifying comment in import schema catalog update
2022-08-25 10:47:12 +00:00
Marko Mikulicic 99daa13897
test: Test dotenvy regression (#5461) 2022-08-24 09:39:55 +00:00
Marko Mikulicic 4beb721a9a
fix: Revert Bump dotenvy from 0.15.1 to 0.15.2 (#5450) (#5455)
This reverts commit 84acbd2fad.

Closes #5454

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-24 09:10:09 +00:00
Nga Tran 3220c6f88b
feat: add file_count_threshold for comapcting cold partitions (#5456)
* feat: file file_count_threshold for comapcting cold partitions to make it consistent with the hot case and help set up to avoid oom easier

* chore: remove unecessary commments
2022-08-23 20:12:21 +00:00
dependabot[bot] 84acbd2fad
chore(deps): Bump dotenvy from 0.15.1 to 0.15.2 (#5450)
Bumps [dotenvy](https://github.com/allan2/dotenvy) from 0.15.1 to 0.15.2.
- [Release notes](https://github.com/allan2/dotenvy/releases)
- [Changelog](https://github.com/allan2/dotenvy/blob/master/CHANGELOG.md)
- [Commits](https://github.com/allan2/dotenvy/commits/v0.15.2)

---
updated-dependencies:
- dependency-name: dotenvy
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-23 11:24:42 +00:00
Luke Bond f4443f0b3a
feat: import schema override (#5420)
* chore: struct for overrides of import schema conflicts

* chore: import schema override shouldn't support tags

* feat: import schema merge can take an override schema

* fix: schema override in test had superfluous tag

* chore: test for batch schema merge with override in import schema

* feat: import schema merge now takes override schema
2022-08-17 14:59:50 +00:00
Andrew Lamb 7f0ae53d6f
chore: Update to (almost) released object_store 0.4.0 (#5419)
* chore: update object_store

* chore: update hakari config

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-17 13:44:48 +00:00
dependabot[bot] 78665d3092
chore(deps): Bump once_cell from 1.13.0 to 1.13.1 (#5413)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.13.0 to 1.13.1.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.13.0...v1.13.1)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-17 08:31:46 +00:00
Luke Bond 10fee5535a
feat: import schema updates iox catalog (#5385)
* feat: import schema updates iox catalog

- renamed import/schema module to aggregate_tsm_schema to not conflic
  with schema crate
- fetch schema from iox catalog, and validate/merge/create as needed

chore: add catalog dsn config to import schema command
chore: import schema command connects to catalog
chore: import schema merge validation errors return non-zero code
chore: simplified and tidies import update catalog code

chore: tests and refactoring of import schema catalog update

* chore: require retention on ns creation in import

* chore: fixed bad test in import schema validation

* chore: friendlier errors & more tests in import schema catalog update
2022-08-16 11:05:27 +00:00
Carol (Nichols || Goulding) 549a267e3c
fix: Use Self instead of unnecessary structure name repetition
As now caught by clippy. https://rust-lang.github.io/rust-clippy/master/index.html#use_self
2022-08-11 15:21:02 -04:00
Carol (Nichols || Goulding) b982bdaf2f
fix: Derive Eq when we derive PartialEq and members can derive Eq
Allow this in generated code that we don't control, though.

Recommended by clippy now. https://rust-lang.github.io/rust-clippy/master/index.html#derive_partial_eq_without_eq
2022-08-11 15:04:06 -04:00
dependabot[bot] ae6ac27960
chore(deps): Bump console-subscriber from 0.1.6 to 0.1.7 (#5374)
Bumps [console-subscriber](https://github.com/tokio-rs/console) from 0.1.6 to 0.1.7.
- [Release notes](https://github.com/tokio-rs/console/releases)
- [Commits](https://github.com/tokio-rs/console/compare/console-subscriber-v0.1.6...tokio-console-v0.1.7)

---
updated-dependencies:
- dependency-name: console-subscriber
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-11 11:08:10 +00:00
Carol (Nichols || Goulding) 9321c96aaf
test: Add a step for running compaction in e2e tests 2022-08-10 11:30:22 -04:00
Jake Goulding 3915841a53
feat: Introduce a separate config for the compactor command 2022-08-10 11:30:21 -04:00
Jake Goulding 7787c51b57
feat: add new CLI command to run the compactor once 2022-08-10 11:28:51 -04:00
Carol (Nichols || Goulding) 463a13b814
test: Remove the compactor from the test MiniCluster 2022-08-10 11:28:51 -04:00
Carol (Nichols || Goulding) 45f8e567ed
fix: Revert adjustment of e2e test to expect compaction
This was part of
"feat: Different branch to hook up new compaction algorithm (#5194)"

and will be added back in a new test for compaction specifically in the
next commit.

This reverts part of commit 69640c0ba5.
2022-08-10 11:28:50 -04:00
Luke Bond 7e9918f067
chore: import validate merged schema (#5367)
* feat: import schema merge now outputs validation results

chore: refactor import crate

chore: renamed some structs for clarity in import crate

* chore: tests for import schema merge validation

* chore: Run cargo hakari tasks

* chore: clippy

* chore: make hashmap loop easier to read in import schema validation

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-10 12:15:37 +00:00
Luke Bond c5f062bba0
feat: initial commit of schema merge bulk import tool (#5344)
* feat: initial commit of schema merge bulk import tool

* chore: use observability depds instead of tracing-*

* chore: removed debug printlns

* chore: fix feature decls for cloud providers for import crate

* chore: use println instead of info in import- no need for a simple CLI

* chore: tidy whitespace

* chore: remove unused dep in import

* chore: Run cargo hakari tasks

* chore: removed unimpld import job subcommand

* chore: clarifying comment about custom serialisation code

* chore: clarifying comment about schema merge code in import

* chore: fix wrong comment in import command

* chore: bump object store dep to get bugfix

* chore: rename import schema struct for clarity

* chore: run `cargo hakari generate`

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-10 09:07:38 +00:00
Andrew Lamb 16ddc5efc6
chore: Update datafusion / arrow/parquet/arrow-flight and prost/tonic ecosystem (#5360)
* chore: Update datafusion and arrow

* chore: Update Cargo.lock

* chore: update to Decimal128

* chore: Update tonic/prost/pbjson/etc

* chore: Run cargo hakari tasks

* fix: doctest in generated types

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-09 17:30:44 +00:00
Carol (Nichols || Goulding) da0b031c44
feat: Add parameters to limit total memory usage of cold partition compaction 2022-08-04 16:55:48 -04:00
Carol (Nichols || Goulding) d55f45a5c2
feat: Run compaction of hot partitions a configurable number of times more than cold 2022-08-04 16:55:48 -04:00
dependabot[bot] e8231b2986
chore(deps): Bump serde_json from 1.0.82 to 1.0.83 (#5297)
* chore(deps): Bump serde_json from 1.0.82 to 1.0.83

Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.82 to 1.0.83.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.82...v1.0.83)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-04 14:28:29 +00:00
kodiakhq[bot] b3958321d3
Merge branch 'main' into crepererum/issue5272 2022-08-03 21:06:15 +00:00
dependabot[bot] 55e1e2ec2b
chore(deps): Bump thiserror from 1.0.31 to 1.0.32 (#5294)
* chore(deps): Bump thiserror from 1.0.31 to 1.0.32

Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.31 to 1.0.32.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/1.0.31...1.0.32)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-03 16:20:36 +00:00
Marco Neumann 4bd8977d55 refactor: add some main function debug logs 2022-08-03 18:00:28 +02:00
Marco Neumann 840e4801b8
feat: make querier RAM pool split a proper feature (#5283)
* feat: make querier RAM pool split a proper feature

- use propre pool names
- expose sizing via CLI/env

Closes https://github.com/influxdata/conductor/issues/1102.

* refactor: improve naming and docs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-03 15:27:23 +00:00
Marco Neumann 663a20d743
refactor: remove `--ingster-address` (#5255)
Closes #5002.
2022-08-03 15:05:01 +00:00
Marco Neumann 273b3cc165
chore: replace `dotenv` with `dotenvy` (#5285)
The latter one is a maintained fork. This avoids having both crates
after #5282.
2022-08-03 12:41:38 +00:00
Andrew Lamb 9215a534d0
chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to `19.0.0` (#5229)
* chore: Update datafusion and `arrow`/`parquet`/`arrow-flight` to `19.0.0`

* chore: Run cargo hakari tasks

* fix: Update for API changes

* fix: clippy

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-28 08:10:47 +00:00
Marco Neumann 9a9a1a4777
feat: limit per-table chunk data for every query (#5223)
* feat: `QueryChunk::as_any`

* feat: allo `ChunkPruner::prune_chunks` to fail

* feat: limit per-table chunk data for every query

Closes #5211.

* fix: address review comments

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-07-27 13:20:05 +00:00
Andrew Lamb e4dc8c2067
refactor: rename garbage collector crates for consistency (#5196)
* refactor: rename garbage collector crates for consistency

* fix: cargo fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-25 12:44:37 +00:00
Nga Tran 69640c0ba5
feat: Different branch to hook up new compaction algorithm (#5194)
* chore: cherry pick the first 3 commits of branch cn/connect-new-compaction

* fix: modify the test to work correctly with compactor running

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-22 19:29:47 +00:00
Marko Mikulicic 5a0af921c8
chore: Roll forward: Sync ReadWindowAggregate API: TagKeyMetaNames (#5186)
This reverts commit 5d02c755687ef041f5f45dbfc3e633a833284edb.
2022-07-22 10:44:06 +00:00
Marko Mikulicic 07cdb99192
chore: Revert "Sync ReadWindowAggregate API: TagKeyMetaNames" (#5184)
We're noticing a possible regression (OOMs) in our testing cluster that roughly correlates with this.
2022-07-22 09:26:42 +00:00
Nga Tran 69cb3f2b19
refactor: remove min_sequence_number from Compactor and Querier, add `count_by_overlaps_with_level_0` and `count_by_overlaps_with_level_1` to catalog (#5151)
* refactor: remove min_sequnce_number

* fix: typos

* fix: remove min_sequencer_number from new files from merging main

* fix: add back throwing error if the compactor compacts files persisted by the ingester after the ingester sends max seq_num back to querier

* test: add test_compactor_collision back but modify the input to make it work woth new changes

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-21 13:51:54 +00:00
Marko Mikulicic 21d033eafd fix: Sync ReadWindowAggregate API: TagKeyMetaNames
The storage API has been updated in https://github.com/influxdata/idpe/pull/12868
in January, but since we forked the `.proto` files we never noticed.
2022-07-21 15:07:04 +02:00
Andrew Lamb efe39033b3
fix: tab complete from hints in `sql` relp (itch scratch) (#5155)
* fix: tab complete from hints

* fix: remove default impl

* docs: Update influxdb_iox/src/commands/sql/repl.rs

* fix: fmt

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-20 11:41:01 +00:00
Marko Mikulicic c20288f60e
fix: Add TagKeyMetaNamesCapability capability (#5160)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-20 10:52:40 +00:00
dependabot[bot] 278a7f91af
chore(deps): Bump bytes from 1.1.0 to 1.2.0 (#5156)
Bumps [bytes](https://github.com/tokio-rs/bytes) from 1.1.0 to 1.2.0.
- [Release notes](https://github.com/tokio-rs/bytes/releases)
- [Changelog](https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md)
- [Commits](https://github.com/tokio-rs/bytes/compare/v1.1.0...v1.2.0)

---
updated-dependencies:
- dependency-name: bytes
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-20 10:00:08 +00:00
Marko Mikulicic b8236e2b9d
fix: Fix SeriesKey sort order for special _measurement and _field (#5150)
* fix: Fix SeriesKey sort order for special _measurement and _field

* fix: Update expected test output

* fix: Update more tests

* fix: Re-sort tag key when using binary encoding

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-07-20 08:45:17 +00:00
Jake Goulding f7a0fd43d2
feat: make object store garbage collector into a long-running service (#5135)
* refactor: remove unused logging config

* chore: remove the object store garbage collector CLI tool

* refactor: accept an object store and catalog

* refactor: make Result type alias public like the error

* refactor: remove public modifier from modules

* refactor: allow shutting down the object store garbage collector

* feat: Introduce the object-store garbage collection server

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-18 21:27:38 +00:00
Carol (Nichols || Goulding) 07e10852a8
feat: Add an input file count threshold to the compactor settings 2022-07-18 15:41:17 -04:00
Carol (Nichols || Goulding) 128833e7d9
fix: Change placeholder new_param to input_size_threshold_bytes 2022-07-18 15:16:43 -04:00
Carol (Nichols || Goulding) d62b1ed7ee
feat: Select a subset of parquet files for a partition to compact
Fixes #5120.
2022-07-18 15:14:22 -04:00
Carol (Nichols || Goulding) 4416f1ce37
fix: Remove max number of level 0 files configuration option 2022-07-18 15:08:16 -04:00
Carol (Nichols || Goulding) 57c70fcec5
fix: Remove redundant 'compaction' naming from CompactorConfig fields 2022-07-18 15:03:33 -04:00
Nga Tran c8f4000f04
feat: Select compaction candidates (#5131)
* feat: initial implementation for selecting compaction candidates

* feat: 2 catalog functions to choose the most thorughput partitions to compact and the selecting candidate function itself

* test: tests for the new 2 queries

* feat: more tests and metrics for chooing compaction candidates

* chore: Apply self suggestions from self review

* chore: cleanup

* chore: fix doc comment

* chore: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* refactor: address review comments

* fix: get the right time provider for the tests

* refactor: remove the left over compaction_

* fix: typos

* fix: make the param name and env name consistent

* refactor: make relevant iSomething to uSomething

* fix: typo

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2022-07-18 18:05:13 +00:00
Andrew Lamb e2d871b00b
chore: Update datafusion and arrow/parquet/arrow-flight to `18.0.0` (#5079)
* chore: Update datafusion to 10.0.0, arrow/parquet/arrow-flight to 18

* chore: Run cargo hakari tasks

* fix: update cargo pin

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-18 15:01:03 +00:00
dependabot[bot] 1eeee9809c
chore(deps): Bump rustyline from 9.1.2 to 10.0.0 (#5139)
Bumps [rustyline](https://github.com/kkawakam/rustyline) from 9.1.2 to 10.0.0.
- [Release notes](https://github.com/kkawakam/rustyline/releases)
- [Changelog](https://github.com/kkawakam/rustyline/blob/master/History.md)
- [Commits](https://github.com/kkawakam/rustyline/compare/v9.1.2...v10.0.0)

---
updated-dependencies:
- dependency-name: rustyline
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-18 08:10:22 +00:00
dependabot[bot] 9b67de2f43
chore(deps): Bump tokio from 1.19.2 to 1.20.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.19.2 to 1.20.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.19.2...tokio-1.20.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-07-14 01:21:43 +00:00
Carol (Nichols || Goulding) 61c023139b
refactor: Switch compaction levels to an enum with values rather than separate consts
Bonuses:

- Type checking
- Validation
- Less casting
- Exhaustiveness checking
- Less use of the numerical value
2022-07-13 11:30:36 -04:00
Nga Tran 5c5c964dfe
feat: config params for Compactor (#5108)
* feat: config params for Compactor

* refactor: address review comments

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-13 13:50:07 +00:00
Andrew Lamb c46e1c6347
chore: Update datafusion + arrow/parquet/arrow-flight to `17.0.0` (#5021)
* fix: correct nullability declaration of system tables

* chore: Update datafusion and arrow/parquet/arrow-flight

* chore: Run cargo hakari tasks

* fix: Update tests

* fix: Update tests

* fix: predicate pruning

* fix: add some tests

* fix: query_functions

* fix: fix read_buffer test

* fix: fix clippy

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-07-07 19:22:15 +00:00
Carol (Nichols || Goulding) 6b4325642f
refactor: Rename inner_main to main as it's now public 2022-07-07 09:48:06 -04:00
Carol (Nichols || Goulding) 9a681e75cc
fix: Box a large enum variant 2022-07-07 09:48:06 -04:00
Carol (Nichols || Goulding) 70dd6009e8
fix: Integrate the object store garbage collector into the main binary 2022-07-07 09:48:06 -04:00
Carol (Nichols || Goulding) 89f5091546
refactor: Don't require DSN for test config
This enables writing a test for all-in-one's ephemeral mode, which
currently isn't working
2022-07-06 11:00:29 -04:00
Nga Tran 1de022136c
feat: add max desired file size config param (#5025)
* feat: add max desired file size config param

* fix: comment typos

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* chore: Apply suggestions from code review

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-07-05 15:32:45 +00:00
dependabot[bot] 68eff79594
chore(deps): Bump once_cell from 1.12.0 to 1.13.0 (#5033)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.12.0 to 1.13.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.12.0...v1.13.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-07-05 08:54:51 +00:00
Marco Neumann be53716e4d
refactor: use IDs for `parquet_file.column_set` (#4965)
* feat: `ColumnRepo::list_by_table_id`

* refactor: use IDs for `parquet_file.column_set`

Closes #4959.

* refactor: introduce `TableSchema::column_id_map`
2022-06-30 15:08:41 +00:00
Sam Arnold 9438570ba1
test: document how to run tests (#4982)
* test: document how to run tests

Also fix a few issues for local runs.

* docs: add back one-liner for running end to end tests

* docs: add comment for clap_blocks test

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* docs: add comment in influxdb_iox/tests/end_to_end_cases/cli.rs

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-06-30 14:01:35 +00:00
Carol (Nichols || Goulding) 3049479b78
feat: Implement new querier to ingester config design 2022-06-30 08:26:50 -04:00
Carol (Nichols || Goulding) 59da2dccb8
feat: Assert if no ingester addresses are found
Temporarily support `--ingester-addresses` (and always return all
ingesters) so that this PR can be deployed during the switchover.
2022-06-30 08:22:47 -04:00
Carol (Nichols || Goulding) 0e450deca8
feat: Support a sequencer being mapped to multiple ingesters 2022-06-30 08:22:47 -04:00
Carol (Nichols || Goulding) 7965bda42f
fix: Accept JSON ingester/shard config as CLI param value or env var value 2022-06-30 08:22:47 -04:00
Carol (Nichols || Goulding) 4e91121e29
feat: Allow specification of sequencer to ingester mappings in a JSON file 2022-06-30 08:22:46 -04:00
dependabot[bot] 40a8525520
chore(deps): Bump serde_json from 1.0.81 to 1.0.82 (#4992)
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.81 to 1.0.82.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.81...v1.0.82)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-30 09:54:08 +00:00
Raphael Taylor-Davies 835e1c91c7
chore: update object_store to 0.3.0 (#4707)
* chore: update object_store to 0.3.0

* chore: review feedback

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-29 21:44:03 +00:00
Nga Tran cfcc4b8426
refactor: change level 1 to level 2 preparing for next design changes (#4954)
* refactor: change level 1 to level 2 preparing for next design changes

* fix: make level-2 consistent everywhere

* chore: remove unused comments

* refactor: change all the name level_1 to level_2 to completely replace 1 with 2 to amke everything consistent

* chore: add correspinding constants for the comapction levels in the comments

Co-authored-by: Dom <dom@itsallbroken.com>
2022-06-29 14:08:58 +00:00
Andrew Lamb bfddb032ce
docs: improve docs for `persist_partition_size_threshold_bytes` / `INFLUXDB_IOX_PERSIST_PARTITION_SIZE_THRESHOLD_BYTES` (#4877)
* docs: improve docs for `persist_partition_size_threshold_bytes` / `INFLUXDB_IOX_PERSIST_PARTITION_SIZE_THRESHOLD_BYTES`

* docs: improve comments about LifecycleConfig::partition_size_threshold

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-27 21:52:40 +00:00
kodiakhq[bot] c22aed4347
Merge branch 'main' into dom/schema-api 2022-06-27 21:34:07 +00:00
Marco Neumann 215f297162
refactor: parquet file metadata from catalog (#4949)
* refactor: remove `ParquetFileWithMetadata`

* refactor: remove `ParquetFileRepo::parquet_metadata`

* refactor: parquet file metadata from catalog

Closes #4124.
2022-06-27 15:38:39 +00:00
Dom Dwyer 75c425f375 refactor(schema-api): column data type enum
Previously the column data type was exposed using an internal i32 value.
This commit changes the Schema API to use a self-descriptive proto enum
for the column data type.
2022-06-27 16:14:49 +01:00
Andrew Lamb 087dbd3eca
fix: fix heappy + update docs (#4917)
* docs: Update heap profiling documentation

* fix: fix heappy builds

* fix: do not run cli tests with heappy

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-21 19:53:28 +00:00
Marco Neumann c3912e34e9
refactor: store per-file column set in catalog (#4908)
* refactor: store per-file column set in catalog

Together with the table-wide schema and the partition-wide sort key, this should
be everything we need to read a parquet file directly into memory
without peeking any file-level metadata.

The querier will use this to directly load parquet files into the read
buffer.

**WARNING: This requires a catalog wipe!**

Ref #4124.

* refactor: use proper `ColumnSet` type
2022-06-21 10:26:12 +00:00
Marco Neumann 0fbff981ec
chore(deps): Bump sqlx to 0.6.0 and uuid to 1 (#4894)
Closes #4889.
Closes #4890.

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-17 10:28:28 +00:00
Marco Neumann 66c7d95312
refactor: use new ingester<>querier wire protocol (#4867)
* refactor: use new ingester<>querier wire protocol

Use and document the new and more flexible ingester<>querier wire
protocol.

Note that the ingester does NOT stream the response data yet, but the
internal data structures would allow that. A follow-up change will
adjust the ingester code to stream the data.

Ref #4849.

* fix: typos

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* refactor: clarify naming and public interface

* test: add schema assertion to `ingester_response_to_record_batches`

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-06-16 08:02:28 +00:00
Carol (Nichols || Goulding) e9cdaffe74
fix: Create querier sharder from catalog sequencer info
Panic if there are no sharders in the catalog.
2022-06-15 10:18:54 -04:00
Carol (Nichols || Goulding) 874ef89daa
feat: Make specifying the write buffer, and thus getting a sharder, optional in querier 2022-06-15 10:01:45 -04:00
Carol (Nichols || Goulding) 127467b5c4
feat: Create a sharder in the querier 2022-06-15 10:01:45 -04:00
Carol (Nichols || Goulding) 148bc57e7b
refactor: Make the querier server constructor more like other server constructors 2022-06-15 10:01:45 -04:00
Marco Neumann 3bd24b67ba
feat: extend flight client to accept multiple (changing) schemas (#4853)
* feat: extend flight client to accept multiple (changing) schemas

See #4849.

Originally I intended not to use Flight at all for the new
ingester<>querier protocol. However since flight also deals with
dictionary batches and multiple batches and the gRPC protocol that I
would write would look very similar, I will use Flight with a bit more
flexible message types.

The rough idea for the protocol is the following stream:

- for each partition:
  1. "none" message with partition metadata
  2. for each chunk (can have different schemas under certain
     circumstances):
     1. "schema" message (resets dictionary state)
     2. (optional) dictionary batch messages
     3. one or more "record batch" message

The nice thing about it is that the same arrow client works also for the
existing client<>querier protocol since there we just send:

1. "schema" message (no app metadata)
2. (optional) dictionary batch messages
3. zero, one or more "record batch" message (no app metadata)

* refactor: separate high- and low-level flight client

It is very unlikely that a user will use the high-level batch-producing
functionality and the low-level stuff within the same session. So let's
split this into to clients (high-level uses the low-level one
internally) to avoid confusion.

Also add documentation on our protocol handling.

* refactor: enumerate all variants in match statement to better catch errors in the future
2022-06-15 11:38:08 +00:00
Andrew Lamb 005610b172
refactor: remove some `&` use in iox_catalog (#4862)
* refactor: remove some `&` use in iox_catalog

* fix: Update data_types/src/lib.rs
2022-06-15 11:31:49 +00:00
Andrew Lamb e91d00b10c
chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `16.0.0 (#4851)
* chore: TEMP Update DataFusion to pre-release

* chore: update arrow et al to 16.0.0

* chore: Run cargo hakari tasks

* fix: update reader read_dictionary API

* chore: Update to real Datafusion release

* fix: Update parquet API

* fix: update test

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-06-14 16:31:40 +00:00
dependabot[bot] 23c9e38ea7
chore(deps): Bump clap from 3.1.18 to 3.2.1 (#4848)
* chore(deps): Bump clap from 3.1.18 to 3.2.1

Bumps [clap](https://github.com/clap-rs/clap) from 3.1.18 to 3.2.1.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v3.1.18...clap_complete-v3.2.1)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: fix clap deprecations

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-14 15:42:18 +00:00
Dom Dwyer b41ea1d718 refactor: PartitionKey type
This commit changes the code base to use a new reference-counted
PartitionKey type wrapper, instead of passing a bare String around.

This allows the compiler to type check & verify usage of the partition
key, instead of passing a bare string around. By reference counting the
underlying string, we reduce memory usage for some use cases.
2022-06-14 14:47:56 +01:00
Andrew Lamb 9fdbfb05e7
refactor: Use scan_and_filter in ReorgPlanner (#4822)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-10 17:31:25 +00:00
Carol (Nichols || Goulding) 1c7cbaf5ae
refactor: Use DurationHistogram in more places 2022-06-09 14:20:51 -04:00
dependabot[bot] 3ecb1ee056
chore(deps): Bump http from 0.2.7 to 0.2.8 (#4796)
Bumps [http](https://github.com/hyperium/http) from 0.2.7 to 0.2.8.
- [Release notes](https://github.com/hyperium/http/releases)
- [Changelog](https://github.com/hyperium/http/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/http/compare/v0.2.7...v0.2.8)

---
updated-dependencies:
- dependency-name: http
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-07 13:35:01 +00:00
dependabot[bot] 04c685b3b7
chore(deps): Bump tokio-util from 0.7.2 to 0.7.3 (#4784)
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.2 to 0.7.3.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.2...tokio-util-0.7.3)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-06 14:46:27 +00:00
dependabot[bot] e03bf94420
chore(deps): Bump tokio from 1.18.2 to 1.19.1 (#4783)
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.18.2 to 1.19.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.18.2...tokio-1.19.1)

---
updated-dependencies:
- dependency-name: tokio
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-06 14:15:12 +00:00
Andrew Lamb 3592aa52d8
chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `15.0.0` (#4743)
* chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `15.0.0`

* chore: Update APIs

* chore: Run cargo hakari tasks

* feat: normalize parquet file metadata

* chore: update size tests

* chore: add docs on metadata stripping

* chore: TEMP UPDATE TO DF BRANCH

* chore: Update for new API

* fix: Update to latest DF

* fix: cargo hakari

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Raphael Taylor-Davies <r.taylordavies@googlemail.com>
2022-06-03 10:32:26 +00:00
Andrew Lamb 1472ec272f
refactor: consolidate duplicate testing logic (#4708)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-01 20:02:13 +00:00
Dom Dwyer 1caeb04869 test(e2e): do not mangle prod database
Unset the all env vars for the following CLI e2e tests:

    * default_mode_is_run_all_in_one
    * default_run_mode_is_all_in_one

This prevents them from executing against the "prod" catalog, running
migrations and inserting values to the prod database specified in the
prod DSN env (INFLUXDB_IOX_CATALOG_DSN).
2022-06-01 17:12:12 +01:00
Dom Dwyer 60de97ac26 test(e2e): ensure "partition pull" writes files
Adds a test case covering the "remote partition pull" command configured
with file-based object storage.
2022-06-01 16:41:57 +01:00
Dom Dwyer 6d647fb7a9 refactor: warn for silly object store configs
Warn when downloading files to an in-memory object store.

The "remote partition pull" command downloads parquet files from an
object store via a router, and saves them locally. It's pretty unlikely
the user intends to download those files to memory of the CLI process
which then exits when the pull is complete, throwing away the downloaded
files, but this is the default.
2022-06-01 16:41:57 +01:00
Marco Neumann ebeccf037c
feat: limit querier concurrency by limiting number of active namespaces (#4752)
This is a rather quick fix for prod. On the mid-term we probably wanna
rethink our deployment strategy, e.g. by using "one query per pod" and
by deploying queryd w/ IOx into the same pod.
2022-06-01 11:59:35 +00:00
Paul Dix 6af32b7750
feat: add concurrency limit for ingester queries (#4703)
I've defaulted it to 20, we can adjust as needed.

Closes #4657

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-30 10:22:17 +00:00
Andrew Lamb 700a1de8f3
fix: fix at least one intermittent failure (#4711) 2022-05-26 21:24:37 +00:00
Andrew Lamb 633117e595
feat: avoid catalog access on each query (#4650)
* feat: cache catalog access on query

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2022-05-26 20:44:22 +00:00
Nga Tran 6cc767efcc
feat: teach compactor to compact smaller number of files (#4671)
* refactor: split compact_partition into two functions to handle concurrency better

* feat: limit number of files to compact

* test: add test for limit num files

* chore: fix cipply

* feat: split group if over max size

* fix: split the overlapped group to limit size or file num

* chore: reduce config values

* test: add tests and clearer comments for the split_overlapped_groups and test_limit_size_and_num_files

* chore: more comments

* chore: cleanup
2022-05-25 19:54:34 +00:00
Marko Mikulicic 9ddb0a816e
fix: Return panic message in internal error (#4693)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-25 15:11:17 +00:00
Marco Neumann a08a91c5ba
fix: ensure querier cache is refreshed for partition sort key (#4660)
* test: call `maybe_start_logging` in auto-generated cases

* fix: ensure querier cache is refreshed for partition sort key

Fixes #4631.

* docs: explain querier sort key handling and test

* test: test another version of issue 4631

* fix: correctly invalidate partition sort keys

* fix: fix `table_not_found_on_ingester`
2022-05-25 10:44:42 +00:00
Marko Mikulicic cdbe546e50
fix: return gRPC error on panic (#4686) 2022-05-25 07:06:25 +00:00
Andrew Lamb a8d5f7f5f7
test: add debug output to test (#4684) 2022-05-24 19:57:11 +00:00
Marco Neumann 9c1ffc2b0d
test: panic handling, add compactor to end to end test harness (#4677)
* feat: add test gRPC client

* test: start compactor in mini cluster

* test: assert panic handling

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-24 14:55:26 +00:00
dependabot[bot] ca49820a0f
chore(deps): Bump console-subscriber from 0.1.5 to 0.1.6 (#4670)
Bumps [console-subscriber](https://github.com/tokio-rs/console) from 0.1.5 to 0.1.6.
- [Release notes](https://github.com/tokio-rs/console/releases)
- [Commits](https://github.com/tokio-rs/console/compare/console-subscriber-v0.1.5...console-subscriber-v0.1.6)

---
updated-dependencies:
- dependency-name: console-subscriber
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-24 08:24:12 +00:00
dependabot[bot] 76f7043417
chore(deps): Bump once_cell from 1.11.0 to 1.12.0 (#4666)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.11.0 to 1.12.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.11.0...v1.12.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-24 08:14:03 +00:00
Marco Neumann 2029bd16ba
feat: enable debugging of failed querier->ingester requests (#4659)
* feat: enable debugging of failed querier->ingester requests

- extend `query-ingester` CLI to allow usage of predicates
- on failed requests: log all information that required for the CLI
- test the "ingester fails" scenario

* test: explain

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* docs: improve

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* refactor: move b64 pred. serde into a single crate

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-05-23 15:37:31 +00:00
Carol (Nichols || Goulding) c811bebdb7
feat: Add ingester CLI option to skip to oldest available WB seq num
The default behavior of the ingester is to panic if the min unpersisted
sequence number in the catalog is unknown to the write buffer due to the
retention policies having evicted that sequence number.

Specifying `--skip-to-oldest-available` changes this behavior to skip to
the oldest sequence number the write buffer does have available and go
from there.

Fixes #4624.
2022-05-20 10:51:07 -04:00
dependabot[bot] 6bc0c74c7d
chore(deps): Bump once_cell from 1.10.0 to 1.11.0 (#4646)
* chore(deps): Bump once_cell from 1.10.0 to 1.11.0

Bumps [once_cell](https://github.com/matklad/once_cell) from 1.10.0 to 1.11.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.10.0...v1.11.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-20 07:40:38 +00:00
Marco Neumann 20fa70d54b feat: add `measurement_fields` support to `influxdb_iox storage` 2022-05-19 16:50:46 +02:00
Marco Neumann 52346642a0
ci: fix cargo deny (#4629)
* ci: fix cargo deny

* chore: downgrade `socket2`, version 0.4.5 was yanked

* chore: rename `query` to `iox_query`

`query` is already taken on crates.io and yanked and I am getting tired
of working around that.
2022-05-18 09:38:35 +00:00
Andrew Lamb 3a33e806c7
chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `14.0.0` (#4619)
* chore: Update datafusion deps

* chore: update arrow/parquet/arrow flight deps

* chore: Run cargo hakari tasks

* chore: Update location of utils

* chore: Update some more APIs

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-05-17 14:13:03 +00:00
Marco Neumann 779f0e9cdf
feat: querier RAM pool (#4593)
* feat: `SortKey::size`

* feat: `FunctionEstimator`

* feat: querier RAM pool

Let's put all the caches into a single RAM pool, so we can at least
somewhat control RAM usage. Note that this does NOT limit the peak
memory during query execution though, but should at least stop unlimited
cache growth. A follow-up PR will add metrics.

* refactor: improve some size calculations

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-17 13:11:20 +00:00
dependabot[bot] 259d2486c1
chore(deps): Bump tokio-util from 0.7.1 to 0.7.2 (#4605)
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.1 to 0.7.2.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.1...tokio-util-0.7.2)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-16 11:42:31 +00:00
Raphael Taylor-Davies f2bb0fdf77
feat: update to crates.io object_store version (#4595)
* feat: update to crates.io object_store version

* chore: Run cargo hakari tasks

* fix: tests

* chore: remove object store integration test plumbing

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-05-13 16:26:07 +00:00
Carol (Nichols || Goulding) 55313d290a
fix: Update or remove comments that mention NG or OG
Connects to #4450.
2022-05-12 16:09:08 -04:00
Carol (Nichols || Goulding) 30e53fd09c
fix: Rename end-to-end NG tests to not contain NG
Connects to #4450.
2022-05-12 16:09:07 -04:00
Carol (Nichols || Goulding) 48e6e5713d
fix: Rename test_helpers_end_to_end_ng to test_helpers_end_to_end
Connects to #4450.
2022-05-12 16:09:07 -04:00
Carol (Nichols || Goulding) 78bbe629b2
feat: Add more logging to understand the flaky multi ingester test better (#4580)
* feat: Increase logging to investigate multi ingester flaky test

* feat: Temporarily disable a test while logging is increased in CI

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-12 20:05:05 +00:00
Carol (Nichols || Goulding) 2079cf98f6
fix: Add back a test case that needs to check ingester for write info
Specifically because the querier doesn't know about the ingester.
2022-05-11 15:30:59 -04:00
Carol (Nichols || Goulding) 48b84b3bdf
feat: Querier can get write status from ingesters
Connects to influxdata/influxdb-iox-client-go#27.
2022-05-11 14:12:10 -04:00