Commit Graph

594 Commits (7daf680e76ac035342094c4bc00c890ffbcdfa54)

Author SHA1 Message Date
Andrew Lamb 3592aa52d8
chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `15.0.0` (#4743)
* chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `15.0.0`

* chore: Update APIs

* chore: Run cargo hakari tasks

* feat: normalize parquet file metadata

* chore: update size tests

* chore: add docs on metadata stripping

* chore: TEMP UPDATE TO DF BRANCH

* chore: Update for new API

* fix: Update to latest DF

* fix: cargo hakari

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Raphael Taylor-Davies <r.taylordavies@googlemail.com>
2022-06-03 10:32:26 +00:00
Andrew Lamb 1472ec272f
refactor: consolidate duplicate testing logic (#4708)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-06-01 20:02:13 +00:00
Dom Dwyer 1caeb04869 test(e2e): do not mangle prod database
Unset the all env vars for the following CLI e2e tests:

    * default_mode_is_run_all_in_one
    * default_run_mode_is_all_in_one

This prevents them from executing against the "prod" catalog, running
migrations and inserting values to the prod database specified in the
prod DSN env (INFLUXDB_IOX_CATALOG_DSN).
2022-06-01 17:12:12 +01:00
Dom Dwyer 60de97ac26 test(e2e): ensure "partition pull" writes files
Adds a test case covering the "remote partition pull" command configured
with file-based object storage.
2022-06-01 16:41:57 +01:00
Dom Dwyer 6d647fb7a9 refactor: warn for silly object store configs
Warn when downloading files to an in-memory object store.

The "remote partition pull" command downloads parquet files from an
object store via a router, and saves them locally. It's pretty unlikely
the user intends to download those files to memory of the CLI process
which then exits when the pull is complete, throwing away the downloaded
files, but this is the default.
2022-06-01 16:41:57 +01:00
Marco Neumann ebeccf037c
feat: limit querier concurrency by limiting number of active namespaces (#4752)
This is a rather quick fix for prod. On the mid-term we probably wanna
rethink our deployment strategy, e.g. by using "one query per pod" and
by deploying queryd w/ IOx into the same pod.
2022-06-01 11:59:35 +00:00
Paul Dix 6af32b7750
feat: add concurrency limit for ingester queries (#4703)
I've defaulted it to 20, we can adjust as needed.

Closes #4657

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-30 10:22:17 +00:00
Andrew Lamb 700a1de8f3
fix: fix at least one intermittent failure (#4711) 2022-05-26 21:24:37 +00:00
Andrew Lamb 633117e595
feat: avoid catalog access on each query (#4650)
* feat: cache catalog access on query

* fix: Apply suggestions from code review

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
2022-05-26 20:44:22 +00:00
Nga Tran 6cc767efcc
feat: teach compactor to compact smaller number of files (#4671)
* refactor: split compact_partition into two functions to handle concurrency better

* feat: limit number of files to compact

* test: add test for limit num files

* chore: fix cipply

* feat: split group if over max size

* fix: split the overlapped group to limit size or file num

* chore: reduce config values

* test: add tests and clearer comments for the split_overlapped_groups and test_limit_size_and_num_files

* chore: more comments

* chore: cleanup
2022-05-25 19:54:34 +00:00
Marko Mikulicic 9ddb0a816e
fix: Return panic message in internal error (#4693)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-25 15:11:17 +00:00
Marco Neumann a08a91c5ba
fix: ensure querier cache is refreshed for partition sort key (#4660)
* test: call `maybe_start_logging` in auto-generated cases

* fix: ensure querier cache is refreshed for partition sort key

Fixes #4631.

* docs: explain querier sort key handling and test

* test: test another version of issue 4631

* fix: correctly invalidate partition sort keys

* fix: fix `table_not_found_on_ingester`
2022-05-25 10:44:42 +00:00
Marko Mikulicic cdbe546e50
fix: return gRPC error on panic (#4686) 2022-05-25 07:06:25 +00:00
Andrew Lamb a8d5f7f5f7
test: add debug output to test (#4684) 2022-05-24 19:57:11 +00:00
Marco Neumann 9c1ffc2b0d
test: panic handling, add compactor to end to end test harness (#4677)
* feat: add test gRPC client

* test: start compactor in mini cluster

* test: assert panic handling

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-24 14:55:26 +00:00
dependabot[bot] ca49820a0f
chore(deps): Bump console-subscriber from 0.1.5 to 0.1.6 (#4670)
Bumps [console-subscriber](https://github.com/tokio-rs/console) from 0.1.5 to 0.1.6.
- [Release notes](https://github.com/tokio-rs/console/releases)
- [Commits](https://github.com/tokio-rs/console/compare/console-subscriber-v0.1.5...console-subscriber-v0.1.6)

---
updated-dependencies:
- dependency-name: console-subscriber
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-24 08:24:12 +00:00
dependabot[bot] 76f7043417
chore(deps): Bump once_cell from 1.11.0 to 1.12.0 (#4666)
Bumps [once_cell](https://github.com/matklad/once_cell) from 1.11.0 to 1.12.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.11.0...v1.12.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-24 08:14:03 +00:00
Marco Neumann 2029bd16ba
feat: enable debugging of failed querier->ingester requests (#4659)
* feat: enable debugging of failed querier->ingester requests

- extend `query-ingester` CLI to allow usage of predicates
- on failed requests: log all information that required for the CLI
- test the "ingester fails" scenario

* test: explain

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* docs: improve

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

* refactor: move b64 pred. serde into a single crate

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-05-23 15:37:31 +00:00
Carol (Nichols || Goulding) c811bebdb7
feat: Add ingester CLI option to skip to oldest available WB seq num
The default behavior of the ingester is to panic if the min unpersisted
sequence number in the catalog is unknown to the write buffer due to the
retention policies having evicted that sequence number.

Specifying `--skip-to-oldest-available` changes this behavior to skip to
the oldest sequence number the write buffer does have available and go
from there.

Fixes #4624.
2022-05-20 10:51:07 -04:00
dependabot[bot] 6bc0c74c7d
chore(deps): Bump once_cell from 1.10.0 to 1.11.0 (#4646)
* chore(deps): Bump once_cell from 1.10.0 to 1.11.0

Bumps [once_cell](https://github.com/matklad/once_cell) from 1.10.0 to 1.11.0.
- [Release notes](https://github.com/matklad/once_cell/releases)
- [Changelog](https://github.com/matklad/once_cell/blob/master/CHANGELOG.md)
- [Commits](https://github.com/matklad/once_cell/compare/v1.10.0...v1.11.0)

---
updated-dependencies:
- dependency-name: once_cell
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-20 07:40:38 +00:00
Marco Neumann 20fa70d54b feat: add `measurement_fields` support to `influxdb_iox storage` 2022-05-19 16:50:46 +02:00
Marco Neumann 52346642a0
ci: fix cargo deny (#4629)
* ci: fix cargo deny

* chore: downgrade `socket2`, version 0.4.5 was yanked

* chore: rename `query` to `iox_query`

`query` is already taken on crates.io and yanked and I am getting tired
of working around that.
2022-05-18 09:38:35 +00:00
Andrew Lamb 3a33e806c7
chore: Update datafusion + `arrow`/`parquet`/`arrow-flight` to `14.0.0` (#4619)
* chore: Update datafusion deps

* chore: update arrow/parquet/arrow flight deps

* chore: Run cargo hakari tasks

* chore: Update location of utils

* chore: Update some more APIs

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-05-17 14:13:03 +00:00
Marco Neumann 779f0e9cdf
feat: querier RAM pool (#4593)
* feat: `SortKey::size`

* feat: `FunctionEstimator`

* feat: querier RAM pool

Let's put all the caches into a single RAM pool, so we can at least
somewhat control RAM usage. Note that this does NOT limit the peak
memory during query execution though, but should at least stop unlimited
cache growth. A follow-up PR will add metrics.

* refactor: improve some size calculations

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-17 13:11:20 +00:00
dependabot[bot] 259d2486c1
chore(deps): Bump tokio-util from 0.7.1 to 0.7.2 (#4605)
Bumps [tokio-util](https://github.com/tokio-rs/tokio) from 0.7.1 to 0.7.2.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-util-0.7.1...tokio-util-0.7.2)

---
updated-dependencies:
- dependency-name: tokio-util
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-05-16 11:42:31 +00:00
Raphael Taylor-Davies f2bb0fdf77
feat: update to crates.io object_store version (#4595)
* feat: update to crates.io object_store version

* chore: Run cargo hakari tasks

* fix: tests

* chore: remove object store integration test plumbing

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-05-13 16:26:07 +00:00
Carol (Nichols || Goulding) 55313d290a
fix: Update or remove comments that mention NG or OG
Connects to #4450.
2022-05-12 16:09:08 -04:00
Carol (Nichols || Goulding) 30e53fd09c
fix: Rename end-to-end NG tests to not contain NG
Connects to #4450.
2022-05-12 16:09:07 -04:00
Carol (Nichols || Goulding) 48e6e5713d
fix: Rename test_helpers_end_to_end_ng to test_helpers_end_to_end
Connects to #4450.
2022-05-12 16:09:07 -04:00
Carol (Nichols || Goulding) 78bbe629b2
feat: Add more logging to understand the flaky multi ingester test better (#4580)
* feat: Increase logging to investigate multi ingester flaky test

* feat: Temporarily disable a test while logging is increased in CI

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-12 20:05:05 +00:00
Carol (Nichols || Goulding) 2079cf98f6
fix: Add back a test case that needs to check ingester for write info
Specifically because the querier doesn't know about the ingester.
2022-05-11 15:30:59 -04:00
Carol (Nichols || Goulding) 48b84b3bdf
feat: Querier can get write status from ingesters
Connects to influxdata/influxdb-iox-client-go#27.
2022-05-11 14:12:10 -04:00
Andrew Lamb 381ad3b81d chore: Update heappy 2022-05-11 09:49:10 -04:00
Andrew Lamb b8cb4c3f2b
feat: Interrogate schema from querier (as well as router) (#4557)
* refactor: move SchemaService into `service_grpc_schema`

* feat: implement schema gRPC for querier

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-05-10 20:55:58 +00:00
Andrew Lamb 03ee6840d0
feat: Add `debug namespaces` CLI command (#4556) 2022-05-10 18:35:05 +00:00
Andrew Lamb 84fd883688
feat: Add query_ingester CLI command (#4554)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-10 18:18:07 +00:00
Raphael Taylor-Davies 84d60ce56e
fix: feature flags (#4550)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-10 13:42:51 +00:00
Raphael Taylor-Davies 99b1a9b83f
refactor: split out ObjectStoreMetrics (#4547)
* refactor: split out ObjectStoreMetrics

* chore: add workspace hack

* fix: compile
2022-05-10 10:56:28 +00:00
Raphael Taylor-Davies 8b379c83cc
refactor: simplify object_store path handling (#4534)
* refactor: simplify object_store path handling

* fix: aws integration tests

* chore: lint

* fix: update gcs tests

* refactor: move errors into submodules

* chore: lint

* chore: review feedback

* refactor: replace provider with Display

* fix: failing tests

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-05-09 18:43:22 +00:00
Carol (Nichols || Goulding) 1759443a13
fix: Remove unused dependencies in influxdb_iox found by manual inspection 2022-05-06 14:51:54 -04:00
Carol (Nichols || Goulding) fcd4815645
fix: Rename router2 to router 2022-05-06 14:51:52 -04:00
Carol (Nichols || Goulding) 0650a9bb77
fix: Rename ioxd_router2 to ioxd_router 2022-05-06 14:45:39 -04:00
Carol (Nichols || Goulding) 068096e7e1
fix: Rename data_types2 to data_types 2022-05-06 14:45:39 -04:00
Carol (Nichols || Goulding) 0541c6e40f
fix: Remove data_types crate where it's no longer used 2022-05-06 14:45:39 -04:00
Carol (Nichols || Goulding) 485d6edb8f
refactor: Move IngesterQueryRequest to generated_types 2022-05-06 14:45:37 -04:00
Carol (Nichols || Goulding) ea46830954
fix: Remove iox_object_store crate; move ParquetFilePath to parquet_file 2022-05-06 14:45:36 -04:00
Carol (Nichols || Goulding) f8bdb022bc
fix: Remove job_registry crate 2022-05-06 11:35:11 -04:00
Carol (Nichols || Goulding) c45a85ca81
fix: Remove now-obsolete 'debug dump catalog' command 2022-05-06 11:30:36 -04:00
Carol (Nichols || Goulding) b88d071ce7
fix: Remove server 2022-05-06 11:30:36 -04:00
Carol (Nichols || Goulding) e0bc1801ac
fix: Remove router 2022-05-06 11:30:36 -04:00