Commit Graph

8955 Commits (439bcf08d98066c167b3b144d4caaecd2853babe)

Author SHA1 Message Date
Marco Neumann 26f3849191
feat: improve cache policies and their testing (#5375)
1. add GET support for the subscribers (this is needed so that
   TTL/refreshers and LRU systems "know" when a key was used)
2. improve multi-threading test to not rely on a wait loop but use a 2nd
   barrier instead
3. ensure that panics in the testing background thread are propagated
   and make the test fail
4. implement defaults for `Subscriber` methods to reduce boilerplate for
   implementators

Helps with #5320.
2022-08-11 13:53:22 +00:00
Marco Neumann 90fec1365f
feat: intern schemas during query planning (#5215)
* feat: intern schemas during query planning

Helps with #5202.

* refactor: `SchemaMerger::build` shall return an `Arc`

* feat: `SchemaMerger::with_interner`

* refactor: hash-based schema interning
2022-08-11 12:28:51 +00:00
kodiakhq[bot] 4867c6b682
Merge pull request #5376 from influxdata/dom/rskafka-bump
build: bump rskafka
2022-08-11 12:20:25 +00:00
Dom Dwyer 7174f38f3f build: bump rskafka
Bumps rskafka to HEAD to pick up:

    https://github.com/influxdata/rskafka/pull/164
2022-08-11 14:00:21 +02:00
dependabot[bot] ae6ac27960
chore(deps): Bump console-subscriber from 0.1.6 to 0.1.7 (#5374)
Bumps [console-subscriber](https://github.com/tokio-rs/console) from 0.1.6 to 0.1.7.
- [Release notes](https://github.com/tokio-rs/console/releases)
- [Commits](https://github.com/tokio-rs/console/compare/console-subscriber-v0.1.6...tokio-console-v0.1.7)

---
updated-dependencies:
- dependency-name: console-subscriber
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-11 11:08:10 +00:00
Marco Neumann 13f2a2ebc7
feat: new policy system for caches (#5370)
* feat: new policy system for caches

This is the framework part for the policy system outlined in #5320. It
does NOT port any existing policies (likely TTL, LRU and shared) over to
the new system. This will be a follow-up.

* docs: improve example

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>

Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
2022-08-11 09:41:17 +00:00
dependabot[bot] 692ec97c2f
chore(deps): Bump ahash from 0.7.6 to 0.8.0 (#5373)
* chore(deps): Bump ahash from 0.7.6 to 0.8.0

Bumps [ahash](https://github.com/tkaitchuck/ahash) from 0.7.6 to 0.8.0.
- [Release notes](https://github.com/tkaitchuck/ahash/releases)
- [Commits](https://github.com/tkaitchuck/ahash/compare/v0.7.6...v0.8.0)

---
updated-dependencies:
- dependency-name: ahash
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

* fix: `ahash` features

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-11 08:37:31 +00:00
Marco Neumann 6b8b922fe7
fix: do not loose data when Kafka reports that offset is above watermark (#5322)
* fix: do not loose data when Kafka reports that offset is above watermark

This can happen in certain cluster rebalance settings.

This is also linked to https://github.com/influxdata/rskafka/issues/147
but for the upstream issue I currently have no idea how to fix it, so
let's at least harden IOx against it.

Fixes #5128.

* refactor: panic for `SequenceNumberAfterWatermark`
2022-08-11 07:32:04 +00:00
Andrew Lamb 3a945dbcb2
chore: return a struct with named and documented fields from `compact_persisting_batch` (#5346)
* chore: return a struct with named and documented fields from `compact_persisting_batch`

* docs: Remove extra 'the' and fix a typo

Co-authored-by: Carol (Nichols || Goulding) <carol.nichols@gmail.com>
2022-08-10 20:22:29 +00:00
Andrew Lamb b834bc630c
chore: more readability improvements to sort keys (#5366)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-10 17:59:25 +00:00
kodiakhq[bot] 07729bf153
Merge pull request #5334 from influxdata/cn+jg/compactor-e2e-tests
feat: Compactor e2e tests and a "run once" compactor command
2022-08-10 16:19:06 +00:00
kodiakhq[bot] 5a672dc3da
Merge branch 'main' into cn+jg/compactor-e2e-tests 2022-08-10 16:11:50 +00:00
Luke Bond 061444a23c
chore: added helper script for a docker catalog (#5372) 2022-08-10 15:49:40 +00:00
Carol (Nichols || Goulding) 1b77abdda7
fix: Explain the purpose of this macro, make arg name better 2022-08-10 11:33:43 -04:00
Carol (Nichols || Goulding) 96acf3c54b
fix: Don't rustfmt this module because of a rustfmt bug 2022-08-10 11:30:39 -04:00
Carol (Nichols || Goulding) 75bdd470a2
docs: Rewrap doc comments to 100 cols now that they're indented more 2022-08-10 11:30:22 -04:00
Carol (Nichols || Goulding) 9321c96aaf
test: Add a step for running compaction in e2e tests 2022-08-10 11:30:22 -04:00
Carol (Nichols || Goulding) 8ec9117836
test: Log a command run by e2e tests in a convenient way
So that you can copy-paste it if you want to run it again! 🎉
2022-08-10 11:30:22 -04:00
Carol (Nichols || Goulding) 9981d1636b
refactor: Move dump_log_to_stdout so it can be used in more than server fixtures 2022-08-10 11:30:22 -04:00
Jake Goulding 3915841a53
feat: Introduce a separate config for the compactor command 2022-08-10 11:30:21 -04:00
Jake Goulding 21864f35e1
refactor: Generate the CompactorConfig in a macro
This will allow us to have related but different configurations for
service and command mode.
2022-08-10 11:30:20 -04:00
Jake Goulding 7787c51b57
feat: add new CLI command to run the compactor once 2022-08-10 11:28:51 -04:00
Jake Goulding 68e64af4d1
refactor: extract compactor loop body to call it separately 2022-08-10 11:28:51 -04:00
Jake Goulding 49c5281454
refactor: Supersede old CompactorHandlerImpl constructor 2022-08-10 11:28:51 -04:00
Jake Goulding e9140df476
refactor: extract method to build `compactor` from CLI configuration 2022-08-10 11:28:51 -04:00
Jake Goulding ce908c8678
refactor: Use CompactorHandlerImpl::new_with_compactor in service 2022-08-10 11:28:51 -04:00
Jake Goulding cc061b6ce9
refactor: add CompactorHandlerImpl::new_with_compactor
This will allow us to refactor the code a level up to create a
`Compactor` directly.
2022-08-10 11:28:51 -04:00
Carol (Nichols || Goulding) 463a13b814
test: Remove the compactor from the test MiniCluster 2022-08-10 11:28:51 -04:00
Carol (Nichols || Goulding) 45f8e567ed
fix: Revert adjustment of e2e test to expect compaction
This was part of
"feat: Different branch to hook up new compaction algorithm (#5194)"

and will be added back in a new test for compaction specifically in the
next commit.

This reverts part of commit 69640c0ba5.
2022-08-10 11:28:50 -04:00
Marco Neumann 22037b2461
chore: update libm to 0.2.5 (#5371)
The former version (0.2.4) was yanked, so our CI is now failing.
2022-08-10 15:03:27 +00:00
Luke Bond 7e9918f067
chore: import validate merged schema (#5367)
* feat: import schema merge now outputs validation results

chore: refactor import crate

chore: renamed some structs for clarity in import crate

* chore: tests for import schema merge validation

* chore: Run cargo hakari tasks

* chore: clippy

* chore: make hashmap loop easier to read in import schema validation

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-10 12:15:37 +00:00
Andrew Lamb c0fc91c627
chore: Warn if a parquet file has no sort key (#5368) 2022-08-10 11:56:50 +00:00
Andrew Lamb ce3e2c3a15
chore: make terminology in iox_query::Provider consistent (remove super notation) (#5349)
* chore: make terminology in iox_query::Provider consistent (remove super notation)

* refactor: be more specific about *which* sort key is meant

* refactor: rename another sort_key --> output_sort_key

* refactor: rename additional sort_key to output_sort_key

* refactor: rename sort_key --> chunk_sort_key

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-10 10:59:47 +00:00
Andrew Lamb ee2013ce52
chore: Update docstrings for Partition::sort_keys (#5347)
* chore: Update docstrings for Partition::sort_keys

* docs: describe update details
2022-08-10 10:52:24 +00:00
Marco Neumann 3446127b65
chore: enable and fix warnings for `clap_blocks` (#5365)
Esp. this fixes "unused import" warnings when not all features are
enabled, so developer IDEs don't shout.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-10 10:14:34 +00:00
dependabot[bot] 624b1b33af
chore(deps): Bump libc from 0.2.127 to 0.2.129 (#5363)
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.127 to 0.2.129.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.127...0.2.129)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-10 09:23:29 +00:00
Marco Neumann 4da124d862
feat: concurrent garbage collector deletes (#5364)
This should speed up the prod process a bit.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-10 09:14:46 +00:00
Luke Bond c5f062bba0
feat: initial commit of schema merge bulk import tool (#5344)
* feat: initial commit of schema merge bulk import tool

* chore: use observability depds instead of tracing-*

* chore: removed debug printlns

* chore: fix feature decls for cloud providers for import crate

* chore: use println instead of info in import- no need for a simple CLI

* chore: tidy whitespace

* chore: remove unused dep in import

* chore: Run cargo hakari tasks

* chore: removed unimpld import job subcommand

* chore: clarifying comment about custom serialisation code

* chore: clarifying comment about schema merge code in import

* chore: fix wrong comment in import command

* chore: bump object store dep to get bugfix

* chore: rename import schema struct for clarity

* chore: run `cargo hakari generate`

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-10 09:07:38 +00:00
dependabot[bot] 4bc16356bc
chore(deps): Bump serde from 1.0.142 to 1.0.143 (#5362)
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.142 to 1.0.143.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.142...v1.0.143)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-10 08:37:06 +00:00
dependabot[bot] 7975d70632
chore(deps): Bump chrono from 0.4.20 to 0.4.21 (#5361)
* chore(deps): Bump chrono from 0.4.20 to 0.4.21

Bumps [chrono](https://github.com/chronotope/chrono) from 0.4.20 to 0.4.21.
- [Release notes](https://github.com/chronotope/chrono/releases)
- [Changelog](https://github.com/chronotope/chrono/blob/main/CHANGELOG.md)
- [Commits](https://github.com/chronotope/chrono/compare/v0.4.20...v0.4.21)

---
updated-dependencies:
- dependency-name: chrono
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-10 08:25:45 +00:00
Andrew Lamb 16ddc5efc6
chore: Update datafusion / arrow/parquet/arrow-flight and prost/tonic ecosystem (#5360)
* chore: Update datafusion and arrow

* chore: Update Cargo.lock

* chore: update to Decimal128

* chore: Update tonic/prost/pbjson/etc

* chore: Run cargo hakari tasks

* fix: doctest in generated types

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-09 17:30:44 +00:00
Raphael Taylor-Davies dadcc369b1
chore: update object_store to fix credentials client (#5359)
* chore: update object_store to fix credentials client

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-09 13:17:43 +00:00
Andrew Lamb b21799acae
chore: Update datafusion, get `date_bin` (#5340)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-09 11:01:37 +00:00
Raphael Taylor-Davies dfa862fd53
chore: temporary allow http always (#5357) 2022-08-09 10:54:42 +00:00
Andrew Lamb 7219f512c3
fix: update sort key in catalog before adding parquet file to catalog (#5333)
* fix: update sort key before parquet file

* fix: Remove left over debugging

* fix: fix bug, improve logging

* chore: move debug log after catalog update, improve args and docs
2022-08-09 10:27:51 +00:00
Raphael Taylor-Davies ccb45d7bac
chore: update to rusoto-less object_store (#5342)
* chore: update to rusoto-less object_store

* chore: Run cargo hakari tasks

* chore: further fixes

* chore: document workaround

* chore: review feedback

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-09 09:06:03 +00:00
kodiakhq[bot] a4923a709a
Merge pull request #5341 from influxdata/dom/instrument-kafka
feat: instrument Kafka produce latency
2022-08-09 08:27:05 +00:00
kodiakhq[bot] eebeae0fc6
Merge branch 'main' into dom/instrument-kafka 2022-08-09 08:19:54 +00:00
Andrew Lamb 172f893368
fix: fix logging typo in querier (#5345)
* fix: fix logging typo

* fix: fix type in typo fix ;(
2022-08-09 06:34:06 +00:00
Marco Neumann 3f55335d91
fix: create proper storage regex request via CLI (#5343)
The literal for regex comparisons within the storage API are NOT
strings but regex nodes. This was a mistake in #5281.
2022-08-08 15:41:24 +00:00