Commit Graph

8883 Commits (fc162b9dc2ada841d75626d4716b1dd75c07afcd)

Author SHA1 Message Date
Jake Goulding 68e64af4d1
refactor: extract compactor loop body to call it separately 2022-08-10 11:28:51 -04:00
Jake Goulding 49c5281454
refactor: Supersede old CompactorHandlerImpl constructor 2022-08-10 11:28:51 -04:00
Jake Goulding e9140df476
refactor: extract method to build `compactor` from CLI configuration 2022-08-10 11:28:51 -04:00
Jake Goulding ce908c8678
refactor: Use CompactorHandlerImpl::new_with_compactor in service 2022-08-10 11:28:51 -04:00
Jake Goulding cc061b6ce9
refactor: add CompactorHandlerImpl::new_with_compactor
This will allow us to refactor the code a level up to create a
`Compactor` directly.
2022-08-10 11:28:51 -04:00
Carol (Nichols || Goulding) 463a13b814
test: Remove the compactor from the test MiniCluster 2022-08-10 11:28:51 -04:00
Carol (Nichols || Goulding) 45f8e567ed
fix: Revert adjustment of e2e test to expect compaction
This was part of
"feat: Different branch to hook up new compaction algorithm (#5194)"

and will be added back in a new test for compaction specifically in the
next commit.

This reverts part of commit 69640c0ba5.
2022-08-10 11:28:50 -04:00
Marco Neumann 22037b2461
chore: update libm to 0.2.5 (#5371)
The former version (0.2.4) was yanked, so our CI is now failing.
2022-08-10 15:03:27 +00:00
Luke Bond 7e9918f067
chore: import validate merged schema (#5367)
* feat: import schema merge now outputs validation results

chore: refactor import crate

chore: renamed some structs for clarity in import crate

* chore: tests for import schema merge validation

* chore: Run cargo hakari tasks

* chore: clippy

* chore: make hashmap loop easier to read in import schema validation

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-10 12:15:37 +00:00
Andrew Lamb c0fc91c627
chore: Warn if a parquet file has no sort key (#5368) 2022-08-10 11:56:50 +00:00
Andrew Lamb ce3e2c3a15
chore: make terminology in iox_query::Provider consistent (remove super notation) (#5349)
* chore: make terminology in iox_query::Provider consistent (remove super notation)

* refactor: be more specific about *which* sort key is meant

* refactor: rename another sort_key --> output_sort_key

* refactor: rename additional sort_key to output_sort_key

* refactor: rename sort_key --> chunk_sort_key

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-10 10:59:47 +00:00
Andrew Lamb ee2013ce52
chore: Update docstrings for Partition::sort_keys (#5347)
* chore: Update docstrings for Partition::sort_keys

* docs: describe update details
2022-08-10 10:52:24 +00:00
Marco Neumann 3446127b65
chore: enable and fix warnings for `clap_blocks` (#5365)
Esp. this fixes "unused import" warnings when not all features are
enabled, so developer IDEs don't shout.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-10 10:14:34 +00:00
dependabot[bot] 624b1b33af
chore(deps): Bump libc from 0.2.127 to 0.2.129 (#5363)
Bumps [libc](https://github.com/rust-lang/libc) from 0.2.127 to 0.2.129.
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.127...0.2.129)

---
updated-dependencies:
- dependency-name: libc
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-10 09:23:29 +00:00
Marco Neumann 4da124d862
feat: concurrent garbage collector deletes (#5364)
This should speed up the prod process a bit.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-10 09:14:46 +00:00
Luke Bond c5f062bba0
feat: initial commit of schema merge bulk import tool (#5344)
* feat: initial commit of schema merge bulk import tool

* chore: use observability depds instead of tracing-*

* chore: removed debug printlns

* chore: fix feature decls for cloud providers for import crate

* chore: use println instead of info in import- no need for a simple CLI

* chore: tidy whitespace

* chore: remove unused dep in import

* chore: Run cargo hakari tasks

* chore: removed unimpld import job subcommand

* chore: clarifying comment about custom serialisation code

* chore: clarifying comment about schema merge code in import

* chore: fix wrong comment in import command

* chore: bump object store dep to get bugfix

* chore: rename import schema struct for clarity

* chore: run `cargo hakari generate`

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-10 09:07:38 +00:00
dependabot[bot] 4bc16356bc
chore(deps): Bump serde from 1.0.142 to 1.0.143 (#5362)
Bumps [serde](https://github.com/serde-rs/serde) from 1.0.142 to 1.0.143.
- [Release notes](https://github.com/serde-rs/serde/releases)
- [Commits](https://github.com/serde-rs/serde/compare/v1.0.142...v1.0.143)

---
updated-dependencies:
- dependency-name: serde
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-10 08:37:06 +00:00
dependabot[bot] 7975d70632
chore(deps): Bump chrono from 0.4.20 to 0.4.21 (#5361)
* chore(deps): Bump chrono from 0.4.20 to 0.4.21

Bumps [chrono](https://github.com/chronotope/chrono) from 0.4.20 to 0.4.21.
- [Release notes](https://github.com/chronotope/chrono/releases)
- [Changelog](https://github.com/chronotope/chrono/blob/main/CHANGELOG.md)
- [Commits](https://github.com/chronotope/chrono/compare/v0.4.20...v0.4.21)

---
updated-dependencies:
- dependency-name: chrono
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

* chore: Run cargo hakari tasks

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-10 08:25:45 +00:00
Andrew Lamb 16ddc5efc6
chore: Update datafusion / arrow/parquet/arrow-flight and prost/tonic ecosystem (#5360)
* chore: Update datafusion and arrow

* chore: Update Cargo.lock

* chore: update to Decimal128

* chore: Update tonic/prost/pbjson/etc

* chore: Run cargo hakari tasks

* fix: doctest in generated types

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-09 17:30:44 +00:00
Raphael Taylor-Davies dadcc369b1
chore: update object_store to fix credentials client (#5359)
* chore: update object_store to fix credentials client

* chore: Run cargo hakari tasks

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-09 13:17:43 +00:00
Andrew Lamb b21799acae
chore: Update datafusion, get `date_bin` (#5340)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-09 11:01:37 +00:00
Raphael Taylor-Davies dfa862fd53
chore: temporary allow http always (#5357) 2022-08-09 10:54:42 +00:00
Andrew Lamb 7219f512c3
fix: update sort key in catalog before adding parquet file to catalog (#5333)
* fix: update sort key before parquet file

* fix: Remove left over debugging

* fix: fix bug, improve logging

* chore: move debug log after catalog update, improve args and docs
2022-08-09 10:27:51 +00:00
Raphael Taylor-Davies ccb45d7bac
chore: update to rusoto-less object_store (#5342)
* chore: update to rusoto-less object_store

* chore: Run cargo hakari tasks

* chore: further fixes

* chore: document workaround

* chore: review feedback

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
2022-08-09 09:06:03 +00:00
kodiakhq[bot] a4923a709a
Merge pull request #5341 from influxdata/dom/instrument-kafka
feat: instrument Kafka produce latency
2022-08-09 08:27:05 +00:00
kodiakhq[bot] eebeae0fc6
Merge branch 'main' into dom/instrument-kafka 2022-08-09 08:19:54 +00:00
Andrew Lamb 172f893368
fix: fix logging typo in querier (#5345)
* fix: fix logging typo

* fix: fix type in typo fix ;(
2022-08-09 06:34:06 +00:00
Marco Neumann 3f55335d91
fix: create proper storage regex request via CLI (#5343)
The literal for regex comparisons within the storage API are NOT
strings but regex nodes. This was a mistake in #5281.
2022-08-08 15:41:24 +00:00
Dom Dwyer 87e4290e1f refactor(write_buffer): database_name -> topic_name
Previously IOx mapped a single database to a single kafka topic - this
is no longer the case, so referring to the kafka topic name as the
"database name" name is confusing.
2022-08-08 15:24:35 +02:00
Dom Dwyer c133cf22c6 refactor: use kafka produce instrumentation
This commit changes the IOx write buffer initialisation code to add the
KafkaProducerMetrics instrumentation to the per-partition Kafka clients.
2022-08-08 15:24:35 +02:00
Dom Dwyer 284a3069ce feat: Kafka client produce() instrumentation
Adds a decorator over the underlying kafka client to capture the latency
distribution of the low-level kafka writes, independent of the
aggregation/DML batching framework that sits "above" this client.

The latency measurements include the serialisation overhead, protocol
overhead, and actual network I/O.
2022-08-08 15:24:35 +02:00
Dom Dwyer d003fe0047 refactor: const KafkaPartition::new()
Allows this fn to be called from const contexts (useful in test setups).
2022-08-08 14:56:03 +02:00
Dom Dwyer 323788767d refactor: impl TimeProvider for Arc<TimeProvider>
This allows the MockProvider to be used in tests with consuming code
that uses generics/static dispatch instead of a dyn TimeProvider, while
still retaining a ref to the MockProvider instance.
2022-08-08 14:56:03 +02:00
Marco Neumann cd0dc42b4a
refactor: use a single chunk filter/pruning step in querier (#5338)
We already prune all chunks in the query-access layer. There's no need
to do that another time (which is actually the first time) in
`QuerierTable::chunks`. The time savings we get from feeding less chunks
into the state reconciling should be negligible. On the pro-side however
we get a more streamlined data flow and actually correct chunk pruning
metrics. Also see #5336.

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-08 12:55:14 +00:00
Andrew Lamb f3913f89e3
chore: Update datafusion (to get fix for pruning bug) (#5339)
* chore: Update datafusion

* chore: Update AggregateSelector API
2022-08-08 12:28:21 +00:00
Marco Neumann 5f407ec8cd
chore: ignore a few profiling-related files (#5337)
* chore: git-ignore heaptrack output

* chore: git-ignore perf outputs
2022-08-08 12:04:06 +00:00
Andrew Lamb f9d0e37144
chore: reduce h2 and hyper logging level in tests (#5332)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-08 09:39:26 +00:00
dependabot[bot] 3a697df261
chore(deps): Bump sqlparser from 0.19.0 to 0.20.0 (#5335)
Bumps [sqlparser](https://github.com/sqlparser-rs/sqlparser-rs) from 0.19.0 to 0.20.0.
- [Release notes](https://github.com/sqlparser-rs/sqlparser-rs/releases)
- [Changelog](https://github.com/sqlparser-rs/sqlparser-rs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/sqlparser-rs/sqlparser-rs/compare/v0.19.0...v0.20.0)

---
updated-dependencies:
- dependency-name: sqlparser
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-08 08:23:31 +00:00
Andrew Lamb 7066c4e679
fix: make it clear rpc_predicates are only ever specialized when a schema is known (#5315)
* fix: make it clear rpc_predicates are only ever specialized when a schema is known

* fix: handle case of no schema

* fix: Update predicate/src/rpc_predicate.rs
2022-08-06 10:56:53 +00:00
Nga Tran b71c1a09ea
feat: only sleep when there are neither hot nor cold partitions to compact (#5329)
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-05 16:36:36 +00:00
Andrew Lamb 38a0cdbb4a
fix: Install cargo deny in ci image (#5317)
* fix: install cargo deny in ci image

* fix: Update docker/Dockerfile.ci

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>

* fix: Apply suggestions from code review

Co-authored-by: Marco Neumann <marco@crepererum.net>

Co-authored-by: Carol (Nichols || Goulding) <193874+carols10cents@users.noreply.github.com>
Co-authored-by: Marco Neumann <marco@crepererum.net>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2022-08-05 15:51:35 +00:00
Marco Neumann fc1870ff76
fix: chunk pruning stats (#5319)
- emit a warning if we cannot even attempt to prune chunks due to an
  error. This is always either a missing feature or a bug (even though
  it does not impact correctness but _only_ performance). Also see
  https://github.com/influxdata/conductor/issues/1107
- change metrics to clearly differentiate between "could not prune" and
  "not pruned"
- add new "not pruned" observer hook (this was missing for some reason,
  the "pruned" hook existed though)
2022-08-05 10:50:31 +00:00
kodiakhq[bot] 4898a7f1e3
Merge pull request #5303 from influxdata/cn/upgrade-cold-nonoverlapping-l0
feat: Compact cold partitions; upgrade a single non-overlapping level 0 file to level 1 without running compaction
2022-08-04 21:02:28 +00:00
Carol (Nichols || Goulding) facc967320
fix: Specify hot or cold in more log messages 2022-08-04 16:55:48 -04:00
Carol (Nichols || Goulding) c9d66c30b1
fix: Make this field name consistent
With the other fields on this struct and with the corresponding field on
the clap block struct.
2022-08-04 16:55:48 -04:00
Carol (Nichols || Goulding) da0b031c44
feat: Add parameters to limit total memory usage of cold partition compaction 2022-08-04 16:55:48 -04:00
Carol (Nichols || Goulding) 9d8f94d0d7
fix: Remove an unneeded sleep
The cold case won't make a hot busy loop (hah), we'll just go back to
working on the hot partitions if there's no cold partitions to do.
2022-08-04 16:55:48 -04:00
Carol (Nichols || Goulding) e1c45e836a
test: Remove copypastaed assertions that duplicate a different test 2022-08-04 16:55:48 -04:00
Carol (Nichols || Goulding) cb6442018e
test: Add more test cases varying number of partitions per sequencer 2022-08-04 16:55:48 -04:00
Carol (Nichols || Goulding) d55f45a5c2
feat: Run compaction of hot partitions a configurable number of times more than cold 2022-08-04 16:55:48 -04:00