Commit Graph

11917 Commits (523fd0cabf73d712fb06011eebf7e76e391f9375)

Author SHA1 Message Date
Stuart Carnie 252da2b75f
chore: add tests for the selector functions 2023-04-16 08:42:37 +10:00
Stuart Carnie 8f5f3b2057
chore: clarify comment 2023-04-16 08:36:59 +10:00
Stuart Carnie ccfd334834
chore: correct typo 🔨 2023-04-16 08:32:01 +10:00
Stuart Carnie acd6cff631
chore: validate single-selector with tags or fields is not implemented 2023-04-16 08:21:19 +10:00
Stuart Carnie 8274d584f5
chore: update all remaining code to use `error` and `error::map` module 2023-04-16 08:00:12 +10:00
Stuart Carnie 69d75745cc
feat: add limited `last`, `first`, `min` and `max` selector functions
Returns a `NotImplemented` error when attempting to execute a
selector query, which projects a single selector function and additional
tags or fields until #7533 is implemented.

Introduced `error` module to simplify error handling and ensure
consistency of error messages.
2023-04-16 07:59:28 +10:00
Stuart Carnie 03ea8ea2b8
feat: add `last` selector function
This does not complete the implementation, as we must still use the
timestamp of the `struct_selector_last` when the projection semantics
are selector
2023-04-15 13:54:42 +10:00
Stuart Carnie 007d5b90f3
chore: add APIs to find selector user-defined aggregate functions
This will be used to complete queries that have selector semantics,
meaning they project a single selector function and therefore
use the timestamp for the time column.
2023-04-15 13:54:42 +10:00
Stuart Carnie d11097cf18
chore: add APIs to find selector user-defined aggregate functions
This will be used to complete queries that have selector semantics,
meaning they project a single selector function and therefore
use the timestamp for the time column.
2023-04-15 13:54:42 +10:00
Stuart Carnie 42074e7a9d
chore: refactor and rename `validate_select`
This will be used to determine the semantics of the projection clause
2023-04-15 13:54:42 +10:00
Stuart Carnie 3529762726
chore: remove `println` 2023-04-15 13:54:42 +10:00
Phil Bracikowski 1d64cb1b1e fix(garbage_collector): delay initial s3 checker loop, fix dryrun
This PR makes 3 improvements.

* It adds the configured sleep interval at the start of the object store
  checker to avoid issues with making a remote list immediately at
startup. We see issues with the s3 api.
* the --dry-run flag was stopping deletes of objects from object store,
  but the retention flagger was still making updates to the catalog.
These writes to the catalog are surprising when the --dry-run flag is
provided. Now, with --dry-run the catalog is not updated. The logging
instead says how many records would be updated because of retention.
* It decreases logging in should_delete of the checker as it will be
  extremely noisey when reporting files it skips. An internal
environment has 3.8 million parquet files, most of which would be
skipped.

* related to #7363
* fixes influxdata/idpe#17451
2023-04-14 17:03:07 -07:00
Carol (Nichols || Goulding) 31043811d9
feat: Log cold compaction selection 2023-04-14 17:50:51 -04:00
Carol (Nichols || Goulding) bb02e1ce1b
feat: Don't actually compact anything if you're running in cold compaction mode 2023-04-14 17:33:05 -04:00
Carol (Nichols || Goulding) 9350b64314
feat: Add a CompactionType in compactor2::config as well as clap blocks
It's a little weird to have such similar types and have to convert them,
but doing this prevents too many crates from having to depend on/know
about each other.
2023-04-14 17:33:05 -04:00
Carol (Nichols || Goulding) 4f7fe18e51
fix: Organize use statements 2023-04-14 17:33:05 -04:00
Carol (Nichols || Goulding) b4bad29357
docs: Wrap comments at 100 cols 2023-04-14 17:33:05 -04:00
Carol (Nichols || Goulding) 565a9c454d
refactor: Extract a function for creating the PartitionsSourceConfig
And then add some unit tests for that function. It's getting a smidge
complicated.
2023-04-14 17:33:05 -04:00
Carol (Nichols || Goulding) 76d155fe89
feat: Configuration for hot vs cold thresholds
This creates a separate option for the number of minutes *without* a
write that a partition must have before being considered for cold
compaction.

This is a new CLI flag so that it can have a different default from hot
compaction's compaction_partition_minute_threshold.

I didn't add "hot" to compaction_partition_minute_threshold's name so
that k8s-idpe doesn't have to change to continue running hot compaction
as it is today.

Then use the relevant threshold earlier, when creating the
PartitionsSourceConfig, to make it clearer which threshold is used
where.

Right now, this will silently ignore any CLI flag specified that isn't
relevant to the current compaction mode. We might want to change that to
warn or error to save debugging time in the future.
2023-04-14 17:33:05 -04:00
Carol (Nichols || Goulding) 15a7c527b4
feat: Add a --compaction-type CLI arg that can be hot or cold 2023-04-14 17:33:05 -04:00
Andrew Lamb c26981d51b
chore: Update datafusion again (#7557)
* chore: Update datafusion again

* chore: Run cargo hakari tasks

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-14 17:37:00 +00:00
Carol (Nichols || Goulding) 5e6dbec909
fix: Remove tombstones as they aren't functional currently 2023-04-14 13:36:08 -04:00
kodiakhq[bot] e2b1acf1c0
Merge pull request #7560 from influxdata/cn/remove-obsolete-docs-infra
fix: Remove outdated documentation and infrastructure having to do with Kafka
2023-04-14 17:25:22 +00:00
Carol (Nichols || Goulding) 5f2d82fbc6
fix: Remove tombstones from querier; they're unused 2023-04-14 13:20:39 -04:00
kodiakhq[bot] bc3b69ef3f
Merge branch 'main' into cn/remove-obsolete-docs-infra 2023-04-14 17:14:45 +00:00
Carol (Nichols || Goulding) 8d3e285251
fix: Remove outdated documentation that discusses Kafka 2023-04-14 13:08:21 -04:00
Dom d55d41b174
Merge pull request #7559 from influxdata/dom/coalesce-partition-fetches
feat: coalesce partition catalog fetches
2023-04-14 17:02:21 +01:00
Dom 4cead9391d
Merge branch 'main' into dom/coalesce-partition-fetches 2023-04-14 16:55:59 +01:00
Dom Dwyer 395224407f
docs: clarify test comment
The mock will panic!
2023-04-14 17:54:10 +02:00
Dom Dwyer b333ddeab0
refactor: remove unwrapping, use nested match
It is simpler, and likely faster.
2023-04-14 17:51:39 +02:00
wiedld 42b5f6d517
Merge pull request #7550 from influxdata/idpe-17449/content-encoding-identity
fix(idpe-17449): content-encoding identity
2023-04-14 08:21:37 -07:00
wiedld a4d9e58e10
Merge branch 'main' into idpe-17449/content-encoding-identity 2023-04-14 08:14:35 -07:00
Dom Dwyer e6dc3bb72f
feat(ingester): coalesce partition fetch requests
Reduce N concurrent partition fetch requests for the same partition into
a single catalog query. This prevents multiple queries executing when
all but one result is thrown away.

This removes a potential request amplification when the catalog is
unavailable, where a number of queries for the same partition execute,
see an error, and retry forever (with backoff) until the catalog
recovers, while more catalog queries are started.
2023-04-14 16:29:01 +02:00
Dom Dwyer b32a21d093
refactor: request coalescing partition resolver
Implement a new PartitionResolver decorator that coalesces concurrent
requests to the inner PartitionResolver with minimal memory overhead.
2023-04-14 16:29:01 +02:00
Dom Dwyer 435499e9d7
refactor: resolve Arc-wrapped PartitionData
Changes the PartitionResolver trait to return a ref-counted
PartitionData instance, instead of a plain PartitionData (which is then
wrapped in an Arc anyway).

This allows resolver implementations to return multiple references to
the same physical instance.
2023-04-14 16:29:00 +02:00
Andrew Lamb f46d06d56f
chore: Update DataFusion + arrow ecosystem to 37 (#7544)
* chore: Update datafusion and arrow/parquet to 37, tonic to 0.9.1

* refactor: Update for FieldRef and other API changes

* fix: Update field size calculation

* fix: Use `NullBuffer` directly

* fix: remove outdated comment

* chore: Update test for tonic

* chore: Run cargo hakari tasks

* chore: cargo update

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-14 12:43:01 +00:00
dependabot[bot] d578367341
chore(deps): Bump hyper from 0.14.25 to 0.14.26 (#7554)
Bumps [hyper](https://github.com/hyperium/hyper) from 0.14.25 to 0.14.26.
- [Release notes](https://github.com/hyperium/hyper/releases)
- [Changelog](https://github.com/hyperium/hyper/blob/v0.14.26/CHANGELOG.md)
- [Commits](https://github.com/hyperium/hyper/compare/v0.14.25...v0.14.26)

---
updated-dependencies:
- dependency-name: hyper
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-14 10:22:21 +00:00
dependabot[bot] 567ce82df2
chore(deps): Bump assert_cmd from 2.0.10 to 2.0.11 (#7553)
Bumps [assert_cmd](https://github.com/assert-rs/assert_cmd) from 2.0.10 to 2.0.11.
- [Release notes](https://github.com/assert-rs/assert_cmd/releases)
- [Changelog](https://github.com/assert-rs/assert_cmd/blob/master/CHANGELOG.md)
- [Commits](https://github.com/assert-rs/assert_cmd/compare/v2.0.10...v2.0.11)

---
updated-dependencies:
- dependency-name: assert_cmd
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-14 09:42:31 +00:00
dependabot[bot] 072d685f94
chore(deps): Bump predicates from 3.0.2 to 3.0.3 (#7552)
Bumps [predicates](https://github.com/assert-rs/predicates-rs) from 3.0.2 to 3.0.3.
- [Release notes](https://github.com/assert-rs/predicates-rs/releases)
- [Changelog](https://github.com/assert-rs/predicates-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/assert-rs/predicates-rs/compare/v3.0.2...v3.0.3)

---
updated-dependencies:
- dependency-name: predicates
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-14 09:25:19 +00:00
dependabot[bot] 48d98cc30b
chore(deps): Bump clap from 4.2.1 to 4.2.2 (#7551)
Bumps [clap](https://github.com/clap-rs/clap) from 4.2.1 to 4.2.2.
- [Release notes](https://github.com/clap-rs/clap/releases)
- [Changelog](https://github.com/clap-rs/clap/blob/master/CHANGELOG.md)
- [Commits](https://github.com/clap-rs/clap/compare/v4.2.1...v4.2.2)

---
updated-dependencies:
- dependency-name: clap
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-14 09:13:20 +00:00
wiedld ca492b09d2 fix(idpe-17449): accept content-encoding identity for the parseBody 2023-04-13 17:09:21 -07:00
wiedld b1d10671b9 fix(idpe-17449): accept content-encoding identity as a valid header 2023-04-13 17:07:56 -07:00
Chunchun Ye 69da3c2495
feat(flightsql): Support `GetCrossReference` metadata endpoint with an empty RecordBatch (#7548)
* feat: support CommandGetCrossReference metadata endpoint with tests

* chore: create two tables in the test for GetCrossReference endpoint

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-13 19:53:37 +00:00
Chunchun Ye 5182ab0037
chore: add more test and logic to grpc database header names (#7529)
* chore: support returning the database name if all the keys refer to the same database

* test: add test cases to check for same, different, and no database in request header

* chore: lint

* chore: more lint

* refactor: replace empty string with None for database_name

* refactor: simplify logic for NoFlightSQLDatabase error

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-13 18:28:29 +00:00
Chunchun Ye 8bf47df621
feat(flightsql): Support `GetImportedKeys` metadata endpoint with an empty RecordBatch (#7546)
* feat: support CommandGetImportedKeys metadata endpoint with tests

* chore: remove comments that is no longer valid

---------

Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-13 18:06:47 +00:00
dependabot[bot] b4003a70fe
chore(deps): Bump h2 from 0.3.16 to 0.3.17 (#7547)
Bumps [h2](https://github.com/hyperium/h2) from 0.3.16 to 0.3.17.
- [Release notes](https://github.com/hyperium/h2/releases)
- [Changelog](https://github.com/hyperium/h2/blob/master/CHANGELOG.md)
- [Commits](https://github.com/hyperium/h2/compare/v0.3.16...v0.3.17)

---
updated-dependencies:
- dependency-name: h2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-13 17:37:47 +00:00
Andrew Lamb 134ff2ef83
chore: update DataFusion pin (right before arrow 37 update) (#7540)
* chore: update DataFusion pin

* refactor: Update for deprecated API

* chore: Run cargo hakari tasks

---------

Co-authored-by: CircleCI[bot] <circleci@influxdata.com>
Co-authored-by: kodiakhq[bot] <49736102+kodiakhq[bot]@users.noreply.github.com>
2023-04-13 17:25:24 +00:00
kodiakhq[bot] 08274b12d2
Merge pull request #7541 from influxdata/dom/remove-bad-comments
docs: remove misleading API comments
2023-04-13 14:31:13 +00:00
Dom Dwyer 3a8803c43c
docs: remove misleading API comments
These fields are very much in use now!
2023-04-13 16:17:48 +02:00
kodiakhq[bot] b7096bdad4
Merge pull request #7524 from influxdata/cn/remove-old-querier
fix: Remove old querier
2023-04-13 14:05:53 +00:00